← Back to jobs
TopDev

Data Engineer (Azure)
SOFTWORLD VIETNAM LTD.
Thành phố Hồ Chí Minh, Hồ Chí Minh1 day agoNegotiable
Job Description
Your role & responsibilities
- Ability to collaborate closely with Business Analysts, Data Architects, and Functional SMEs to understand source system data structures and business rules.
- Analyse source system schemas, tables, relationships, and data patterns to identify relevant data elements required for migration.
- Develop and maintain Source-to-Target Mapping (STM) documents
- Identify data gaps, inconsistencies, and transformation requirements during the mapping process
- Design and develop data ingestion pipelines using Microsoft Fabric Data Pipelines, Azure Data Factory, or Synapse pipelines.
- Build data pipelines to extract data from SaaS platforms, legacy databases, files, APIs, and enterprise applications.
- Load data into Microsoft Fabric OneLake (Lakehouse or Warehouse) environments.
- Implement pipelines for initial bulk migrations, incremental loads, delta processing, and cutover migrations
- Develop transformation logic using Fabric Notebooks (PySpark / Spark SQL), Dataflows Gen2, and SQL transformations in Fabric Warehouse
- Implement business transformation rules defined in source-to-target mappings.
- Standardize and cleanse data prior to loading into target systems.
- Design transformation workflows following Bronze / Silver / Gold architecture patterns
- Implement ingestion into Fabric Lakehouse tables (Delta format).
- Manage partitioning, indexing, and file compaction strategies
- Implement validation checks including record counts, field-level reconciliation, completeness checks, and consistency validation
- Build automated data quality validation frameworks.
- Monitor pipeline execution and troubleshoot issues.
- Ensure migration runs meet defined performance and timing requirements.
- Optimize Spark jobs and SQL queries in Microsoft Fabric.
- Implement parallel processing and batch strategies for large datasets.
- Implement CI/CD pipelines using Azure DevOps and Fabric deployment pipelines.
- Maintain version control for notebooks, pipelines, SQL scripts, and transformation logic
- Implement security using Microsoft Entra ID, RBAC, and Fabric workspace security.
Your skills & qualifications
Core Data Engineering Skills
- Strong SQL development skills
- Experience building ETL/ELT pipelines
- Experience working with large-scale datasets
- Understanding of data modelling and data architecture
Microsoft Fabric Skills
- Fabric Lakehouse
- Fabric Data Pipelines
- Fabric Notebooks
- Dataflows Gen2
- Fabric Warehouse
- One Lake architecture
- Delta Lake format
Azure Platform Skills
- Azure Data Factory
- Azure Storage / ADLS Gen2
- Azure DevOps
- Azure Key Vault
- Microsoft Entra ID
Programming Skills
- SQL
- Python (PySpark preferred)
- Spark SQL