About the position
We’re hiring an experienced Azure Data Engineer to design, build, and scale our next-generation data platform for private lending. You’ll own robust ETL/ELT pipelines using Azure Data Factory, Databricks (Delta Lake), Synapse Analytics, and ADLS Gen2—integrating structured and semi-structured data from core lending systems, CRM, payment gateways, credit bureaus, and third-party REST APIs. You’ll collaborate with data scientists, BI developers, and risk/compliance teams to deliver analytics for risk modeling, portfolio performance, collections, and regulatory reporting.
What You’ll Do (Key Responsibilities)
- Design & Build Pipelines: Develop scalable, secure ADF pipelines and Databricks notebooks for batch and near-real-time ingestion and transformations; implement medallion architecture (Bronze/Silver/Gold) on ADLS Gen2/Delta.
- REST API Integrations: Implement robust API ingestion with OAuth2, pagination (cursor/offset), retry/backoff, error handling, idempotent upserts, and incremental watermarks.
- Data Modeling & ELT: Build curated fact/dimension models (loan, repayment, collateral, delinquency) optimized for Synapse/Power BI; implement SCD Type 2, partitioning, and schema evolution.
- SQL Excellence: Author high-performance T-SQL (stored procedures, views, indexing strategies, MERGE/upserts) for transformations, reconciliations, and regulatory extracts.
- On-Prem to Cloud: Modernize/replace SSIS packages with ADF/Data Flows or Databricks; orchestrate dependencies and schedules.
- CDC & Streaming: Implement CDC from SQL Server or file drops (Auto Loader), handling late-arriving data and deduplication.
- Security & Governance: Apply Managed Identity, Key Vault, RBAC/ACLs, encryption at rest/in transit, and PII controls; integrate with Purview for lineage and catalog.
- Observability: Configure Log Analytics, dashboards, and alerting for ADF/Databricks/Synapse; drive reliability and cost optimization.
- DevOps: Use Azure DevOps (Git + YAML pipelines) for CI/CD across ADF/Databricks/Synapse; implement environment promotion and configuration as code.
- Stakeholder Collaboration: Partner with product, risk, finance, and compliance teams to translate requirements into robust data solutions and SLAs.
Requirements - 6–8 years in Data Engineering with at least 3+ years on Azure (ADF, Databricks, Synapse, ADLS Gen2).
- Strong SQL/T-SQL: complex joins, window functions, dynamic SQL, stored procedures, performance tuning, and MERGE/upserts.
- Hands-on with SQL Server and SSIS (migration/modernization to ADF preferred).
- Proven REST API integration experience: authentication (OAuth2, client credentials), pagination, throttling/rate limits, retries, and incremental ingestion.
- Expertise with Databricks (PySpark) and Delta Lake (ACID, time travel, schema evolution, optimize/vacuum).
- Experience building data models and curated layers for BI/analytics; familiarity with Synapse Serverless/Dedicated.
- Solid grasp of data quality (validations, reconciliations), error handling, and observability.
- Strong understanding of data security (PII, masking, encryption, access controls) and compliance-oriented design.
- CI/CD with Azure DevOps (repos, branching, pull requests, pipelines).
- Excellent communication and ownership mindset in cross-functional environments.
Nice-to-Have (Good to Bring)
- Domain knowledge in lending/fintech, e.g., loan origination/servicing, delinquency, collections, vintage/roll rate analytics, regulatory reporting.
- Python for orchestration/helpers, Power BI exposure, or Event Hub/Functions for event ingestion.
- Purview (glossary, lineage), Great Expectations/Deequ-like data quality frameworks.
- Cost optimization (cluster sizing, partitioning/file size best practices, serverless vs dedicated trade-offs).
- Experience with infrastructure-as-code (Bicep/Terraform) is a plus.
Desired Skills:
- Azure
- ADF
- Databricks
- Synapse
- ADLS Gen2
- REST API
Desired Qualification Level:
About The Employer: