Join our team as an Azure Data Engineer and build a next-generation data platform for private lending. You'll design, build, and scale ETL/ELT pipelines using Azure Data Factory, Databricks (Delta Lake), Synapse Analytics, and ADLS Gen2. Collaborate with data scientists, BI developers, and risk/compliance teams to deliver analytics for risk modeling, portfolio performance, and regulatory reporting.
We are seeking a highly skilled Azure Data Engineer to spearhead the design, construction, and scaling of our cutting-edge data platform, specifically tailored for private lending operations.
The successful candidate will be responsible for the end-to-end management of robust Extract, Transform, Load (ETL) and Extract, Load, Transform (ELT) pipelines, utilizing a suite of Azure services including Azure Data Factory (ADF), Databricks (leveraging Delta Lake), Synapse Analytics, and Azure Data Lake Storage Gen2 (ADLS Gen2). The role involves integrating diverse structured and semi-structured data from various sources, such as core lending systems, Customer Relationship Management (CRM) platforms, payment gateways, credit bureaus, and third-party REST APIs. A key aspect of this position is collaborative engagement with data scientists, Business Intelligence (BI) developers, and risk and compliance teams, contributing to the delivery of critical analytics for risk modeling, portfolio performance analysis, collections optimization, and regulatory reporting requirements. This is a dynamic opportunity to shape the future of our data infrastructure and contribute directly to critical business decisions.\The core responsibilities of this position encompass a wide range of technical and collaborative tasks. Design and Build Pipelines: This involves the development of scalable and secure ADF pipelines and Databricks notebooks, specifically designed for both batch and near-real-time data ingestion and transformation processes. Implementation of the medallion architecture (Bronze, Silver, and Gold layers) on ADLS Gen2 and Delta Lake is also crucial. REST API Integrations: The engineer will be responsible for implementing robust API ingestion strategies, incorporating authentication mechanisms like OAuth2, as well as handling pagination techniques (cursor and offset-based), implementing retry and backoff strategies for resilience, managing error handling processes, ensuring idempotent upserts to prevent data duplication, and employing incremental watermarking techniques for efficient data processing. Data Modeling and ELT: The role includes building curated fact and dimension models, tailored for lending-related data (e.g., loan, repayment, collateral, delinquency), optimizing these models for Synapse Analytics and Power BI, implementing Slowly Changing Dimension (SCD) Type 2 methodologies, applying partitioning strategies for performance, and managing schema evolution for adaptability. SQL Excellence: The engineer will author high-performance T-SQL code, including stored procedures, views, and efficient indexing strategies, and utilize MERGE and upsert operations for data transformations, data reconciliation, and generating regulatory extracts. On-Premises to Cloud Migration: This involves modernizing and replacing existing SQL Server Integration Services (SSIS) packages with ADF Data Flows or Databricks solutions, along with orchestrating data dependencies and scheduling tasks for optimal data pipeline execution. Change Data Capture (CDC) and Streaming: Implementing Change Data Capture from SQL Server or utilizing file drop mechanisms (Auto Loader) to handle late-arriving data and efficiently manage data deduplication. Security and Governance: Applying Managed Identity, Key Vault for secret management, Role-Based Access Control (RBAC) and Access Control Lists (ACLs) for security, encryption at rest and in transit, and robust Personally Identifiable Information (PII) controls, along with integration with Azure Purview for data lineage and data cataloging capabilities. Observability: Configuring Log Analytics, dashboards, and alerting mechanisms for ADF, Databricks, and Synapse, with a strong focus on driving data pipeline reliability and cost optimization. DevOps Practices: Utilizing Azure DevOps (Git and YAML pipelines) for Continuous Integration and Continuous Deployment (CI/CD) across ADF, Databricks, and Synapse environments, implementing efficient environment promotion strategies and managing infrastructure configuration as code. Stakeholder Collaboration: Partnering effectively with product, risk, finance, and compliance teams to translate business requirements into robust, reliable data solutions and define Service Level Agreements (SLAs) for data delivery.\The ideal candidate will possess a strong foundation in data engineering principles and a proven track record of delivering successful data solutions. Required qualifications include 6-8 years of experience in Data Engineering, with a minimum of 3 years of hands-on experience working with Azure cloud services (ADF, Databricks, Synapse, and ADLS Gen2). Strong proficiency in SQL and T-SQL is essential, including expertise in complex joins, window functions, dynamic SQL, stored procedures, performance tuning techniques, and the utilization of MERGE and upsert operations. The candidate should have proven experience in REST API integration, including a deep understanding of authentication protocols such as OAuth2 and client credentials, as well as experience in implementing pagination strategies, managing throttling and rate limits, utilizing retry mechanisms, and performing incremental data ingestion. Expertise with Databricks (PySpark) and Delta Lake is highly desirable, including a solid understanding of Delta Lake features such as ACID properties, time travel, schema evolution, and optimization and vacuuming processes. Experience in building data models and curated data layers optimized for Business Intelligence and analytics is necessary, and familiarity with Synapse Serverless and Dedicated SQL pools is a plus. A strong understanding of data security principles, including PII protection, data masking techniques, encryption methods, and access control strategies, is crucial. Excellent communication skills and a proactive, ownership mindset, particularly in cross-functional team environments, are required. Domain knowledge in the lending and fintech industries, encompassing areas such as loan origination, loan servicing, delinquency analysis, collections processes, vintage and roll rate analytics, and regulatory reporting requirements, is highly valuable. Familiarity with Azure Purview (glossary and data lineage), and the use of data quality frameworks such as Great Expectations or Deequ, is also beneficial. Finally, the candidate should have experience in cost optimization strategies, including cluster sizing considerations, partitioning and file size best practices, and the trade-offs between serverless and dedicated compute resources
Azure Data Engineer Data Engineering Azure Data Factory Databricks Synapse Analytics
South Africa Latest News, South Africa Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Business Systems EngineerIT Industry News. Daily.
Read more »
Business Systems Engineer – Western Cape Cape Town RegionIT Industry News. Daily.
Read more »
Guardiola Vows to Use Platform to Advocate for Victims of Global ConflictsManchester City manager Pep Guardiola stated he will continue to use his position to speak out against global conflicts and violence, citing examples like the situation in Palestine, Ukraine, and the US, emphasizing the importance of human empathy and striving for a better society.
Read more »
Machine Learning EngineerIT Industry News. Daily.
Read more »
Machine Learning Engineer – Gauteng Johannesburg RegionIT Industry News. Daily.
Read more »
Data Engineer (Corporate Lending – Corporate Banking) at Sabenza IT & RecruitmentIT Industry News. Daily.
Read more »
