Join our team as a Data Engineer, focusing on building scalable data pipelines and analytics solutions on Databricks. This role involves end-to-end data flow management, optimization, and collaboration with various teams to transform raw data into actionable insights. Strong expertise in Databricks, Spark, Delta Lake, and data engineering principles is essential.
We are looking for a skilled Data Engineer to join our team, someone who can dive in and build scalable data pipelines and analytics solutions, particularly on the Databricks platform. The successful candidate will be responsible for the entire data lifecycle, from ingestion to delivery of trusted insights.
This involves designing, developing, implementing, and maintaining end-to-end data flows, constantly striving for performance optimization, and closely collaborating with a diverse team including data scientists, analysts, and business stakeholders. The core objective is to transform raw data into valuable, actionable insights that drive informed decision-making across the organization. This role demands a proactive individual with a strong understanding of data engineering principles and a passion for creating efficient and reliable data solutions.\The role encompasses a wide array of responsibilities, starting with the hands-on design, development, testing, and maintenance of robust data pipelines and ETL/ELT processes within the Databricks ecosystem. This includes working extensively with technologies like Delta Lake, Apache Spark (PySpark), SQL, and leveraging Python/Scala/SQL notebooks. The engineer will also be responsible for architecting scalable data models and designing data vault or dimensional schemas to effectively support reporting, business intelligence (BI) initiatives, and advanced analytics projects. Crucially, the role necessitates a focus on data quality, data lineage, and governance. This involves implementing rigorous data quality practices, diligently monitoring key metrics, and proactively resolving any data issues that arise. Collaboration is key; the Data Engineer will work closely with Data Platform Engineers to optimize cluster configurations, performance tuning strategies, and cost management within cloud environments, specifically Azure Databricks. Furthermore, the role will involve building and maintaining data ingestion processes from various sources, including relational database management systems (RDBMS), Software as a Service (SaaS) applications, files, and streaming queues, utilizing modern data engineering patterns like Change Data Capture (CDC), event-driven pipelines, change streams, and Lakeflow Declarative Pipelines. The engineer will also be expected to develop and maintain Continuous Integration/Continuous Deployment (CI/CD) pipelines for data workflows, ensuring proper versioning, thorough testing, and automated deployments. Additionally, partnering with data scientists and analysts to provide clean data, reusable notebooks, and data products, along with supporting feature stores and model deployment pipelines will be vital.\Key skills include a strong foundation in data engineering, a thorough understanding of cloud data platforms like Azure, and significant experience with Databricks. The ideal candidate will have demonstrable expertise in Apache Spark (PySpark), Databricks notebooks, Delta Lake, and SQL. Experience with object storage solutions such as Azure Data Lake Storage (ADLS) is also highly desirable. The ability to build and maintain efficient ETL/ELT pipelines, coupled with strong data modeling skills and a proven track record of performance optimization, is essential. Experience with CI/CD methodologies for data pipelines, along with familiarity with orchestration tools like GitHub Actions, Asset Bundles, or Databricks' Jobs, is also highly valued. Strong problem-solving abilities, attention to detail, and the capacity to work effectively within a collaborative, cross-functional team are crucial. Experience with streaming data technologies, such as Structured Streaming, Kafka, and Delta Live Tables, is a plus. Knowledge of data visualization and business intelligence tools like Splunk, Power BI, or Grafana will be beneficial. A Bachelor's or Master's degree in Computer Science, Data Engineering, Information Systems, or a related field is required, and certifications in Databricks or relevant cloud provider platforms are a significant advantage. The ability to document data lineage, architecture, and operational runbooks, and participate in architectural reviews and best practice governance is a must. The successful candidate will be a proactive, solutions-oriented individual with a passion for data and a desire to contribute to a data-driven organization. This role offers the opportunity to make a significant impact on the company's data infrastructure and analytics capabilities, working on cutting-edge technologies and collaborating with a talented team of professionals
Data Engineering Databricks ETL/ELT Data Pipelines Apache Spark
United States Latest News, United States Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Senior Data AnalystSeeking an experienced Senior Data Analyst to perform in-depth data analysis, manage reporting requests, design data models, and collaborate with stakeholders. The ideal candidate will have extensive experience with enterprise databases, data visualization tools (Power BI), and a strong understanding of data analytics principles. They will be responsible for extracting, transforming, and analyzing data to uncover trends and insights, contributing to data-driven decision-making. The position requires strong problem-solving skills, the ability to work independently, and excellent communication abilities.
Read more »
Systems Engineer for Company Mid Dish Systems and ProductsCompany Mid Dish is seeking a highly experienced Systems Engineer to lead the assembly, integration, and verification of its systems and products. The role encompasses the full system lifecycle, from requirements engineering and design to integration, testing, and operational support. The ideal candidate will have extensive experience in systems engineering, a strong understanding of electronic and mechanical systems, and the ability to lead and contribute to engineering tasks and analysis.
Read more »
Business Intelligence Engineer / Data Scientist Opportunity at Membership Management TrailblazerA leading membership management solutions provider is seeking a Business Intelligence Engineer / Data Scientist to analyze company and client data, develop BI dashboards (Power BI), design predictive models, and translate business needs into data-driven outcomes. The role involves mentoring junior analysts.
Read more »
Business Intelligence Engineer/Data Scientist (Remote)IT Industry News. Daily.
Read more »
Senior System Engineer – Gauteng JohannesburgIT Industry News. Daily.
Read more »
Data Engineer - Databricks and Cloud Data PipelinesJoin our team as a Data Engineer to build and maintain scalable data pipelines and analytics solutions on Databricks. Responsibilities include designing data flows, optimizing performance, and collaborating with data scientists and business stakeholders. Experience with Spark, Delta Lake, Azure, and ETL/ELT processes is required.
Read more »
