Lead Data Engineer - Azure Databricks/Kafka
Job Fit Check
Base Career helps you apply smarter for this job.
Key skills for this role
About the Role
Design and develop streaming ingestion pipelines using Apache Spark (Structured Streaming) and Databricks Auto Loader to consume files from cloud storage or messages from Kafka/RabbitMQ/Confluent Cloud and ingest them into Delta Lake, ensuring schema evolution and exactly once semantics.
Key Skills for This Role
Full Job Posting
Overview
Design and develop streaming ingestion pipelines using Apache Spark (Structured Streaming) and Databricks Auto Loader to consume files from cloud storage or messages from Kafka/RabbitMQ/Confluent Cloud and ingest them into Delta Lake, ensuring schema evolution and exactly once semantics. Implement CDC and deduplication logic by capturing change events from source databases using Debezium, built-in CDC features of SQL Server/Oracle, or other connectors, and apply watermarking and drop duplicate strategies based on primary keys and event timestamps. Scale ingestion through configuration by building a config-driven framework such as using Airflow, DBX Jobs, or Delta Live Tables that iterates over metadata tables to deploy/update ingestion pipelines for hundreds of tables/sources without code duplication. Implement monitoring, observability, and security by capturing streaming query metrics and publishing them to monitoring platforms like Prometheus and Grafana, setting up dashboards for lag, files processed, and processing duration, and enforcing role-based access control, encryption, and data masking. Participate in DevOps processes by using CI/CD pipelines, such as Jenkins or GitHub Actions, to automate the deployment of jobs, managing infrastructure with Terraform or similar tools, and following best practices for version control and code reviews. This role requires 5–8 years of experience designing and building data pipelines using Apache Spark, Databricks, or equivalent big data frameworks, along with hands-on expertise with streaming and messaging systems such as Apache Kafka, Confluent Cloud, RabbitMQ, or Azure Event Hub, including creating producers, consumers, and topics and integrating them into downstream processing. Candidates should possess a deep understanding of relational databases and CDC, with proficiency in SQL Server, Oracle, or other RDBMSs and experience capturing change events using Debezium or native CDC tools; proficiency in programming languages such as Python, Scala, or Java; solid knowledge of SQL for data manipulation and transformation; cloud platform expertise, specifically with Azure or AWS services for data storage, compute, and orchestration; and knowledge of data Lakehouse architectures, Delta Lake, partitioning strategies, and performance optimization. Additionally, familiarity with Git, CI/CD pipelines, and infrastructure-as-code is essential,
Apply for this job in 1 click
Skip the repetitive application forms
Install the Base Career Chrome Extension and autofill job applications across major job boards with your profile.
Trusted by over 500,000 job seekers on Base Career
More from this employer
More jobs at Virtusa
SAS Transaction Monitoring SME
Dubai, UAE
Virtusa is seeking a SAS Transaction & Fraud Monitoring Applications Expert to lead hands-on technical design and implementation of SAS Transaction Monitoring and Fraud Management solutions. The role requires strong expe
Project Manager – Retail Lending Programme
Dubai, UAE
Virtusa is seeking a Project Manager to lead a retail lending programme involving migration of LOS, LMS, and Collections systems. Requires 12-14 years of banking/financial services experience and strong project delivery
QA - Agentic SDLC
Dubai, UAE
We are seeking a QA professional to govern AI-driven development lifecycles by writing autonomous testing agents, engineering testing scenarios, and building governance frameworks. The role involves designing and managin
Finacle Automation Testing (Selenium/ Playwright)
Dubai, UAE
Virtusa is seeking a skilled Finacle Test Engineer to join their banking technology team. The role involves designing and executing test cases for Finacle 11X core banking modules, building automated test suites using Se
Business Analyst Finacle
Dubai, UAE
The Finacle Business Analyst will act as a key liaison between Business and IT, responsible for requirement gathering, solution design, customization support, and delivery of Finacle-related initiatives. The role focuses
Android
Dubai, UAE
Develop and maintain native Android applications using Kotlin and Java, implement clean architecture and modern UI with Jetpack Compose, and ensure high-quality code through testing and CI/CD.
Data Project Manager - Treasury
Dubai, UAE
Key Responsibilities End-to-End Delivery Ownership: Lead the complete lifecycle of data platform delivery, managing the transition from high-velocity design sprints into multi-month, complex engineering and implementatio
Security Tools Engineer
Dubai, UAE
Develop and support SailPoint IdentityIQ / Identity Security Cloud workflows, rules, connectors, certifications, provisioning, and access request processes. Integrate applications, Active Directory, LDAP, databases, and
SAS Transaction Monitoring SME
Dubai, UAE
Project Manager – Retail Lending Programme
Dubai, UAE
QA - Agentic SDLC
Dubai, UAE
Finacle Automation Testing (Selenium/ Playwright)
Dubai, UAE
Business Analyst Finacle
Dubai, UAE
Android
Dubai, UAE
Data Project Manager - Treasury
Dubai, UAE
Security Tools Engineer
Dubai, UAE