Data Quality Engineer (Databricks)
Job Fit Check
Base Career helps you apply smarter for this job.
Key skills for this role
About the Role
The role involves designing and implementing data quality frameworks using Databricks, PySpark, and Delta Lake, focusing on automated profiling and quality metrics.
Key Skills for This Role
Full Job Posting
Job Description
Data Quality Engineer (Databricks) x 4 Positions Location: Abudhabi - UAE - Onsite Open to Relocate Duration: 6 months ( Extendable to One Year)
Experience
5 to 7 Years Project start date is 1st July - Immediate joiners will be preferred Role Overview The Data Quality Engineer will be responsible for designing, implementing, and operating ADC's enterprise data quality framework within the Databricks platform.
The role will deliver automated profiling, quality rule execution, cleansing, monitoring, remediation support, and quality reporting capabilities across 170 datasets and 1,346 prioritised Critical Data Elements (CDEs).
Working closely with Data Modellers, Data Catalogue Specialists, business data owners, and platform engineers, the Data Quality Engineer will establish scalable and reusable quality controls that improve trust, accuracy, completeness, consistency, timeliness, validity, and uniqueness across ADC's data estate.
Key Responsibilities
Databricks Platform Configuration and Administration Configure and manage the Databricks environment supporting enterprise data quality operations.
Establish and maintain: Compute clusters.
PySpark notebook frameworks.
Delta Lake storage structures.
Unity Catalog integration.
Optimise platform performance for large-scale profiling and rule execution across all in-scope datasets and CDEs.
Implement development, testing, and production deployment standards for data quality assets.
Data Profiling and Quality Assessment Design and develop AI-assisted profiling notebooks using PySpark.
Perform baseline data quality assessments across the six quality dimensions: Completeness.
Accuracy.
Consistency.
Validity.
Timeliness.
Uniqueness.
Capture and analyse: Null value rates.
Duplicate records.
Invalid values.
Format violations.
Outliers.
Schema drift.
Produce quality profiling outputs for all prioritised CDEs and datasets.
Data Quality Rule Factory Development Design and implement a reusable Data Quality Rule Factory.
Build parameterised PySpark-based rule templates capable of supporting large-scale rule deployment.
Enable automated generation and management of approximately 6,730 data quality rules without manual rule-by-rule development.
Ensure rules are reusable, configurable, and maintainable across multiple datasets and domains.
Data Quality Controls and Pipeline Integration Deploy quality rules as reusable Databricks Jobs integrated into Delta Lake processing pipelines.
Embed quality controls within Bronze, Silver, and Gold processing stages.
Implement automated quality gates preventing data progression where defined thresholds are not met.
Maintain rule traceability and execution history for audit and governance purposes.
Data Cleansing and Quality Improvement Develop automated remediation and cleansing pipelines using PySpark.
Implement: Standardisation routines.
Data enrichment processes.
Deduplication logic.
Schema harmonisation controls.
Deploy machine learning models managed through MLflow for: Anomaly detection.
Exact duplicate detection.
Fuzzy matching and duplicate identification.
Ensure all AI and ML recommendations are explainable, auditable, and routed through human-in-the-loop validation processes where required.
Exception Management and Reprocessing Design and manage exception handling processes for failed quality records.
Implement quarantine Delta Lake tables serving as the Failed Record Register.
Capture and maintain: Failure reason.
Associated CDE.
Rule reference.
Processing timestamp.
Resolution status.
Develop reprocessing workflows to support correction and controlled re-ingestion of remediated records.
Data Quality Metrics and Reporting Develop Delta Lake metric aggregation structures supporting enterprise quality reporting.
Calculate and publish: Data Quality Index (DQI) scores.
Dimension-level quality metrics.
Rule pass/fail rates.
Dataset compliance scores.
SLA adherence indicators.
Provide curated outputs to support Power BI quality dashboards and executive reporting.
Monitoring, Alerting and Predictive Quality Management Configure automated quality monitoring and alerting mechanisms.
Implement threshold-based notifications using: Databricks SQL Alerts.
Azure Monitor integrations.
Develop predictive risk scoring models to identify datasets at risk of future quality degradation.
Support proactive quality management and operational intervention activities.
Root Cause Analysis and Remediation Support Apply Databricks machine learning and pattern analysis techniques to profiling and rule execution outputs.
Support AI-assisted root cause analysis across established remediation categories.
Identify recurring quality issues, systemic defects, and process breakdowns.
Produce prioritised remediation datasets for business and operational stakeholders.
Export remediation outputs to Power BI and Excel to support: Remediation Tiering Matrix.
Prioritisation Scoring Models.
Governance reporting processes.
Governance, Compliance and Stakeholder Engagement Collaborate with Data Modellers and Data Catalogue Specialists to ensure quality controls align with authoritative data definitions and metadata standards.
Support DDA and DGE governance processes by producing required quality artefacts and evidence.
Maintain documentation, version control, and audit trails for all quality assets, rules, models, and processes.
Participate in quality reviews, governance forums, and stakeholder workshops.
Required Skills and Experience
Strong experience designing and implementing enterprise Data Quality frameworks.
Advanced Databricks engineering experience.
Strong PySpark development skills.
Experience
with: Delta Lake.
Unity Catalog.
Databricks Workflows and Jobs.
Apply for this job in 1 click
Skip the repetitive application forms
Install the Base Career Chrome Extension and autofill job applications across major job boards with your profile.
Trusted by over 500,000 job seekers on Base Career
More from this employer
More jobs at Datamatics Technologies
ServiceNow GRC & SecOps Implementation Consultant
Riyadh, KSA
Seeking experienced ServiceNow GRC and SecOps consultants to design, implement, and customize risk management solutions in regulated environments.
Data Modeller (Erwin Data Modeler)
Dubai, UAE
The role involves managing enterprise data modeling using Erwin Data Modeler, requiring strong skills in data governance, modeling standards, and stakeholder engagement.
BI Engineer
Riyadh, KSA
Design and maintain Business Intelligence solutions, focusing on data modeling, ETL development, and dashboarding, requiring strong analytical and communication skills.
Delivery Manager (Data, BI & AI Projects)
, KSA
Lead delivery of Data, Business Intelligence, and AI projects, managing teams, client engagement, and ensuring project governance and quality.
AI/ML Engineer Agentic AI
, KSA
Seeking an AI/ML Engineer to design and deploy AI solutions, optimize models, and collaborate on AI automation with strong skills in Python and machine learning frameworks.
AI/ML Engineer – Agentic AI
, UAE
Position Title : AI/ML Engineer – Agentic AI Experience Required: 4–6 Years (Should be open to Travel) Employment Type : Full-Time - Riyadh - Onsite Job Overview We are seeking a highly skilled AI/ML Engineer with hands-
Delivery Manager (Data, BI & AI Projects)
Riyadh, KSA
Position Title : Delivery Manager – Data, BI & AI Experience Required: 6–8 Years (Should be open to Travel) Employment Type : Full-Time - Riyadh - Onsite Job Overview We are looking for an experienced Delivery Manager to
Software Engineering Lead (Full Stack)
Riyadh, KSA
Lead development teams, manage priorities, conduct code reviews, and ensure application performance using technologies like Flutter, Node.js, and Docker.
ServiceNow GRC & SecOps Implementation Consultant
Riyadh, KSA
Data Modeller (Erwin Data Modeler)
Dubai, UAE
BI Engineer
Riyadh, KSA
Delivery Manager (Data, BI & AI Projects)
, KSA
AI/ML Engineer Agentic AI
, KSA
AI/ML Engineer – Agentic AI
, UAE
Delivery Manager (Data, BI & AI Projects)
Riyadh, KSA
Software Engineering Lead (Full Stack)
Riyadh, KSA
