{bc}
linkedin

Senior Data Engineer

cander
Abu Dhabi, UAE
fulltime
Mid-Senior
Today
Big DataETLData WarehousingCloud Computing (AWSAzureGCP)
Free

Job Fit Check

Base Career helps you apply smarter for this job.

?%
Ready to Scan

Key skills for this role

Big DataETLData Warehousing
Smart Apply

Full Job Posting

Overview

Headquartered in Abu Dhabi, United Arab Emirates, specializes in developing AI-powered supply chain solutions that integrate fragmented SAP, Ariba, and unstructured data into actionable intelligence.

The company focuses on architecting ultra-secure data lakehouses to deliver real-time analytics, secure high-performance pipelines, and enable advanced generative AI applications, supporting critical enterprise workflows and procurement automation in defense and technical sectors.

Job Summary

We are seeking a Senior Data Engineer to architect and develop the core data infrastructure that will underpin our AI-driven transformation.

In this pivotal role, you will move beyond traditional ETL processes to design and deploy high-security Data Lakehouse environments, establishing a single, authoritative source of truth for our AI systems.

You will lead the technical implementation of 'Workstream 1: Data & Platform Foundations,' a critical initiative for our flagship projects.

Your responsibilities will include integrating complex enterprise systems such as SAP S/4HANA and Ariba, as well as processing unstructured data like technical drawings and regulatory documents.

By collaborating across teams, you will build high-performance pipelines and systems that power Intelligent Supply Chain forecasting and generative AI tools.

Operating within a structured 'Sprint Zero' framework, you will ensure robust data lineage, security, and compliance with defense-grade standards.

This role demands expertise in both structured and unstructured data engineering, with a focus on creating scalable, secure, and high-performance data architectures.

Your work will directly enable AI agents to optimize procurement processes and support engineers in developing next-generation systems, making you instrumental in transforming raw data into strategic advantage for national defense capabilities.

Key Responsibilities

  • Design and deploy defense-grade Data Lakehouse architectures to serve as the Single Source of Truth for AI-driven Supply Chain Intelligence, ensuring high-performance pipelines and systems for Intelligent Supply Chain forecasting and generative AI tools.
  • Lead the technical execution of Workstream 1: Data & Platform Foundations, mapping rigid enterprise systems (e.g., SAP S/4HANA, Ariba) and collaborating with cross-functional teams to integrate and process complex unstructured data (technical drawings, regulatory text) for AI applications.
  • Architect and deploy ingestion pipelines to extract high-volume transactional data from ERP systems like SAP S/4HANA, Ariba, and PLM, ensuring near real-time availability for forecasting models and AI agents.
  • Build connectors for external market intelligence feeds (e.g., S&P Global, Orbis, EcoVadis) to enrich internal procurement data with macroeconomic and geopolitical signals for enhanced decision-making.
  • Design and implement a standardized procurement data model and taxonomy across multiple entities, harmonizing fragmented datasets into a cohesive analytics layer.
  • Engineer pipelines to ingest, process, and transform unstructured technical data (PDF tender documents, CAD metadata, historical CONOPS) into vector-ready formats for Retrieval-Augmented Generation (RAG) applications.
  • Manage and optimize Vector Databases (e.g., Weaviate) to store embeddings of archival proposals and engineering snippets, ensuring high-speed retrieval for AI drafting assistants and generative tools.
  • Establish data lineage and traceability protocols to link requirements to physical components, supporting Model-Based Systems Engineering (MBSE) and the Digital Thread implementation.
  • Implement Role-Based Access Control (RBAC), audit logging, and data redaction policies to ensure compliance with export controls and strict on-premise security requirements.
  • Deploy automated data quality frameworks to validate Bill of Materials (BOM) completeness and cost data accuracy before ingestion into AI models.
  • Optimize data pipelines for on-premise GPU clusters and air-gapped environments, ensuring efficiency and performance within existing infrastructure constraints.
  • Operate within a structured Sprint Zero environment to ensure data lineage, security, and governance meet defense-grade standards.

Qualifications And Experience

  • 5+ years of experience in Data Engineering, with at least 2 years focused on building pipelines for Machine Learning or Generative AI applications in an enterprise setting.
  • Expert proficiency in Python, SQL, and modern data engineering frameworks including Apache Spark, Kafka, and Airflow.
  • Strong experience extracting data from complex ERP environments, specifically SAP S/4HANA and SAP Ariba, with familiarity in SAP BTP as a plus.
  • Deep understanding of Data Lakehouse architectures (Databricks/Delta Lake), Relational Databases (PostgreSQL), and Vector Databases (Weaviate/Milvus).
  • Experience building pipelines for RAG solutions, Conversational AI agents, and classical ML models using tools such as dbt, dagster, or prefect.
  • Proficiency with containerization (Docker, Kubernetes) and CI/CD pipelines for deploying data workflows in secure environments.
  • Experience in Supply Chain, Manufacturing, or Defense sectors, with the ability to understand 'Bill of Materials' (BOM) structures and procurement lifecycles.
  • Ability to navigate the governance challenges between agile data work and rigid systems engineering requirements, ensuring data deliverables meet formal Stage Gate reviews.
  • Proven ability to collaborate with Data Scientists and Backend Engineers to define data schemas that support predictive modeling and AI agents.

Technical Skills

  • Expert proficiency in Python and SQL for data engineering tasks
  • Strong experience with modern data engineering frameworks, including Apache Spark, Kafka, and Airflow
  • Deep expertise in extracting and processing data from complex ERP environments, specifically SAP S/4HANA and SAP Ariba, with familiarity in SAP BTP as a plus
  • Comprehensive understanding of Data Lakehouse architectures, such as Databricks and Delta Lake, for scalable and secure data storage
  • Experience with relational databases, including PostgreSQL, and vector databases like Weaviate and Milvus for AI-driven applications
  • Proven ability to develop data pipelines for Retrieval-Augmented Generation (RAG) solutions, conversational agents, and classical machine learning models using tools like dbt, Dagster, or Prefect
  • Proficiency in containerization technologies, including Docker and Kubernetes, and experience implementing CI/CD pipelines for secure data workflow deployment
  • Knowledge of defense-grade security protocols, including Role-Based Access Control (RBAC), audit logging, and data redaction policies for compliance with export controls and on-premise security requirements
  • Experience building and optimizing pipelines for high-performance GPU clusters and air-gapped environments
  • Familiarity with automated data quality frameworks to validate critical datasets such as Bill of Materials (BOM) and cost data for AI model accuracy
  • Ability to design and implement standardized procurement data models and taxonomies to harmonize fragmented datasets across multiple entities
  • Experience engineering pipelines for unstructured data ingestion, including PDF tender documents, CAD metadata, and historical CONOPS, transforming them into vector-ready formats for AI applications
  • Knowledge of Model-Based Systems Engineering (MBSE) principles and the ability to establish data lineage and traceability protocols for the Digital Thread

About Our Sister Company

This role is based within our sister company, a trusted and integral part of our broader organizational network.

As a key extension of our client, this sister company shares our mission, values, and commitment to excellence while operating as an independent entity to deliver specialized expertise and tailored solutions.

Together, we leverage collective strengths to drive innovation, efficiency, and growth across our global operations.

Role Overview

The client is seeking a Senior Data Engineer to design and construct the foundational data infrastructure for an AI-driven transformation.

This role extends beyond traditional ETL processes to develop defense-grade Data Lakehouse architectures that serve as the authoritative 'Single Source of Truth' for AI agents.

You will lead the technical execution of 'Workstream 1: Data & Platform Foundations,' a critical component of the organization’s flagship initiatives.

Your responsibilities include mapping rigid enterprise systems—such as SAP S/4HANA and Ariba—while collaborating with cross-functional teams to interpret complex unstructured data (e.g., technical drawings, regulatory documents).

The goal is to enhance high-performance pipelines and systems that power Intelligent Supply Chain forecasting and generative AI tools.

All work will be conducted within a structured 'Sprint Zero' environment, ensuring rigorous data lineage and security compliance with defense-grade standards.

Core Objectives

This role demands a strategic approach to integrating structured and unstructured data sources to enable advanced analytics and AI capabilities.

You will architect scalable solutions that harmonize fragmented datasets into a unified layer, support real-time decision-making, and ensure compliance with stringent governance and security protocols.

The focus is on bridging enterprise systems with cutting-edge AI applications, optimizing data workflows for performance, and maintaining traceability across procurement, manufacturing, and defense-related data assets.

Why Join This Role?

This role offers a unique opportunity to architect the foundational data infrastructure that underpins national defense capabilities.

Your contributions will directly empower AI-driven systems to negotiate procurement contracts and support engineers in developing cutting-edge defense technologies.

If you are passionate about transforming raw data into actionable strategic assets and ready to build the robust frameworks that define the future of defense innovation, this position provides the ideal platform to make a meaningful impact.

Join us in shaping the next generation of defense solutions.

Apply for this job in 1 click

Skip the repetitive application forms

Install the Base Career Chrome Extension and autofill job applications across major job boards with your profile.

Sarah M.James T.Maya R.

Trusted by over 500,000 job seekers on Base Career

Start Free Today

More from this employer

More jobs at cander