{bc}

Lead Data Engineer

InceptionAbu Dhabi, UAE3 weeks agoMid-Seniorfulltime
AWSAzureGCPScalaSEMSQLVAT
Generate Resume for this Job
Via LinkedIn·

About This Role

Inception, a G42 company, is the region’s leading innovator of AI-powered domain-specific as well as industry-agnostic products, built on a rich heritage of research and development. Within the G42 ecosystem, Inception functions as the core intelligence layer – transforming data and compute infrastructure into real-world, applied AI solutions. Beyond its commercial endeavors, Inception is committed to creating positive societal impact. For more information, please visit www.inceptionai.ai

Overview Inception is seeking a highly skilled Lead Data Engineer to architect and build scalable, cloud-native data and AI pipelines that power enterprise LLM, RAG, and retrieval systems.

Responsibilities

  • Design, build, and optimize scalable data pipelines for AI/LLM workloads, including vectorization and embedding processing.
  • Develop and maintain ETL/ELT workflows for structured, unstructured, and streaming data.
  • Create and manage vector database indexing and similarity search pipelines using tools like FAISS, Pinecone, Weaviate, Qdrant, Chroma.
  • Build retrieval systems for RAG, semantic search, and enterprise knowledge retrieval.
  • Develop robust, reusable data orchestration pipelines using Airflow, Spark, or similar tools.
  • Architect and manage data pipelines across Azure (primary), AWS, and GCP environments.
  • Integrate and optimize storage and processing across SQL, NoSQL, and vector databases.
  • Contribute to the design and implementation of event-driven architectures.
  • Collaborate with AI teams to enable embedding generation, LLM integration, and model-serving pipelines.
  • Ensure end-to-end data quality, monitoring, reliability, and observability.
  • Lead or participate in system design for large-scale, distributed data and AI systems.

Required Skills Programming & Data

  • Strong expertise in Python for data processing, APIs, automation, or distributed workloads.
  • Strong proficiency in SQL and knowledge of NoSQL databases (MongoDB, DynamoDB, Cosmos DB, etc.).
  • Experience with vector databases, such as: FAISS, Pinecone, Weaviate, Qdrant, Chroma.
  • Strong knowledge of data modeling, pipeline development, and ETL/ELT frameworks.

AI/LLM Infrastructure

  • Solid understanding of vectorization, embeddings, and similarity search techniques.
  • Familiarity with LLMs, embedding models, and RAG pipeline concepts.
  • Experience integrating embedding-generation pipelines via Hugging Face, OpenAI, or other model providers.

Cloud & Distributed Systems

  • Proficiency with Azure (primary), and familiarity with AWS and GCP.
  • Experience with Docker and containerized development.
  • Understanding of Kubernetes is a strong plus.

Orchestration & Big Data

  • Expertise in Apache Airflow for scheduling and orchestration.
  • Experience with Apache Spark or equivalent distributed processing frameworks.

Architecture & Engineering Fundamentals

  • Strong system design fundamentals for scalable and distributed systems.
  • Knowledge of event-driven architecture and modern data platforms.
  • Strong understanding of DevOps, CI/CD, version control, and observability best practices.

Qualifications

  • 8+ years of progressive experience in data engineering, distributed systems, or AI/ML data infrastructure
  • Experience building RAG pipelines in production.
  • Knowledge of graph databases or hybrid search systems.
  • Understanding of model deployment, inference optimization, and caching techniques for LLM workloads.
  • Familiarity with data governance, IAM, and security patterns across cloud ecosystems

What We Look For

If you are a performance-driven, inquisitive mind with the agility to adapt to ambiguity, you will fit right in. You should be eager to explore opportunities to build meaningful collaborations with stakeholders and aspire to create unique customer-centric solutions. Bias for action and a passion to conquer new frontiers in the AI space is at the heart of the Inception community.

What Working At Inception Offers

Culture: An open, diverse and inclusive environment with a global vision that encourages personal growth and focuses on ground-breaking, industry-first innovations.

Career: Outstanding learning, development & growth opportunities via structured training programs and innovative, high-tech projects.

Rewards: A competitive remuneration package with a host of perks including healthcare, education support, leave benefits and more.

If you can confidently demonstrate that you meet the criteria above, please contact us as soon as possible.

Similar Jobs

Lead Data Engineer

Capgemini · Abu Dhabi

**About the job you are considering** ------------------------------------- Capgemini Global Insights \& Data business line is a market leader in the data, platform, and analytics across all regions and across many secto

AWSAzureGCP

Lead Data Engineer

Capgemini · Abu Dhabi

**About the job you are considering** ------------------------------------- Capgemini Global Insights \& Data business line is a market leader in the data, platform, and analytics across all regions and across many secto

AWSAzureGCP

Lead Data Scientist

Capgemini · Abu Dhabi

**About the job you are considering** ------------------------------------- **Your Role** ------------- **Your Skills and Experience** ------------------------------ **Why you should consider Capgemini** ----------------

Lead Data Scientist

Capgemini · Abu Dhabi

**About the job you are considering** ------------------------------------- Capgemini Global Insights \& Data business line is a market leader in the data, platform, and analytics across all regions and cross many sector

VAT

Lead Data Intelligence Machine Learning Engineer

Dyson · Dubai

Mid-Senior

**About Us** At Dyson, we’re driven by a relentless pursuit of innovation—pushing boundaries in engineering, AI, and robotics. Our new Data Intelligence team sits at the heart of this mission: shaping Dyson’s future thro

Machine LearningVAT

Informatica Technical Lead Data Quality & Data Governance

Datamatics Technologies · Dubai

Senior

Lead design and implementation of Data Quality rules, manage Data Governance workflows, and develop data integration patterns using Informatica IDMC and Power BI.

Informatica Technical Lead Data Quality & Data Governance

Lead Data Intelligence Project Manager

Dyson · Dubai

Mid-Senior

**About Us** At Dyson, we’re driven by a relentless pursuit of innovation—pushing boundaries in engineering, AI, and robotics. Our new Data Intelligence team sits at the heart of this mission: shaping Dyson’s future thro

VAT

Lead Data Analyst

Net2Source (N2S) · Dubai

Mid-Senior

**Technical Skills:** * Strong data querying and processing skills using **SQL** * Data Visualization tools – Power BI, Business Objects, Crystal or similar tool * Data Warehousing and ETL concepts **Competencies:** * Ex

AWSAzurePower BI

Lead Data Analyst

HireAlpha · Dubai

Mid-Senior

Role Overview We are seeking an experienced Senior Data Analyst with deep expertise in SQL, data mapping, data marts, and data warehousing, along with significant exposure to the banking domain (last 5–6 years). The idea

ScalaSQL
AI Job Platform

Stop applying blindly. Start getting hired.

Base Career automates the hardest parts of job searching — apply smarter, not harder.

AI Resume in 60s

Your resume rewritten for this exact role using the job description as the brief.

ATS-Optimized

Get past automated screening filters with the right keywords matched to each job.

Application Tracker

Track every job, follow-up, and interview in one visual kanban board.

Start Today for Free

Free plan · No credit card required