Lead Data Engineer
About This Role
Inception, a G42 company, is the region’s leading innovator of AI-powered domain-specific as well as industry-agnostic products, built on a rich heritage of research and development. Within the G42 ecosystem, Inception functions as the core intelligence layer – transforming data and compute infrastructure into real-world, applied AI solutions. Beyond its commercial endeavors, Inception is committed to creating positive societal impact. For more information, please visit www.inceptionai.ai
Overview Inception is seeking a highly skilled Lead Data Engineer to architect and build scalable, cloud-native data and AI pipelines that power enterprise LLM, RAG, and retrieval systems.
Responsibilities
- Design, build, and optimize scalable data pipelines for AI/LLM workloads, including vectorization and embedding processing.
- Develop and maintain ETL/ELT workflows for structured, unstructured, and streaming data.
- Create and manage vector database indexing and similarity search pipelines using tools like FAISS, Pinecone, Weaviate, Qdrant, Chroma.
- Build retrieval systems for RAG, semantic search, and enterprise knowledge retrieval.
- Develop robust, reusable data orchestration pipelines using Airflow, Spark, or similar tools.
- Architect and manage data pipelines across Azure (primary), AWS, and GCP environments.
- Integrate and optimize storage and processing across SQL, NoSQL, and vector databases.
- Contribute to the design and implementation of event-driven architectures.
- Collaborate with AI teams to enable embedding generation, LLM integration, and model-serving pipelines.
- Ensure end-to-end data quality, monitoring, reliability, and observability.
- Lead or participate in system design for large-scale, distributed data and AI systems.
Required Skills Programming & Data
- Strong expertise in Python for data processing, APIs, automation, or distributed workloads.
- Strong proficiency in SQL and knowledge of NoSQL databases (MongoDB, DynamoDB, Cosmos DB, etc.).
- Experience with vector databases, such as: FAISS, Pinecone, Weaviate, Qdrant, Chroma.
- Strong knowledge of data modeling, pipeline development, and ETL/ELT frameworks.
AI/LLM Infrastructure
- Solid understanding of vectorization, embeddings, and similarity search techniques.
- Familiarity with LLMs, embedding models, and RAG pipeline concepts.
- Experience integrating embedding-generation pipelines via Hugging Face, OpenAI, or other model providers.
Cloud & Distributed Systems
- Proficiency with Azure (primary), and familiarity with AWS and GCP.
- Experience with Docker and containerized development.
- Understanding of Kubernetes is a strong plus.
Orchestration & Big Data
- Expertise in Apache Airflow for scheduling and orchestration.
- Experience with Apache Spark or equivalent distributed processing frameworks.
Architecture & Engineering Fundamentals
- Strong system design fundamentals for scalable and distributed systems.
- Knowledge of event-driven architecture and modern data platforms.
- Strong understanding of DevOps, CI/CD, version control, and observability best practices.
Qualifications
- 8+ years of progressive experience in data engineering, distributed systems, or AI/ML data infrastructure
- Experience building RAG pipelines in production.
- Knowledge of graph databases or hybrid search systems.
- Understanding of model deployment, inference optimization, and caching techniques for LLM workloads.
- Familiarity with data governance, IAM, and security patterns across cloud ecosystems
What We Look For
If you are a performance-driven, inquisitive mind with the agility to adapt to ambiguity, you will fit right in. You should be eager to explore opportunities to build meaningful collaborations with stakeholders and aspire to create unique customer-centric solutions. Bias for action and a passion to conquer new frontiers in the AI space is at the heart of the Inception community.
What Working At Inception Offers
Culture: An open, diverse and inclusive environment with a global vision that encourages personal growth and focuses on ground-breaking, industry-first innovations.
Career: Outstanding learning, development & growth opportunities via structured training programs and innovative, high-tech projects.
Rewards: A competitive remuneration package with a host of perks including healthcare, education support, leave benefits and more.
If you can confidently demonstrate that you meet the criteria above, please contact us as soon as possible.
Similar Jobs
Lead Data Engineer
Capgemini · Abu Dhabi
**About the job you are considering** ------------------------------------- Capgemini Global Insights \& Data business line is a market leader in the data, platform, and analytics across all regions and across many secto
Yesterday
Generate Resume ↗Lead Data Engineer
Capgemini · Abu Dhabi
**About the job you are considering** ------------------------------------- Capgemini Global Insights \& Data business line is a market leader in the data, platform, and analytics across all regions and across many secto
Yesterday
Generate Resume ↗Lead Data Scientist
Capgemini · Abu Dhabi
**About the job you are considering** ------------------------------------- **Your Role** ------------- **Your Skills and Experience** ------------------------------ **Why you should consider Capgemini** ----------------
Yesterday
Generate Resume ↗Lead Data Scientist
Capgemini · Abu Dhabi
**About the job you are considering** ------------------------------------- Capgemini Global Insights \& Data business line is a market leader in the data, platform, and analytics across all regions and cross many sector
Yesterday
Generate Resume ↗Lead Data Intelligence Machine Learning Engineer
Dyson · Dubai
**About Us** At Dyson, we’re driven by a relentless pursuit of innovation—pushing boundaries in engineering, AI, and robotics. Our new Data Intelligence team sits at the heart of this mission: shaping Dyson’s future thro
1 weeks ago
Generate Resume ↗Informatica Technical Lead Data Quality & Data Governance
Datamatics Technologies · Dubai
Lead design and implementation of Data Quality rules, manage Data Governance workflows, and develop data integration patterns using Informatica IDMC and Power BI.
2 weeks ago
Generate Resume ↗Lead Data Intelligence Project Manager
Dyson · Dubai
**About Us** At Dyson, we’re driven by a relentless pursuit of innovation—pushing boundaries in engineering, AI, and robotics. Our new Data Intelligence team sits at the heart of this mission: shaping Dyson’s future thro
2 weeks ago
Generate Resume ↗Lead Data Analyst
Net2Source (N2S) · Dubai
**Technical Skills:** * Strong data querying and processing skills using **SQL** * Data Visualization tools – Power BI, Business Objects, Crystal or similar tool * Data Warehousing and ETL concepts **Competencies:** * Ex
3 weeks ago
Generate Resume ↗Lead Data Analyst
HireAlpha · Dubai
Role Overview We are seeking an experienced Senior Data Analyst with deep expertise in SQL, data mapping, data marts, and data warehousing, along with significant exposure to the banking domain (last 5–6 years). The idea
4 weeks ago
Generate Resume ↗Stop applying blindly.
Start getting hired.
Base Career automates the hardest parts of job searching — apply smarter, not harder.
AI Resume in 60s
Your resume rewritten for this exact role using the job description as the brief.
ATS-Optimized
Get past automated screening filters with the right keywords matched to each job.
Application Tracker
Track every job, follow-up, and interview in one visual kanban board.
Free plan · No credit card required