{bc}

AI Infrastructure Engineer

DautomSharjah, UAE4 days agoMid-Senior
Mid-Seniorfulltime

Skills

engineeringdesignproject management

About This Role

Overview

The AI Infrastructure Engineer is a platform specialist responsible for architecting, building, and operating high-performance AI infrastructure to support advanced AI workloads, including LLMs, GenAI, Computer Vision, and MLOps.

This role will focus on managing GPU clusters (NVIDIA A100/H100), deploying and maintaining Red Hat OpenShift AI (RHODS), and ensuring secure, scalable, and cost-efficient AI platforms across SDD’s Sovereign Cloud and hybrid/multi-cloud environments.

The engineer will enable enterprise-grade AI adoption for 200+ government entities.

GPU & AI Platform Architecture

Design and implement GPU-based compute clusters.

Define reference architectures for LLM hosting, Vector Databases, MLOps, and high-performance storage/networking.

Fully operational GPU-based AI infrastructure.

GPU Cluster Uptime and Performance Utilization.

Reduction in Cost per Training/Inference Workload.

GPU Cluster Operations

Install, configure, and optimize core components: CUDA, cuDNN, NCCL, NVIDIA Drivers, and GPU Operators.

Implement GPU partitioning, scheduling, and performance tuning for high-end GPUs (e.g., A100/H100).

High-availability architecture for all AI workloads.

Complete documentation and runbooks.

OpenShift AI (RHODS) Management

Deploy, configure, and maintain the Red Hat OpenShift AI (RHODS) platform for multi-tenant use.

Manage the integration of NVIDIA GPU Operator for efficient GPU scheduling and support Data Scientists with Notebooks, Training, and Inference Endpoints.

Production-ready OpenShift AI (RHODS) platform.

AI Project Onboarding Speed.

LLM & Model Serving

Build and manage infrastructure for hosting and serving open-source LLM frameworks (Llama, Falcon, Mistral) and supporting RAG pipelines, LoRA adapters, and Vector Databases (Milvus, pgvector).

Multi-model LLM serving environment for entities.

MLOps Pipeline Success Rate and Deployment Frequency.

MLOps & Automation

Implement IaC (Terraform, Ansible) and GitOps for the automated lifecycle management of the AI platform (node onboarding, scaling, model rollout/rollback).

Build robust MLOps pipelines for data prep, training, evaluation, and monitoring (using tools like MLflow/Kubeflow).

Infrastructure automation via Terraform & Ansible.

Automation Coverage for AI Infrastructure.

& Experience

  • Experience: 7–12 years in Cloud Infrastructure, DevOps, ML Infrastructure, or Platform Engineering.

• Deep Hands-On Expertise

  • GPU Systems (NVIDIA A100/H100), Linux, Containers, and Kubernetes.
  • OpenShift AI (RHODS) or equivalent Kubernetes GPU orchestration.
  • LLM Hosting (Llama, Mistral, Falcon, etc.) and supporting Vector Databases/RAG systems.
  • Strong Experience In: TensorFlow, PyTorch, Hugging Face, Distributed Training (DDP, Deep Speed), and ML Ops Stacks (ML flow, Kubeflow).

Essential Skills & Competencies

  • Technical: Deep understanding of GPU compute, HPC architectures, and ML performance profiling. Strong skills in IaC (Terraform/Ansible), CI/CD, and OpenShift/Kubernetes operators.
  • Soft Skills: Strong troubleshooting, optimization, and performance engineering mindset. Excellent cross-functional collaboration and documentation skills.

Preferred Certifications

  • NVIDIA Deep Learning / AI Infrastructure Certification
  • Red Hat OpenShift AI specialization
  • Kubernetes CKA/CKAD
  • Azure AI or Oracle Cloud AI certifications
  • Terraform & Ansible certifications

Your resume, rewritten for this exact role.

Sign up free — Base Career tailors your CV to this job description in 60 seconds.

01 / 05

Resume Tailored to This Job

Resume Tailored to This Job

Your keywords, structure, and story — rewritten to match this exact role and pass ATS filters.

Get My Free Resume

Free · No card · 60 seconds

02 / 05

Cover Letter for This Role, Done

Cover Letter for This Role, Done

Job-specific cover letters written in Gulf professional tone — ready in seconds, not hours.

Get My Cover Letter

Free · No card · 60 seconds

03 / 05

See How Well You Fit This Role

See How Well You Fit This Role

AI match score with clear reasons — know your fit before investing time in the application.

Check My Fit Score

Free · No card · 60 seconds

04 / 05

Apply in One Click

Apply in One Click

Autofill any application form on Workday, LinkedIn, Bayt, Greenhouse — with your tailored content.

Start Applying Faster

Free · No card · 60 seconds

05 / 05

Track It. Follow Up at the Right Time.

Track It. Follow Up at the Right Time.

Visual pipeline for every application with AI-timed follow-up reminders so nothing slips.

Track My Applications

Free · No card · 60 seconds

Similar Jobs

Project Manager - AI Infrastructure

Open Innovation AI · Abu Dhabi Emirate

Mid-Seniorfulltime

Company Description Open Innovation AI is a global technology company that specializes in developing advanced solutions for managing AI workloads. Its flagship product, the Open Innovation Cluster Manager (OICM), orchest

Skills

Project PlanningBudget ManagementRisk Management

Solution Engineering- Cloud And AI Infrastructure

TALENTMATE · Dubai

Mid-Seniorfulltime

Overview Job Description Are you curious, enthusiastic about infrastructure, and ready to solve complex challenges in the AI era? Join us as a Cloud & AI Solution Engineer focused on the Azure Platform for commercial cus

Skills

engineeringdesignproject management

DevOps Engineer - AI Infrastructure & GPU Orchestration

NEXUS AIDC INC · Dubai

Mid-Seniorfulltime

Company Description NEXUS is revolutionizing the data center industry with the first AI-native Data Center Operating System. Addressing the growing complexity of AI-driven workloads and infrastructure, our platform unifi

Skills

KubernetesLinuxScala

Senior Software Engineer – AI Infrastructure

Kraken ·

Mid-Seniorfulltime

Building the Future of Crypto Our Krakenites are a world-class team with crypto conviction, united by our desire to discover and unlock the potential of crypto and blockchain technology. What makes us different? Kraken i

AI Infrastructure Engineer (GPU) - Remote EMEA

Pragmatike · Dubai

Mid-Seniorfulltime

Location: Fully remote (EMEA timezone) Start date: ASAP Languages: Fluent English required Industry: Cloud Computing / AI / European Deep-Tech SaaS About The Role Pragmatike is recruiting on behalf of a fast-scaling, wel

Skills

Machine LearningSAPScala

Cloud Solution Architecture- Cloud & AI Infrastructure

Microsoft ·

Seniorfulltime

Overview With more than 45,000 employees and partners worldwide, the Customer Experience and Success (CE&S) organization is on a mission to empower customers to accelerate business value through differentiated customer e

Skills

Git

Project Manager - AI Infrastructure

Open Innovation AI · Abu Dhabi Emirate

Mid-Seniorfulltime

Company Description Open Innovation AI is a global technology company that specializes in developing advanced solutions for managing AI workloads. Its flagship product, the Open Innovation Cluster Manager (OICM), orchest

Skills

Project PlanningBudget ManagementRisk Management

Solutions Architect - AI Infrastructure

Open Innovation AI · Abu Dhabi Emirate

Mid-Seniorfulltime

Company Overview Open Innovation AI is a global technology company that specializes in developing advanced solutions for managing AI workloads. Its flagship product, the Open Innovation Cluster Manager (OICM), orchestrat

Skills

AICloudInfrastructure

Senior Azure AI Infrastructure Architect

Acenet consulting · Abu Dhabi

Seniorfulltime

Experience: 10 to 15 years Location: Abu Dhabi Job code: 101484 Posted on: Apr 17, 2026 About Us: AceNet Consulting is a fast-growing global business and technology consulting firm leveraging a consultative approach, dee

Skills

AzureExcelGit

2.2K+

Cover Letters & Follow-ups

1.8K+

Resumes Tailored

190.5K+

Jobs Tracked

Trusted by professionals at

PwC//
Emaar//
KPMG//
Noon//
Amazon AWS//
Talabat//
Deloitte//
Emirates//
Careem//
Aramex//
McKinsey//
Property Finder//
Majid Al Futtaim//
Chalhoub Group//
PwC//
Emaar//
KPMG//
Noon//
Amazon AWS//
Talabat//
Deloitte//
Emirates//
Careem//
Aramex//
McKinsey//
Property Finder//
Majid Al Futtaim//
Chalhoub Group//
AI Job Platform

Stop applying blindly. Start getting hired.

Base Career automates the hardest parts of job searching — apply smarter, not harder.

AI Resume in 60s

Your resume rewritten for this exact role using the job description as the brief.

ATS-Optimized

Get past automated screening filters with the right keywords matched to each job.

Application Tracker

Track every job, follow-up, and interview in one visual kanban board.

Free plan · No credit card required