{bc}

DevOps Engineer - AI Infrastructure & GPU Orchestration

NEXUS AIDC INCDubai, UAE2 days agoMid-Seniorfulltime
KubernetesLinuxScalaDevOpsVAT
Generate Resume for this Job
Via LinkedIn·

About This Role

Company Description

NEXUS is revolutionizing the data center industry with the first AI-native Data Center Operating System. Addressing the growing complexity of AI-driven workloads and infrastructure, our platform unifies DCIM, APM, FinOps, Kubernetes orchestration, AI workload management, and full-stack observability into one intelligent, real-time system.With cutting-edge predictive intelligence and automated remediation, the platform ensures optimized performance, cost efficiency, and seamless AI deployment. At NEXUS, we are shaping a future with autonomous infrastructure intelligence for smarter, more efficient decisions.

Role Description

This is a full-time hybrid role for a DevOps Engineer specializing in AI Infrastructure and GPU Orchestration. The DevOps Engineer will be responsible for building and maintaining scalable infrastructure, implementing infrastructure as code (IaC), developing automation scripts, streamlining continuous integration workflows, and managing Linux-based systems. The role also involves optimizing GPU clusters, collaborating with software developers, and ensuring high system performance to support innovative AI-driven workloads.

Key Responsibilities

  • GPU Workload Orchestration: Design and manage complex Kubernetes environments (EKS, AKS, GKE, or bare metal) specifically tuned for AI/ML workloads, including GPU scheduling, device plugins, and node affinity.
  • DCIM Integration: Build and maintain infrastructure pipelines that interface with Data Center Infrastructure Management (DCIM) systems to monitor power, cooling, and hardware health at the rack level.
  • Advanced APM & Telemetry: Implement deep Application Performance Monitoring (APM) and observability stacks (Prometheus, Grafana, Datadog) to track GPU utilization, memory bandwidth, and workload latency in real-time.
  • Infrastructure as Code (IaC): Architect and deploy scalable, multi-cloud and hybrid environments using Terraform or equivalents, ensuring our platform can deploy rapidly into diverse enterprise environments.
  • CI/CD for AI Infrastructure: Own the CI/CD pipelines (GitHub Actions, GitLab CI) that deliver our orchestration software, ensuring zero-downtime deployments for mission-critical AI systems.
  • Performance Tuning: Work closely with the core engineering team to optimize network routing, storage I/O, and compute resource allocation for heavy AI training and inference workloads.

Qualifications

  • Minimum 3-5 years of professional experience in DevOps, SRE, or Infrastructure Engineering, with a strong focus on high-performance computing or AI infrastructure.
  • Expert-level skills in Terraform,

Ansible, or similar technologies

and CI/CD automation, coupled with strong scripting abilities in Python, Go, or Bash.

  • Strong knowledge of Continuous Integration tools (e.g., Jenkins, GitHub Actions, GitLab CI/CD)
  • Background in System Administration and expertise in managing multi-OS-based environments
  • Understanding of GPU clusters and handling modern AI workloads
  • Deep, hands-on experience with Kubernetes, specifically managing stateful workloads, custom resource definitions (CRDs), and GPU node provisioning.
  • Proven ability to design and implement comprehensive APM and telemetry solutions for complex, distributed systems.
  • Understanding of data center operations, including power, thermal management, and hardware-level monitoring.
  • Multi-cloud infrastructure experience is a plus
  • Ability to troubleshoot and optimize performance across complex infrastructure
  • Strong problem-solving abilities and a collaborative mindset

Similar Jobs

AWS DevOps Engineer

Dicetek LLC · Dubai

Senior

Implement and maintain Azure infrastructure, manage CI/CD pipelines, ensure security, and support application teams with strong automation and cloud operations skills.

AWS DevOps EngineerCloudFormationAWS Cloud Engineer

AWS DevOps Engineer

Dicetek LLC · Dubai

Entry

**Role Purpose** Deliver secure, automated, and reliable Azure\-based platforms and applications through strong DevOps and operational practices. **Key Responsibilities** * Implement and maintain Azure infrastructure usi

AWSAzureCI/CD

Senior DevOps Engineer, Trilogy (Remote) - $100,000/year USD

Crossover · Abu Dhabi

Mid-Senior

You're the engineer who keeps 50\+ SaaS products running when no one else knows where to start. We need DevOps engineers who can dive into unfamiliar AWS environments, bring order to disorder, and drive uptime beyond 99\

PythonJavaScriptAWS

DevOps Engineer

Virtusa · Dubai

Entry

**Location: Abu Dhabi** Design, implement, and manage cloud\-native infrastructure (AWS/Azure/GCP/On Prem). Build and optimize CI/CD pipelines to support rapid release cycles. Manage containerization \& orchestration (Do

AWSAzureGCP

DevOps Engineer

Analog · Abu Dhabi

Mid-Senior

**Overview** **Role :** We’re looking for a DevEx / DevOps Engineer to join the Infrastructure team to evolve our build, test, and deployment systems. You’ll design scalable pipelines, reduce cycle time, and raise the ba

PythonTypeScriptGit

DevOps Engineer

Staff Connect UAE · Dubai

Mid-Senior

Job Title: DevSecOps Engineer Work Location: UAE, Dubai. Role Summary We are seeking an experienced DevSecOps Engineer to support and enhance our digital platforms and cloud environments. This role focuses on CI/CD autom

PythonAWSAzure

DevOps Engineer

CoorB · Dubai

Senior

Responsible for automating build processes, managing cloud infrastructure, implementing CI/CD pipelines, and mentoring junior engineers in a collaborative environment.

DevOps Engineer

DevOps Engineer

Platoon Consulting · Dubai

Entry

**Job brief** We are looking for a DevOps Engineer to help us build functional systems that improve customer experience. DevOps Engineer responsibilities include deploying product updates, identifying production issues a

PythonSQLRuby

AI/ML/DevOps Engineer

Faze 3 Consulting · Abu Dhabi

Mid-Senior

**AI/ML/DevOps Engineer** **Own the MLOps platform that powers enterprise AI at scale.** A leading Abu Dhabi\-based holding group is hiring an **AI/ML/DevOps Engineer** to architect, operate, and continuously improve the

AzureTerraformGit
AI Job Platform

Stop applying blindly. Start getting hired.

Base Career automates the hardest parts of job searching — apply smarter, not harder.

AI Resume in 60s

Your resume rewritten for this exact role using the job description as the brief.

ATS-Optimized

Get past automated screening filters with the right keywords matched to each job.

Application Tracker

Track every job, follow-up, and interview in one visual kanban board.

Start Today for Free

Free plan · No credit card required