AI/HPC Level 1 Support Engineer

AIHostingHub

Dubai, UAE

fulltime

Entry

2 months ago

engineeringdesignproject managementmaintenancequality controltechnical

Apply

Free

Job Fit Check

Base Career helps you apply smarter for this job.

Ready to Scan

Key skills for this role

engineeringdesignproject management

Smart Apply

Full Job Posting

Company Description

AIHostingHub, the UAE's leading provider of cutting-edge AI and High-Performance Computing (HPC) infrastructure.

We specialize in building large-scale AI data centers and delivering GPU-as-a-Service from nimble deployments to massive clusters.

As a trusted professional services partner for industry giants like Supermicro and VAST Data in the GCC, we provide the technology, expertise, and support to fuel your most ambitious projects.

Our Services

AI/HPC Data CentersCustom-built, scalable environments optimized for the most demanding AI workloads.
GPU as a ServiceOn-demand access to massive GPU clusters, starting from a 2048 GPU to over 16,384 GPU per cluster.
Cybersecurity MSSP Fortinet and AttackIQ powered, 24/7 managed security to protect your critical infrastructure and data.
Expert Professional Services End-to-end support from design and deployment to optimization, directly from GCC-based partners.
AIHostingHub prides itself on delivering customized security solutions, dedicated support, and strategic guidance, ensuring that clients can operate confidently in the digital landscape.
Explore the future of cybersecurity with AIHostingHub, where protection is the top priority.

Role Description

This is a full-time, on-site role based in Dubai for an AI/HPC Level 1 Support Engineer.

The role involves providing first-level troubleshooting, technical and customer support for AI and high-performance computing (HPC) infrastructures.

Responsibilities include monitoring system performance, resolving operational issues, assisting clients with inquiries, and maintaining operational documentation.

The engineer will also collaborate with internal teams and escalate issues to higher-level support when necessary.

Key Responsibilities

Monitor dashboards (Grafana, ticketing system) for GPU node health, InfiniBand link flaps, temperature, and power anomalies.
Log, categorize, and prioritize incidents (P1–P4) per SLA response times (1h for urgent, 2h for high).
Perform onsite smart‑hands tasks: cable patching, component replacement, fibre cleaning, visual inspections.
Execute post‑repair validation scripts (CUDA P2P, NCCL local, DCGMI, Stream) after RMA.
Coordinate with vendors (Nvidia, Supermicro) for warranty replacements.
Escalate unresolved issues to Level 2 AI/HPC Engineers.
Maintain operational logs, asset records, and maintenance documentation.

Required Qualifications

1–3 years in datacenter, NOC, or HPC support.
Familiarity with GPU servers, InfiniBand/Ethernet cabling, and fibre optics.
Basic Linux command line (dmesg, nvidia‑smi, grep, uptime).
Understanding of incident management and SLA targets (response/resolution times).
Ability to work 24/7 rotating shifts (including weekends).
Strong communication and documentation skills.

Preferred

Experience with DCGM, Grafana, or ticketing systems (Jira/ServiceNow).
Knowledge of liquid cooling CDUs or Proxmox/Ceph is a plus.

We Offer

Structured career progression to L2/L3 roles.
Training on HGX platforms and AI validation frameworks.

Apply for this job in 1 click

Skip the repetitive application forms

Install the Base Career Chrome Extension and autofill job applications across major job boards with your profile.

Trusted by over 500,000 job seekers on Base Career

Start Free Today

More jobs at AIHostingHub

Datacenter Engineer (On‑site Operations)

Dubai, UAE

Mid-Seniorfulltime

Company Description AIHostingHub, the UAE's leading provider of cutting-edge AI and High-Performance Computing (HPC) infrastructure. We specialize in building large-scale AI data centers and delivering GPU-as-a-Service f

2 months agoView →

AI/HPC Engineer (Senior Technical Role)

Dubai, UAE

Mid-Seniorfulltime

2 months agoView →

Datacenter Level 1 Support Technician

Dubai, UAE

Entryfulltime

2 months agoView →

Datacenter Engineer (On‑site Operations)

Dubai, UAE

2 months agofulltime

AI/HPC Engineer (Senior Technical Role)

Dubai, UAE

2 months agofulltime

Datacenter Level 1 Support Technician

Dubai, UAE

2 months agofulltime

AI/HPC Level 1 Support Engineer

Job Fit Check

About the Role

Full Job Posting

Company Description

Our Services

Role Description

Key Responsibilities

Required Qualifications

Preferred

We Offer

Apply for this job in 1 click

More jobs at AIHostingHub

Datacenter Engineer (On‑site Operations)

AI/HPC Engineer (Senior Technical Role)

Datacenter Level 1 Support Technician

Datacenter Engineer (On‑site Operations)

AI/HPC Engineer (Senior Technical Role)

Datacenter Level 1 Support Technician