Senior Engineer - HPC Operations
Skills
About This Role
About Us
Core42, a leader in AI-powered cloud and digital infrastructure, is driving transformative technology solutions globally.
Leveraging advanced resources and partnerships, Core42 empowers clients to harness sovereign AI infrastructure, especially in sectors with stringent regulatory needs.
With a mission to redefine digital transformation, we combine sovereign capabilities with scalable, high-performance compute infrastructure, positioning itself at the forefront of AI innovation in the Middle East and beyond.
The opportunity
We are seeking a highly skilled Senior Engineer – HPC Operations to oversee the daily operations and support of high-performance computing clusters designed to power large-scale AI and ML workloads.
This role ensures stable, secure, and high-performing infrastructure leveraging technologies such as Slurm, Kubernetes, and modern MLOps platforms.
The ideal candidate will bring deep technical expertise in HPC and a strong operational mindset to drive continuous improvement and automation across globally distributed environments.
Responsibilities
- will extend to collaborating with multidisciplinary teams, leading complex projects, implementing cutting-edge technologies, and providing mentorship to operations engineers.
- Your key responsibilities
- Lead the daily operational support of HPC infrastructure including compute, storage, networking, and scheduler components (Slurm, Kubernetes, etc.).
- Lead efforts to maximize the efficiency and performance of HPC systems, ensuring optimal resource utilization and minimal downtime.
- Act as the primary technical escalation point for L2 support teams and ensure prompt resolution of incidents and service requests.
- Monitor system health, performance, and utilization using advanced tools (e.g., Prometheus, Grafana, DCGM).
- Manage user environments for AI/ML workloads including container orchestration (e.g., Docker, Kubernetes) and workflow tools (e.g., MLflow, Kubeflow).
- Implement and manage job scheduling policies, priorities, and partitions within Slurm and/or Kubernetes environments to ensure fairness and efficiency.
- Lead root cause analysis (RCA) of operational issues and contribute to post-mortem documentation and continuous improvement efforts.
- Provide mentorship and guidance to junior engineers and participate in on-call rotation if required.
- Ensure compliance with security and operational policies; assist in audits and documentation for change and incident management processes.
Qualifications
- What we’re looking for
- (a) Required skills / qualifications
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related technical field.
- 7+ years of experience in HPC operations, systems engineering, or DevOps roles.
- Advanced knowledge and expertise in configuring, optimizing, and maintaining complex HPC environments, including hardware, software, and storage systems.
- Hands-on experience managing Slurm clusters and/or Kubernetes-based environments for AI/ML workloads.
- Expert knowledge of GPU resource management, workload schedulers, and performance tuning for AI/ML workloads.
- Experience with monitoring and observability frameworks such as Prometheus, Grafana, and DCGM.
- Strong scripting and automation skills (Python, Bash, Ansible, Terraform).
- In-depth understanding of Linux (RHEL/CentOS/Ubuntu), networking concepts (RDMA, InfiniBand, RoCE), and storage technologies (NFS, Lustre, Ceph).
- What working at Core42 offers
- With a diverse team of 1,100+ employees from 68 nationalities, we foster an inclusive, innovative and collaborative environment.
- At Core42, we foster a culture grounded in trust, accountability and high performance.
- We are united by our values:
Grit
, where we overcome challenges with resilience and determination,
Passion
, which drives us to pursue excellence in everything we do, and
Impact
- , as we aim to inspire progress and create meaningful change.
- Our team members thrive in an environment where each person’s contributions propel us forward, and together, we commit to achieving extraordinary results.
- Competitive Salary: We offer an attractive salary package based on your skills and experience
- Yearly Bonus: In recognition of your contributions, you will receive a performance-based annual bonus
- Exclusive Discount Cards: Access special benefits with Esaad and Fazaa cards, offering discounts across a wide range of services
- Premium Family Insurance: We provide comprehensive health coverage, including dental, vision and life insurance, ensuring the well-being of you and your family
- Learning & Development: We offer access to top-tier learning platforms to help you grow in your career. Learn at your own pace with unlimited access to premium courses.
Your resume, rewritten
for this exact role.
Sign up free — Base Career tailors your CV to this job description in 60 seconds.
01 / 05
Resume Tailored to This Job

Your keywords, structure, and story — rewritten to match this exact role and pass ATS filters.
Free · No card · 60 seconds
02 / 05
Cover Letter for This Role, Done

Job-specific cover letters written in Gulf professional tone — ready in seconds, not hours.
Free · No card · 60 seconds
03 / 05
See How Well You Fit This Role

AI match score with clear reasons — know your fit before investing time in the application.
Free · No card · 60 seconds
04 / 05
Apply in One Click

Autofill any application form on Workday, LinkedIn, Bayt, Greenhouse — with your tailored content.
Free · No card · 60 seconds
05 / 05
Track It. Follow Up at the Right Time.

Visual pipeline for every application with AI-timed follow-up reminders so nothing slips.
Free · No card · 60 seconds
Similar Jobs
Senior Engineer - FJC (Field Joint Coating)
McDermott International, Ltd · Dubai
Job Description Job Overview: The Senior Field Engineer prepares for and executes offshore installation operations. Responsibilities Key Tasks and Responsibilities: TBT: Each morning at 6am with all the PPG crew includi
Skills
Senior Engineer - Quality
Petrofac · Sharjah
Petrofac is a leading international service provider to the energy industry, with a diverse client portfolio including many of the world’s leading energy companies. We design, build, manage and maintain infrastructure fo
Skills
Senior Engineer for Enterprise CRM Systems
Eaton · Dubai
Eaton's Global Data Center Segment is focused on helping data centers realize real business benefits today while optimizing their operations for the future. With the rise of big...
Skills
Yesterday
Apply Now↗Apply Now ↗Senior Engineer - Electrical (MEP)
AECOM · Dubai
Work with Us. Change the World. At AECOM, we're delivering a better world. Whether improving your commute, keeping the lights on, providing access to clean water, or transforming skylines, our work helps people and commu
Skills
Yesterday
Apply Now↗Apply Now ↗Senior Engineer - Planning & Cost Control (Offshore)
Penspen · Abu Dhabi
Penspen is urgently looking for Senior Engineer - Planning & Cost Control (Offshore) role for one of our Offshore PMC projects in Abu Dhabi. Key Responsibilities Develop, maintain, and update project schedules for offsh
Skills
Yesterday
Apply Now↗Apply Now ↗Senior Engineer - Structural
AECOM · Abu Dhabi
Company Description Work with Us. Change the World. At AECOM, we're delivering a better world. Whether improving your commute, keeping the lights on, providing access to clean water, or transforming skylines, our work he
Skills
Yesterday
Apply Now↗Apply Now ↗Senior Engineer, Planning
KBR, Inc. · Abu Dhabi
Title Senior Engineer, Planning "Belong, Connect, Grow, with KBR! The KBR team of teams delivers future-forward science, technology and engineering solutions and mission-critical services that help governments and compan
Skills
Yesterday
Apply Now↗Apply Now ↗Senior Engineer CandE - Antenna Systems Design
TALENTMATE · Abu Dhabi
Job Description ADVANCED CONCEPTS is an entity within EDGE dedicated to the design and deployment of advanced, innovative technologies. ADVANCED CONCEPTS is fast-tracking the development of high-technology autonomous sys
Skills
Yesterday
Apply Now↗Apply Now ↗Senior Engineer – Network Operations
Core42 · Abu Dhabi
About Us Core 42, a leader in AI-powered cloud and digital infrastructure, is driving transformative technology solutions globally. Leveraging advanced resources and partnerships, Core42 empowers clients to harness sover
Skills
Yesterday
Apply Now↗Apply Now ↗2.2K+
Cover Letters & Follow-ups
1.8K+
Resumes Tailored
190.5K+
Jobs Tracked
Trusted by professionals at
Stop applying blindly.
Start getting hired.
Base Career automates the hardest parts of job searching — apply smarter, not harder.
AI Resume in 60s
Your resume rewritten for this exact role using the job description as the brief.
ATS-Optimized
Get past automated screening filters with the right keywords matched to each job.
Application Tracker
Track every job, follow-up, and interview in one visual kanban board.
Free plan · No credit card required