{bc}
linkedin

AI / LLM Deployment Engineer

Walker Lovell
Abu Dhabi Emirate, UAE
fulltime
Mid-Senior
Today
engineeringdesignproject managementmaintenancequality controltechnical
Free

Job Fit Check

Base Career helps you apply smarter for this job.

?%
Ready to Scan

Key skills for this role

engineeringdesignproject management
Smart Apply

Full Job Posting

Location

Remote (GST time zone preferred) with occasional travel to Abu Dhabi if required

Travel

Occasional international travel

Compensation

Exceptional package reflecting seniority, technical expertise and impact

What's in it for you?

This isn't another AI application role.

You'll lead the deployment of large language models including DeepSeek, Kimi and Qwen into sovereign, air-gapped environments where GPU performance, inference optimisation and security are business critical.

If you're passionate about high-performance AI infrastructure, this is an opportunity to solve problems that very few engineers get to tackle.

Package And Benefits

  • Exceptional package reflecting seniority and specialist expertise
  • Fully remote initially with flexibility for future relocation if desired
  • Visa sponsorship available where applicable
  • Work with cutting-edge open-weight LLMs and enterprise GPU infrastructure
  • Influence deployment architecture from the ground up
  • Why this business
  • Join a globally focused technology business developing sovereign AI and intelligence platforms for highly regulated environments across multiple international markets.
  • Working at the forefront of secure AI deployment, the organisation is investing heavily in advanced infrastructure and offers the opportunity to solve technically demanding challenges alongside a highly experienced engineering team.
  • What you'll be doing
  • Architect and deploy LLMs including DeepSeek, Kimi, Qwen and LLaMA into secure, air-gapped production environments
  • Configure and optimise NVIDIA H100/H200 GPU clusters, NVLink and InfiniBand infrastructure for high-performance inference
  • Apply GPTQ, AWQ and GGUF quantisation techniques to maximise deployment efficiency without compromising model performance
  • Deploy and optimise inference runtimes including vLLM, TGI and Ollama within Kubernetes environments, delivering target throughput and latency SLAs
  • What you'll bring
  • Proven commercial experience deploying production LLMs using vLLM, TGI, Ollama or equivalent inference platforms
  • Expert knowledge of Kubernetes, NVIDIA GPU infrastructure, GPU memory optimisation and high-performance computing
  • Hands-on experience with model quantisation techniques including GPTQ, AWQ or GGUF
  • Experience delivering on-premise or air-gapped AI deployments. Experience within government, defence, cyber security or other highly regulated environments would be advantageous.
  • Who this suits
  • You're an infrastructure engineer who thrives on solving complex deployment challenges rather than building AI applications.
  • You understand what it takes to run large language models reliably at scale, enjoy optimising GPU performance and want to work on technically demanding projects where security, performance and engineering excellence are non-negotiable.
  • Apply now for a confidential conversation with Walker Lovell.

Apply for this job in 1 click

Skip the repetitive application forms

Install the Base Career Chrome Extension and autofill job applications across major job boards with your profile.

Sarah M.James T.Maya R.

Trusted by over 500,000 job seekers on Base Career

Start Free Today

More from this employer

More jobs at Walker Lovell