{bc}
indeed

AI Inference Infrastructure Engineer

Dizzaract FZ LLC
Abu Dhabi, UAE
fulltime
2 months ago
ExcelGitVAT
Free

Job Fit Check

Base Career helps you apply smarter for this job.

?%
Ready to Scan

Key skills for this role

ExcelGitVAT
Smart Apply

Full Job Posting

About Dizzaract

Dizzaract is a UAE-based game development studio founded in 2022, headquartered at Yas Creative Hub, Abu Dhabi.

We develop cutting-edge AI-powered games and systems, including our innovation R&D laboratory FAR labs, the upcoming hero shooter Farcana, and the AI gaming identity platform GAMED.

Our research and development team boasts over 100 peer-reviewed papers and more than 20 patents in AI-driven gameplay, digital ownership, and competitive design.

With a diverse team of more than 80 professionals from over 20 countries, we are committed to innovation, excellence, and building a culture that drives performance and results.

**The Mission:** We are building a highly optimized, decentralized AI inference network.

To beat the latency and throughput of established centralized players, we cannot rely on off-the-shelf wrappers.

You will be responsible for building the bare-metal, ultra-low-latency infrastructure that serves large language models and multimodal networks at unprecedented scale.

What You Will Do

  • **Core Engine Development:** Architect and write highly optimized, low-level code (primarily in Rust and C) to manage model loading, memory allocation, and request batching across a distributed fleet of GPUs/NPUs.
  • **Hardware-Aware Optimization:** Implement tensor mathematics optimizations and custom kernels (CUDA/Triton) to squeeze maximum FLOPS out of the hardware.
  • **Zero-Intervention Deployments:** Build rock-solid, fully packaged infrastructure pipelines. We operate with zero manual intervention—no ad-hoc scripts, no PowerShell bandaids. If a node fails, the network must heal autonomously.
  • **Decentralized Orchestration:** Design the peer-to-peer or decentralized routing logic that ensures high availability and optimal load balancing across geographically distributed nodes.
  • **Advanced Inference Techniques:** Implement and optimize techniques like continuous batching, speculative decoding, and paged attention (vLLM, TensorRT-LLM) customized for our specific network architecture.

What We Are Looking For

  • Deep expertise in systems programming (Go) and a strong aversion to bloated, high-level abstractions where performance matters.
  • Proven experience with GPU programming (CUDA, ROCm) and low-level hardware architecture.
  • Strong understanding of deep learning architectures (Transformers, Mamba) and how tensor operations execute on silicon.
  • Experience building highly concurrent, distributed systems with sub-millisecond network latency requirements.

Apply for this job in 1 click

Skip the repetitive application forms

Install the Base Career Chrome Extension and autofill job applications across major job boards with your profile.

Sarah M.James T.Maya R.

Trusted by over 500,000 job seekers on Base Career

Start Free Today

More from this employer

More jobs at Dizzaract FZ LLC