ML/Inference Engineer
Job Fit Check
Base Career helps you apply smarter for this job.
Key skills for this role
About the Role
Привет! Это сообщество Yandex Family и многие классные зарубежные ребята доверяют нам процессы по найму. Ищем Principal/Senior ML/Inference инженера в команду Mirai на полную удаленку и валютную зарплату.
Key Skills for This Role
Full Job Posting
Overview
Привет! Это сообщество Yandex Family и многие классные зарубежные ребята доверяют нам процессы по найму.
Ищем Principal/Senior ML/Inference инженера в команду Mirai на полную удаленку и валютную зарплату.
По зарплате готовы предлагать от $150,000 гросс в год.
Mirai делает так, чтобы AI-модели работали быстро и дешево прямо на устройствах, а не в облаке. Строят high-performance inference engine под Apple Silicon и инфраструктуру вокруг него: профилирование/бенчмарки, runtime-оптимизации и т.д. Компания закрыла seed на $10M от топ VC и ангелов.
Команда русскоязычная, маленькая, можно сильно влиять на продукт; фаундеры — создатели Reface (200M+ users, backed by a16z) и Prisma (100M+).
About Us
Mirai builds the fastest on-device inference engine for Apple Silicon.
In under a year, a 14-person team built a full stack, from model optimization to a proprietary runtime, outperforming MLX and llama.cpp on supported models.
We’re making local inference practical, fast, and reliable for real products.
Why us?
Mirai is founded by proven entrepreneurs who built and scaled consumer AI leaders like Reface (200M+ users, backed by Andreessen Horowitz) and Prisma (100M+ users).
Our team is small (14 people), senior, and deeply technical.
We ship fast and own problems end-to-end.
We’re advised by a former Apple Distinguished Engineer who worked on MLX, and backed by leading AI-focused funds and individuals.
Responsibilities
- You'll work across our inference engine and model conversion toolkit, implementing new model architectures, supporting new modalities, writing optimized kernels, and building a wide range of features such as function calling and batch decoding.
- This role is ideal for someone who reads papers for fun, enjoys writing high-performance code, and gets excited about constant learning.
Requirements
- JAX / Equinox / Pallas stack.
- Rust systems programming with a focus on developer experience.
- Writing Metal / Vulkan kernels.
- Neural codecs and voice model architectures.
- Trellis-based quantization approaches.
- Advanced speculative decoding methods, such as EAGLE.
- Deep understanding of Transformer / SSM / Diffusion / Vision language models.
- Benchmarking inference performance and model quality.
- Strong linear algebra, optimization methods, and probability theory.
Apply for this job in 1 click
Skip the repetitive application forms
Install the Base Career Chrome Extension and autofill job applications across major job boards with your profile.
Trusted by over 500,000 job seekers on Base Career
More from this employer
More jobs at YNDX Family
Inference Engineer (Russian)
Abu Dhabi, UAE
Привет! Это Yandex Family и многие классные зарубежные ребята доверяют нам процессы по найму Ищем сильного ML/Inference инженера в команду Mirai на полную удаленку и валютную зарплату Mirai строит самый быстрый инференс-
C++ Developer (HFT)
Abu Dhabi, UAE
Hi! This is the Yandex Family community, and many great international talent trust us with their hiring processes. We're looking for a Middle/Senior C++ developer to join the Sparkland team, fully remote and with a forei
Quantitative Researcher (HFT, Russian)
Dubai, UAE
Silver Mont is a HFT fund in crypto, monthly trading volume exceeding $60B, top 5 by volume on leading crypto exchanges, team of 30 people, Russian-speaking, Olympiad-level talent (ICPC, IMO, IOI, IPhO) Responsibilities