Senior SRE Engineer (MLOps) - AI
Job Fit Check
Base Career helps you apply smarter for this job.
Key skills for this role
About the Role
Elevate your career as a Senior SRE Engineer (MLOps) in Saudi Arabia, focusing on the operational excellence of AI and ML systems. This role emphasizes reliability, observability, and governance, ensuring that AI features operate securely and efficiently at scale.
Key Skills for This Role
Full Job Posting
Overview
Description Salla is looking for a Senior SRE Engineer (MLOps) to join our Salla AI team.
This role focuses on running our AI and ML systems as real production systems, not side experiments — owning the operational layer around models, prompts, agents, inference services, and retrieval systems.
You will be responsible for enabling Agentic AI and Generative AI features to operate reliably, securely, and cost-effectively at scale within the Salla ecosystem.
This role is SRE- and platform-engineering-first, with a strong emphasis on reliability, observability, safe releases, cost, and governance, while collaborating closely with engineering, data, and AI teams to give every pod a fast, safe path to production.
It exists because AI systems fail differently from normal services — a prompt change can behave like a code change, an agent calling tools needs auditability, and latency, quality, and cost can move together in uncomfortable ways.
Key Responsibilities Own reliability for ML and agentic AI services in production — SLOs, dashboards, alerts, runbooks, and incident follow-ups Build observability across the AI stack — latency, errors, traces, tool calls, cost, and user impact Design safe-release patterns for models, prompts, agents, tools, and configuration, including canary, rollback, feature-flag, and evaluation-gate strategies Provide operational support for inference APIs, queues, retrieval layers, and AI workflows running on Kubernetes/EKS Establish ownership, traceability, and guardrails around what agentic systems are allowed to do, including how they call internal tools Defend agent tool-calling against prompt injection and untrusted-data risks — establish and enforce data-trust boundaries so that untrusted store/merchant content cannot manipulate agent decisions, tool calls, or actions Drive AI cost governance — per-model and per-pod spend visibility, token-cost tracking, and anomaly alerting Build automation and self-service paths so product teams have a known safe path to production instead of rebuilding it each time Turn recurring operational pain into simple, reusable platform standards that other teams adopt Participate in architecture discussions, code reviews, and technical decision-making 4+ years in SRE, platform engineering, DevOps, or production infrastructure, operating distributed systems in production — not only in demos Hands-on experience with Kubernetes and cloud-native systems in production Familiarity with deploying ML projects Strong command of CI/CD, GitOps, observability, and incident response Solid experience with infrastructure-as-code, secrets management, and networking Ability to write automation or platform tooling in Python, or a similar language Production judgment — knowing how to make systems measurable, debuggable, repeatable, and safe to change (you do not need to be a machine learning researcher) Ability to work across teams, explain trade-offs clearly, and turn operational pain into standards engineers will actually use Nice to have: Experience with MLOps or ML platforms — model serving, registries, evaluation, feature/data dependencies, drift monitoring, or ML pipelines Familiarity with LLM applications or agentic systems — RAG, vector databases, tool calling, workflow orchestration, memory, traces, guardrails, or evaluation pipelines Exposure to tooling such as OpenTelemetry, Prometheus, Grafana, MLflow, KServe, Ray, LiteLLM, vLLM, LangGraph, Arize Phoenix, or LangSmith Experience with Kafka consumers, GPU workloads, inference optimization, model routing, or AI cost governance Experience working in cross-functional product teams involving AI, backend, and frontend engineers
Apply for this job in 1 click
Skip the repetitive application forms
Install the Base Career Chrome Extension and autofill job applications across major job boards with your profile.
Trusted by over 500,000 job seekers on Base Career
More from this employer
More jobs at Salla
Head of Security
Mecca, KSA
Most senior security authority at a SaaS company. Sets strategy, builds team, and owns security controls across cloud, network, endpoint, physical, and GRC. Leads audits, certifications, and incident response. Requires 1
Accountant
Saudi Arabia, KSA
Perform accounting activities related to fixed assets , including recording, tracking, and maintaining accurate asset records. Manage and process prepaid expenses accounting , e...
Category Manager - Mahally
Jeddah, KSA
Salla is one of the fastest-growing e-commerce platforms in the MENA region, delivering innovative technology solutions that empower businesses to scale rapidly. As we continue ...
Accountant
Mecca, KSA
About Salla Salla is one of the fastest-growing e-commerce platforms in the MENA region, delivering innovative technology solutions that empower businesses to scale and succeed. As we continue to expand, we are looking f
Senior Data Scientist - Recommendation Systems Pod
Saudi Arabia, KSA
Join us in building the intelligence that powers product discovery for millions of shoppers and thousands of merchants across the Middle East. As a Senior Data Scientist for the...
Category Manager - Mahally
Jiddah, KSA
Salla is one of the fastest-growing e-commerce platforms in the MENA region, delivering innovative technology solutions that empower businesses to scale rapidly. As we continue to expand, we are seeking a Category Manage
Senior Data Scientist - Recommendation Systems Pod
Mecca, KSA
Join us in building the intelligence that powers product discovery for millions of shoppers and thousands of merchants across the Middle East. As a Senior Data Scientist for the Recommendation Systems Pod, you'll lead th
Senior SRE Engineer (MLOps) - AI
Saudi Arabia, KSA
Salla is looking for a Senior SRE Engineer (MLOps) to join our Salla AI team. This role focuses on running our AI and ML systems as real production systems, not side experiments...
Head of Security
Mecca, KSA
Accountant
Saudi Arabia, KSA
Category Manager - Mahally
Jeddah, KSA
Accountant
Mecca, KSA
Senior Data Scientist - Recommendation Systems Pod
Saudi Arabia, KSA
Category Manager - Mahally
Jiddah, KSA
Senior Data Scientist - Recommendation Systems Pod
Mecca, KSA
Senior SRE Engineer (MLOps) - AI
Saudi Arabia, KSA

