Site Reliability Engineer (SRE)
About This Role
About the Role:
We are looking for a Site Reliability Engineer (SRE) with solid experience running production systems and working closely with development teams. The ideal candidate is comfortable with Linux, containers, Kubernetes, and CI/CD pipelines, and has a strong focus on reliability, monitoring, and incident handling. You will help keep our services stable, observable, and scalable while collaborating with engineers across the stack.
Responsibilities:
- Operate and maintain production systems with a focus on reliability, availability, and performance.
- Work with Docker and Kubernetes to deploy, update, and troubleshoot services.
- Configure and optimize Kubernetes resources (pods, deployments, services, ingress, config maps, secrets, etc.).
- Implement and maintain monitoring, logging, and alerting for applications and infrastructure.
- Build and improve CI/CD pipelines in collaboration with development and DevOps teams.
- Create and maintain dashboards for key service metrics (latency, error rate, throughput, resource usage).
- Participate in incident response: investigate issues, identify root cause, and propose fixes and improvements.
- Work closely with backend developers to improve service reliability, resilience, and observability.
- Contribute to capacity planning and performance tuning of services and infrastructure.
- Automate repetitive operational tasks using scripts or small tools.
- Document runbooks, procedures, and best practices for operating services in production.
Must-Have Qualifications:
- 3–5 years of professional experience in an SRE, DevOps, or infrastructure-focused engineering role.
- Strong understanding of Linux systems (shell, processes, networking, permissions, logs).
- Hands-on experience with Docker and Kubernetes in real environments.
- Practical experience with:
o Kubernetes deployments, services, ingress, config maps, and secrets o Basic troubleshooting inside a cluster (pods failing, crashes, restarts, resource issues)
- Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK/EFK, Application Insights, or similar).
- Experience with CI/CD pipelines (Azure DevOps, GitHub Actions, GitLab CI, Jenkins, or similar).
- Ability to read and modify pipeline definitions and understand build test deploy flows.
- Basic programming/scripting skills in at least one language (e.g., Python, Bash, PowerShell, Go, etc.).
- Understanding of core reliability concepts such as SLIs, SLOs, uptime, latency, and availability.
- Experience troubleshooting production issues using logs, metrics, and dashboards.
- Good communication skills and ability to collaborate with developers, QA, and product teams.
Nice-to-Have:
- Experience with at least one major cloud platform (Azure, AWS, Alibaba Cloud, or GCP).
- Experience with infrastructure as code (Terraform, Bicep, Pulumi, Helm, etc.).
- Experience with ingress controllers, API gateways, or service mesh.
- Familiarity with security best practices (secrets management, TLS/certificates, RBAC on Kubernetes or cloud).
- Experience participating in on-call rotations and using incident management tools (PagerDuty, Opsgenie, etc.).
- Experience contributing to post-incident reviews and implementing follow-up improvements.
Experience:
3–5 years
Similar Jobs
Infrastructure & Site Reliability Engineer – Datacentre AI Engineering - Riyadh, KSA
Qualcomm · Riyadh
**Company** Qualcomm Middle East Information Technology Company LLC **Job Area** Engineering Group, Engineering Group \> Software Test Engineering **General Summary** **About Us** Qualcomm is growing its presence in Riya
4 days ago
Generate Resume ↗AI Infrastructure Nutanix Site Reliability Engineer
emagine · Riyadh
**Job Title:** AI Infrastructure Nutanix Site Reliability Engineer **Location:** Saudi Arabia **Nationality:** Saudi Nationals only **Experience:** 5\+ years **Job Overview:** We are seeking an experienced AI Infrastruct
1 weeks ago
Generate Resume ↗Nutanix AI Site Reliability Lead Engineer
emagine · Riyadh
**Nationality:** Saudi Nationals only We are seeking an experienced Site Reliability Lead Engineer to act as the on\-site technical lead for Nutanix AI infrastructure environments. The role is responsible for driving rel
1 weeks ago
Generate Resume ↗Site Reliability Engineering Officer
Takamol Holding · Riyadh
**Job Description** **Job description :** * Provide support for application incidents across digital platforms, working closely with Platform Engineering, Application Development, and customer support teams to ensure tim
1 weeks ago
Generate Resume ↗Site Reliability Engineer
S2 Global · Riyadh
**Overview** S2 Global is seeking a skilled and motivated **Site Reliability Engineer (SRE)** to implement, maintain, and support deployments of our CertScan platform. As part of our systems engineering team, you will de
2 weeks ago
Generate Resume ↗Site Reliability Engineer - Observability
Mirai Arabian International Company Limited · Riyadh
Seeking a Site Reliability Engineer focused on observability, automation, and reliability for AI platforms, requiring strong coding and cloud automation skills.
2 weeks ago
Generate Resume ↗Senior Site Reliability Engineer
HALA · Riyadh
**Who Are We** HALA is a leading fintech player in the MENAP region that aims to redefine financial services and build the future bank of SMEs. HALA aims at empowering SMEs to start, run, and grow their businesses by pro
1 months ago
Generate Resume ↗Site Reliability Engineer (SRE)
PrimeGate for Communications and IT · Riyadh
**About the Role:** We are looking for a Site Reliability Engineer (SRE) with solid experience running production systems and working closely with development teams. The ideal candidate is comfortable with Linux, contain
1 months ago
Generate Resume ↗Stop applying blindly.
Start getting hired.
Base Career automates the hardest parts of job searching — apply smarter, not harder.
AI Resume in 60s
Your resume rewritten for this exact role using the job description as the brief.
ATS-Optimized
Get past automated screening filters with the right keywords matched to each job.
Application Tracker
Track every job, follow-up, and interview in one visual kanban board.
Free plan · No credit card required