{bc}

Lead Site Reliability Engineer (SRE)

Pearl Autonomous LogisticsDoha, QAT1 months agofulltime
CI/CDGitGoKubernetesScalaTerraform
Generate Resume for this Job
Via Indeed·

About This Role

Lead Site Reliability Engineer (SRE) — Job Description Role summary

Lead reliability, scalability, and operability of distributed systems by defining SRE strategy, building platform capabilities, and driving culture and processes that reduce toil and improve uptime.

Key responsibilities

  • Lead design, implementation, and operation of highly available, scalable production systems across cloud and on-prem environments.
  • Define and own SLOs/SLIs, error budgets, monitoring, and alerting strategies; drive SLI/SLO adoption across teams.
  • Lead incident response, post-incident reviews, root-cause analysis, and remediation; implement preventative measures.
  • Build and maintain observability stacks (metrics, logs, tracing) and dashboards (Prometheus, Grafana, ELK/EFK, OpenTelemetry).
  • Architect and operate CI/CD and deployment platforms (ArgoCD, Spinnaker, GitHub Actions, GitLab CI) enabling safe, automated rollouts (canary, blue/green, feature flags).
  • Design, implement, and maintain self-service platform tooling for developers (Kubernetes/EKS/GKE/AKS, service meshes, operators).
  • Drive Infrastructure as Code practices (Terraform, Pulumi, CloudFormation) and manage infrastructure lifecycle, drift detection, and compliance.
  • Automate operational runbooks, remediation, capacity planning, and routine maintenance to minimize manual toil.
  • Own reliability-related security practices: secrets management, IAM, network policies, vulnerability scanning, and secure configurations.
  • Mentor and grow SRE and platform engineers; lead hiring, performance reviews, and career development.
  • Partner with engineering, product, and security teams to influence design decisions for fault tolerance and operability.
  • Manage on-call rotations, escalation policies, and ensure adequate coverage; coordinate across teams during major incidents.
  • Drive cost optimization, observability of cloud spend, and capacity forecasting.

Required qualifications

  • 7+ years in site reliability, platform, or DevOps engineering roles with progressive leadership responsibility.
  • Proven experience operating production distributed systems at scale on at least one major cloud provider (AWS, GCP, or Azure).
  • Deep expertise with Kubernetes and container ecosystems; experience running large clusters and multi-cluster environments.
  • Strong IaC experience (Terraform required; CloudFormation/Pulumi a plus).
  • Extensive experience with observability tooling (Prometheus, Grafana, ELK/EFK, Open Telemetry) and incident management platforms (PagerDuty, Ops genie).
  • Solid software engineering skills (Python, Go, or similar) for automation, tooling, and reliability engineering.
  • Demonstrated experience setting and enforcing SLOs/SLIs and reducing MTTR through engineering practices.
  • Experience with CI/CD systems and deployment strategies (Argo CD, Spinnaker, Flux, Git Ops).
  • Strong systems, networking, and security fundamentals.
  • Excellent leadership, communication, and stakeholder management skills; proven ability to influence across orgs.
  • Experience mentoring engineers and leading cross-functional initiatives.

Job Types: Full-time, Permanent

Pay: QAR23.71 - QAR86.45 per hour

Expected hours: 40 per week

Work Location: In person

Similar Jobs

Lead Site Reliability Engineer

Avrioc Technologies · Abu Dhabi Emirate

Mid-Senior

⚙️ HIRING: 🚀 We’re Hiring | Senior SRE / DevOps Lead | Avrioc | UAE 🇦🇪 We’re looking for a seasoned DevOps \& Site Reliability Engineering (SRE) Lead to design, scale, and elevate our cloud infrastructure and o

AWSAzureGCP

AFCAP V SWA Transient Aircraft Services: Lead Site Manager

KBR, Inc. · Doha

Director

Title AFCAP V SWA Transient Aircraft Services: Lead Site Manager Belong, Connect, Grow, with KBR! Program Summary KBR, through the AFCAP V Program, assists the U.S. Air Force by offering Southwest Asia Transie

Go

AFCAP V SWA Transient Aircraft Services: Lead Site Manager

KBR · Doha

Title: AFCAP V SWA Transient Aircraft Services: Lead Site Manager*Belong, Connect, Grow, withKBR!* Program Summary KBR, through the AFCAP V Program, assists the U.S. Air Force by offering Southwest Asia Transient Air

Go

Technical Manager / Lead Site Architect (Buildings Projects)

Connexa Recruitment · Dubai

Mid-Senior

Connexa Recruitment are working in partnership with an award-winning International Architectural Design practice who are looking for an Technical Manager / Lead Site Architect (Buildings Projects) in there Dubai office.

Recruitment

Team Lead Site Reliability Engineer

Sana Commerce · Dubai

Mid-Senior

At Sana Commerce we're committed to an inclusive environment and recognize that our diverse work\force is one of our greatest strengths. It all started in 2007, with a pizza and a plan. Sana Commerce is an e-commerce

AI Job Platform

Stop applying blindly. Start getting hired.

Base Career automates the hardest parts of job searching — apply smarter, not harder.

AI Resume in 60s

Your resume rewritten for this exact role using the job description as the brief.

ATS-Optimized

Get past automated screening filters with the right keywords matched to each job.

Application Tracker

Track every job, follow-up, and interview in one visual kanban board.

Start Today for Free

Free plan · No credit card required