Site Reliability Engineer
Job Fit Check
Base Career helps you apply smarter for this job.
Key skills for this role
About the Role
Job Title: Site Reliability Engineer (SRE) Company: Throne Solutions Location: Riyadh, Saudi Arabia Employment Type: Full-Time Experience Required: 5–8 Years About Throne Solutions Throne Solutions is seeking an experienced and motivated Site Reliability Engineer (SRE) to join our growing technology team in Riyadh.
Key Skills for This Role
Full Job Posting
About Throne Solutions
Throne Solutions is seeking an experienced and motivated
Site Reliability Engineer (Sre)
to join our growing technology team in Riyadh.
The ideal candidate will be responsible for ensuring the availability, scalability, performance, and reliability of enterprise production environments through automation, cloud-native technologies, proactive monitoring, and operational excellence.
This role requires strong expertise in AWS cloud infrastructure, Kubernetes, CI/CD, and incident management within large-scale enterprise or Cisco environments.
Role Summary
As a Site Reliability Engineer, you will bridge software engineering and IT operations by designing resilient cloud infrastructure, automating operational processes, improving system reliability, and minimizing downtime.
You will collaborate closely with development, infrastructure, and security teams to maintain highly available production systems while driving continuous improvement through automation and observability.
Key Responsibilities
- Design, build, and maintain highly available, scalable, and secure AWS cloud infrastructure.
- Provision and manage cloud resources using Infrastructure as Code (IaC) tools such as Terraform and AWS CloudFormation.
- Deploy, administer, and optimize Kubernetes clusters and Docker-based containerized applications.
- Develop automation scripts using Python, Bash, or Go to eliminate manual operational tasks and improve efficiency.
- Design and maintain CI/CD pipelines using Jenkins, GitLab CI/CD, or similar DevOps platforms.
- Implement and manage monitoring, logging, and observability solutions using Prometheus, Grafana, Splunk, Datadog, CloudWatch, or equivalent tools.
- Monitor application health, infrastructure performance, and service availability to proactively detect and resolve issues.
- Lead incident response activities, perform Root Cause Analysis (RCA), and implement preventive measures to minimize recurring incidents.
- Manage Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets to ensure service reliability.
- Participate in 24×7 on-call rotations and provide production support for critical systems.
- Optimize system performance, scalability, reliability, and Mean Time to Recovery (MTTR) through automation and continuous improvement initiatives.
- Develop and maintain operational runbooks, disaster recovery procedures, and technical documentation.
- Collaborate with DevOps, Development, Security, Infrastructure, and Network teams to support production deployments and operational readiness.
- Implement security best practices across cloud infrastructure, Kubernetes environments, and CI/CD pipelines.
- Ensure compliance with ITIL Incident, Problem, Change, and Release Management processes.
- Support enterprise production environments, including Cisco-based infrastructure where applicable.
Required Qualifications
- Bachelor's degree in Computer Science, Information Technology, Software Engineering, Computer Engineering, or a related discipline.
- 5–8 years of professional experience in Site Reliability Engineering (SRE), DevOps, Cloud Engineering, or Production Support.
- Proven experience supporting mission-critical enterprise production environments.
• Amazon Web Services (AWS)
- EC2
- VPC
- IAM
- RDS
- S3
- ELB
- Auto Scaling
- Route 53
- CloudWatch
Infrastructure As Code (Iac)
- Terraform
- AWS CloudFormation
Containerization & Orchestration
- Kubernetes
- Docker
Programming & Automation
- Python
- Bash
- Go (Preferred)
Ci And Cd & Devops
- Jenkins
- GitLab CI/CD
- Git
- GitHub
Monitoring & Observability
- Prometheus
- Grafana
- Splunk
- Datadog
- AWS CloudWatch
Incident & Itsm Tools
- ServiceNow
- Jira
Networking Fundamentals
- TCP/IP
- DNS
- HTTP/HTTPS
- Load Balancing
- VPN
- Firewalls
Preferred Skills
- Experience working in Cisco enterprise environments.
- Knowledge of cloud security tools and security best practices.
- Experience with Infrastructure Monitoring and Application Performance Monitoring (APM).
- Familiarity with container security, Kubernetes security, and DevSecOps practices.
- Experience with Helm, ArgoCD, or GitOps methodologies.
- Understanding of microservices architecture and distributed systems.
- Exposure to multi-cloud or hybrid cloud environments is an advantage.
Preferred Certifications
- AWS Certified Solutions Architect – Associate or Professional
- AWS Certified DevOps Engineer – Professional
Key Performance Outcomes
- Maintain high availability and reliability of production systems.
- Improve service uptime and overall platform resilience.
- Reduce Mean Time to Detect (MTTD) and Mean Time to Recovery (MTTR).
- Increase operational efficiency through automation.
- Enhance monitoring, observability, and incident response capabilities.
- Deliver scalable, secure, and cost-optimized cloud infrastructure.
- Ensure compliance with SLAs, SLOs, and operational best practices.
Required Competencies
- Strong analytical and problem-solving abilities.
- Excellent troubleshooting skills in complex production environments.
- Strong communication and stakeholder management skills.
- Ability to perform effectively under pressure during critical incidents.
- Automation-first mindset with a passion for operational excellence.
- Excellent documentation and knowledge-sharing skills.
- Strong collaboration across development, infrastructure, networking, and security teams.
- Self-motivated, proactive, and committed to continuous learning.
Why Join Throne Solutions?
- Opportunity to work on enterprise-scale cloud and infrastructure projects in Saudi Arabia.
- Exposure to cutting-edge AWS, Kubernetes, DevOps, and SRE technologies.
- Collaborative, innovation-driven, and high-performance work culture.
- Competitive compensation and professional development opportunities.
- Access to certification support, technical training, and career advancement.
- Work with modern cloud-native architectures, automation platforms, and enterprise production systems.
Apply for this job in 1 click
Skip the repetitive application forms
Install the Base Career Chrome Extension and autofill job applications across major job boards with your profile.
Trusted by over 500,000 job seekers on Base Career
More from this employer
More jobs at Throne Solutions
Network Operations Center Engineer
Riyadh, KSA
Job Title: Network Operations Center Engineer Company: Throne Solutions Location: Riyadh, Saudi Arabia Employment Type: Full-Time Experience Required: 5–8 Years About Throne Solutions Throne Solutions is seeking a highly
Wireless survey Enginner
Doha, QAT
Job Title: Physical & Wireless Survey Engineer (Freelance Contract) Location: Doha, Qatar Project Dates: 29th & 30th June Job Type: Freelance / Temporary Assignment Job Overview We are looking for experienced Physical Su