Site Reliability Engineer

Throne Solutions

Riyadh, KSA

fulltime

Entry

4 days ago

engineeringdesignproject managementmaintenancequality controltechnical

Apply

Free

Job Fit Check

Base Career helps you apply smarter for this job.

Ready to Scan

Key skills for this role

engineeringdesignproject management

Smart Apply

Full Job Posting

About Throne Solutions

Throne Solutions is seeking an experienced and motivated

Site Reliability Engineer (Sre)

to join our growing technology team in Riyadh.

The ideal candidate will be responsible for ensuring the availability, scalability, performance, and reliability of enterprise production environments through automation, cloud-native technologies, proactive monitoring, and operational excellence.

This role requires strong expertise in AWS cloud infrastructure, Kubernetes, CI/CD, and incident management within large-scale enterprise or Cisco environments.

Role Summary

As a Site Reliability Engineer, you will bridge software engineering and IT operations by designing resilient cloud infrastructure, automating operational processes, improving system reliability, and minimizing downtime.

You will collaborate closely with development, infrastructure, and security teams to maintain highly available production systems while driving continuous improvement through automation and observability.

Key Responsibilities

Design, build, and maintain highly available, scalable, and secure AWS cloud infrastructure.
Provision and manage cloud resources using Infrastructure as Code (IaC) tools such as Terraform and AWS CloudFormation.
Deploy, administer, and optimize Kubernetes clusters and Docker-based containerized applications.
Develop automation scripts using Python, Bash, or Go to eliminate manual operational tasks and improve efficiency.
Design and maintain CI/CD pipelines using Jenkins, GitLab CI/CD, or similar DevOps platforms.
Implement and manage monitoring, logging, and observability solutions using Prometheus, Grafana, Splunk, Datadog, CloudWatch, or equivalent tools.
Monitor application health, infrastructure performance, and service availability to proactively detect and resolve issues.
Lead incident response activities, perform Root Cause Analysis (RCA), and implement preventive measures to minimize recurring incidents.
Manage Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets to ensure service reliability.
Participate in 24×7 on-call rotations and provide production support for critical systems.
Optimize system performance, scalability, reliability, and Mean Time to Recovery (MTTR) through automation and continuous improvement initiatives.
Develop and maintain operational runbooks, disaster recovery procedures, and technical documentation.
Collaborate with DevOps, Development, Security, Infrastructure, and Network teams to support production deployments and operational readiness.
Implement security best practices across cloud infrastructure, Kubernetes environments, and CI/CD pipelines.
Ensure compliance with ITIL Incident, Problem, Change, and Release Management processes.
Support enterprise production environments, including Cisco-based infrastructure where applicable.

Required Qualifications

Bachelor's degree in Computer Science, Information Technology, Software Engineering, Computer Engineering, or a related discipline.
5–8 years of professional experience in Site Reliability Engineering (SRE), DevOps, Cloud Engineering, or Production Support.
Proven experience supporting mission-critical enterprise production environments.

• Amazon Web Services (AWS)

EC2
VPC
IAM
RDS
S3
ELB
Auto Scaling
Route 53
CloudWatch

Infrastructure As Code (Iac)

Terraform
AWS CloudFormation

Containerization & Orchestration

Kubernetes
Docker

Programming & Automation

Python
Bash
Go (Preferred)

Ci And Cd & Devops

Jenkins
GitLab CI/CD
Git
GitHub

Monitoring & Observability

Prometheus
Grafana
Splunk
Datadog
AWS CloudWatch

Incident & Itsm Tools

ServiceNow
Jira

Networking Fundamentals

TCP/IP
DNS
HTTP/HTTPS
Load Balancing
VPN
Firewalls

Preferred Skills

Experience working in Cisco enterprise environments.
Knowledge of cloud security tools and security best practices.
Experience with Infrastructure Monitoring and Application Performance Monitoring (APM).
Familiarity with container security, Kubernetes security, and DevSecOps practices.
Experience with Helm, ArgoCD, or GitOps methodologies.
Understanding of microservices architecture and distributed systems.
Exposure to multi-cloud or hybrid cloud environments is an advantage.

Preferred Certifications

AWS Certified Solutions Architect – Associate or Professional
AWS Certified DevOps Engineer – Professional

Key Performance Outcomes

Maintain high availability and reliability of production systems.
Improve service uptime and overall platform resilience.
Reduce Mean Time to Detect (MTTD) and Mean Time to Recovery (MTTR).
Increase operational efficiency through automation.
Enhance monitoring, observability, and incident response capabilities.
Deliver scalable, secure, and cost-optimized cloud infrastructure.
Ensure compliance with SLAs, SLOs, and operational best practices.

Required Competencies

Strong analytical and problem-solving abilities.
Excellent troubleshooting skills in complex production environments.
Strong communication and stakeholder management skills.
Ability to perform effectively under pressure during critical incidents.
Automation-first mindset with a passion for operational excellence.
Excellent documentation and knowledge-sharing skills.
Strong collaboration across development, infrastructure, networking, and security teams.
Self-motivated, proactive, and committed to continuous learning.

Why Join Throne Solutions?

Opportunity to work on enterprise-scale cloud and infrastructure projects in Saudi Arabia.
Exposure to cutting-edge AWS, Kubernetes, DevOps, and SRE technologies.
Collaborative, innovation-driven, and high-performance work culture.
Competitive compensation and professional development opportunities.
Access to certification support, technical training, and career advancement.
Work with modern cloud-native architectures, automation platforms, and enterprise production systems.

Apply for this job in 1 click

Skip the repetitive application forms

Install the Base Career Chrome Extension and autofill job applications across major job boards with your profile.

Trusted by over 500,000 job seekers on Base Career

Start Free Today