{bc}
indeed

Site Reliability Engineer (SRE)- Banking Industry

Fecundity Technologies
Abu Dhabi, UAE
Senior
AED 17,000/month
4 days ago
engineeringdesignproject managementmaintenancequality controltechnical
Free

Job Fit Check

Base Career helps you apply smarter for this job.

?%
Ready to Scan

Key skills for this role

engineeringdesignproject management
Smart Apply

Full Job Posting

Overview

**Role Overview:** We are seeking an SiteReliability Engineer to own the "Production Readiness" of our cloud-based AI solutions.

This hybrid role combines automated software testing, and Site Reliability Engineering (SRE).

You will build the automated frameworks that validate our AI outputs and ensure the underlying Azure/AWS infrastructure is resilient, performant, and compliant with banking standards.

Key Responsibilities

  • **Resiliency Engineering (SRE):** Implement "Chaos Engineering" and load testing to ensure web/mobile backends can handle banking-scale traffic. Maintain high availability through automated recovery scripts.
  • **Automated Regression:** Build CI/CD-integrated test suites using **Python** that validate both the application logic and the infrastructure state (IaC validation).
  • **Observability & SLIs:** Define and monitor Service Level Indicators (SLIs) and Objectives (SLOs). Set up advanced alerting in Azure Monitor or AWS CloudWatch to catch performance degradation before users do.
  • **Security & Compliance Testing:** Automate security scans and compliance checks to ensure all AI data handling meets strict banking data residency and privacy protocols.

Technical & Professional Requirements

  • **Automation Stack:** High proficiency in **Python** (for AI testing) and framework automation (PyTest, Selenium, or Robot Framework).
  • **Cloud Infrastructure:** Strong hands-on experience with **Azure** or **AWS**, specifically regarding networking, scaling, and serverless reliability.
  • **AI/ML Understanding:** Understanding of Prompt Engineering and how to evaluate AI model outputs (RAG evaluation, ROUGE/BLEU scores, or custom LLM-benchmarks).
  • **Monitoring Tools:** Experience with Grafana, Prometheus, or native cloud monitoring tools to build real-time reliability dashboards.
  • **FinOps Awareness:** Ability to identify "expensive" failing tests or inefficient cloud resource usage during the testing phase.

Recommended Skillset & Tools

  • **Languages:** Python (Mandatory), Bash scripting.
  • **Tools:** GitHub Actions (CI/CD), Terraform (reading/validating), K6 or JMeter (Performance).
  • **AI Frameworks:** DeepEval, Ragas, or LangSmith (for automated AI evaluation).
  • Pay: AED17,000.00 - AED21,000.00 per month

Application Question(s)

  • Do you have hands-on experience with cloud platforms?
  • Have you built or maintained automated test frameworks (PyTest, Selenium, Robot Framework, etc.)?
  • Have you implemented Chaos Engineering or resilience testing?
  • What tools have you used for performance/load testing?
  • Describe how you ensure high availability and automated recovery in production systems.
  • Which evaluation methods have you used for AI outputs?

Experience

  • Site Reliability Engineering or Production Support: 4 years (Preferred)
  • Python: 4 years (Preferred)

Apply for this job in 1 click

Skip the repetitive application forms

Install the Base Career Chrome Extension and autofill job applications across major job boards with your profile.

Sarah M.James T.Maya R.

Trusted by over 500,000 job seekers on Base Career

Start Free Today

More from this employer

More jobs at Fecundity Technologies