Quality Assurance Engineer
Skills
About This Role
Role Overview
We are looking for a Senior QA Engineer to own the testing and quality engineering surface for AI-powered government products.
You will design, build, and operate the test automation frameworks, evaluation infrastructure, and quality practices that let the AI Factory ship reliably at government scale.
Testing AI-powered systems is genuinely different from testing traditional software.
The same input can produce different outputs, "correct" is often fuzzy, and failure modes include hallucination, drift, and degradation that no assertion library catches.
You will bring real automation engineering depth and combine it with the judgment to design quality systems for non-deterministic behaviour — across LLM-powered products, retrieval pipelines, agent workflows, backend services, web and mobile interfaces, and the data pipelines that feed all of them.
At senior level, we expect more than reliable execution.
You will own the architecture of major testing and evaluation surfaces end-to-end, raise the engineering bar around you, surface and resolve quality risks before they reach users, and contribute meaningfully to the standards set across the organisation.
You operate independently, mentor others, and partner with staff engineers to shape how quality engineering evolves at the AI Factory.
Core Responsibilities
- Design, build, and operate test automation frameworks that span web, mobile, API, backend services, and data pipelines — owning the architecture of the testing surfaces you build.
- Design and operate evaluation infrastructure for AI-powered products: systematic testing of LLM outputs, regression detection on model behaviour, golden-set management, automated grading where appropriate, and the practices that catch quality regressions before they reach production.
- Partner with AI Product Engineers on evaluation design — connecting their model-level evaluation work to the systematic, organisation-wide testing infrastructure you own.
- Build automated data validation across training data, evaluation datasets, and production data flows: schema validation, distribution checks, drift detection, and the practices that keep data trustworthy across consumers.
- Design and run performance testing across AI models, APIs, and user-facing applications: load profiling, latency under realistic conditions, capacity testing, and the practices that surface scalability limits before users do.
- Design and generate synthetic test data, adversarial inputs, and edge-case fixtures that exercise systems beyond the happy path — including inputs designed to probe AI failure modes such as prompt injection, jailbreaks, and grounding failure.
- Own the integration of automated testing into CI/CD pipelines: test selection, parallelisation, flake management, and the discipline that keeps test suites trustworthy as they grow.
- Own quality observability: test result analytics, failure pattern analysis, regression tracking, and the dashboards that make system quality transparent across teams.
- Contribute to non-functional testing across security, accessibility, and reliability — partnering with the relevant specialists to ensure these are systematically tested rather than checked off.
- Lead incident triage and post-mortem analysis for quality escapes: identify what testing should have caught the issue, build the test that does, and ensure systemic improvement rather than a one-off patch.
- Contribute to quality engineering standards across the AI Factory: testing conventions, framework choices, evaluation practices, and the technical bar that applies to anything shipped to production.
- Mentor engineers across the organisation on testing strategy, automation design, and AI evaluation practice. Champion a culture where engineers own the quality of what they ship.
Basic Qualifications
- 7+ years of QA and test automation experience, with a track record of designing and owning test automation frameworks at scale — not just writing tests within frameworks built by others.
- Strong programming foundation in at least one of Python, JavaScript / TypeScript, or Java — at the depth required to design test frameworks and evaluation tooling, not just script test cases.
- Strong expertise in modern test automation tooling across the relevant surfaces: Playwright, Cypress, Selenium, or equivalent for web; Appium, Detox, or equivalent for mobile; with the judgment to choose the right tool for the problem.
- Strong expertise in API testing and contract testing: REST, schema validation, contract verification, and the practices that catch integration failures early.
- Experience designing test strategy at the architectural level: deciding what to test, at what layer, with what coverage — and defending those decisions with evidence rather than habit.
- Strong CI/CD integration experience with GitHub Actions, GitLab CI, Jenkins, or equivalent: test selection, parallelisation, environment management, and flake reduction practices.
- Performance testing experience with k6, JMeter, Locust, or equivalent at production scale, including realistic load modelling and result interpretation.
- Demonstrated ability to think systematically about quality — including for non-deterministic systems where traditional pass/fail assertions are insufficient.
- Strong written and verbal communication — you can author test strategy documents, drive quality decisions in design reviews, and explain quality trade-offs clearly to engineers, product managers, and non-technical stakeholders.
- Proven ability to operate autonomously, take ownership, and deliver high-quality work end-to-end.
Preferred Qualifications
- Experience designing evaluation frameworks for AI / LLM-powered products: output quality assessment, retrieval and grounding evaluation, regression detection on model behaviour, and the practices that close the loop between measurement and product improvement.
- Hands-on experience with LLM evaluation tooling: ragas, DeepEval, Promptfoo, Braintrust, LangSmith, or equivalent — and the judgment to use them appropriately rather than ceremonially.
- Experience with LLM-as-judge evaluation patterns, including their limitations, calibration issues, and the practices that make automated grading trustworthy.
- Experience testing RAG systems specifically: retrieval quality, grounding correctness, citation accuracy, and end-to-end pipeline behaviour.
- Experience testing agent-based AI systems: tool use validation, multi-step reasoning evaluation, and failure recovery testing.
- Experience with data pipeline testing using Great Expectations, Soda, or equivalent — and with ML pipeline testing in Airflow, Kubeflow, MLflow, or equivalent.
- Experience with security testing practices: OWASP coverage, fuzzing, and adversarial testing — including AI-specific concerns such as prompt injection and jailbreak resistance.
- Experience with accessibility testing automation: axe-core, Pa11y, or equivalent, and the practices required for WCAG conformance at scale.
- Experience with chaos engineering or resilience testing in production-like environments.
- Experience with Arabic-language testing or right-to-left interface validation — relevant for government services in the UAE.
- Experience in a regulated or government-adjacent environment: audit trails, compliance frameworks, and the engineering discipline required to operate quality systems for sensitive data.
Our Stack
- We use the tools best suited to each problem.
- Current defaults reflect what works well for our use cases — not a mandated standard.
- Candidates should be productive across most of these areas and willing to operate across them.
- **Languages:**
Python, TypeScript / JavaScript, Java
- **Web and mobile testing:**
Playwright, Cypress, Appium, Detox
- **API testing:**
- Postman, Newman, Pact or equivalent contract testing, schema validation tooling
- **AI evaluation:**
- Custom evaluation harnesses, LLM-as-judge patterns, golden sets, ragas / DeepEval / Promptfoo or equivalent
- **Data validation:**
- Great Expectations, Soda, or equivalent
- **Performance testing:**
- k6, JMeter, Locust
- **CI/CD:**
GitHub Actions, GitLab CI, Jenkins
- **Observability:**
- Test result analytics, regression dashboards, structured logging across test infrastructure
- **Infrastructure:**
- Docker, Kubernetes, cloud platforms (Azure)
Technical Depth Expectations
- Candidates will be expected to demonstrate genuine depth in at least three of the following areas.
- Conceptual familiarity is not sufficient.
- **Test automation architecture —**
- framework design across web, mobile, API, and backend; test selection and parallelisation; flake management; and the structural decisions that keep test infrastructure trustworthy as it grows.
- **AI and LLM evaluation —**
- evaluation harness design, golden set management, regression detection on non-deterministic outputs, LLM-as-judge patterns and their limitations, and the practices that make AI quality measurable in production.
- **Performance and load testing —**
- realistic load modelling, capacity testing, latency analysis, and the depth to surface scalability limits in AI services and user-facing applications.
- **Data validation and pipeline testing —**
- schema validation, distribution checks, drift detection, and the practices that keep data trustworthy across training, evaluation, and production.
- **Security and adversarial testing —**
- OWASP coverage, fuzzing, prompt injection and jailbreak testing, and the practices appropriate for high-value government applications.
- **CI/CD integration —**
- test selection, parallelisation, environment management, and the discipline that keeps automated test suites fast and reliable as they scale.
- **Quality observability —**
- test analytics, regression tracking, failure pattern analysis, and the practices that turn quality data into systemic improvement.
- **Accessibility and inclusive testing —**
- WCAG conformance, assistive technology validation, and the practices that make government services genuinely usable.
Your resume, rewritten
for this exact role.
Sign up free — Base Career tailors your CV to this job description in 60 seconds.
01 / 05
Resume Tailored to This Job

Your keywords, structure, and story — rewritten to match this exact role and pass ATS filters.
Free · No card · 60 seconds
02 / 05
Cover Letter for This Role, Done

Job-specific cover letters written in Gulf professional tone — ready in seconds, not hours.
Free · No card · 60 seconds
03 / 05
See How Well You Fit This Role

AI match score with clear reasons — know your fit before investing time in the application.
Free · No card · 60 seconds
04 / 05
Apply in One Click

Autofill any application form on Workday, LinkedIn, Bayt, Greenhouse — with your tailored content.
Free · No card · 60 seconds
05 / 05
Track It. Follow Up at the Right Time.

Visual pipeline for every application with AI-timed follow-up reminders so nothing slips.
Free · No card · 60 seconds
Similar Jobs
Junior Quality Assurance Engineer
Air Arabia · Sharjah
Job Purpose To assist in executing, monitoring and follow up on Quality Assurance audit plans and reviews and surveillance as directed by “Lead Quality Assurance Engineer”. Follow up on corrective measures in line with A
Skills
Quality Specialist - IT Governance & Quality Assurance
Dubai Careers - A Smart Dubai Initiative · Dubai
Job Description Job Purpose: Drive authoritative technology quality assurance and governance across the full digital-system life cycle, ensuring reliable, compliant, high-performing solutions for RTA. Champion continuous
Skills
Quality Assurance Engineer
ai71 · Abu Dhabi
Role Summary AI71 is seeking a Senior QA Automation Engineer to lead the validation and verification strategies for EDGE Group’s AI transformation. You will be responsible for defining "what good looks like" for non-dete
Skills
Quality Assurance Real Estate/ Arabic Candidates Only
Eliva Real Estate · Abu Dhabi
We are looking for a detail-oriented Quality Assurance professional ARABIC PERSON REQUIRED to ensure the accuracy, compliance, and quality of our real estate leads, documentation, and operational processes. *\Key Respons
Skills
QA Officer – Food Safety & Quality Assurance
BBM FOODS · Dubai
About BBM Foods BBM Foods is a growing food service distribution company specializing in premium fresh produce supply for the UAE market. As part of our operational expansion and ongoing FSSC 22000 certification process,
Skills
Admin Coordinator - Travel & Quality Assurance
The First Group · Dubai
Coordinate hotel bookings and logistics for roadshows, manage budgets and vendor contracts, and support on-site evaluations.
Skills
Senior Quality Assurance Associate
noon · Dubai
About noon noon, the region's leading consumer commerce platform. On December 12th, 2017, noon launched its consumer platform in Saudi Arabia and the UAE, expanding to Egypt in February 2019. The noon ecosystem of servic
Skills
Admin Coordinator - Travel & Quality Assurance (UAE National)
The First Group · Dubai
Overview: Headquartered in Dubai, UAE, The First Group is a dynamic, integrated global property developer with a fast-growing portfolio of upscale hotels, residential properties, F&B brands and real estate asset manageme
Skills
Quality Assurance (Real Estate)
ITW Properties · Abu Dhabi
Industry: Real Estate Location: Abu Dhabi Salary: Competitive salary package (based on experience) Key Responsibilities: Monitor and maintain quality standards across real estate operations Review documentation, contra
Skills
Professionals hired via Base Career
“I kept getting rejections from London. Base Career rewrote my CV for Dubai, and I landed Emirates in 3 weeks.”
Sarah M. · Marketing Manager
🇬🇧 UK → 🇦🇪 Dubai
“50 applications in Canada, zero replies. Base Career tailored my resume for Riyadh and I got 4 interviews within a month.”
James T. · Software Engineer
🇨🇦 Canada → 🇸🇦 Riyadh
“The cover letters matched Gulf tone immediately. I got hired by a semi-government team in Doha on my first round.”
Maya R. · Product Manager
🇺🇸 USA → 🇶🇦 Doha
“As an expat I had no idea how Gulf CVs work. Base Career nailed it. Offer from a Big 4 in Abu Dhabi in 6 weeks.”
Priya K. · Finance Analyst
🇮🇳 India → 🇦🇪 Abu Dhabi
2.2K+
Cover Letters & Follow-ups
1.8K+
Resumes Tailored
190.5K+
Jobs Tracked
Trusted by professionals at
Stop applying blindly.
Start getting hired.
Base Career automates the hardest parts of job searching — apply smarter, not harder.
AI Resume in 60s
Your resume rewritten for this exact role using the job description as the brief.
ATS-Optimized
Get past automated screening filters with the right keywords matched to each job.
Application Tracker
Track every job, follow-up, and interview in one visual kanban board.
Free plan · No credit card required