{bc}
linkedin

Machine Learning Engineer

Selby Jennings
Shanghai, UAE
fulltime
Mid-Senior
5 days ago
PythonTensorFlowPyTorchScikit-learnDeep LearningNatural Language Processing (NLP)
Free

Job Fit Check

Base Career helps you apply smarter for this job.

?%
Ready to Scan

Key skills for this role

PythonTensorFlowPyTorch
Smart Apply

Full Job Posting

Responsibilities

  • Architecting and developing the next generation of Company's machine learning research platform, with an emphasis on scalability, reliability, observability, and reproducibility
  • Building infrastructure that enables large-scale experimentation, model training, and simulation across on-premises HPC and multi-cloud environments
  • Partnering closely with quantitative researchers to understand evolving research workflows and translate them into robust platform capabilities
  • Designing and optimizing distributed training pipelines for high-throughput, GPU-accelerated workloads
  • Improving experiment management, model versioning, artifact tracking, and data lineage to ensure transparent and reproducible research
  • Developing tools and frameworks that streamline feature engineering, dataset generation, and large-scale backtesting
  • Leading initiatives to improve compute efficiency, resource scheduling, and workload isolation across heterogeneous environments
  • Enhancing platform observability, including metrics, logging, tracing, and debugging capabilities tailored to ML workloads
  • Supporting rapid iteration by implementing features and fixes on tight timelines while maintaining high engineering standards
  • Contributing to long-term architectural decisions that enable the platform to scale with increasing data volumes and model complexity

Qualifications

  • 2+ years of experience designing and building large-scale distributed systems, ideally in support of research or data-intensive workloads
  • Strong programming experience in Python, with a focus on writing clean, maintainable, and high-performance code
  • Experience developing and operating applications on Linux-based HPC clusters and/or cloud platforms
  • Solid understanding of distributed computing concepts, parallel processing, and resource management
  • Experience with GPU-based workloads and familiarity with modern ML frameworks (e.g., PyTorch, TensorFlow, JAX)
  • Experience optimizing data pipelines and handling large-scale structured and unstructured datasets
  • Strong troubleshooting skills with the ability to debug complex, cross-layer system issues
  • Ability to work independently in a fast-paced, research-driven environment
  • Strong communication skills and experience collaborating directly with researchers or data scientists

Apply for this job in 1 click

Skip the repetitive application forms

Install the Base Career Chrome Extension and autofill job applications across major job boards with your profile.

Sarah M.James T.Maya R.

Trusted by over 500,000 job seekers on Base Career

Start Free Today

More from this employer

More jobs at Selby Jennings