💼 Full-Time Position

ML Research Platform Engineer (Distributed Training & HPC)

🏢
QNT Partners
📍 Singapore, Singapore, Singapore
📍
Location
Singapore, Singapore
📅
Posted
June 09, 2026
Type
Full-Time
🎯

Full-Time Opportunity: This is a permanent, full-time position with a competitive package and real career growth potential.

Job Description

Location: Singapore, Hong Kong or Shanghai


About the role

We are looking for a platform engineer to build the infrastructure that powers our next-generation machine learning research. Think: large-scale experimentation, distributed training, and reproducibility.


This is not an applied ML role. You will not be fine-tuning LLMs or building agents. Instead, you will build the systems that enable researchers to train models at scale


What you will own

  • Distributed training pipelines for GPU-accelerated workloads (PyTorch, JAX)
  • Experiment management and model versioning
  • Resource scheduling on on-premise HPC clusters and cloud (Slurm, Kubernetes)
  • Observability and debugging for complex training jobs
  • Data lineage