💼 Full-Time Position

Senior LLM Deployment & Inference Optimization Engineer

🏢

Confidential

📍 singapore, singapore, Singapore

📍

Location

singapore, Singapore

📅

Posted

June 19, 2026

⏰

Type

Full-Time

🎯

Full-Time Opportunity: This is a permanent, full-time position with a competitive package and real career growth potential.

Job Description

We are looking for an experienced Senior LLM Deployment & Inference Optimization Engineer  to build and operate self-hosted inference infrastructure for LLMs, multimodal models, ASR, and TTS systems  in the cloud. Your mission is to deliver a stable, low-latency, and cost-efficient inference platform that powers real-time conversations and voice interactions in AI-driven English learning classrooms. This is a senior, cross-functional engineering role focused on deploying, optimizing, and operating open-source inference engines and GPU infrastructure at scale,  rather than developing inference kernels from scratch. 

Responsibilities 
Design, deploy, and operate self-hosted cloud inference services for LLMs, multimodal models, ASR, and TTS systems , building highly available and elastically scalable inference infrastructure. 
Optimize and productionize open-source inference framewor...
                    

Job Details

Job Type Full-Time

Location singapore, singapore

Country Singapore

Posted June 19, 2026

Deadline July 29, 2026

Experience As specified