Job Description
We are seeking a highly skilled Machine Learning Engineer to design and build a low-latency query understanding and intelligent routing system that operates without reliance on large language models. The role focuses on extracting intent, entities, application context, routing decisions, and supporting evidence from user queries in real time.
This is a full lifecycle role spanning data modeling, ML development, optimization, local deployment, and MLOps. The ideal candidate will have strong experience in applied NLP, lightweight model architectures, and production-grade ML systems, with a focus on sub-second inference, CPU-based execution, and scalable domain evolution.
Responsibilities
- Design and implement a query understanding pipeline to extract intent, routing decisions, entities, application mapping, and historical evidence from user queries and conversations.
- Defi...