💼 Full-Time Position

AI Inference Systems Engineer at NVIDIA

🏢
NVIDIA Gruppe
📍 toronto, on, Canada
📍
Location
toronto, Canada
📅
Posted
May 30, 2026
Type
Full-Time
🎯

Full-Time Opportunity: This is a permanent, full-time position with a competitive package and real career growth potential.

Job Description

NVIDIA is looking for a Senior Software Engineer focused on building high-performance AI inference systems. Leverage your expertise in GPU optimization and distributed systems in a dynamic, innovative environment.

This senior role requires software engineers with expertise in AI inference and systems design. You will contribute to the vLLM framework, optimize GPU kernels, and architect large-scale deployments across multi-cloud environments. Your work will drive industry benchmarks and involve collaboration with diverse teams in the realm of accelerated computing.

Key Responsibilities:
• Develop features for vLLM leveraging NVIDIA GPU hardware
• Optimize and benchmark GPU kernels using advanced techniques
• Define methodologies for inference benchmarking tools
• Architect scheduling for large-scale containerized inference deployments
• Conduct original research to enhance ML Systems capabilities

Requirements:
• ...