💼 Full-Time Position

Research Intern – Multimodal Foundation Model for Vision

🏢
Sony UK Technology Centre
📍 Manhattan, New York, United States
📍
Location
Manhattan, United States
📅
Posted
June 10, 2026
Type
Full-Time
🎯

Full-Time Opportunity: This is a permanent, full-time position with a competitive package and real career growth potential.

Job Description

Research Intern –MultimodalFoundation Modelfor Vision

Sony AI is seeking research interns to joinus. Our teammainly focuseson fundamental and applied research, with a focus on buildingnext-generationfoundation modelsforvisionin a responsible manner. The role of a research intern is to developefficient and effectivemethodologiesand prototype solutions. You will work with aproductiveteam of world-class scientists and engineers to tackle the most challenging problems in foundation models and generative AI, includinglow-cost yetpowerfulvision foundation models(VFM), vision-languagemodels(VLM),unified models,automaticmodel compression, optimization anddeployementoncloud andedge. You will see your ideas not only published inpapers, butalso improve the experience ofbillionsofcustomers.

Roles and Responsibilities

  • Conduct fundamental and innovativedevelopmentinlow-cost yetpowerfulvision-languagemodels(VLM),unified models,automaticmodel compression, optimiz...