💼 Full-Time Position

RLHF Specialist

🏢

Odixcity Consulting

📍 , , spain, , , spain, Spain

📍

Location

, , spain, Spain

📅

Posted

June 03, 2026

⏰

Type

Full-Time

🎯

Full-Time Opportunity: This is a permanent, full-time position with a competitive package and real career growth potential.

Job Description

Job Title:  RLHF Specialist 
Location:  Remote (Worldwide) 
Job Summary An RLHF Specialist is responsible for improving and aligning AI models using Reinforcement Learning from Human Feedback (RLHF)  methodologies. This role focuses on designing, implementing, and optimizing feedback pipelines that enhance model performance, safety, factual accuracy, and alignment with human values. 
Responsibilities Generate high-quality preference data by comparing multiple model responses and ranking them based on criteria such as helpfulness, honesty, and harmlessness (HHH). 
Design complex, multi-turn prompts to stress-test model behavior and expose weaknesses in reasoning or safety. 
Write detailed “chain-of-thought” explanations and rationales to train reward models on why specific responses are superior. 
Collaborate with Machine Learning Engineers to analyze model failure modes and identify dat...
                    

Job Details

Job Type Full-Time

Location , , spain, , , spain

Country Spain

Posted June 03, 2026

Deadline July 13, 2026

Experience As specified