🎯
Full-Time Opportunity: This is a permanent, full-time position with a competitive package and real career growth potential.
Job Description
HPC Infrastructure Reliability Engineer - Tata Consultancy Services
Responsibilities
- Manage and optimize high-performance physical infrastructure (servers, GPUs, and advanced networking).
- Ensure availability, performance, and reliability of HPC and AI environments.
- Oversee the full hardware lifecycle (capacity planning, deployment, and decommissioning).
- Work with tools such as HPE OneView, Lenovo XClarity, and ServiceNow CMDB.
- Collaborate with R&D, science, and engineering teams to design optimal infrastructure solutions.
- Optimize resource utilization (CPU/GPU) and improve overall infrastructure efficiency.
Qualifications
- 5–7+ years of experience in Data Center Engineering, Bare Metal, or HPC Infrastructure.
- Strong expertise in enterprise hardware (HPE, Lenovo) and high-performance systems.
- Hands‑on experience with GPUs (NVIDIA) and AI/HPC environments. ...