SantaClaraRecruiter Since 2001
the smart solution for Santa Clara jobs

AI with HPC Cluster Engineer

Company: Tata Consultancy Services
Location: Santa Clara
Posted on: February 12, 2024

Job Description:

Role: AI with HPC Cluster Engineer

Work Location: Santa Clara, CA or Austin, TX

Technical Skills:

Proficiency in RoCEv2, K8s, KVM, Ubuntu, Python, Shell, Go, Rust, GPU drivers, and Cluster interconnect with 200G/400G networking. Managing GPU clusters optimizing GPU-based services/tools/software
Roles & Responsibilities:


Develop, implement, and maintain GPU-based clusters of 10 to 1000 nodes, ensuring optimal performance and availability.
Administer ML/AI platforms - Distributed ML services, LLMs, Vector-DB and AI inferencing, by managing deployments, resource allocation, monitoring, and security.
Collaborate with cross-functional teams to address AI infrastructure requirements, support AI-related projects, and provide technical expertise.
Monitor and evaluate the performance of AI systems and clusters, ensuring that they adhere to industry best practices and meet company standards.
Compile reports, document procedures, and publish recommendations for improving AI infrastructure and solutions.
Use AI/ML to continuously improve internal processes and tools that are used in end-to-end delivery of your services in this team.

Keywords: Tata Consultancy Services, Santa Clara , AI with HPC Cluster Engineer, Engineering , Santa Clara, California

Click here to apply!

Didn't find what you're looking for? Search again!

I'm looking for
in category
within


Log In or Create An Account

Get the latest California jobs by following @recnetCA on Twitter!

Santa Clara RSS job feeds