TA

HPC Engineer

Tata Consultancy Services Limited
Delhi5-7 LPA Posted 21 Mar 2025
FULL TIME
Cluster Analysis
Hpc
Linux

Job Description

Job Description

Responsibilities:

  • Run
  • Workload scheduler management
  • HPC tools and middleware management
  • HPC Application troubleshooting from infra perspective

• System/cluster monitoring

Management, monitoring and maintenance of Infiniband interconnect, cluster services, cluster hardware

• Root Cause Analysis of HPC cluster Build

• System deployment using cluster management tools

• OS repository management along with compatibility matrix for various device drivers

• Configuration and maintenance of network services

• Installation and configuration of HPC workload managers

• HPC Application integration with job scheduler

Skills / Expertise

  • Operating systems: Linux: RHEL, Rocky, CentOS, SuSE, Windows
  • Schedulers & Resource Managers: PBS Pro, LSF, SLURM, Open Grid Scheduler [OGS]
  • Provisioning: HP-CMU, xCAT, Bright Cluster Manager
  • Monitoring: Ganglia, Nagios, Zabbix, Grafana
  • Configuration Management: Chef, Puppet, Ansible, CFEngine.
  • HPC Application : Openfoam, Star–CCM+,Abaqus, Ansys, Ls-Dyna and other CAE & CFD applications
  • Linux operating system fundamentals, architecture, administration, native service configuration and advanced debugging skills
  • Knowledge of x86 hardware, system software and system services
  • Experience in HPC cluster configuration, management, upgrade and migration
  • Knowledge of Managing parallel file system Like Luster, BeeGFS ,GPFS
  • Scripting and automation – bash, Perl, Python
  • Knowledge of ITSM processes