Project Engineer - 4- 8 years Experience

Name of the Post Project Engineer ( 4-8 years' experience )
Specialization/ Domain Application Support
No. of Requirement 8
Location Pune
Qualification First Class B. E. / B. Tech. in Comp/IT/ Electronics/ Electronics & Telecommunication/communication/Electrical / Electrical & Electronics
OR
First Class MCA
OR
ME / M. Tech. in Comp/IT/ Electronics/ Electronics & Telecommunication/communication/Electrical / Electrical & Electronics
OR
Firs Class M. Sc. in Computer /IT
Post Qualification relevant Experience. For BE/B. Tech/MCA - 4 years post qualification relevant experience

For ME/ M. Tech - 1 years post qualification relevant experience

For M. Sc - 5 years post qualification relevant experience
Age 37 years as on last date of application
Skill Sets
  • Hands on experience in operating large scale compute infrastructure.
  • Working knowledge of cluster configuration managements tools
  • Experience with HPC cluster job schedulers such as SLURM, LSF
  • Understating of container technologies like Docker, Singularity, Shifter etc.
  • Proficient in bash scripting and working experience in python programming would be desirable
  • Proficient in Linux Operating System
  • Strong understanding of Linux administration
  • Working knowledge of workflows that use MPI
  • Experience with InfiniBand based networking
  • Understanding of fast, distributed PFS based storage systems like Lustre and Spectrum Scale for HPC workloads.
  • Understanding of HDFS, Spark and Kubernetes
  • Understanding of HPC cluster and system networking
  • Understanding and working knowledge of NFS, DHCP, DNS, SSH/SCP, boot over network, Ganglia, Nagios,
  • Understanding of GPU accelerators,
  • Understanding and working knowledge of system and network security
  • Troubleshooting and problem solving skill
  • Hands on experience with Linux operating system
  • Experience with parallel programming models - MPI, OpenMP, pthreads; Experience with GPGPU computing - OpenACC & CUDA programming,
  • Experience with AI frameworks: tensorflow, pytorch., Experience with HPC & AI applications compilation, installation, configuration, tuning & optimization on Linux based clusters.
  • Knowledge of Programming language : C/C++, python Knowledge of containers will be of added advantage.
  • Understanding of code review, compilers, debugging tools including Intel Parallel Studio, GCC, GDB, TotalView
  • Excellent communication skill (Verbal and Written)
Job Profile
  • Monitoring, management and optimization of the facility including hardware and software
  • Enabling and management of application workflows using docker containers,
  • Development of plugins for integration with RT and monitoring tools,
  • Automation of system administration tasks,
  • Provide user support for technical issues, data management, etc.,
  • System administration of dense GPU HPC-AI system, storage, network and associated infrastructure
  • Operational/Schedule maintenance of servers and system.
  • Troubleshooting of Hardware related issues
  • Installed software trouble shooting, patch updates, Customer application installation,
  • Regular node health check including analysis of performance, temperature monitoring.,
  • Infiniband, Ethernet troubleshooting including Cables, Controllers, Drivers, IP address clashes, reassignment etc.,
  • Storage maintenance and backup policies.,
  • Documentation of the GPU-HPC environment as well as documenting system administration policies and procedures (Weekly Report Generation).
  • Asset management ,
  • Vendor co-ordination
  • Manage, deploy and support HPC and AI application/frameworks on GPU based distributed computing clusters.
  • Work with users to customize applications and configure software development, integration and production environments to specification
  • Tune applications to optimize performance and reliability of services across the High- Performance Computing (HPC) ecosystem, Diagonse application problems quickly and effectively Automate administration procedures for routine and complex tasks
  • Provide backup HPC system administration support
  • Troubleshooting application execution through SLURM, K85 managed clusters
  • Develop and maintain programs and scripts to aid in the operation and automation of administrative tasks and workflows using Bash and Python
CTC per Annum *As per the industry standards based on qualification, experience, expertise, role etc.
  Apply Now

*C-DAC reserves the right w.r.t. to the pay to be offered to selected candidates based on the norms of C-DAC.

Back to previous page

Human Resource Department
Centre for Development of Advanced Computing (C-DAC)
Innovation Park 34/1, Panchavati, Pashan
Pune - 411 008