A Hierarchal Framework Offering Insights via Single view of HPC Systems Under NSM
Introduction:
Under NSM (National Supercomputing Mission), national academic and R&D institutions will going to get over 70 HPC (High-Performance Computing) facilities.
"High Performance Computing most generally refers to the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business." It will be extremely important from the governing point of view as well as from the administration point of view that, how these large number of system behaving individually and collectively. To know the behavior of such systems, collection of data from (possibly) every system component and at very finer level (in terms of duration/period) will be necessarily important.
The proposed framework will be instrumental in pursuing the above goal. This framework will not only provide insight into the System workflow but also about the Job behavior on the systems. Outcome of this framework i.e large amount of normalize data will be crucial for decision making in defining policies and strategy for the future HPC systems in the country.
Purpose of the document:
The purpose of this document is to provide insight into our proposed framework so that reader could give comments, suggestions. It explains about the architecture, category of the systems / sub-systems, request pay load and response. As a result of reader’s comment/suggestions, we will get inputs on existing features as well as get to know about the points that we have missed. We will welcome comments and suggestions at mailing list: npsfhelp@cdac.in with the below given Subject line. We request readers to not deviate from the provided Subject Line for the email.
Subject Line: A hierarchal framework offering insights via single view of HPC systems under NSM
How This Document Is Organized
- Terminologies being used – This section describes about the terminologies being used in the HPC systems.
- Objective - Provides overall objective of this document
- Architecture – Explains about the proposed framework’s 3 Tier architecture in detail
- Present Framework at NPSF – About the current data collection system for PARAM Yuva II at NPSF.
- Deliverables – Details about the tasks going to be accomplished in the project life cycle.
- Development Model- It gives detail about the request generated and what will be the response given by the adaptors with required payload.
- Annexure A - List of all the category and sub-category metrics which we are going to collect from the system.