header

High Performance Computing

 
C-DAC Logo
 

National Supercomputing Mission | HPC Systems and Facilities | High Speed Interconnect and Accelerator Technologies | HPC System Software | HPC Applications | HPC Solutions and Services | Trainings/Workshops on HPC | Cloud Computing | Big Data


National Supercomputing Mission

Cabinet Committee on Economic Affairs (CCEA) approved the project titled “National Supercomputing Mission (NSM) : Building Capacity and Capability" on April 9, 2015 to be implemented jointly by the Ministry of Electronics and Information Technology (MeitY) and Department of Science and Technology (DST) with Indian Institute of Science, Bangalore and C-DAC being the executing agencies.

The National Supercomputing Mission envisages harmonizing the efforts of stakeholders involved in R&D efforts in HPC through nationwide centralized coordination. C-DAC is entrusted with building systems indigenously. NSM is divided in four verticals: Facilities and Infrastructure, Applications, Human Resource Development and Research and Development. C-DAC has prepared the (Build Approach) RFP for Phase-I and Phase-II. Phase-1 includes build and deploy of 3 systems at IIT Kharagpur (1.3 PetaFlops), at IIT BHU, Varanasi (650 TeraFlops) and IISER Pune (650 TeraFlops). In phase-2, multiple HPC systems with cumulative compute power of 10 PetaFlops is planned.


HPC Systems and Facilities

PARAM Yuva - II (C-DAC's National Supercomputing Facility)

C-DAC upgraded its PARAM Yuva system (54 TF/s) to PARAM Yuva II system (529 TF/s) through the use of Many Integrated Core (MIC) accelerator technology. The upgrades system uses the same amount of electric power as its predecessor. C-DAC's PARAM Yuva II HPC system has executed 2, 70,732 jobs as of end of March 2018. Utilization of PARAM Yuva II has always remained above 95%. PARAM series has been acknowledged in 247 publications and 34 PhDs. More than 60 HPC applications from various science and engineering domain were ported and optimized for PARAM Yuva II.


Establishment of Supercomputing facility at IIT Guwahati:

C-DAC setup a centralized supercomputing facility titled PARAM - Ishaan with peak computing power of 240 Tera Flops with 300TB storage at IIT, Guwahati under the NE funding scheme of MeitY. Presently 400 users from IITG are extensively using this system for their research. C-DAC conducted two workshops and trained around 150 faculties and research scholars from IITG in the area of HPC.

supercomp


PARAM SHAVAK

C-DAC indigenously developed PARAM Shavak - a compact and energy efficient supercomputing solution for the academic and research institutes. Till now around 50+ PARAM Shavak systems have been deployed and these systems are being used by the students and faculties of the institutes to carry out research in HPC and Deep Learning technologies. C-DAC conducted workshop for PARAM Shavak and 500 users were given training on PARAM Shavak.

GPU


PARAM Bio-Blaze

PARAM Bio Blaze, yet another supercomputing system, was launched at C-DAC on February 18, 2014 for enhancing various research capabilities in bioinformatics, enabling research projects with scientific, academic and industrial collaboration. It is a blade based system with peak compute performance of 10.65 TF. It has 32 compute nodes with 16 cores of Intel Xeon processor running at optimum 2.6 GHz. Compute nodes communicate with each other over 56 Gbps high speed FDR Infiniband interconnect. 20 TB of scratch storage is mounted on nodes using the same 56 Gbps link so that the disk input/output is fast. Among other applications in Bioinformatics, Param Bio Blaze will also help capture the movement of molecules and interaction between two molecules.

Bio_blaze


HPC Sangam LAB

HPC Test bed cluster with advanced technologies for development of indigenous tools and software is created at C-DAC, Pune under the National Supercomputing Mission. Indigenous software stack based on open source software is deployed on the test bed cluster. Customization of different packages has been done to suite the requirements of the HPC users.


HPC for Science & Engineering, Capacity Building and High End Computational Research at NIT, Sikkim

C-DAC established Computational Resource Centre for capacity building and high end computational research with advanced technologies at National Institute of Technology (NIT), Sikkim. The Centre has a peak performance of 15 TF with 40 TB storage. It was inaugurated by Shri Shriniwas Patil, Hon'ble Governor of Sikkim on April 14, 2016 at NIT Sikkim located at Ravangla. Four workshops on HPC and parallel processing were also conducted at NIT Sikkim.


Establishment of Supercomputing Centre at Tezpur University, Tezpur, Assam

C-DAC established a hybrid technology based HPC facility at Tezpur University. The facility has 12 TF of compute power using 2 PARAM Shavaks with advanced accelerator technologies. C-DAC also conducted two workshops on HPC at Tezpur University following inauguration of HPC facility in August 2016.


Development and Utilization of Bioinformatics Resources and Applications Facility

C-DAC has established Bioinformatics Resources and Applications Facility (BRAF) to provide services in the area of genome analysis, molecular modeling and systems biology including maintenance of databases, software, and high-end computing resources with application software. BRAF computing facility is servicing more than 150 active users from IITs, Universities, and Government labs among others. A semi-empirical code such as MOPAC to meet cloud requirement was developed and deployed on cloud testbed as SaaS. BRAF facilitated promotion of basic and applied research in computational biology, doctoral and post-doctoral fellowships. This is one of the initiatives towards the growth of the bioinformatics industry in India which has contributed to the genome based drug discovery in the Indian pharmaceutical sector.


High Speed Interconnect and Accelerator Technologies

Trinetra: Next Generation HPC network

C-DAC is carrying out development of next generation indigenous HPC interconnect called “Trinetra” for efficient inter-node communication between compute nodes under National Supercomputing Mission (NSM). The next generation network is being designed for performance, power efficiency and support for large scale systems. C-DAC developed a Proof of Concept (PoC) platform called Trinetra-I, capable of supporting six 40 Gbps channels (240 Gbps full duplex switching performance) which would be used as validation platform for experimentation of various architectural concepts.

trinetra

Trinetra-I – PoC Platform for indigenous HPC Interconnect


Reconfigurable Computing System (RCS)

RCS is a FPGA (Field Programmable Gate Array) based high performance application accelerator card for accelerating applications. This energy efficient card supports Linux and Windows Operating Systems. The FPGA-based RCS cards designed and developed by C-DAC have been incorporated as accelerator cards in a number of HPC systems commissioned by C-DAC in the country and is part of PARAM-Bilim supercomputer deployed by C-DAC at Kazakhstan in July, 2015.


HPC System Software

System Software Development for NSM Petascale Systems

A System Software Laboratory (NSM-SSL) is envisaged to be setup as part of NSM project. Following software for NSM HPC clusters are under development:

nsm

System Software Stack for NSM HPC Systems


MLStack - A scalable machine learning framework on Heterogeneous HPC Clusters

C-DAC's Machine Learning Stack (MLStack) is an automated integration of state-of-the-art open source Machine Learning and Deep Learning tools and frameworks, which facilitates deployments on modern computing infrastructure with ease and comfort. It aims to enable novice users to leverage full power of existing tools on latest computing frameworks (Hadoop, Spark, MPI, OpenMP, CUDA etc.), which accelerates the path to informed decisions. MLStack would blend the Big Data technologies onto heterogeneous HPC resources that are well suited for structured, unstructured and streaming data with enhanced speed and flexibility for adhoc data exploration, discovery and analysis. MLStack will be made available on heterogeneous HPC Clusters in the form of APIs/ libraries.

heterogenous clusters

C-DAC Machine Learning Stack on Heterogeneous HPC Clusters


Integrated Cluster Solution (InClus)

InClus is a cluster management and monitoring software developed by C-DAC which helps to seamlessly install, manage and monitor HPC clusters. It facilitates monitoring of HPC resources such as CPUs, storage, network, user jobs, etc. InClus web based user interface is simple to use and helps in managing multiple Linux cluster systems from a centralized location. Key features include development platform with parallel and serial libraries, compilers, debuggers and profilers, industry standard resource manager and scheduler, policy based accounting and accelerator based support.

Inclus

InClus Framework


Hybrid IDE (HiPAD)

HiPAD is an Integrated Development Environment, developed by C-DAC, for writing hybrid codes on configurable heterogeneous clusters. It provides a single interface having all the functionality required for developing hybrid parallel programs. It includes a web based IDE that is compatible with different browsers and makes the target clusters accessible over internet to remote users.


Automatic OpenCL Program Generator (OpenCLGen)

OpenCLGen is a software service developed by C-DAC to automatically generate OpenCL program from the kernel code. OpenCLGen service takes the kernel code and kernel parameters as input and provides the complete OpenCL program as output. It improves the productivity by automatically generating complex OpenCL codes.


Hybrid Cluster Monitoring Tool

Monitoring accelerator-based hybrid clusters is imperative for early detection of any service degradation to enable immediate rectification. The Hybrid Cluster Monitoring tool is a pluggable and customizable monitoring solution for heterogeneous multi-accelerator clusters, which can be independently used and can also integrate with other third party tools. It enables monitoring of CPU, GPU and FPGA accelerators, network, storage, user jobs and other relevant services of a heterogeneous cluster. It is extendible to monitor any new accelerator/device, provides facility to analyse archived data and has alert facility for faults/degradations of resources/services.


Hybrid Cluster Scheduler

The hybrid cluster scheduler is unique in the sense that it considers all the accelerator resources (GPUs, FPGAs) with the CPU while allocating resources for a job. It takes into account the applications requirements of diverse computational resources and provides the best fit match for its execution. The scheduling algorithm is designed to improve the cluster utilization considering multiple parameters such as job type, job age, resource status (availability, load, memory) and information of prior job executions and availability of alternate resources to allocate the resources. In this manner, it offers better turn-around time for jobs and improved resource utilization.


HPC Applications

Forest Fire Spread Simulation in part of North Sikkim using HPC

C-DAC Pune is carrying out a project on Forest Fire Spread Simulation in the state of Sikkim jointly with IIT Kharagpur and DST Sikkim. The project is funded by Ministry of Electronics and Information Technology, Govt. of India.

Recently C-DAC Pune carried out live forest fire spread simulation exercise in north Sikkim region in coordination with forest department in last week of January 2023.

On 27th January 2023 immediately after discussion with Forest department, Sikkim about the spreading fire in North Sikkim region, fire spread simulation was carried out for 24 hours based on the latest fire alert location available from the satellite. WRF SFIRE model was executed on PARAM SEVA HPC system for fire spread simulation. Total 3 nodes (48 cores/node) were used and the computation time was 04:30 hrs.

The forecasted fire spread area polygon was shared with Forest department to let the department know the potentially affected area. The forecasted burnt area was then compared with actual burnt area when the satellite pass (Sentinel 2) was available on 29 January 2023. A good match between actual and forecasted fire area was observed (forecasted fire area 1.5 km2 vis vis actual fire area 1.07 km2).

This activity was a demonstration of the coordination between different teams (including user agency) to forecast and disseminate the forest fire spread area faster than real time.

Forest Fire Spread Simulation


Real Time Weather System (RTWS)

"Anuman" (http://rtws.cesgroup.in/) comprises daily operational weather products in real time. It provides high-resolution (12x4 km grids) weather simulations over Indian sub-continent along with daily and 6-hourly weather forecasts over nearly 50,000 locations. Real time operational forecasts have been carried out daily using C-DAC’s PARAM Yuva-II. Cyclone Roanu was formed on May 17, 2016 and dissipated on May 23, 2016. The case was simulated with Real Time Weather System (RTWS) data. The track forecast was simulated well by the model. Outputs from Anuman are also being used to provide micro climatic data which is very useful for farmers.

Real Time Weather

Tracking of Cyclone Roanu by Real Time Weather System


Seasonal Monsoon Forecast

Since 2005, C-DAC is one of the stakeholders in Extended Range Monsoon Prediction Program of DST. It has been issuing extended range prediction of Indian summer monsoon using a National Center for Environmental Prediction (NCEP) T170/L42 global model. The seasonal summer monsoon forecast for the year 2016 using May conditions is shared with Indian Meteorological Department for the official monsoon forecast.

Short Range Weather Information Services for Agro Ecological Units of Kerala State C-DAC has automated short range real time weather forecast for Agro Ecological Units of all districts of Kerala state ( https://www.rtws.cesgroup.in/kaalavastha ). Daily weather forecasts are simulated on 640 cores PARAM YUVA-II. Agro advisory information system is prepared with the help of agricultural scientists and meteorologists based on forecasted weather and is made available for each district through the kalavastha portal.


Near Real Time Urban Flood Forecasting

In collaboration with IIT Bombay and support from Ministry of Earth Sciences (MoES), C-DAC has started development of urban flood forecasting system in Mumbai using regional weather model along with hydrology modeling system. Sensitivity analysis of WRF-UCM model was setup on PARAM Yuva-II and simulation of heavy rainfall cases were carried out.


Impact of Urbanization on Current and Future Heavy Rainfall over Urban Cities in India

This initiative is targeted at understanding the increasing urbanization effects on different meteorological disasters' frequency and intensity in the coming few years. A coupled model WRF-UCM is being adapted for PARAM YUVA-II to establish a tool for assessing the long term impacts of urbanization due to change of land use land cover over urban areas.


Panorama - GIS based Marine Visualization and Forecast System

C-DAC is developing a software named "Marine Forecast and Visualisation System - Panorama" to provide naval vessels with high resolution weather forecasts for optimal voyage planning. The complete automated system has real time data download from multiple sources, database management, state of the art data compression, multi-parameter visualization, extreme event analysis, alerts and real time data dissemination. This can also be customized for land based installations requiring such forecast for various contingencies.


Glacier Lake Management and GLOF Early Warning System for Sikkim

C-DAC has developed Glacier Lake Management and Glacial Lake Outburst Floods (GLOF) Early Warning System and deployed in Sikkim. It is useful for giving timely warnings to administration for evacuating people in case of overflow of glacial lakes. The ultrasonic level sensors to monitor the water level in glacial lakes on real time basis have been indigenously designed and developed by C-DAC. Currently, the sensors have been deployed at Kuppup Chho and South Lhonak Chho. Sensor communication and data transmission has been established with the base station at Gangtok.


Land use-Land cover (LULC) estimation

C-DAC contributed to land-use and land-cover (LULC) estimation for the Western-Ghats and Krishna river basins of India for three decades i.e. 1985, 1995 and 2005 under ISRO – Geosphere Biosphere b 4programme. The dataset prepared using multi-temporal and variable (medium) resolution satellite imageries is the first of its kind and forms a very strong basis for future scientific LULC simulation endeavors. Based on this initiative, LULC, Socio-economic and Climatic database for the year 2005, 2010 and 2015 at taluka level and future LULC for the years 2015 and 2025 for the Western-Ghats and Krishna river basins were prepared.

LULC

Land Use Land Cover Dynamics for Western Ghats of India


UrbAir India

C-DAC enhanced its UrbAirIndia expert system that deals with various components of air quality management viz. air quality monitoring, emission inventory, dispersion and receptor modelling and multiple scenario analysis. This web-based GIS enabled system developed in collaboration with Central Pollution Control Board (CPCB), provides useful inputs to policy makers, environmental researchers and general public. Presently, the system is being used by IIT Bombay and Maharashtara Pollution Control Board.

Urban air

UrbAirIndia – A decision support system for Indian Urban Air Quality Management


Weather Forecast Applications

C-DAC carried out enhancements in the following weather forecast applications


RNAseq analysis of breast cancer data

C-DAC developed a pipeline for RNAseq data analysis for differential expression and carried out analysis that helped to identify the genes and pathways involved in hypoxia response in breast cancer. Samples of breast cancer from Tata Memorial Hospital were analysed to understand the effect of progesterone as a therapeutic agent. This case study aided in understanding the complexities in handling large volumes of data in HPC environment.


Modeling network of gene responses to abiotic stress in rice

Abiotic stresses are the major causes for lower productivity in rice and it accounts for 50% yield loss. In India, salinity and high temperature stress are two important abiotic stresses which need immediate attention. To overcome the computational challenges involved in analysis of high-throughput data sets of gene responses to abiotic stress in rice, C-DAC is developing GRN analysis algorithms using its Bioinformatics Resources and Applications Facility (BRAF).


Development of BCG Vaccine and Complementary Diagnostics for TB Control in Cattle

Tuberculosis infection in cattle remains a major problem in both developed and developing nations. In addition to being a cause of huge economic loss in livestock farming, Mycobacterium bovis infection can spread from infected cattle to humans by aerosol or by consumption of contaminated dairy products to cause zoonotic tuberculosis. This project in collaboration with University of Surrey, UK and TRPVB, Chennai aims to generate a synergistic vaccine and diagnostic approach using HPC systems. This will allow the vaccination of cows without interfering with the surveillance of bovine tuberculosis using advanced sequencing approaches.


OpenFOAM

OpenFOAM is an open-source general purpose software suite for Computational Fluid Dynamics (CFD) computation and is fully parallelized using MPI. It is widely used in academia, R&D institutes and industries. The code was ported and benchmarked on GPGPU as well as on MIC architecture to run it in native and symmetric modes. In native mode, it runs on the Xeon-Phi coprocessor, whereas in symmetric mode it runs on both the host processors as well as on Xeon-Phi co-processors.

iteration

The figure shows parallel benchmark study of an IcoFoam solver of
OpenFOAM on Xeon host processor for a lid driven cavity (50 million cells).

gpu

Flow streams over Ahmed Body computed by OpenFOAM on GPGPU


OpenSEES

Open System for Earthquake Engineering Simulation (OpenSEES) software, an open-source software for geotechnical and earthquake engineering simulation, was ported on Xeon-Phi architecture and its performance was analysed. This analysis will help in making OpenSEES available on new hybrid HPC system and to carry out earthquake simulation studies for different structures.

xenon

Comparison of results of OpenSEES on Xeon and Xeon Phi


Parallel Signal Processing Software for Ooty Radio Telescope (ORT)

The Ooty Radio Telescope (ORT) operated by NCRA-TIFR was the first large radio telescope built in India. C-DAC collaborated with National Centre for Radio Astrophysics (NCRA), Pune and Raman Research Institute (RRI), Bangalore for developing its parallel signal processing software on hybrid architecture for upgradation of ORT.

The challenge addressed by C-DAC in this initiative was to handle the data generated at the rate of 62 GB per second. The data was reduced for easy storage and handling using FFTs and Correlation operations. Such a massive signal processing was realized through the use of C-DAC’s HPC systems. The system was used to calculate FFTs of 264 channels on host, offload correlations to MIC (Xeon-Phi) cards, and to understand the communication profile between host and MIC. The development of data transfer module between Xeon and Xeon Phi processors was successfully completed by C-DAC for this activity.


AcoMod on Xeon Phi Platform

AcoMod is a C based Acoustic Modelling code using MPI and OpenMP. Various optimization techniques were applied to the code and profiling was done using Intel Vtune XE tool. Improvement in performance was achieved using the following levels of optimizations (on Sandybridge):

Performance of 4.65X was achieved w.r.t. the base line MPI code (which is 1092 seconds) and 1.55X w.r.t. OpenMP code (which is 364.63 seconds) using 15 threads on single node. Removing MPI from the code and using only OpenMP improved the performance further. Memory allocation issues were addressed to get memory bandwidth on both sockets resulting in speedup of 6.6X (w.r.t. only MPI baseline performance). Optimized AcoMod was successfully ported on Xeon Phi (in native execution mode). This reduced the execution time of the code from almost 11 hours to approximately 10 minutes. The code was executed using 60 cores on one Xeon Phi card.


Gromacs on Xeon Phi Platform

Gromacs is a molecular dynamics code used for simulation of biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions. It simulates the Newtonian equations of motion for systems with hundreds to millions of particles. This code is an open source code primarily written in C and parallelized using hybrid OpenMP and MPI models, with the compute intensive parts written in Intel intrinsic instructions. Speedup of AcoMod on Sandybridge with different optimizing techniques Execution of AcoMod in native execution mode on single Xeon Phi card As Intel intrinsic instructions are architecture specific, the performance drop in the code was observed while running the code on Xeon Phi platform (PARAM Yuva II). A version of Gromacs in which the hotspot was coded using 512 bit Intel intrinsic instructions for Xeon Phi was obtained and integrated into existing version of Gromacs. The code was then entirely recompiled as the input data generation steps contained Gromacs tools, which are version specific. Benchmarking of the compiled version of code was done both for host and Xeon Phi in native execution mode using Gromacs standard benchmarking input.


WRF on Xeon Phi Platform

The Weather Research and Forecasting Model (WRF) is an atmospheric model, which is widely used for numerical weather prediction. The model employs advanced physical parameterizations, which facilitate modeling the atmospheric processes from global to mesoscale with spacing down up to 100 metres. Better and faster weather forecasting requires huge amount of computational resources. Recent computational accelerators like Intel Xeon Phi and Nvidia GPGPU, which have higher power efficiency, provide a better platform for weather forecasting applications.

As part of short-range weather forecasting research, the complexities involved in porting and execution of high resolution WRF model on Intel Xeon Phi platform (PARAM Yuva II) was studied. Studies were done to evaluate the performance of high resolution Nested (12 km and 4 km) and Single (3 km) domain WRF model configurations in host, native and symmetric execution modes. Scalability studies using varying nodes and KNC threads were carried out. The execution environment has been optimized for host and symmetric mode execution and the performance bottlenecks were identified. It has been observed that single domain configurations were better suitable for many core based Xeon Phi accelerators. This was the first of a kind operational full WRF implementation on Xeon Phi.


ROMS on Xeon Phi Platform

Ocean modelling is an inherently complex phenomenon within the earth system framework, which poses a challenge to the computational scientists. The computational requirements for ocean state forecasting are high due to the spectra of the scales of motion. Regional Ocean Modelling System (ROMS) is a terrain following ocean model which solves the Reynolds averaged Navier-stokes equation. Early experiments to evaluate the performance of ROMS on Intel Xeon Phi platform (PARAM Yuva II) have been performed. Model simulations were conducted in native and symmetric execution modes on Xeon Phi. Performance profiling of the code has been done to determine the bottlenecks, and possible improvements to achieve higher performance has been identified.


VASP on Xeon Phi Platform

The Vienna Ab initio Simulation Package (VASP) is a code for atomic scale materials modelling, e.g. electronic structure calculations and quantum-mechanical molecular dynamics, from first principles.

Native compilation and benchmarking of VASP 5.3 code on Xeon Phi platform (PARAM Yuva II) with gamma and All K points input files was done followed by explicit threading with OpenMP for routines taking most time. The code was run in native execution mode and best results were obtained with total of 240 MPI processes on 8 Xeon Phi cards.


HPC Solutions and Services

C-DAC deployed HPC solutions and offered HPC related services to various national and international agencies. The details regarding some of the deployments are given below:


Trainings/Workshops on HPC

Capacity building through Internship Scheme for students of NE India for strengthening R&D in HPC

C-DAC has established facilities in HPC in North-Eastern States including Assam (NIT Silchar, Assam University, Tezpur University, Assam Engineering College), Meghalaya (North Eastern Hill University, NIT Meghalaya), Sikkim (NIT Sikkim), Tripura (NIT Agartala) and imparted training in the area of parallel computing to enable the students in the North Eastern region for proper use of HPC system. More facilities in HPC are under deployment in various institutions in NE region.


Cloud Computing

SuMegha Cloud Builder

SuMegha Cloud Builder is a tool to install cloud stack automatically for building private cloud. Sumegha Cloud Builder was enhanced with cloud middleware openStack support. Mitaka version of OpenStack was installed on Scicloud Testbed machine using CentOS 7.2 and MPI and Hadoop Clustering and submission of Job through Job submission Portal for Cloud (JSPC) was completed.

sumegha

Sumegha Cloud Builder


Meghdoot - Software Suite for building Cloud Computing Environment

MeghDoot is a comprehensive software suite designed and developed by C-DAC for building cloud computing environment. Key features includes service provisioning and deployment, ease of management through web services and enhanced security etc. It also provides simplified graphical installation and configuration of cloud, exhaustive monitoring, metering, simplified management of resource and services, inclusion of security features focusing on data in transit, data at rest, multi-level authentication and authorization, high availability across all services and resources, backup and disaster recovery solutions etc. Eucalyptus based Meghdoot Cloud environment was enhanced with Openstack based Meghdoot Cloud environment at Tamil Nadu State Data Centre, Chennai. Management, maintenance and support activities were also carried out for existing cloud environment in Green Mini Data Centre (GMDC), Sabarkantha District Panchayat, Himmatnagar.


Cloud Connect

Cloud Connect is an easy to use web interface for connecting clouds and simplifies the use of Infrastructure-as-a-Service (IaaS) feature of cloud. It abstracts creation of security group, management of network topology, creation of virtual machine, elastic block storage and snapshot, and automatic mounting of elastic block storage to virtual machine.


Cloud Vault

Cloud Vault is an enterprise-class cloud storage solution offered as Storage-as-a-Service. Users and organizations can use Cloud Vault to store large data efficiently, safely and cheaply. Its key features include Single Sign On (SSO) authentication using user mail-id, object-based storage, file and directory operations support, data isolation, reliability, high availability and 2-way redundancy for data, and multiple client interfaces such as web, java API’s and command line.

cloud vault

User Interface of Cloud Vault


Online NGS Tool for Scientific Cloud

Next Generation Sequencing (NGS) is used to analyze and process the data produced as a result of genome sequencing. Generally, the datasets produced are huge and require huge computation power and other resources. Analysis and processing of NGS data requires a work-flow where the results of one step need to be pipelined to the next step for further processing. Online NGS tool is a web-based pipeline for genome sequence analysis. It is hosted on C-DAC’s scientific cloud and is provided to the users via the Internet. Online NGS tool works on MPI-enabled virtual clusters to provide maximum computation using parallel approaches and provides storage for huge sequenced files. It comes integrated with tools for pre-processing, mapping/aligning and manipulating sequenced datasets.
Its key features are:


Indian Banking Community Cloud (IBCC)

Banking, Financial Services and Insurance (BFSI) sector benefits from cloud computing as it:


Multi-Site Disaster Recovery as a Service on Cloud

C-DAC is working on a solution for Multi-site Disaster Recovery-as-a-Service (DRaaS) on cloud for state and national data centres having cloud infrastructure. This leverages cloud technology to provide logically centralized and physically distributed "Disaster Recovery-as-a-Service" model for service continuity of e-governance applications.


Open Stack based Infrastructure as a Service (IAAS) Facility

A state-of-the-art computing and storage infrastructure for services and research was installed at Kolkata Centre that includes Tier II data centre with provision of 8 high density racks, 45 TB SAN storage and networking equipment including firewall, router with 10 Gbps backbone.


National Grid Computing initiative-GARUDA

GARUDA (Global Access to Resources Using Distributed Architecture) provides pan-India e-infrastructure to catalyze the research in science & engineering. Users belong to virtual organizations such as Bioinformatics, Computer Aided Engineering and Open Source Drug Discovery community, etc. Activities involved provisioning the trending architectural components such as the Science gateways, Visualization gateways and Data grid solutions which have largely contributed to overall utility of the grid infrastructure worldwide. Grid operational activities include constant monitoring and management of grid components along with user support.


Big Data

C-DAC's Big Data Software Suite (Desktop Version)

C-DAC's Big Data Software Suite (C-BDSS) (Desktop Version) is an open source platform that provides the processing and analysis capabilities to run Big Data applications in varied domains. It enables novice users to leverage power of Hadoop and its ecosystem components including Spark and accelerates the path to informed decisions. It comes with modest set of Big Data analysis tools which have been chosen for ease-of-use and computational power. C-BDSS was launched during National Conference on Parallel Computing Technologies during February 23-24, 2017.


Framework for Healthcare Analytics

C-DAC has developed a Big Data analytical framework that uses multiple inputs of health care data for deriving metric based insights. Such insights enable healthcare providers (hospitals, doctors) and funding agencies to standardize best practices on medication, improve the patients’ experience and institute preventive and corrective measures in the field of healthcare. Key features include Open Refinery, Infection Control Registry, Heat Map, Symptom Based Registry, Emergency Patient Timeline and GTM Search Engine. Test deployment of the solution was carried out at AIIMS, New Delhi.


Big Data Analytics Framework for Rice Genome Analysis

C-DAC developed a Big Data Analytics Framework to analyze the genetic variations (SNPs - Single Nucleotide Polymorphisms) across varieties of rice genomes with origin from different countries and to visualize the level of genetic variations among them by applying machine learning techniques. The data was taken from International Rice Informatics Consortium (IRIC) and the framework is developed using a combination of pre-processing, processing and post-processing tools. Pre-processing and Post-processing components are developed in-house using Python and processing is done using VariantSpark, a machine learning analysis framework for genomic data.


DPICT Visualizer – Tool to assist Drug Discovery

DPICT Visualizer is a standalone application developed by C-DAC to assist Drug Discovery. It facilitates researchers to visualize multiple simulation trajectory data in accelerated and efficient way. Its key feature is to load multiple trajectory files simultaneously so as to view them together and perform operations on them. The application supports AMBER and GROMACS Trajectory formats and can load pdb files to view molecular structure in ribbon, cartoon and wire rendering options. Various colour coding schemes for the structures according to the users' choice are also incorporated. 


NEtwork relationship Using causal ReasONing (NEURON)

NEURON is a tool developed by C-DAC that focuses on deriving gene regulatory networks. A gene regulatory network is a collection of genes/molecules and their interactions which together control their functionality. NEURON provides an easy to use interface and helps researchers understand the process of identifying causality in a gene, the relationship of cause and effect. The statistical significance of predictions has been tested using multinomial coefficients derived from randomized data sets. NEURON has been used in studying 70,000 varieties of rice crops in collaboration with the Indian Council of Agricultural Research. DPICT Visualizer and NEURON were launched during the event named "Accelerating Biology 2017: Delivering Precision" held at Pune during January 17-19, 2017.


MOlecular Structure GenerAtor In the Cloud (MOSAIC)

MOSAIC is an OpenStack cloud based conformational search tool developed by C-DAC to explore potential energy surface of biomolecules of interest in parallel mode using semi-empirical method. The tool is useful for finding the target drug ligands. The torsion angle driven conformational search method is useful in a range of chemical design applications, including drug discovery and design of targeted chemical hosts. MOSAIC has easy-to-use interface for the bioinformatics community over Software as a Service (SaaS) platform. MOSAIC was launched during 30th Foundation Day of C-DAC on March 27, 2017 at Pune.

mosaic

MOSAIC- Cloud based Conformational Search Tool


A Big Data Platform for Graph-based Pharmacogenomics Data

Pharmacogenomics studies are widely adopted in clinical practices and these help in understanding the effects of drug and its dosage based on individual's genetic makeup. C-DAC has developed Big Data platform by integrating the existing pharmacogenomics data from multiple sources. A web application has been developed with an easy to use interface for querying this integrated database and to visualize results graphically.

pharmacogenomics

Visualization of Pharmacogenomics data using Big data Platform


H-bond Bigdata Analysis Tool (H-BAT)

Molecular Dynamics (MD) is a computational technique that utilizes Newton’s equations of motion to study the dynamics of various biomolecules and is commonly used by structural biologists. Currently, there is a need to have advanced analytics platforms and algorithms which can analyze data faster and more efficiently. C-DAC implemented an algorithm within the map and reduce paradigm to calculate hydrogen bonding (including water-water interactions) in large trajectories. Benchmarking of the algorithm brought out a linear scalability with up to 5TB of data.