In particular the mission of the HPC department of Cineca is to accelerate the scientific discovery by providing HPC resources, data management and storage systems and tools. It also provides expertise on numerical simulation and data science in an Open Innovation paradigm. Member of BDVA, ETP4HPC, PRACE, core partner in EUDAT and Elixir, partner in Human Brain, Fortissimo2 and I4MS, in 2016 the HPC department of Cineca supported 1140 research projects and directly participated in 31 EU research projects, 40 research agreements and 12 industry projects.
The IOP4HPDA is made available by the HPC department of Cineca, as an environment for Big Data for research and innovation.
The DBGroup is the research database group at the Department of Engineering “Enzo Ferrari” of the University of Modena and Reggio Emilia and it contributes to the IOP4HPDA with researchers and tools.
PLATFORM AND SERVICES INFORMATION
Software Big Data Apache suite AAS; Deep Learning: Caffe, Theano, TensorFlow optimized for platform hardware characteristics in collaboration with Intel and Nvidia; Data Analytics: R, H2O, Octave, math libraries, I/O libraries; organized data repository for data deposit, retrieval and preservation; high performance data base PGSQL, MySQL, NEO4J; specific tools for bioinformatics (NGS pipelines) and for visualization.
- Exploitation of the Infrastructure
open access to HPC/HPDA storage and computing resources, Cloud Computing, Computing in batch, interactive and streaming modes.
- Advanced middleware and software tools
- Data management
collection, preparation, annotation, curation, linking, security, access control, long-term preservation, post-processing.
- Data analytics
Predictive modeling, Supervised and unsupervised learning, Association rules, Sequential patterns, Link analysis, Recommenders, Natural Language Processing, Named Entities Recognition, Information Extraction, Automatic classification, Sentiment Analysis, Semantic metadata generation, Automatic annotation, Speaker segmentation, Automatic Speech Recognition, Video segmentation, Keyframes extraction, Semantic metadata generation from video items, Image recognition.
Remote visualization, Computer vision and visual computing, Computer Graphics, 3d modeling and rendering, Immersive device programming, Render farm service, Virtual Reality, Augmented Reality, Virtual museum and exhibition design.
- User support and Specialist support
covering different scientific fields, technologies, programming languages, and techniques.
- Training and Education
Specialized training (workshops on massive data analysis and international summer schools on parallel computing, data analytics and computer graphics), Cooperation with universities (lab activity for master programs, post-doc programs), Knowledge transfer during the projects life cycle.
- Technology transfer and consulting
Development of proof of concept and innovation projects for businesses to demonstrate the added value and ROIs.
SELECTED PROJECTS AND/OR SUCCESS STORIES
The algorithm that the Large Insurance Company was using took many hours and would not allow them to calculate the risk measurement with a nested Monte Carlo approach. In fact, nested Monte Carlo,
involve two stages, scenario generation (outer stage) and portfolio re-valuation (inner stage),
that produce millions of Monte Carlo trajectories to be executed for each of the millions of life policies.
The simulation becomes very quickly a computational challenge. The Large Insurance Company contacted the HPC department of Cineca – IOP4HPDA for a PoC to demonstrate the improved efficiency that could be obtained with efficient code parallelization and optimization. Nested Monte Carlo with parameters 100000x100 for all the 12M of policies could be achieved. The Large Insurance Company then decided to establish a commercial contract with Cineca for the provision of the service.
In the PRESERVE project, which has been funded within the Fortissimo EU project, sensor data from TEXA on-board diagnostic tools have been analyzed in order to identify the driving habits on one hand, and patterns of operating parameters that are predictive of failures and damages on the other hand.
The result is a portfolio of prototypes of service that can predict failures, mechanic problems or damages at the components level and offer the manufacturer very detailed information to better re-design or upgrade spare-parts or vehicle. The Return on Innovation Investment (ROI2) for Texa from this project has been estimated as 2,72.
In the first experiment, a sample of 8,600 enterprises’ websites was “scraped” and the acquired texts (more than 200 M textual records) were processed in order to extract the same information that is provided by the standard questionnaire “Survey on ICT Usage and e-Commerce in Enterprises”. The results were encouraging, with a satisfactory predictive capability of the fitted models. Within this experiment, the HPC department of Cineca – IOP4HPDA developed novel methods for extracting information from unstructured data and the usage of supercomputers resulted in significantly reducing the needed computational time. As a result a more extensive application of the approach in a “Census like manner”, by considering all the Italian enterprises, has been performed and ISTAT is now evaluating the production phase.
In this context, data from social media were collected and analyzed to decipher emotions, opinions and judgments and to provide the Caserta Royal Palace with a real-time reputation monitoring system that is also interactive on historical data and past events. The system tracks the topics being discussed and the sentiments being expressed and can be used to assess the impact of events and communication strategies.
Among the scientific research projects that the HPC department of Cineca supports many can be reported as being both very successful and data intensive projects, eg EMODnet (European Marine Observation and Data Network) and SPHINX (Data Storage and Preservation of High resolution climate experiments).