1 of 13

WP3 Status

Fabio Vitello, Fabio Gargano

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

Spoke 3 Technical Workshop, Trieste October 9 / 11, 2023

2 of 13

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

WP3 Objectives

This WP develops a prototype framework of data analysis, based on Machine Learning (ML) and Visualization tools exploiting diverse computing platforms and combining them with exascale applications. The framework will be tailored to the ACO-S community, identifying the use case best suited to tackle high-performance visualization tools and ML techniques. Furthermore, it copes with observational data coming from challenging large experiments.

Objective 2. Big data processing and visualization, via adopting innovative approaches (e.g. Artificial Intelligence, inference via Bayesian statistics) for the analysis of large and complex data volumes and for their exploration (e.g. in-situ visualization), capable of efficiently exploiting HPC solutions. (WP3, WP4, WP5)

3 of 13

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

WP3 Objectives

  • Action 2.2: To develop new methods and/or optimize existing prototype Machine Learning applications for the automated processing of large and complex data, produced on the Exascale systems by the ACO-S community.
  • Action 2.3: To develop and optimize existing solutions for high performance visualization, addressing on-site and remote visualization for Exascale and post-Exascale systems.
  • Action 2.4: To develop integrated solutions starting from the outcome of the previous actions, in order to provide a unique and optimized eco-system for Exascale platforms for big and complex data sets.

4 of 13

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

WP3 Tasks

T3.1: Requirements from AAA community

It will assess the requirements of the community and assess corresponding solutions in terms of Machine Learning and Visualization, exploiting the functionalities available in selected tools and/or the development of new functionalities.

5 of 13

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

WP3 Tasks

T3.1: Requirements from AAA community

Duration of the Task: September 1, 2022 - August 31, 2023

Outcome: A collection of requirement in terms of Machine Learning and Visualization exploiting the functionalities available in selected tools and/or the development of new functionalities.

The requirement will drive the development of use cases in T3.2 and T3.3.

https://docs.google.com/spreadsheets/d/1yhkCVYdK92WGq_iDOtNyLoxYd2bq_c5yGEIYPIfkP5A

6 of 13

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

WP3 Tasks

T3.1: Requirements from AAA community

Thematic groups

  • Time series: 1 use case;
  • Feature extraction: 4 use cases;
  • Bayesian inference: 3 use cases;
  • Deep learning: 20 use cases;
  • Visualization: 2 use cases;
  • Data-reduction & imaging: 1 use case;
  • Web Tools: 1 use case;
  • Not Classified: 4 use cases.

7 of 13

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

WP3 Tasks

T3.2: Innovative Machine Learning

This task is in charge of designing, implementing and evaluating ML components in ACO-S pipelines. Targets include off-line processes for the transformations and enrichment of data, such as classification, segmentation, reduction, and emulation. For each component targeted, the task will provide an adequate ML model, together with an efficient and scalable implementation. Finally, its performance (both in terms of task outcome and efficient computation) will be assessed, to produce a final evaluation on the advantages and disadvantages of the produced ML solutions. Together with industrial partners the task will address a) search of patterns using network analysis in very large, noisy datasets, b)anomaly detection, c) techniques of privacy preserving, deployed in federated learning services (Cybersecurity)

8 of 13

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

WP3 Tasks

T3.3: HPC/Cloud Visualization Services

The main emphasis will be on fast and interactive visualization, using services at the HPC facilities near the data, but with high speed delivery of the results to the user’s desktop. Such tools will be tested and integrated on the existing open-source database as https://smart-turb.roma2.infn.it/ which will be further developed during this project such as to host a large quantity of geophysical and astrophysical datasets. The ultimate goal is to provide visualization capabilities that are fully interactive and can access and visualize large data sets at a remote repository.

9 of 13

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

Criticalities (From Catania’s meeting)

  • Lot of declared effort, what about real participation?
  • Hardware: we need to access HPC resources to develop our codes, ASAP*
  • Hardware: we need precise specs to emulate the system that we will use, ASAP*

10 of 13

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

Criticalities (From Catania’s meeting)

  • Hardware: we need to access HPC resources to develop our codes, ASAP*
  • Hardware: we need precise specs to emulate the system that we will use, ASAP*

11 of 13

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

Criticalities (From Catania’s meeting)

  • Lot of declared effort, what about real participation?

12 of 13

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

Criticalities (From Catania’s meeting)

From just one talk at the Catania meeting to 13 talks!

13 of 13

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

Criticalities (Updated)

  • Lot of declared effort, what about real participation?
  • Hardware: we need to access HPC resources to develop our codes, ASAP*
  • Hardware: we need precise specs to emulate the system that we will use, ASAP*
  • Repository and instruction on the deployment of the developed softwares.
  • Do we need to implement CI/CD pipelines?