1 of 13

Learning Molecular Cloud emission with Neural nets

Giuseppe Puglisi (UniCT),

Avinash Anand, Marina Migliaccio, Domenico Marinucci (UniToV)

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

Spoke 3 General Meeting, Bologna December,18 2024

2 of 13

The Milky Way acts as a foreground wrt Cosmic Microwave Background (CMB)

Scientific Rationale

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

3 of 13

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

Scientific Rationale

  • Full sky maps of Galactic emissions are needed for cosmological observations.
  • There are regions that are not observed … yet
  • In the same area, CMB ground telescopes are observing…
  • Planck data observed full sky,BUT also full of noise

CMB

Unobserved

4 of 13

Cycle-GAN in a nutshell

= binary Cross-Enthropy

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

5 of 13

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

Technical Objectives, Methodologies and Solutions

  • Build training set from available data (Planck, HI4Pi )
  • Identify Galactic regions of bright emission, low noise contribution,-> high SNR >8
  • Create the training set from those areas
  • 10,488 (training),1166 (validat.) , 747(test.)

Training

Testing

Validation

6 of 13

Training Cycle-GAN

  • batch size= 16 (progressively increasing to 128)
  • 2 input channels (dust and HI)
  • 2 targets (CO J:1-0 and J:2-1)
  • training time performed on NVIDIA A100-SXM4-40GB (4GPUs @NERSC)
  • 3x3 deg2 maps (128x128)
  • added random gaussian noise (sigma=0.3)
  • 14,000 epochs
  • 80% accuracy

Methodologies

Cycle-GAN

J:2->1

J:1->0

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

7 of 13

Results on Test set

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

8 of 13

Results on Test set

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

9 of 13

Results on Test set

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

10 of 13

Quality of predictions - Figures of Merit

Power Spectra (2pt stat.)

Minkowski Functionals (high-order stat.)

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

11 of 13

Predictions CO emission in regions where it has been never observed, so far.

Reprojection of 50M pixels with MPI, 140 kcpuh @Perlmutter - NERSC

Building a new Galactic model

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

12 of 13

noise and angular resolution

synthetic scales injected

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

13 of 13

M7: Results with Res-UNet

Missione 4 • Istruzione e Ricerca 

ICSC Italian Research Center on High-Performance Computing, Big Data and Quantum Computing

Timescale, Milestones and KPIs

Res-UNet implementation, debugging, preprocessing training data

10/2022

07/2023

01/2023

03/2023

NN Architecture change: CycleGAN

09/2023

01/2024

10/2024

03/2025

Res-UNet released on GitHub: PICASSO repo

Paper

Analysis of Results

M9: Results with Cycle-GAN

04/2024

Dataset augmentation with de-noising

Figures of merit

08/2025

Validation and verification with observations

80%

M10: Release of Models

M8: Training�cycle-GAN

Implementation of CycleGAN