1 of 35

Unleashing the Potential of Machine Learning for Efficient Analysis of Solar Observations

Carlos José Díaz Baso

Rosseland Centre for Solar Physics, Institute of Theoretical Astrophysics, University of Oslo, N-0315 Oslo, Norway

carlos.diaz@astro.uio.no

2 of 35

2

¿Big Data?

IRIS (2013-now) ~ 61 TB*

*Level 2

DKIST, EST, SST ~ TB/h/instr

Numbers courtesy of Bart De Pontieu, Marc DeRosa, Ryan Timmons (10/2022)

Hinode/SOT (2006-now) ~ 35 TB*

*Level 1 (FG) + 1&2 (SP)

3 of 35

3

Exploration and dimensionality reduction

Typical questions of someone that recently got a big dataset:

  • How diverse vs big is my dataset?

  • How many of these features are actually really informative

or contain just redundant information?

4 of 35

4

Dimensionality reduction: Principal Component Analysis

1.- Find the “principal components” basis

2.- Project the data into a truncated version of it.

What is the basic idea behind PCA?

Why is it useful in solar observations?

Martínez González et al. (2008a)

compressibility

(e.g. Asensio Ramos et al. 2007; Asensio Ramos & López Ariste 2010)

5 of 35

5

Dimensionality reduction: PCA applications

Martínez González et al. (2008a,b)

Denoising

Casini et al. (2012, 2021)

Removal of fringes

(e.g. Asensio Ramos et al. 2007; Asensio Ramos & López Ariste 2010; Paletou 2012; Pastor Yabar et al. 2018; Trelles Arjona et al. 2021)

6 of 35

6

Dimensionality reduction: PCA applications

Ruiz Cobo & Asensio Ramos (2013)

(e.g. Rees et al. 2000; López Ariste & Casini 2002; Skumanich & López Ariste 2002; Casini et al. 2005; Casini et al. 2009, 2013; Sainz Dalda et al. 2019)

PCA inversion

Socas-Navarro et al. (2001)

(e.g. Quintero Noda et al. 2015, 2016; Felipe et al. 2016, Griñón-Marín 2021)

PCA deconvolution

7 of 35

7

Finding patterns

Once we have processed our dataset …

  • Can I find a way to distinguish “groups” of similar properties?

  • The conclusions will be the same for all the examples within the group!

  • Where are all these “groups” located in the solar surface?

8 of 35

8

Clustering: K-means algorithm

(e.g., Pietarila et al. 2007; Viticchié & Sánchez Almeida 2011; Panos et al. 2018; Sainz Dalda et al. 2019; Bose et al. 2019; Rouppe van der Voort et al. 2021; Robustini et al. 2019; Joshi & Rouppe van der Voort 2020b; Kuckein et al. 2020; Bose et al. 2021a,b; Barczynski et al. 2021; Nóbrega-Siverio et al. 2021; Kleint & Panos 2022; Joshi et al. 2022; Thoen Faber 2022).

1.- Define the K clusters and draw the centroids

2.- Assign each point to the closest centroid (Euclidean distance)

3.- The centroids are updated as the average of their cluster

λ1

λ2

9 of 35

9

K-means applications

Viticchié & Sánchez Almeida (2011)

QS magnetic field

Bose et al. (2019, 2021a,b)

Type-II Spicules

10 of 35

10

K-means applications

Brandon Panos et al. (2018)

Nóbrega-Siverio et al. (2021)

(e.g. Magnus Woods et al. 2021)

11 of 35

PCA and k-means are not the only ones

11

  • Affinity Propagation
  • Agglomerative Hierarchical Clustering
  • BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies)
  • DBSCAN (Density-Based Spatial Clustering of Applications with Noise)
  • Gaussian Mixture Models (GMM)
  • K-Means
  • Mean Shift Clustering
  • Mini-Batch K-Means
  • OPTICS
  • Spectral Clustering

Clustering techniques

Dimensionality reduction

  • Feature selection
  • Principal Component Analysis (PCA)
  • Non-negative matrix factorization (NMF)
  • Linear discriminant analysis (LDA)
  • Generalized discriminant analysis (GDA)
  • Missing Values Ratio
  • Low Variance Filter
  • High Correlation Filter
  • Backward Feature Elimination
  • Forward Feature Construction
  • t-SNE (T-distributed stochastic neighbour embedding)

12 of 35

12

Classification and prediction

  • Can we find a recipe that links our data with some properties?
  • they can come from different sources
  • just very computationally expensive
  • very difficult to manually find a way

  • Can it be general enough to be applied to future data?

  • Even more important, can we learn something from it?

Once you know the interesting part in our dataset …

13 of 35

13

Classification and prediction

Support Vector Machines

margin

Bobra et al. (2015, 2016)

Flare prediction

(e.g. Yuan et al. 2010; Nishizuka et al. 2017; Florios et al. 2018)

14 of 35

14

Nonlinear modeling → Neural networks

What happens if this relation that we try to model is very non-linear?

A

B

Input Free parameters Output

15 of 35

15

Intensity + Polarization

Solar

model

Radiative transfer

NLTE Radiative transfer calculations

non-LTE pop. → Intensity + Polarization

Solar

model

Neural Network

Chappell & Pereira (2021)

Vicente Arévalo et al. (2021)

1D / departure coefficients

3D / LTE → non-LTE populations

16 of 35

16

(e.g. Carroll et al. 2001, 2008; Socas-Navarro, H. 2003, 2005; Sainz Dalda et al. 2019; Milić et al. 2020; Gafeira et al. 2021; Centeno et al. 2022)

Intensity + Polarization

Solar

model

Radiative transfer

Intensity + Polarization

Solar

model

Neural Network

Spectropolarimetric inversions

Sainz Dalda et al. (2019)

Synthesis + Inversions ~ 103 - 106 faster

17 of 35

17

Accelerating the inference …

Kianfar S. et al. (2019)

… in large FOVs

Morosin R. et al. (2022)

Ca II 8542 Å

Net radiative losses

CRISP@SST

… in long time-series

18 of 35

Autoencoders (the non-linear PCA)

18

Sparse Representation

Flint S. & Milić I. (2021)

Encoder

Decoder

Input Data

Encoded Data

Reconstructed Data

(e.g. Skumanich & López Ariste 2002; Sadykov et al. 2021; Sergey Ivanov et al. 2021)

19 of 35

19

If I want to analyze a megapixel image (106), do I need a neural network with O(>106) learnable parameters?

Convolutional Neural Networks

github.com/vdumoulin/

  • Translational Equivariance

  • Efficient parametrization

  • Information of neighbouring pixels

Input data

Output data

Translational Equivariance

visual.cs.ucl.ac.uk/pubs/harmonicNets

f(g(x))=g(f(x))

20 of 35

20

Automatic catalogues

Armstrong & Fletcher (2019)

The image as a “whole”

Automatic segmentation

Xulong Guo et al (2022)

(e.g. Ahmadzadeh et al 2019; Zhu et al. 2019; Gaofei Zhu et al 2021, Illarionov & Tlatov (2018); Diercke et al 2022)

21 of 35

21

Díaz Baso et al. (2018)

Image deconvolution

(e.g. Asensio Ramos et al. 2018, 2021; Armstrong et al. 2021; Wang et al 2021; Deng et al 2021)

Using the information from nearby pixels

Asensio Ramos & Díaz Baso (2019)

Hinode PSF-compensated Stokes inversions

1D inversion code

Convolutional Network

22 of 35

22

Solar image denoising

Ca II 8542 A

Díaz Baso et. al (2019)

CNNs applications

(e.g. Eunsu Park et al. 2020)

Horizontal velocity fields

Benoit Tremblay et al. (2020, 2021)

(e.g. Asensio Ramos et al. 2017; Ishikawa et al., 2022)

23 of 35

23

Fibril orientation

Other applications

(Gravitational) wave classification

Plamen G. Krastev (2020)

Haodi Jiang et al (2021)

(e.g. Heming Xia et al. 2020, Richard Qiu et al 2022)

24 of 35

24

Enhancing CNNs with temporal information

Far-side activity detection

Broock et al. (2022)

(e.g. Felipe et al. 2019; Broock et al 2021; Zeyu Sun et al. 2022)

25 of 35

All that glitters is not gold

Challenges and future directions

25

- Why is this AR classified as a flare-producer?

Interpretability

Zeyu Sun et al. (2022)

To name a few methods:

  • Learned Features
  • Pixel Attribution (Saliency Maps)
  • Testing Concepts
  • Adversarial Examples
  • Influential Instances
  • Symbolic regression

(e.g. Kangwoo Yi et al. 2021)

Vishal Upendran et al. (2020)

26 of 35

Does your method know when it doesn’t know?

26

Friday, July 23, 2021 | Virtual Worldwide

Uncertainty quantification

27 of 35

27

A probabilistic perspective: conditional information

Designing observing sampling

Mutual information

Low correlation/ Medium correlation/ High correlation

Panos et al. (2021a,b)

Díaz Baso et al. (in prep)

(e.g. Szenicer et al. 2019; Lim et al. 2021; Salvatelli et al. 2022)

(e.g. Snelling et al. 2020)

28 of 35

28

A probabilistic perspective: inverse problems

forward modeling

inverse problem

29 of 35

29

Inherent in many problems

Wehrbein et al. (2021)

Input image

3D Pose

Input image

Lugmayr A. et al. (2020)

Super-Resolution

Pose estimation

30 of 35

Normalizing flows

30

NNs

λ

NFlows

λ

Rezende & Mohamed (2015), Dinh et al. (2016)

31 of 35

Normalizing flows

31

Díaz Baso et al. (2021)

(e.g. Osborne et al. 2019, Asensio Ramos et al. 2021)

Rezende & Mohamed (2015), Dinh et al. (2016)

N-LTE inversion

  • Only using the Fe I 6301 line

  • Also using the Ca II 8542 profile

Height -

32 of 35

Normalizing flows Diffusion models

32

Sohl-Dickstein et al. (2015), Yang & Ermon (2019), Ho et al. (2020)

Grizzly bear taking a selfie on the Golden Gate bridge on a windy day

Irish Terrier riding a horse in Patagonia and playing the harmonica

Cat with a yellow hat going down the stairs under water

Panda mad scientist mixing sparking chemicals, artstation

Ramesh et al. (2022)

→ valuable effort in complex inverse problems.

33 of 35

33

  • Machine learning can help in many different ways: explore patterns, image reconstruction, compression, denoising, parameter inference, classification, and tracking, etc. Inference is extremely fast.

  • The question you want to address is as important as the method. Depending on the goal, no need to always "reinvent the wheel" (literature vs own design).

  • There are still many ways to improve them: incorporating physical constraints (e.g. symmetries, conservation laws), making them interpretable, quantifying uncertainty, enabling multimodal solutions, etc.

Summary and conclusions

34 of 35

34

  • Machine learning can help in many different ways: explore patterns, image reconstruction, compression, denoising, parameter inference, classification, and tracking, etc. Inference is extremely fast.

  • The question you want to address is as important as the method. Depending on the goal, no need to always "reinvent the wheel" (literature vs own design).

  • There are still many ways to improve them: incorporating physical constraints (e.g. symmetries, conservation laws), making them interpretable, quantifying uncertainty, enabling multimodal solutions, etc.

Summary and conclusions

(e.g. scikit-learn - python)

35 of 35

35

What a time to be alive!

  • Machine learning can help in many different ways: explore patterns, image reconstruction, compression, denoising, parameter inference, classification, and tracking, etc. Inference is extremely fast.

  • The question you want to address is as important as the method. Depending on the goal, no need to always "reinvent the wheel" (literature vs own design).

  • There are still many ways to improve them: incorporating physical constraints (e.g. symmetries, conservation laws), making them interpretable, quantifying uncertainty, enabling multimodal solutions, etc.

Summary and conclusions