1 of 10

A video-based CNN for flare forecasting

Sabrina Guastavino

Joint work with Michele Piana, Federico Benvenuto, Anna Maria Massone, Francesco Marchetti, Cristina Campi

Department of Mathematics, Università degli studi di Genova

MIDA group

The Solar Physics High Energy Research workshop

(SPHERE)

July 15, 2022

2 of 10

Solar flares originate from magnetically active regions (ARs)

but not all solar ARs give rise to a flare.

Two approaches:

  • features, representing numerical approximation of physical parameters, can be previously extracted from HMI images and then some machine learning techniques are used for the prediction (as SVM, RF, Hybrid Lasso..)

  • features can be automatically extracted by some deep learning methods as Convolutional Neural Networks (CNNs).

3 of 10

Machine/deep learning in flare forecasting

  • Three ingredients:
  • a supervised algorithm for classification (e.g. a neural network)
  • a historical data set for its training and validation
  • a score for the assessment of performance

1) A neural network is a parametric function that approximates the map connecting data to the event probability

 

  • The input of a neural network can be:
    • point-in-time of feature vectors
    • time series of feature vectors
    • point-in-time images of ARs
    • videos of images of ARs

4 of 10

  •  

 

Loss function

Regularization

 

 

Confusion matrix

5 of 10

Data generation

Definition of data samples:

  • X, M, C class

Uniform training and validation sets:

  • proportionality: same rates of samples for each sample type
  • parsimony: each subset of samples made by as few ARs as possible (i.e., samples belonging to the same AR fall into the same data set)

Algorithm for data set generation should guarantee the generation of uniform sets

Guastavino, Marchetti, Benvenuto, Campi, Piana, Implementation paradigm for supervised flare forecasting studies: a deep learning application with video data, (2022) Astronomy and Astrophysics, vol. 662, iss. A105.

  • NO1, NO2, NO3, NO4

6 of 10

Gap between loss minimization and score maximization

Two crucial points:

  • choice of the loss function to be minimized in the training phase
  • choice of the skill scores to evaluate the predictions

Score-Oriented Loss (SOL) functions [1]

Ingredients

  • Define a probabilistic version of the confusion matrix.
  • Define scores over the entries of the probabilistic matrix.

Advantage:

No need of a posteriori optimization of the desired skill score.

[1] Marchetti, Guastavino, Piana, Campi, Score-Oriented Loss (SOL)

functions (2022), under review in Pattern Recognition

define a loss function which maximizes the desired skill score

Idea:

 

7 of 10

Deep neural network architecture

CNN

CNN

CNN

LSTM

unit

LSTM

unit

LSTM

unit

LSTM

unit

LSTM

Feature extraction

t

Long-term Recurrent Neural Network

LRCN = CNN + LSTM

Analysis of the temporal aspect of feature sequences

t-T

t

8 of 10

Data

SDO/HMI images recorded in the time range between 2012 September and 2017 September.

  • For each AR, we consider the HMI magnetogram frames associated to it and we organize them in 24 hour long time series of HMI magnetogram frames.

  • We have a collection of samples, each one represented by a video of HMI magnetogram frames associated to an AR.

  • Data from the past: we know if a flare occurred w.r.t. each time series, therefore we can label each time series with 0 if no flare occurred and with 1 if flares occurred.

  • We generate 10 training, validation and test sets according to the data generation process in order to assess the statistical robustness in results.

9 of 10

Mean

Std

Min

25° perc

Median

75° perc

Max

C+ flares

0.55

0.049

0.457

0.521

0.537

0.597

0.613

M+ flares

0.683

0.089

0.546

0.614

0.691

0.724

0.821

TSS on test sets

C+ flares �prediction

M+ flares �prediction

10 of 10

Thank you for the attention!

  • We proposed a general paradigm for the implementation and assessment of flare forecasting.

  • We proposed a new loss function which automatically optimizes a desidered skill score.

  • We used a deep NN which takes in input videos of ARs.

  • We showed the importance of data generation for the evaluation of the performance of the method.

Summary and conclusions