1 of 54

Bayesian Light Source Separator (BLISS): Probabilistic detection, deblending and measurement of astronomical light sources

Ismael Mendoza

Department of Physics, University of Michigan

C. Avestruz

Department of Physics, University of Michigan

J. Regier, D. Hansen, R. Liu, Z. Zhao, Z. Pang

Department of Statistics, University of Michigan

Probabilistic Machine Learning Group (UM)

2 of 54

Outline

The Blending Problem�
Overview of BLISS Framework�
Results: BLISS applied on simulated data�
Future Directions and Conclusions

3 of 54

Motivation: The Blending Problem in Stage-IV surveys

Stage-IV Cosmological surveys (such as LSST) will reach higher depths than ever before. �
Higher depth ⇒ higher number density of galaxies ⇒ ~ 60% will visually overlap (blend)!

For precision cosmology, need accurate source detection and measurement. (e.g. flux and photo-z’s, shapes and weak lensing)�
⇒ Need accurate, robust and fast deblenders.

Image Credit: Kamath 2020 Thesis

Simulated 5x5 arcsec LSST patch

4 of 54

Probabilistic cataloging and uncertainty

Number of sources could be highly uncertain for any given galaxy blend.�
There is also high flux uncertainty as different galaxies could share different portion of the flux in the image.�
⇒ Need a way to estimate and propagate uncertainty to downstream analysis

5 of 54

Overview of BLISS Framework

6 of 54

What is the Bayesian Light Source Separator (BLISS) ?

A Bayesian framework for detection, deblending, and measurement of galaxies and stars.

7 of 54

BLISS at a glance

8 of 54

Not requiring centroids as additional input

9 of 54

Neural networks parametrize distribution on measured quantities

Number of sources �
Locations �
Classification boolean �(star vs galaxy)�
Star fluxes

10 of 54

BLISS predictions are done “per tile”

0 objects detected

1 object detected

⇒ Posterior on location/classification�for this tile

Images of blended galaxies are split into tiles, predictions of properties are done per tile.

11 of 54

Output of algorithm on images provides a probabilistic interpretation of measured quantities and corresponding uncertainties.

12 of 54

Results: BLISS applied on simulated data

Produced via a galaxy simulator ‘Galsim’. Used to create galaxy blends for both training and testing of BLISS.�
Galaxies are “Bulge+Disk” parametric galaxies.

Disk

Bulge

Bulge+Disk

13 of 54

Detection and Classification results on simulated blends

14 of 54

Example of blends with true centroids

15 of 54

BLISS point-estimate detection metrics on 10k blend dataset

Takeaway: BLISS can detect majority of sources > 7 SNR. �BLISS maintains higher precision than recall for low SNR sources.

Precision = �# Matched predicted sources / �# Predicted sources

Recall = �#Matched predicted sources / �#True sources

**Matches are detected (>50% detection probability) galaxy/star centroids less than 1 pixel way from true centroid.

16 of 54

BLISS case study on separation

Galaxy 1 has double the flux of Galaxy 2 and is 50% bigger.

17 of 54

Let’s try to understand BLISS prediction’s step by step at each separation

18 of 54

Detection probability as a function of distance

Initially, the centroid of both galaxies are in the same (central) tile. �
BLISS assumes there is at most 1 object per tile. So only one prediction is made. �
BLISS is highly confident of a galaxy being present in this tile.

Prediction on tile 1

19 of 54

Detection probability as a function of distance

Once the second galaxy enters the adjacent tile, BLISS outputs prediction at both tiles. �
Initially, both galaxies are highly blended and BLISS does not detect a second galaxy. �
Probability becomes > 50% at separation �> 3 pixels.

Prediction on tile 1

Prediction on tile 2

20 of 54

Detection probability as a function of distance

As galaxies get farther apart, detection of each individual light sources becomes easier and both detection probabilities converge to 1.

Prediction on tile 1

Prediction on tile 2

21 of 54

Detection probabilities at tile boundaries

Detection probability has sharp valleys when centroid of galaxy lands close to tile boundaries. �
This is an artifact of our loss not being “symmetric” w.r.t tile boundaries.�
Further investigation in future work.

22 of 54

Flux reconstruction residual as a function of distance

Initially, the centroid of both galaxies are in the same (central) tile. �
Flux reconstructed for both galaxies at the central tile is 50% bigger than Galaxy 1’s flux. As expected.

Prediction on tile 1

23 of 54

Flux reconstruction residual as a function of distance

Flux residual predicted for Galaxy 1 (left) slowly drops to 0 as the Galaxy 2 (right) separates away. �
Galaxy 2’s flux prediction is initially 0 as BLISS thinks there is no source in right tile. As Galaxy 2 further enters the right tile, flux is over-predicted due to high blending.

Prediction on tile 1

Prediction on tile 2

24 of 54

Flux reconstruction residual as a function of distance

As galaxies get farther apart, flux residuals tend to 0. �
Overall under-estimation of flux for moderate blending (6-10 pixel separation)

Prediction on tile 1

Prediction on tile 2

25 of 54

Future directions and Conclusions

26 of 54

Future Directions

Flexible galaxy probabilistic model: Using a VAE, normalizing flows, or diffusion models. �
Tile boundaries: Further investigate peaks in probabilities/residuals arising from tile boundaries. �
Real data application on SDSS, DES, and (future) LSST data.�
Propagate uncertainties to cosmological downstream analysis.

27 of 54

Conclusions

BLISS splits astronomical images into tiles. Each tile is an independently input to a neural network that outputs a distribution of light source parameters for that tile.�
BLISS can output number of sources, centroids, classification, star flux, and galaxy flux and shapes. �
BLISS can accurately detect, classify, and deblend simulated galsim blends for relatively low SNRs and high blendedness.�
Case study suggests probably estimates are reasonable e.g. uncertainty in detection increases for high blendedness regime.

Thank you!

28 of 54

Extra slides

29 of 54

BLISS reconstructions of blends at different degrees of separation

BLISS has difficulty reconstructing blends �< 4 pixels apart.

Doesn’t preserve total flux for extreme blends.

As separation grows, residual improves, as expected.

30 of 54

BLISS point-estimate classification metrics on 10k blend dataset

Takeaway: BLISS can disambiguate stars and galaxies with high accuracy for sources > 7 SNR.

Precision = �# Matched true galaxy classified as galaxy / �# Matched sources classified as galaxies

Recall = �# Matched true galaxies classified as galaxy / �# True matched galaxies

**Matches are detected (>50% detection probability) galaxy/star centroids less than 1 pixel way from true centroid.

31 of 54

10k Blends Histograms of matched objects

Residuals on measured properties will be noisy with highly blended and low SNR objects. It’s hard to detect and thus match them.

32 of 54

10k Blends measurement residuals on matched true galaxies:

Flux residuals

BLISS measured flux residuals on galaxy+star blends are consistent with 0 for matched objects over 7 SNR and 0.8 Blendedness

33 of 54

10k Blends measurement residuals on matched true galaxies:

Ellipticity residuals

BLISS residuals on ellipticities are consistent with 0 for matched objects (but low SNR and high B regions are noisy).

34 of 54

10k Blends measurement residuals on matched true galaxies:

Ellipticity residuals

BLISS residuals on ellipticities are consistent with 0 for matched objects (but low SNR and high B regions are noisy).

35 of 54

BLISS can capture isolated galaxies

Autoencoder galaxy model residuals are consistent with Gaussian noise for majority of examples.

Random examples

“Worst” residuals

36 of 54

BLISS can capture isolated galaxy fluxes

Autoencoder galaxy model reasonably captures galaxy fluxes for SNR > 7.

Bias on flux residuals of lower SNR objects reflects choice of realistic flux prior.

37 of 54

BLISS can capture isolated galaxy shapes

Autoencoder model reasonably captures galaxy shapes.

38 of 54

Outline Walkthrough: Prediction

39 of 54

*Note: Tiles are much smaller in actuality.

45 of 54

Statistical Methods

46 of 54

FAVI for detection and classification

47 of 54

FAVI for detection and classification

48 of 54

Autoencoder for galaxy modeling and deblending

49 of 54

Simulated galaxy blend reconstruction

Galaxy

Star

50 of 54

Examples of BLISS output on real data

51 of 54

Fast inference on large survey scenes

SDSS frame 1500x2100 inference (locations, galaxy latents, … ) ~20s on a gpu after training.

52 of 54

Reconstruction of individual real galaxy blends

from SDSS

Input

Residual

Preliminary Results

53 of 54

Extensible to large frames: Reconstruction of an SDSS frame

We are able to perform inference with BLISS on a 300x300 pixel chunk of an SDSS frame in ~1s on a gpu.

Galaxy

Star

Input

Model

Residual

Preliminary Results

1 of 54

2 of 54

3 of 54

4 of 54

5 of 54

6 of 54

7 of 54

8 of 54

9 of 54

10 of 54

11 of 54

12 of 54

13 of 54

14 of 54

15 of 54

16 of 54

17 of 54

18 of 54

19 of 54

20 of 54

21 of 54

22 of 54

23 of 54

24 of 54

25 of 54

26 of 54

27 of 54

28 of 54

29 of 54

30 of 54

31 of 54

32 of 54

33 of 54

34 of 54

35 of 54

36 of 54

37 of 54

38 of 54

39 of 54

40 of 54

41 of 54

42 of 54

43 of 54

44 of 54

45 of 54

46 of 54

47 of 54

48 of 54

49 of 54

50 of 54

51 of 54

52 of 54

53 of 54

54 of 54