1 of 1

Optimizing the Processing and Storage of Radio Astronomy Data

1

Scientific Achievement

Square Kilometer Array (SKA), the world’s largest radio telescope, is anticipated to collect 710 PB data per year, placing significant strains on I/O and storage. Utilizing data from a precursor telescope, we investigate lossy compression methods using the MGARD and ADIOS2 libraries. We demonstrate the improvements in I/O and storage and quantify the impacts on science quality using lossy compressed data.

Significance and Impact

  • The advancements in radio telescope present great opportunities for scientific discovery also introduce challenges in data management and processing.
  • Our study demonstrates that error-controlled lossy compression can significantly reduce I/O and storage costs while ensuring data accuracy.
  • We quantified the impact of lossy compression and found that the SKA data can be compressed by 15x without impacting the integrity of scientific results.

On the left are images produced by cdeconvoler using the grid data compressed by MGARD at different error bounds. On the right are absolute residuals between the images produced using lossy compressed and uncompressed equivalent. Errors in data compressed by a relative error bound <1e-4 or an absolute error bound <1e-3 are not detectible – less than 1σ of the source spectral.

Technical Approach

  • The raw SKA data was calibrated into a visibility grid, a PSF grid, and a PCF grid, then processed by the cdeconvoler application to generate final images.
  • We compressed the calibrated data using various error bounds and investigated their impacts on the galaxy source after imaging process. We also evaluated the runtime acceleration and storage reduction achieved through compression.

PI(s)/Facility Lead(s): Scott Klasky; Oak Ridge National Laboratory

Collaborating Institutions: University of Western Australia, Pawsey Supercomputing Research Centre

ASCR Program: RAPIDS

ASCR PM: Kalyan Perumalla

Alexander Williamson, et al., “Optimising the Processing and Storage of Radio Astronomy Data,” Cray User Group Conference (CUG 2024)