DRP data flow
Yusra AlSayyad
DF meeting
Feb 11 2025
1
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
Introduction
I know you just want me to give you a list of all the datasetType names that will be generated during DR1 with a categorization of which will be final, which will be intermediate.
But the DR1 pipeline freeze won’t be until after April 2026.
We don’t know what datasetTypes will exist in April 2026
So, I’m going to focus on the parts that won’t change as much
And show you how to find out the parts that will (Timeline for automating this is LSSTCam commissiong)
And we can check that we’re all on the same page with respect to assumptions
2
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
3
As of April 9 2024
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
Assuming that the categories of data to be transferred include:
Assumption: Data products are assumed to be transferred as soon as they are produced.
4
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
5
As of Feb 10 2025
Assumes all DF assignment will be by continuous regions of tract
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
Example: https://github.com/lsst/drp_pipe/blob/main/pipelines/LSSTComCam/DRP.yaml as of w_2025_06
6
a sharding step
A “Stage” a.k.a.“checkpoint step”
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
subset is the middleware name for a collection of Tasks “Stages” are a conceptual collection of “steps”
Stages for humans:
Steps for robots
7
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
8
1-
initial
2-recalibration
3-coadds
4-revisit
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
The plan for implementing this (DM-47320) lives on this Slack Canvas:
Each “stage” a.k.a. “checkpoint step” also includes any analysis tasks that are necessary to validate that step's outputs. Draft through stage 3 on ticket branch.
Steps will proliferate. Stages will consolidate.
Steps will be re-named as components of their stage. For example:
If small dataset, middleware will handle launching of whole stage
If full-DRP scale that requires sharding (“groups”) , cm-service will handle launching of whole stage.
9
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
Stage 0: Calibration Production
Inputs: flats/darks/bias
Output: combined flats/darks/biases
Calibration collections are going to evolve rapidly while we’re on sky
Should be possible to process a final set for DRPs at the USDF in < 2 weeks
Example transfers today: Outputs (along with skymaps, refcats) to be distributed to UKDF, FrDF for stage 1. Also, Includes things like skyFrames (for SkyCorr), fgcmLookUpTable, real-bogus models, and photo-z models.
10
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
Stage 1-initial: Single Frame Processing
11
Short term sharding steps:
1-initial:
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
12
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
Stage 2-recalibration: Global Recalibration
The last step in stage 1 is a global makeCcdVisitTable/makeVisitTable (which takes visitSummary today
The first step in stage 2 is a global FGCM step, which will be run at the USDF.
Example Transfers Today:
13
Short term sharding steps:
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
Pause between stage 2 and stage 3 to check on the quality of the calibrations using the recalibrated preSources (matchedVisit and maybe visit-level metrics)
FGCM plots are already at USDF
matchedVisit plots/metrics need to be copied back.
Use astro/photometric calibrations and final PSFs to run a pilot run of stage 3 and beyond with these calibrations and new pipeline candidate.
We expect to be fixing bugs in the pipelines that affect stage 3 while e.g. stage 1 is running.
14
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
Stage 3-coadds: Coaddition and Coadd Processing
Input:
Output: Coadds, Diffim Templates, Object Tables (today those are spelled deepCoadd, goodSeeingCoadd, objectTable_tract)
15
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
Pause to assess Objects, Coadds, and Templates
Example transfers today:
Send data back to USDF for global aggregation:
16
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
Stage 4-revisit: Difference Imaging Analysis (DIA)
Input: images, finalVisitSummaries, diffIm templates, Objects
Output: DIASource, DIAObject, and ForcedSource Tables, PVIs, image differences.
Categories of Tasks include:
17
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
“Stages” are a conceptual collection of “steps”
Stages for humans:
Steps for robots
18
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
Some information about input and outputs of steps is available now
Docs on working with pipeline graphs
$ pipetask build -p $DRP_PIPE_DIR/pipelines/LSSTComCam/DRP.yaml#step3a --show pipeline-graph
$ pipetask build -p $DRP_PIPE_DIR/pipelines/LSSTComCam/DRP.yaml#step3a --pipeline-mermaid
19
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
Pre-rendered, full version of pipelines graphs available @ tigress-web.princeton.edu/~lkelvin/pipelines
20
For example: HSC DRP-RC2, step1
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
Currently, we manually track dataset type retention info: which datasetTypes fulfill which final data products
For DP1 this mapping is in progress on:
DM-47725 - needed for LSSTCam scale, so expect March.
21
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
Sharding-dimension information can be stored with pipelines now, but hasn’t been yet
22
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
In summary
Whether a dataset type is designated as intermediate or final is information that changes quickly.
Much of that information (what inputs are needed for a step) is machine readable now. Well before DR1, all of it (including is objectTable_tract the Object Table?) will be machine readable.
Implementation is still underway and requests welcome.
23
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
24
1-
initial
2-recalibration
3-coadds
4-revisit
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
Appendix
25
Vera C. Rubin Observatory | DF Workshop | 11 February 2025
26
DR3 Processing
DR2 Processing
DR2 Release
DR3 Preview
Vera C. Rubin Observatory | DF Workshop | 11 February 2025