1 of 19

The CADC, CANFAR and a LSST IDAC

2 of 19

CADC: Canadian Astronomy Data Centre

  • The CADC archives and distributes the data from all Canadian telescopes, including JWST, HST, and NEOSSat in space, and CFHT, Gemini on the ground
  • 216 telescopes and instruments
  • Current holdings:
    • 2.3 PB
    • 300 million files
  • Annual downloads:
    • 100 million files
    • 4.9 Petabytes
  • Users are 15% Canadian, 85% international
  • Upcoming datasets
    • SKA (drives the majority of our expansion)
    • Euclid
    • LSST

NATIONAL RESEARCH COUNCIL CANADA

2

3 of 19

CANFAR Science Platform

3

  • Originally developed for ALMA processing
  • Rapidly adopted by other fields
  • Steady growth in users
  • May be adopted as a requirement for SKA Regional Centres
  • Portable:
    • several other SRCs have stood up instances
    • Will stand up a second instance at the University of Waterloo
  • Currently 668 users
  • ~150 active users at any given time
  • ~60% of Canadian astronomers are users

4 of 19

4

Demo of container launching

Following slides are if connectivity is a concern

5 of 19

5

Using CANFAR

  • Pick a session type

6 of 19

6

Using CANFAR

  • Pick a session type
  • Pick a software container (organized by project)

7 of 19

7

  • Pick a session type
  • Pick a software container (organized by project)

Using CANFAR

8 of 19

8

Using CANFAR

  • Pick a session type
  • Pick a software container (organized by project)
  • Pick resources (cores/memory)

9 of 19

9

  • Pick a session type
  • Pick a software container (organized by project)
  • Pick resources (cores/memory)

Using CANFAR

10 of 19

10

Notebooks

11 of 19

11

Virtual desktop

12 of 19

12

Virtual desktop: Software runs in terminals with their own resources

13 of 19

13

Virtual desktop: Software runs in terminals with their own resources

14 of 19

14

Virtual desktop: Software runs in terminals with their own resources

15 of 19

15

Virtual desktop: Software runs in terminals with their own resources

16 of 19

Batch processing with headless jobs

Currently:

  • Job submission by a RESTful API
  • Has been integrated with LSSTpipe graph system by at least one user
  • Users must be careful not to overload the system
  • Very successful, but relies heavily on users being well-trained

Near future (Currently in Beta, release to production in the next 6 weeks)

  • Job submission with a python command line client
  • User fairness enforced with Kueue

16

17 of 19

User storage

Cavern / arc:

  • Suitable for processing
  • Each user gets a small home directory
  • Project directory on request, quota can be expanded
  • Persistent, stable, but not suitable for archival
  • Looks like a POSIX file system
  • Backed by CephFS

Vault / vos:

  • Suitable for data publication and long term curation
  • Data mirrored at two sites
  • Backed by Ceph Object Store
  • Available by request
  • Self-serve DOIs also available

Both:

  • Accessible via
    • VOSpace API
    • Python Client
    • Web UI
  • Fine grained access control

17

18 of 19

CADC/CANFAR and LSST

NATIONAL RESEARCH COUNCIL CANADA

18

  • The CADC is building an IDAC-Lite
  • DP1 and DP2 will be available through CANFAR
  • For the DRs we will host
    • coadded images
      • 2.7PB
      • retrieve via the existing set of CADC tools
      • implementing Rubin tools (DataButler)
    • ObjectLite catalog
      • Building out expanded database capacity
      • Adopting Rubin’s Firefly database interface
    • Working to implement HATS
    • providing 3000 cores for user processing
    • providing 2PB of user storage
  • One set of resources for Rubin data-rights holders
  • One set of resources for the wider astronomical community when the data becomes public

19 of 19

Summary

The CANFAR Science Platform

  • Colocated with numerous other datasets accessible through the same tools
  • Notebooks
  • Browser-based VNC desktops
  • Data visualization
  • Ability to share containers (key!)
  • Batch processing available (key!)
  • Portable; currently there are multiple instances running in multiple countries (key!)
  • 60% of Canadian astronomers are users
  • Will provide 3000 cores / 2PB of user storage to Rubin data rights holders
  • Will provide world access when data becomes public
  • Access for Rubin data rights holders is available:
    • By the time of DR1, your Rubin login credentials will work on CANFAR
    • For now, please send an e-mail to support@canfar.net

19