Data Management
Julian Borrill
Interim Data Management L2 Scientist
Range
2
Scope
3
Work Breakdown Structure
*interim
4
Design Drivers - Data Rates
Data rates set site bandwidth and local storage requirements.
Design: sufficient network bandwidth (1.1 Gbps) from Chile; insufficient network bandwidth (0.7 Gbps) from South Pole.
Design: 1 month backup (382 TB) in Chile; 1 year backup (5.4 PB) at South Pole.
5
TELESCOPES | DETECTORS | SAMPLING FREQUENCY (Hz) | DATA RATE | |
RAW (Samples/Sec) | COMPRESSED (Gbps) | |||
CHLAT | 243,520 | 400 | 9.7 x 107 | 1.09 |
SPLAT | 114,432 | 400 | 4.6 x 107 | 0.51 |
SAT | 153,232 | 100 | 1.5 x 107 | 0.17 |
Design Drivers - Data Volumes
Data volumes set US data center storage requirements
Design: 1 year of raw data + 7 years of science data (17 PB) spinning at each data center
Design: 7 years raw data (49 PB) archived at each data center
6
TELESCOPES | DAILY DATA (TB) | SPINNING DATA (PB) | ARCHIVAL DATA (PB) |
CHLAT | 11.8 | 4.3 | 30.1 |
SPLAT | 5.5 | 2.0 | 14.2 |
SAT | 1.9 | 0.7 | 4.7 |
Design Drivers - Daily Data Processing
Daily data processing drives the fast-access computational requirements
Design: CHLAT processing in the US; SPLAT+SAT processing at South Pole
7
TELESCOPES | CYCLES (TFLOP) | PEAK MEMORY (TB) | SINGLE DAILY MAP DATA (GB) | TOTAL DAILY MAP DATA (TB) |
CHLAT | 0.84 | 11.5 | 54 | 138 |
SPLAT | 0.40 | 5.7 | 4.5 | 12 |
SAT | 0.13 | 1.2 | 0.02 | 0.05 |
Design Drivers - Bulk Data Processing
Simulating and reducing the entire dataset requires:
Design: bulk computational resources must be allocated at national computing centers
Design: balance having sufficient centers to accommodate down-times & support diverse approaches (high performance + high throughput) against the cost per center of maintaining/optimizing the software stack.
8
PBD Hardware Schematic
Allocated computational resources are planned, not confirmed.
9
PBD Software Schematic
10
Interfaces
All interfaces must be documented:
11
Data Challenges
12
Data Challenge Schedule/Process
13
Parallel Session
Presentation by each L3 team
Note also the “Design Validation: Technical to Measurement” theme which will focus on Data Challenge 1 and validating the Preliminary Baseline Design.
14
Open Questions
15