Data Platform of the Future
November 12, 2021
Rupert Berk, UWash
Satya Kunta, NYU
Ken Taylor, UIUC
Ashish Pandit, UCSD
Agenda
Shifting Technical Capabilities for Data Platforms
Event streams support real and near real-time reporting and alerting, as well as incremental ETL scenarios (small batches).
HTTP APIs speed development, provide fast access, and enable efficient governance.
The rapid growth in AI/ML offerings by cloud service providers promise predictive and even prescriptive analytics, in addition to traditional descriptive or diagnostic analytics.
Data lakes promise faster experimentation and innovation by expanding analysis of raw data by analysts and data scientists.
Data Warehouse
RDBMS
Data Lake
Object Storage, Document, graph, in-memory
ETL
ELT
Batch
Events & Streams
Modeling
Persistence
Transformation
Increments
Traditional Programming
Machine Learning
Coding
Files & SQL
APIs
Interfaces
Towards a Unified Data Platform
Event Broker
Data Sources
Batch-Driven Apps
Event-Driven Apps
Data Persistence Service
Raw
Zone
Curated Zone
Usage Optimized Zone
Stream Processing
Batch Processing
ML
ML Inference
Query API
Stream Query API
User Interfaces
Data Science Workbench
Data Visualization
Dashboards
CDC
Database
File
Event
IoT
Adapted from Trivadis Blueprint for Modern Data Platform (v4), Guido Schmutz
UCSD’s Data Analytics Platform
Hierarchy manager
<- Hierarchy slot ID + [attributes]
Hierarchy slot attributes ->
Curated views (CVs)
Machine learning platform (MLP)
Stream in ->
<- Message out
<- Model development ->
Source systems/devices
or
Base Views
Intermediate Viewlets
Curated Views
Final Curated Views
Curated Views (CVs)
iPaaS
Activity table (pile file)
<- Message out
Stream in ->
Activity Hub architecture
NYU’s Data Analytics Platform
Industry Tipping points...
Infinite Compute & Storage
Machine Learning/AI
FaaS(Function as a Service)/Serverless
Real-Time/Streaming
Velocity
Volume
Variety
The 3 V’s of Future Data Platforms….
Conceptual Architecture
High Level Technical Architecture
Data Governance
Data Classification and Account Organizational Units
Data Classification and Account Organizational Units
Data Classification and Account Organizational Units
Data Classification and Account Organizational Units
Discussion