Advances in High Performance Computing and Deep Learning: Data Engineering and Data Science
1
Digital Science Center
Advances in High Performance Computing and Deep Learning
Interesting Changes in Fields and Communities
2
Digital Science Center
Advances in High Performance Computing and Deep Learning
Remarks on Convergence Big Data, Simulation, HPC
3
Digital Science Center
Advances in High Performance Computing and Deep Learning
HPCforML: Similar Challenges in Parallelism for Big Data and Simulation �Complexity of Synchronization and Parallellization
4
Pleasingly Parallel
Often independent events
MapReduce as in scalable databases
Structured Adaptive Sparse
Regular simulations
Current major Big Data category
Commodity Clouds
HPC Clouds: Accelerators
High Performance Interconnect
Global Machine Learning
e.g. parallel clustering
Deep Learning
HPC Clouds/Supercomputers
Memory access also critical
Unstructured Adaptive Sparse
Graph Analytics e.g. subgraph mining
LDA
Linear Algebra at core �(often not sparse)
Straightforward Parallelism
Parameter sweep simulations
Loosely Coupled
Complex Coupling
Regular Coupling
User Performed Parallelism
Increasing�Data
Simulations
Digital Science Center
Advances in High Performance Computing and Deep Learning
5
ML Code
NIPS 2015 http://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
This well-known paper points out that parallel high-performance machine learning is perhaps most fun but just a part of system. We need to integrate in the other data and orchestration components.
This integration is not very good or easy partly because data management systems like Spark are JVM-based which doesn’t cleanly link to C++, Python world of high-performance ML
Twister2, Cylon at IU address
HPCforML: Integration Challenges
Digital Science Center
Advances in High Performance Computing and Deep Learning
High Performance Computing and Deep Learning
6
12/7/2019
Digital Science Center
Advances in High Performance Computing and Deep Learning
Let’s look at ML for HPC
7
Digital Science Center
Advances in High Performance Computing and Deep Learning
8
1.1 MLAutotuningHPC – Learn configurations
2.1 MLAutotuningHPC – Smart ensembles
1.2 MLAutotuningHPC – Learn models from data
3.2 MLaroundHPC: Learning Outputs from Inputs (fields)
3.1 MLaroundHPC: Learning Outputs from Inputs (parameters)
2.2 MLaroundHPC: Learning Model Details (coarse graining, effective potentials)
1.3 MLaroundHPC: Learning Model Details (ML based data assimilation)
2.3 MLaroundHPC: Improve Model or Theory
INPUT
OUTPUT
1. Improving Simulation with Configurations and Integration of Data
2. Learn Structure, Theory and Model for Simulation
3. Learn Surrogates for Simulation
Digital Science Center
Advances in High Performance Computing and Deep Learning
Examples of ML for HPC (work with JCS Kadupitiya, Vikram Jadhao)
9
→ 106 as Nlookup → ∞
The Learning Net
Direct simulation compared to Surrogates
Digital Science Center
Advances in High Performance Computing and Deep Learning
Up to two billion times acceleration of scientific simulations with deep neural architecture search
10
Digital Science Center
Advances in High Performance Computing and Deep Learning
INSILICO MEDICINE USED CREATIVE AI TO DESIGN POTENTIAL DRUGS IN JUST 21 DAYS
11
September 4 2019 News Item
Digital Science Center
Advances in High Performance Computing and Deep Learning
Operator Formulation of Deep Learning Inference
12
Digital Science Center
Advances in High Performance Computing and Deep Learning
Learn Newton’s laws with Recurrent Neural Networks
13
are 4000 times longer � and also learn potential.
RNN Error2 up to step size dT=4 and total time 106
Verlet error2�dT = 0.01, 0.1
10-5
1023
101
JCS Kadupitiya, Vikram Jadhao
Digital Science Center
Advances in High Performance Computing and Deep Learning
Results on different potentials (one particle)
14
Classic Simulation Error
Digital Science Center
Advances in High Performance Computing and Deep Learning
Multiple versions of MLforHPC used in�Simulating Biological Organisms (with James Glazier @IU)
15
Learning Model (Agent) Behavior�Replace components by learned surrogates �(Reaction Kinetics Coupled ODE’s)
Dynamic Data Assimilation
Theory to Instance
Smart Ensembles
All steps use MLAutotuning
Digital Science Center
Advances in High Performance Computing and Deep Learning
Futures of ML for HPC
16
Digital Science Center
Advances in High Performance Computing and Deep Learning
Looking at Covid Distributions
17
12/7/2019
Digital Science Center
Advances in High Performance Computing and Deep Learning
Times Series represented by Deep Learning
18
Digital Science Center
Advances in High Performance Computing and Deep Learning
Basic Spatial (bag) Time Series
19
Forecast the Future �(any number of time units
any number properties)
Predict Now
or Seq2Seq map
as in
English to French
or
rainfall to runoff
Input Properties
Static e.g. %Seniors
Dynamic e.g. Covid cases per day
Space x
(Different data sources, not necessarily nearby)
Forecast the Future
Time t
Seq 2 Seq
Data Analysis Unit
Time sequence at one space point
For Natural Language Processing, space points are different paragraphs or books. A few sentences at each point. Earthquake points are nearby
Digital Science Center
Advances in High Performance Computing and Deep Learning
General Deep Learning Strategies
20
Hybrid Transformer (for encoder) and LSTM (for decoder)
Pure LSTM
Two different models used
Q(from i) KT(from j) V(j)
added over j
is attention for i
Q K V are dense layers on input i.e linear combinations plus activations on inputs
Merge
Final
Initial
LSTM Layer
LSTM Layer
Outputs
Final
Initial
LSTM Layer
LSTM Layer
Outputs
Inputs
optional
Digital Science Center
Advances in High Performance Computing and Deep Learning
Pure LSTM description of 205 day 314 City Data
21
SQRT(N) Summed over cities from fit to individual times/cities for daily data with 2 week prediction
Red is error
Digital Science Center
Advances in High Performance Computing and Deep Learning
Hybrid Transformer with intrinsic error for 314 cities/counties and 159 days ( 7.11 secs/epoch)
22
Digital Science Center
Advances in High Performance Computing and Deep Learning
Hybrid Transformer with larger intrinsic error for 110 cities/counties and 115 days
23
Digital Science Center
Advances in High Performance Computing and Deep Learning
Particular Regions
from Hybrid Transformer
24
New York City
Chicago (Cook County)
Digital Science Center
Advances in High Performance Computing and Deep Learning
Particular Regions
from Hybrid Transformer
25
Seattle (King)
Los Angeles
Digital Science Center
Advances in High Performance Computing and Deep Learning
Comments I
26
Digital Science Center
Advances in High Performance Computing and Deep Learning
Comments II
27
Digital Science Center
Advances in High Performance Computing and Deep Learning
Collection of Time Series Machine Learning Algorithms (MLPerf)
28
Areas | Applications | Model | Data sets | Papers |
Cars, Taxis, Freeway Detectors | TT-RNN , BNN, LSTM | Taxi/Uber trips [2-5] | [6-8] | |
Wearables, Medical instruments: EEG, ECG, ERP, Patient Data | LSTM, RNN | OPPORTUNITY [9-10], | [16-20] | |
Intrusion, classify traffic, anomaly detection | LSTM | GPL loop dataset [21], SherLock [22] | [21, 23-25] | |
Household electric use, Economic, Finance, Demographics, Industry | CNN, RNN | Household electric [26], M4Competition [27], | [28-29] | |
Stock Prices versus time | CNN, RNN | Available academically from Wharton [30] | [31] | |
Climate, Tokamak | Markov, RNN | USHCN climate [32] | [33-35] | |
Events | LSTM | Enterprise SW system [36] | [36-37] | |
Language and Translation | Pre-trained Data | Transformer [38] | [39-40] | [41-42]Mesh Tensorflow |
All-Neural On-Device Speech Recognizer | RNN-T | | [43] | |
IndyCar Racing | Real-time car and track detectors | HTM, LSTM | | [44] |
Online Clustering | Available from Twitter | [45-46] |
Xinyuan Huang from Cisco and MLPerf
Digital Science Center
Advances in High Performance Computing and Deep Learning
Lots of Applications and DL Opportunities: Hydrology
29
https://eartharxiv.org/xs36g/
Digital Science Center
Advances in High Performance Computing and Deep Learning
Time Series and Operators for Earthquakes
30
Predicted True
From Test Set: Predicted True
Digital Science Center
Advances in High Performance Computing and Deep Learning
High Performance Computing and Deep Learning Benchmarks
31
12/7/2019
Digital Science Center
Advances in High Performance Computing and Deep Learning
MLPerf Consortium Deep Learning Benchmarks
Some Relevant Working Groups
MLPerf's mission is to build fair and useful benchmarks for measuring training and inference performance of ML hardware, software, and services.
Benchmark what user sees
Used for purchasing decisions worth >$1B USD, rapidly growing market (ML chipset market in 2025 is ~$60B)
Total 50 FTE
32
73 Companies; 10 universities
Digital Science Center
Advances in High Performance Computing and Deep Learning
MLPerf Industry Machine Learning Site
Training v0.7
33
Images, Images, Images, Translation, Translation, Voice, Recommender, Play Go
https://mlperf.org/ �ALL Deep Learning but its MLPerf not DLPerf
2048 TPUs or
1536 V100’s with Infiniband quite powerful
Digital Science Center
Advances in High Performance Computing and Deep Learning
Science Data WG in MLPerf
34
Digital Science Center
Advances in High Performance Computing and Deep Learning
Science Data MLPerf working group
35
One aim is to provide a mechanism for assessing the capability of different ML models in addressing different scientific problem
Build tutorials around benchmarks
Digital Science Center
Advances in High Performance Computing and Deep Learning
Possible Initial MLPerf Science Benchmarks
36
https://github.com/stfc-sciml/sciml-benchmarks
Digital Science Center
Advances in High Performance Computing and Deep Learning
Cloud Masking
Segmentation & Classification
Digital Science Center
Advances in High Performance Computing and Deep Learning
SBI: Surrogate Benchmark Initiative�FAIR Surrogate Benchmarks Supporting AI and Simulation Research
PI: Geoffrey Fox, IU
Replacing traditional HPC computations with Deep Learning surrogates can improve the performance of simulations and make optimal use of diverse architectures
SBI collaborates with Industry and a leading machine learning benchmarking activity -- MLPerf
GOAL: Accelerate and better understand Deep Learning Surrogate models that can replace all or part or of traditional large-scale HPC computations with major performance increases.
Software Research: SBI will design and build general middleware to support the generation and the use of surrogates.
Findable, Accessible, Interoperable, and Reusable FAIR data ecosystem for HPC surrogates
Application Benefits: SBI will also make it easier for general users to develop new surrogates and help make their major performance increases pervasive across DoE computational science.
Digital Science Center
Advances in High Performance Computing and Deep Learning
Technology for Benchmarking
39
Digital Science Center
Advances in High Performance Computing and Deep Learning
Call for Action!
40
Digital Science Center
Advances in High Performance Computing and Deep Learning
Data Engineering and Deep Learning
Deep Learning Infrastructure with Cylon and Twister2
41
12/7/2019
DL Code
implies we need deep learning plus general data engineering
Digital Science Center
Advances in High Performance Computing and Deep Learning
Digital Science Center
Advances in High Performance Computing and Deep Learning
Deep Learning Workflow
43
Workflow often divide into two:
Data => Information preprocessing -- Hadoop, Spark, Twister2, Scikit-Learn
Information => Knowledge Compute intensive step Cylon enhanced Spark Twister2, PyTorch and Tensorflow
Digital Science Center
Advances in High Performance Computing and Deep Learning
Data Engineering
Digital Science Center
Advances in High Performance Computing and Deep Learning
Two Ecosystems
Enterprise: Java
Initial Data Engineering
Research Labs, Universities: Python
Final Deep Learning
GOAL: High Performance in each eco-system and high-performance integration between two ecosystems.
Digital Science Center
Advances in High Performance Computing and Deep Learning
Twister2
Big Data Processing�Eco-System
Dataflow
API
Dataflow
API
Twister2
Linear/Relational Algebra Operators
Distributed Linear/Relational Algebra Operators [C++]
Distributed Relational Communication Operations [C++]
Communication Kernels
Twister2 one of 5 possible engines for Apache Beam, which implements a rich data processing (dataflow) workflow environment – other engines are Spark, Flink, Samza, Google Cloud Dataflow
Cylon
Digital Science Center
Advances in High Performance Computing and Deep Learning
Twister2 Benchmarks with Big Data Processing Ecosystem
Digital Science Center
Advances in High Performance Computing and Deep Learning
High Performance Data Engineering
Digital Science Center
Advances in High Performance Computing and Deep Learning
Cylon Architecture
Builds on Apache Arrow to��Link Python world:
Jupyter, Numpy, Pandas, Modin
with Java world�Spark, Twister2
and C++/CUDA world�High performance deep learning on PyTorch and Tensorflow
Digital Science Center
Advances in High Performance Computing and Deep Learning
Cylon: A High Performance Distributed Data Table
50
Digital Science Center
Advances in High Performance Computing and Deep Learning
Performance Comparison with Other Frameworks
Digital Science Center
Advances in High Performance Computing and Deep Learning
Cylon Performance with Language Bindings
Digital Science Center
Advances in High Performance Computing and Deep Learning
Large Scale Experiments with PySpark and PyCylon
Digital Science Center
Advances in High Performance Computing and Deep Learning
Jupyter Notebooks: Data Conversion and Usability
Digital Science Center
Advances in High Performance Computing and Deep Learning
Future Cylon Work
Digital Science Center
Advances in High Performance Computing and Deep Learning
Sound Bites for Cylon
Digital Science Center
Advances in High Performance Computing and Deep Learning
Conclusions
57
12/7/2019
Digital Science Center
Advances in High Performance Computing and Deep Learning
Conclusions
58
Digital Science Center
Advances in High Performance Computing and Deep Learning
Thank you!
External Collaborators at Argonne National Lab, Arizona State University, Kansas, Rutgers, Stony Brook, UT Knoxville, University of Virginia and MLPerf
Indiana University Digital Science Center:
Faculty: David Crandall, James Glazier, Vikram Jadhao, Judy Qiu and others
Staff: Josh Ballard, Gary Miksik, Fugang Wang, Chathura Widanage,
Researchers: Gurhan Gunduz, Supun Kamburugamuve, Ahmet Uyar, Gregor von Laszewski
Students: Vibhatha Abeykoon, Bo Feng, JCS Kadupitiya, Niranda Perera, Pulasthi Wickramasinghe,
59
Digital Science Center
Advances in High Performance Computing and Deep Learning