AI for Science illustrated by �Deep Learning for Geospatial Time Series
1
Geoffrey Fox, University of Virginia
John Rundle, UC Davis
Bo Feng, Indiana University
The IEEE 12th Annual Computing and Communication Workshop and Conference(CCWC2022)
January 27 2022
Especially earthquake nowcasting
or
UVA Biocomplexity/CS
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Abstract
2
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Operator Formulation of Prediction
3
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Predicting the Future with Time Series
4
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Learn Newton’s laws with Recurrent Neural Networks
5
are 4000 times longer � and also learn potential.
RNN Error2 up to step size dT=4 and total time 106
Verlet error2�dT = 0.01, 0.1
10-5
1023
101
JCS Kadupitiya, Vikram Jadhao
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Basic Spatial (bag) Time Series
6
Forecast the Future �(any number of time units
any number properties)
Predict Now
or Seq2Seq map
as in
English to French
or
rainfall to runoff
Input Properties
Static e.g. %Seniors
Dynamic e.g. Covid cases per day
Space x
(Different data sources, not necessarily nearby)
Forecast the Future
Time t
Seq 2 Seq
Data Analysis Unit
Time sequence at one space point
For Natural Language Processing, space points are different paragraphs or books. A few sentences at each point.
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
1990-2019 Dataset overview: (a) 444,589 events with magnitude >= 0; (b) 24,822 events with magnitude >= 2.5; (c) 2,489 events with magnitude >= 3.5; (d) 237 events with magnitude >= 4.5; We can observe the number of larger earthquakes is orders of magnitude less than smaller ones.
7
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Earthquakes and Deep Learning
8
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Data-driven or Theory Driven approaches
9
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Structure of the Data
10
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Energy Weighted Quantities
11
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Different observable functions versus energy averaged magnitude
12
Multiplicity v. mag
Multiplicity m>3.29 v. mag
E0.25 v. mag
E0.5 v. mag
Maximum of mag plot set to 50% other variable plotted
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Choosing function f(x) of any input quantity x
13
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
LSTM Results - predict 2 weeks for magnitude
14
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Magnitude replaced by Energy E0.25
15
Errors dominated by big spikes and description poor in quiescent region -- if value small, then absolute error small even if fractional error large; Energy or Energy0.5 worse!
Could change loss function
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Nash Sutcliffe Efficiency NSE
Qtm is model prediction for Quantity Q
Qto is observed value of Q
is mean value of Qto over time
We use Normalized Nash–Sutcliffe Efficiency NNSE = 1/(2-NSE) as a measure of fit quality
See https://en.wikipedia.org/wiki/Nash%E2%80%93Sutcliffe_model_efficiency_coefficient
16
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Depiction of Faults
17
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
36 Fault Groupings
18
Note region�32 to 36 degrees latitude
-120 to -114 degrees longitude
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Training and Validation Datasets
19
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
The 500 locations used - logEnergy
The 2400 0.1 by 0.1 degree subregions analyzed
500 of these regions were used in analysis after examination of number of quakes with M>3.29 in the region
400 (Red) for training
100 (Green) randomly chosen from 500 for validation
Remaining 1900 (black) not used
Locations of top 20 Earthquakes shown
20
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Properties and Predictions
21
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Input and Output Variables
22
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Known Inputs: Mathematical expansion functions
23
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
LSTM/TFT Description of Covid Data (3142 Counties)
Uses Weekly property plus “top-down” Legendre Polynomials
500 most populous counties in the USA
24
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Inputs and Outputs
25
Static Known Inputs (5) | 4 space-filling curve labels of fault grouping, linear label of pixel |
Targets (24) | mbin (F:Δt,t) for Δt = 2, 4, 8, 14, 26, 52, 104, 208 weeks. Also for skip 52 weeks and predict next 52; skip 104 and predict next 104. With relative weight 0.25, all the Known inputs and linear label of pixel |
Dynamic Known Inputs (13) | Pl(cosFull) for l=0 to 4 cosperiod(t), sinperiod(t) for period = 8, 16, 32, 64 |
Dynamic Unknown Inputs (9) | Energy-averaged Depth, Multiplicity, Multiplicity m>3.29 events mbin (B:Δt,t) for Δt = 2, 4, 8, 14, 26, 52 weeks |
Static Known Inputs (5) | 4 space-filling curve labels of fault grouping, linear label of pixel |
Targets (4) | mbin (F:Δt,t) for Δt = 2, 14, 26, 52 weeks. Calculated for t-52 to t for encoder and t to t+52 weeks for decoder in 2 week intervals. 104 predictions per sequence. |
Dynamic Known Inputs (13) | Pl(cosFull) for l=0 to 4 cosperiod(t), sinperiod(t) for period = 8, 16, 32, 64 |
Dynamic Unknown Inputs (9) | Energy-averaged Depth, Multiplicity, Multiplicity m>3.29 events mbin (B:Δt,t) for Δt = 2, 4, 8, 14, 26, 52 weeks |
TFT�Note targets restricted to time period of decoder LSTM although predicting the next 26 2 week mbin does NOT allow one to predict the 52 week mbin as adding and taking logs (roughly adding and taking maximum) do NOT commute
LSTM and Science Transformer
Only differ in Targets
mbin (F:Δt,t) is energy averaged total magnitude over time Δt starting at time t
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Architectures of LSTM and Hybrid Science Transformer
26
b) Space-Time Transformer (for encoder) and LSTM (for decoder)
26
Merge
Final
Initial
LSTM Layer
LSTM Layer
Outputs
optional but best results if you do this
Input |
Dense Encoder with activation |
LSTM-1 |
LSTM-2 |
Dense Decoder with activation |
Dense Output |
(B,W,InProp) |
(B,W,InProp) |
(B,W,128) |
(B,W,48) |
(B, 48) |
(B, 128) |
(B, OutPred) |
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
c) TFT Architecture
27
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Some details of Forecasting Models
28
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Search Strategies in TFT and Science Transformer
�
29
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Forecasting Models Sample Sizes
30
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
31
Static Context
LSTM Modified Temporal Fusion Transformer Science Transformer AE-TCN Joint Model
Temporal Attention
Embedders
Output Mappers
2-layer LSTM as Forward decoder
2-layer LSTM as Backward encoder
Embedder
Merge
2-layer LSTM as decoder
Space-Time�Attention
Embedder
Output �Mapper
2-layer LSTM
AutoEncoder
Temporal Convolutional
Network
Image
Image
Prediction
Weights: 67K 8 Million 2.3 Million
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Lots of Weights to Train
32
LSTM TFT
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
LSTM and Transformer Results - predict 2 weeks
33
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
LSTM and Transformer Results - predict 6 months
34
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
LSTM and Transformer Results - predict 4 years
35
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
NNSE Summed over Locations
36
| Normalized Nash Sutcliffe Efficiency NNSE | |||||
Time Period | LSTM Train | TFT Train | Science Transformer�Train | LSTM Validation | TFT Validation | Science Transformer Validation |
2 weeks | 0.903 | 0.925 | 0.893 | 0.868 | 0.87 | 0.856 |
4 weeks | 0.895 | | 0.916 | 0.867 | | 0.884 |
8 weeks | 0.886 | | 0.913 | 0.866 | | 0.881 |
14 weeks | 0.924 | 0.982 | 0.919 | 0.893 | 0.899 | 0.881 |
26 weeks | 0.946 | 0.985 | 0.954 | 0.897 | 0.895 | 0.896 |
52 weeks | 0.919 | 0.988 | 0.955 | 0.861 | 0.88 | 0.876 |
104 weeks | 0.923 | | 0.937 | 0.853 | | 0.83 |
208 weeks | 0.935 | | 0.921 | 0.811 | | 0.77 |
Validation results similar between methods
Training quality�TFT > Science Transformer > LSTM
Training results reflect number of weights
TFT 8M >
Science Transformer 2.3M >>
LSTM 66K
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Comments on Deep Learning for Earthquakes
37
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
38
Artificial Intelligence AI for Science
Computational Technologies to Assist, Augment and Automate Human activities
Statistics: Machine Learning ML and Probability
Probability and Bayesian approach provides overall framework and Machine
Deep Learning
Builds flexible models requiring less a priori knowledge in terms of stacked layers of neural nets such as dense, recurrent, convolutional, graph.
Learning specific algorithms to analyze data and use “on its own” (data driven) or in conjunction with theoretical ideas to give models, which are learnt from training and used in inference.
Neural Networks
Problem Classes
Specific Tasks
Expert Systems
Sequence To Sequence maps
Forecasting
Knowledge reasoning
Anomaly Detection
Natural Language Processing
Recommendation systems
Simulation Surrogates
Vision and Perception
Regression
Classification
Clustering
Topic Modelling
Random Forests
Autoencoders
Generative Adversarial Networks
Reinforcement Learning
Transformers (Attention)
ML&�Probability
DL
Theory Driven Data-Driven
Laws of Nature
Phenomenology
χ2
Model 🡪 Nowcasting
AI
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Choices in Scientific Discovery
39
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Building a model in 1978-1979
40
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
More Such Physics Models
41
π0X0
π0X
η0X
η0X0
-t
200 GeV
Experiments at Fermilab
E110, E260, E350
E260
E350
Model Field - Feynman-Fox
Model Regge Theory
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
MLCommons Benchmarks
42
12/7/2019
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
MLCommons (MLPerf) Consortium Deep Learning Benchmarks
Some Relevant Working Groups
43
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
MLCommons (MLPerf) Consortium Activity Areas
44
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Science Research MLCommons working group
45
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Science-based Metrics
46
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
47
Benchmark | Science | Task | Owner Institute | Specific Benchmark Issues |
CloudMask | Climate | Segmentation | RAL | Classify cloud pixels in images |
STEMDL | Material | Classification | ORNL | Classifying the space groups of materials from their electron diffraction patterns |
CANDLE-UNO | Medicine | Classification | ANL | Cancer prediction at cellular, molecular and population levels. |
TEvolOp Forecasting | Earthquake | Regression | Virginia | Predict Earthquake Activity from recorded event data |
ICF or Inertial Confinement Fusion | Plasma Physics | Simulation surrogate | LLNL | There are other possible LLNL benchmarks from collection of 10 |
Benchmark contains Datasets, Science Goals, Reference Implementations; hosted at SDSC or RAL
Specification of 4 Benchmarks https://drive.google.com/file/d/1BeefJTj4ZZL4Wa5c3zNz1l5nzQN-ktGR/view?usp=sharing
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Current Science WG Benchmark Status
48
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
High Performance Data Engineering
Some Details on Cylon
49
12/7/2019
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
50
ML Code
NIPS 2015 http://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
This well-known paper points out that parallel high-performance machine learning is perhaps most fun but just a part of system. We need to integrate in the other data and orchestration components.
This integration is not very good or easy partly because data management systems like Spark are JVM-based which doesn’t cleanly link to C++, Python world of high-performance ML
ML code module is itself built up hierarchically from Numpy and Pandas operations (if Python)
Need to assemble 10 large modules into full workflow and efficiently execute Numpy/Pandas etc. inside modules
Integrating Data Engineering and Data Science
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Data Engineering versus Data Science I: Deep Learning Workflow
51
Workflow often divide into two:
Data => Information preprocessing -- Hadoop, Spark, Twister2, Scikit-Learn
Information => Knowledge Compute intensive step Cylon enhanced Spark Twister2, PyTorch and Tensorflow
Post-Processing
Data
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Data Engineering versus Data Science II
52
Data Engineering Data Science
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Two ways
53
Lines of Open Source Code: Twister2 145000 (Java, Python)
Cylon 25000 (C++, Python, Java)
Data Collection
Pre-Processing
Model Training
Inference/Prediction
Big Data Frameworks
Deep Learning Frameworks
Twister2DL
Twister2, Spark, Flink, Hadoop, ...
PyTorch, TensorFlow, MXNet, Keras, ...
From Data Management(DM) to DL (Twister2) from Deep Learning(DL) to DM (Cylon)
Data Engineering
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Must support Parallelism as Automatically as Possible
54
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Some Intrinsically Parallel Operators
55
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
56
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Cylon: A High Performance Distributed Data Table
57
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Strong Scaling Comparison with Other Frameworks
58
Inner join
Union
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Large Scale Experiments with PySpark and Cylon
59
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series
Strong Scaling Comparison with Other Frameworks
60
Aggregations
Group-by + aggregations
UVA Biocomplexity/CS
AI for Science illustrated by Deep Learning for Geospatial Time Series