Spot Virtual Machine Eviction �Prediction in Microsoft Cloud
Fangkai Yang, Bowen Pang, �Jue Zhang, Bo Qiao, Lu Wang �
Microsoft Research
Microsoft 365
Microsoft Azure
2
Billing policies in public cloud
On-demand Instance
Reserved Instance
Spot Instance
AWS: Spot Instance
Azure: Spot Virtual Machines
Google Cloud: Spot Virtual Machines
What is Spot Instance in the Cloud Computing?
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
3
Computing resource (capacity)�in datacenter of public cloud
Used
resources
Unused
resources
How Spot Instance Work?
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Fulfill!
4
Computing resource (capacity)�in datacenter of public cloud
How Spot Instance Work?
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Used
resources
Unused
resources
Interrupt!
5
Spot Instance Interruption Prediction
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
User
Cloud Vendor
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Spot VM eviction predictions,
available per region in the Azure Portal
when deploying new VMs, are helpful
to optimize capacity utilization planning
and allocation management.
the eviction prediction informs users
to optimize deployment plans
to increase the survivability of Spot VMs
and reduce the possibility of interruptions.
6
Difficulty predicting spot instance interruption
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
complex allocation policy
large-scale data center
Difficult to predict
7
Spot Instance Interruption Prediction
Cluster Level & Node Level Prediction
8
Cluster Level Prediction
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
9
Computing resource (capacity)�in datacenter of public cloud
Cluster Level Prediction
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
T
…
60%
78%
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
50%
18%
30%
60%
10%
98%
10
Problem of Cluster Level Prediction
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
| On-demand | Spot |
Instance Type A | | |
Instance Type B | | |
Instance Type C | | |
Instance Type D | | |
Cluster A
Cluster B
Cluster Capacity Utilization
Node 1
Node 2
Node 3
Node 1
Node 2
Node 3
Cluster A Node3
Cluster B Node3
Interrupt
No -Interrupt
Azure VM Allocator
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
11
Node Level Prediction
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
The overview of node-level spatial-temporal prediction framework
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
12
Spatial-Temporal Transformer Framework
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
13
Spatial-Temporal Transformer Framework
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
14
Transformer?
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Bidirectional Encoder Representation form Transfomers
Genverative Pre-Training
15
Transformer Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Add &norm
Feed�Forwad
Add &norm
Multi-Head�Attention
Input
Embedding
Input
Embedding
Add &norm
Multi-Head�Attention
Add &norm
Multi-Head�Attention
Add &norm
Feed�Forwad
Linear
Softmax
Output
Probabilities
Inputs
Outputs
(shifted righ)
Positional�Encoding
Positional�Encoding
16
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Add &norm
Feed�Forwad
Add &norm
Multi-Head�Attention
Input
Embedding
Input
Embedding
Add &norm
Multi-Head�Attention
Add &norm
Multi-Head�Attention
Add &norm
Feed�Forwad
Linear
Softmax
Output
Probabilities
Inputs
Outputs
(shifted righ)
Positional�Encoding
Positional�Encoding
Encoder
Decoder
17
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Add &norm
Feed�Forwad
Add &norm
Multi-Head�Attention
Input
Embedding
Input
Embedding
Add &norm
Multi-Head�Attention
Add &norm
Multi-Head�Attention
Add &norm
Feed�Forwad
Linear
Softmax
Output
Probabilities
Inputs
Outputs
(shifted righ)
Positional�Encoding
Positional�Encoding
1. Input Embedding
Input
Embedding
안녕 경환아 잘 지내?
0.1
0.65
0.29
안녕 =
18
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Add &norm
Feed�Forwad
Add &norm
Multi-Head�Attention
Input
Embedding
Input
Embedding
Add &norm
Multi-Head�Attention
Add &norm
Multi-Head�Attention
Add &norm
Feed�Forwad
Linear
Softmax
Output
Probabilities
Inputs
Outputs
(shifted righ)
Positional�Encoding
Positional�Encoding
2. Input Embedding
Time Step
Positional
Encoding
Positional Input
Embeddings
1
2
3
4
19
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Add &norm
Feed�Forwad
Add &norm
Multi-Head�Attention
Input
Embedding
Input
Embedding
Add &norm
Multi-Head�Attention
Add &norm
Multi-Head�Attention
Add &norm
Feed�Forwad
Linear
Softmax
Output
Probabilities
Inputs
Outputs
(shifted righ)
Positional�Encoding
Positional�Encoding
3-4. Encoder Layer
Positional Input
Embeddings
Add &norm
Feed�Forwad
Add &norm
Multi-Head�Attention
Encoder Input
Represintation
20
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Multi-Head�Attention
3. Multi-headed Attention
안녕
경환아
잘
지내?
Q
K
V
Self-Attention
Linear
Softmax
Linear
Linear
Scale
MatMul
MatMul
Linear
concat
21
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Multi-Head�Attention
3. Multi-headed Attention
안녕
경환아
잘
지내?
Q
K
V
Self-Attention
Linear
Softmax
Linear
Linear
Scale
MatMul
MatMul
Linear
concat
Linear
Linear
Linear
key
query
value
22
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Multi-Head�Attention
3. Multi-headed Attention
Q
K
V
Self-Attention
Linear
Softmax
Linear
Linear
Scale
MatMul
MatMul
Linear
concat
key
query
Scores
23
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Multi-Head�Attention
3. Multi-headed Attention
Q
K
V
Self-Attention
Linear
Softmax
Linear
Linear
Scale
MatMul
MatMul
Linear
concat
안녕
경환아
잘
지내?
안녕
경환아
잘
지내?
98
27
10
12
89
9
67
54
91
92
54
67
9
10
27
12
Attention
Energies
24
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Multi-Head�Attention
3. Multi-headed Attention
Q
K
V
Self-Attention
Linear
Softmax
Linear
Linear
Scale
MatMul
MatMul
Linear
concat
sqrt(dimension of key)
Scaled Scores
25
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Multi-Head�Attention
3. Multi-headed Attention
Q
K
V
Self-Attention
Linear
Softmax
Linear
Linear
Scale
MatMul
MatMul
Linear
concat
SoftMax
안녕
경환아
잘
지내?
안녕
경환아
잘
지내?
=
0.7
0.3
0.1
0.6
0.1
0.6
0.3
0.1
0.1
0.3
0.3
0.1
0.1
0.1
0.1
0.2
26
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Multi-Head�Attention
3. Multi-headed Attention
Q
K
V
Self-Attention
Linear
Softmax
Linear
Linear
Scale
MatMul
MatMul
Linear
concat
value
Attentaion weight
output
27
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Multi-Head�Attention
3. Multi-headed Attention
Q
K
V
Self-Attention
Linear
Softmax
Linear
Linear
Scale
MatMul
MatMul
Linear
concat
key
query
value
key
query
value
28
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Multi-Head�Attention
3. Multi-headed Attention
Q
K
V
Self-Attention
Linear
Softmax
Linear
Linear
Scale
MatMul
MatMul
Linear
concat
key
query
value
key
query
value
Self-Attention
Head 1
Self-Attention
Head 2
N = 2
29
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Multi-Head�Attention
3. Multi-headed Attention
Q
K
V
Self-Attention
Linear
Softmax
Linear
Linear
Scale
MatMul
MatMul
Linear
concat
N = 2
Linear
30
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
Multi-Head�Attention
3. Multi-headed Attention
Q
K
V
Self-Attention
Linear
Softmax
Linear
Linear
Scale
MatMul
MatMul
Linear
concat
Multi-headed Attention
Output vectors
Positinal Input
Embedding
31
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
4. Residual Connection, Layer Normalizatioin
& Pointwise Feed Forward
Multi-headed Attention
Add &norm
Feed�Forwad
Add &norm
Multi-Head�Attention
Input
Embedding
Input
Embedding
Add &norm
Multi-Head�Attention
Add &norm
Multi-Head�Attention
Add &norm
Feed�Forwad
Linear
Softmax
Output
Probabilities
Inputs
Outputs
(shifted righ)
Positional�Encoding
Positional�Encoding
32
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
4. Residual Connection, Layer Normalizatioin
& Pointwise Feed Forward
Add &norm
Feed�Forwad
Add &norm
Multi-Head�Attention
Input
Embedding
Input
Embedding
Add &norm
Multi-Head�Attention
Add &norm
Multi-Head�Attention
Add &norm
Feed�Forwad
Linear
Softmax
Output
Probabilities
Inputs
Outputs
(shifted righ)
Positional�Encoding
Positional�Encoding
LayerNorm
Linear
Linear
ReLu
LayerNorm
33
Attention Mechanism
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
4. Residual Connection, Layer Normalizatioin
& Pointwise Feed Forward
Add &norm
Feed�Forwad
Add &norm
Multi-Head�Attention
Input
Embedding
Input
Embedding
Add &norm
Multi-Head�Attention
Add &norm
Multi-Head�Attention
Add &norm
Feed�Forwad
Linear
Softmax
Output
Probabilities
Inputs
Outputs
(shifted righ)
Positional�Encoding
Positional�Encoding
Transformer
Encoder
Transformer
Encoder
.
.
.
N times
34
Node Level Prediction
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
The overview of node-level spatial-temporal prediction framework
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
35
Node Level Prediction
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
36
Experimental Evaluation
Setup & Baselines, Result
37
Experimental setup
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
Intel(R) Xeon(R) CPU E5-2690@2.6GHz
112GB Memory
On-demand, Spot VM count
Node Capacity(core & memory)�Node Capacity Utilization
evicted (core/memory/rate/count) of Spot VMs �from the previous 3 hours, next 3 hours
Every 1hour Rule�2 Weeks
12000 Node, 20 Cluster
Experimental System
Collect Train Data
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
38
Baseline
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
Linear Regression (LR)
Support Vector Regression (SVR)
Random Forest (RF)
Gradient Boosting Decision Tree (GBDT)
Long Short-term Memory (LSTM)
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
39
Experimental Results
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
40
Experimental Results
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim
41
Conclusion
42
Conculsion
Distributed Data Processing System Lab, KOOKMIN UNIVERSITY
In this paper investigate Spot VM eviction prediction methods at the node level and the cluster level
2023-03-13 DDPS Seminar, Presenter Kyunghwan Kim