Technology-Aware Program Optimization for Emerging Memory Technologies
Pu (Luke) Yi*, Anjiang Wei*, and Sara Achour
We developed a software stack for resistive RAM (ReRAM)
Resistive RAM
RRAM cells stores program data as analog resistances.
Resistance
7.8kΩ
40kΩ
Resistive Memory
Resistive RAM
The analog resistance is configured by applying voltage pulses to the cell.
Resistance
7.8kΩ
40kΩ
Resistive Memory
Resistive RAM
To precisely write a particular resistance, practitioners apply voltage pulses until the resistance is within some write range.
Resistance
7.8kΩ
40kΩ
Resistive Memory
Hardware Read-Write Algorithms
Resistive RAM
The write algorithm determines the schedule of pulses to tune the resistance to be within a certain range.
Resistance
7.8kΩ
40kΩ
Resistive Memory
Hardware Read-Write Algorithms
Resistive RAM
The resistance range for a given write operation is determined by the level allocation scheme.
Resistance
7.8kΩ
40kΩ
Resistive Memory
Hardware Read-Write Algorithms
Level Allocation Schemes
Resistive RAM
The level allocation scheme divides up the resistance range of a ReRAM cell into multiple levels.
Each level store a value – values are typically binary bit strings.��
Example: 4-level allocation that stores bitstrings 00,01,11,10
Level Allocation Schemes
Resistance
7.8kΩ
40kΩ
Resistance Range 3
Binary “01”
Resistance Range 0
Binary “00”
Resistive RAM
Errors occur when the resistance drifts from one level to another. This naturally occurs due to thermal effects, process variation, and relaxation.
The error model for a level allocation is a matrix describing the probability that the memory level changes
Level Allocation Schemes
Resistance
7.8kΩ
40kΩ
Resistance Range 3
Binary “01”
Resistance Range 0
Binary “00”
Resistive RAM
The error model for a level allocation is a matrix that describes the probability that the memory drifts from level i to level j.
Level Allocation Schemes
Resistance
7.8kΩ
40kΩ
Resistance Range 3
Binary “01”
Resistance Range 0
Binary “00”
Level Error Models
Resistive RAM
The error model for a level allocation is a matrix that describes the probability that the memory drifts from level i to level j.
Goal: Co-optimize level allocation scheme and high-level program.
Level Allocation Schemes
Resistance
7.8kΩ
40kΩ
Resistance Range 3
Binary “01”
Resistance Range 0
Binary “00”
Level Error Models
Previously, we developed a software stack for targeting numerical programs.
Application Domain
Last time, we introduced several approximate numerical representations that store approximable bits (e.g., mantissa bits) in MLC ReRAM.
Resistive Memory
Hardware Characterizion
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Read/Write Interface
Numerical Application Domain
And Software Tooling
Application Domain
We architected a program autotuner that figures out the most compact approximate numerical representation for each piece of data in the program.
The autotuned program dynamically checks that the approximate program delivers an end-to-end quality metric - this is validated via simulation over test inputs.
Resistive Memory
Hardware Characterizion
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Read/Write Interface
Numerical Application Domain
And Software Tooling
Application Domain
This approach had several drawbacks…
Resistive Memory
Hardware Characterizion
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Read/Write Interface
Numerical Application Domain
And Software Tooling
Application Domain
Drawback 1: No accuracy guarantees.
Autotuner introduces approximation into programs by dynamically tuning numerical representations on sets of test inputs.
No guarantees the approximate program will generally perform well on thor inputs.
Can we trust the approximate program?
Resistive Memory
Hardware Characterizion
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Read/Write Interface
Numerical Application Domain
And Software Tooling
Application Domain
Drawback 2: Time-Consuming
Autotuner basically performs a search over the numerical representation parameter space.
For each parameter combination, autotuner runs monte-carlo simulation with injected errors to find good parameter combinations.
Resistive Memory
Hardware Characterizion
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Read/Write Interface
Numerical Application Domain
And Software Tooling
Application Domain
Drawback 3: Complex Architecture
Requires hardware support separately storing precise and approximate data. Benefits from specialized logic that packs/unpacks custom numerical data representations
Requires ECC or reliable last-level storage medium for precise data.
Resistive Memory
Hardware Characterizion
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Read/Write Interface
Numerical Application Domain
And Software Tooling
Application Domain
Question: Can we find an application domain that meets the following criteria?
Numerical Application Domain
And Software Tooling
Application Domain
Analysis-Amenable: We want to be able to statically (without execution) provide end-to-end accuracy guarantees, given the level error model for the ReRAM storage array.
Numerical Application Domain
And Software Tooling
Level Allocation Schemes + Level-Based Error Models
Application Domain
Analysis-Amenable: We want to be able to statically (without execution) provide end-to-end accuracy guarantees, given the level error model for the ReRAM storage array.
We want to be able to efficiently derive these accuracy guarantees – no extensive simulation over test inputs!
Numerical Application Domain
And Software Tooling
Level Allocation Schemes + Level-Based Error Models
Application Domain
We want to be able to implement and accelerate computation with a simple ReRAM-based architecture.
Numerical Application Domain
And Software Tooling
Resistive Memory
Hardware Read-Write Algorithms
Read/Write Interface
Level Allocation Schemes + Level-Based Error Models
Application Domain
We want to be able to implement and accelerate computation with a simple ReRAM-based architecture.
1. The computational model should be resilient to memory errors and should require little-to-no precise storage.
2. The computational operators should be amenable to acceleration and/or in-memory computation.
Numerical Application Domain
And Software Tooling
Resistive Memory
Hardware Read-Write Algorithms
Read/Write Interface
Level Allocation Schemes + Level-Based Error Models
Application Domain
Vector symbolic architectures meet these conditions
We want to be able to implement and accelerate computation with a simple ReRAM-based architecture.
1. The computational model should be resilient to memory errors and should require little-to-no precise storage.
2. The computational operators should be amenable to acceleration and/or in-memory computation.
Numerical Application Domain
And Software Tooling
Resistive Memory
Hardware Read-Write Algorithms
Read/Write Interface
Level Allocation Schemes + Level-Based Error Models
Vector Symbolic Architectures
Vector Symbolic Architectures – A “Brain-inspired” computing model from the cognitive science community.
VSA (Vector Symbolic Architecture)
Vector Symbolic Architectures
Vector Symbolic Architectures – A “Brain-inspired” computing model from the cognitive science community.
HD Computation is a special case of a Vector Symbolic Architecture (VSA) that has been explored by the hardware community.
VSA (Vector Symbolic Architecture)
HD Computation
Vector Symbolic Architectures
What can you do with Vector Symbolic Architectures?
VSA (Vector Symbolic Architecture)
HD Computation
?
Vector Symbolic Architectures
What can you do with Vector Symbolic Architectures?
Can perform classification, implement complex data retrieval tasks, and implement content-addressable memory systems.
VSA (Vector Symbolic Architecture)
HD Computation
Workloads:
?
Vector Symbolic Architectures
What can you do with Vector Symbolic Architectures?
Can perform classification, implement complex data retrieval tasks, and implement content-addressable memory systems.
Information encoding/training algorithm can potentially be a “one-shot” procedure, and doesn’t necessarily require iterative tuning.
VSA (Vector Symbolic Architecture)
HD Computation
Workloads:
?
Vector Symbolic Architectures
What can you do with Vector Symbolic Architectures?
Can also implement many classical data structures.
Data structures:
Sets and Lists
Multisets, histograms, frequency lists
Graphs
Binary Trees
Stack
Finite Automata
Key-Value Stores & Databases
VSA (Vector Symbolic Architecture)
HD Computation
?
Vector Symbolic Architectures
Why target this computational model instead of classical computation?
Let’s revisit our application domain wishlist..
VSA (Vector Symbolic Architecture)
HD Computation
?
Vector Symbolic Architectures
This computational model is highly error resilient, and can tolerate a large number of memory errors.
VSA (Vector Symbolic Architecture)
HD Computation
Vector Symbolic Architectures
Computational model uses data representations and computational operators that are amenable to acceleration.
E.g. element-wise XOR, majority
VSA (Vector Symbolic Architecture)
HD Computation
Vector Symbolic Architectures
Computational model uses data representations and computational operators that are amenable to acceleration.
E.g. element-wise XOR, majority
Some of the computational operators can even be implemented in memory.
E.g. hamming distance
VSA (Vector Symbolic Architecture)
HD Computation
Vector Symbolic Architectures
One criteria left…
Are vector symbolic architectures amenable to analysis? Can we statically derive accuracy guarantees for a given VSA program?
VSA (Vector Symbolic Architecture)
HD Computation
?
Vector Symbolic Architectures
Yes! In this talk, we will present an approach for statically obtaining accuracy guarantees for a VSA program.
We will also showcase what kinds of program optimizations and analyses we can do with this technique.
VSA (Vector Symbolic Architecture)
HD Computation
Vector Symbolic Architectures
First, though we need to introduce the VSA computational model.
VSA (Vector Symbolic Architecture)
HD Computation
Vector Symbolic Architectures
What are Vector Symbolic Architectures?
“Brain-inspired” computing model from the cognitive science community.
Data: Numerical Vectors “hypervectors”
VSA (Vector Symbolic Architecture)
What are Vector Symbolic Architectures?
VSAs compute over hypervectors. A hypervector is an N-dimensional vector of binary, integer, or real values.
Introduction to Hypervectors
[0,1,1,1,0,.........,1,1,0]
[0,3,2,5,4,.........,1,4,0]
Binary Hypervector
Radix-6 Hypervector
What are Vector Symbolic Architectures?
In VSAs, we are primarily interested in the distance between hypervectors.
Small distance between hypervectors �Similar
Large distance between hypervectors� Different or dissimilar
Introduction to Hypervectors
[0,1,0,1,0,.........,1,1,0]
Hypervector 1
[0,1,1,1,0,.........,1,1,0]
Hypervector 2
Hamming Distance
(X, Y)〉 = 1/N ∑ XOR(Xi,Yi)
What are Vector Symbolic Architectures?
Hypervectors are initially randomly generated and don’t encode any specific information.
Introduction to Hypervectors
[0,1,1,1,0,.........,1,1,0]
Binary Hypervector
The bits in the hypervector have no meaning
Random bit
What are Vector Symbolic Architectures?
These random hypervectors are dissimilar with one another and represent distinct concepts in the computational problem.
Example: we use a randomly generated hypervector for each letter in the alphabet. Each letter is distinct.
Introduction to Hypervectors
Random HV1
The Letter “D”
The Letter “A”
Random HV2
D is dissimilar to A
What are Vector Symbolic Architectures?
Codebook: Collection of distinct hypervectors that map to distinct concepts in the computation.
The VSA computation uses the hypervectors in the codebook to build data structures.
Introduction to Hypervectors
Random HV1
The Letter “D”
The Letter “A”
Random HV2
D is dissimilar to A
Codebook
What are Vector Symbolic Architectures?
Codebook: Collection of distinct hypervectors that map to distinct concepts in the computation.
The VSA computation uses the hypervectors in the codebook to build data structures.
Introduction to Hypervectors
Example Codebooks�
Letters in alphabet (M=26+1) �Codebook that can be used to create text.
Base-10 Digits (M=10)�Codebook that can be used to create numerical values.
Vertices (M=100)�Codebook that can be used to create graphs with 100 vertices.
Countries (M=195) �Codebook that can be used to encode geographical information.
What are Vector Symbolic Architectures?
Information is then embedded into hypervectors by performing computational operations over hypervectors.
Operators: Permutation (p), Bundling (+), Binding (*), and Distance Operators (dist).
Introduction to Hypervectors
Compute
codebook: A,B,C
Z = A+C
What are Vector Symbolic Architectures?
Information is then embedded into hypervectors by performing computational operations over hypervectors.
Operators: Permutation (p), Bundling (+), Binding (*), and Distance Operators (dist).
Example: The the computed hypervector Z is similar to the codebook hypervectors A and B.
Introduction to Hypervectors
Compute
codebook: A,B,C
Z = A+C
X and Y are random, but the hypervector Z is related to X and Y
Z similar to A
Z similar to C
Z dissimilar to B
B dissimilar to A
C dissimilar to B
What are Vector Symbolic Architectures?
Each vector symbolic operator can be thought of as an operator that builds a data structure.
Introduction to Hypervectors
Compute
codebook: A,B,C
Z = A+C
X and Y are random, but the hypervector Z is related to X and Y
Z similar to A
Z similar to C
Z dissimilar to B
B dissimilar to A
C dissimilar to B
What are Vector Symbolic Architectures?
VSA bundling operator (+) constructs a set of elements.
Bundling operators produce hypervectors that are similar to all of the input hypervectors.
Introduction to VSA Operators
codebook: A,B,C
Z = A+C
Z encodes the set {A,C}
What are Vector Symbolic Architectures?
After constructing a set, we can query the that an element hypervector (A) belongs to the set hypervector (Z) by computing the distance between the element and the set.
Introduction to VSA Operators
codebook: A,B,C
Z = A+C
Z encodes the set {A,C}
dist(A,Z) < thresh?
A ∈ {A, C}?
dist(A,Z) < thresh?
B ∈ {A, C}?
What are Vector Symbolic Architectures?
After constructing a set, we can query the that an element hypervector (A) belongs to the set hypervector (Z) by computing the distance between the element and the set.
Hypervector elements that are in the set will have a distance from the set vector that is below a certain threshold.
Introduction to VSA Operators
codebook: A,B,C
Z = A+C
Z encodes the set {A,C}
dist(A,Z) < thresh?
A ∈ {A, C}?
dist(A,Z) < thresh?
B ∈ {A, C}?
What are Vector Symbolic Architectures?
VSA binding operator (*) constructs a tuple of elements.
Binding operators produce hypervectors that are dissimilar to all of the input hypervectors.
Introduction to VSA Operators
codebook: A,B,C
Z = A*C
Z encodes the tuple <A,C>
What are Vector Symbolic Architectures?
VSA permutation operator (p) encodes positional information into an element.
The permutation operator is an invertable operator that deterministically transforms the hypervector.
Introduction to VSA Operators
codebook: A,B,C
Z = A+p(A)
Z encodes the sequence [A,A]
Sequence [A,A] can also be thought of as the set {A,p(A)}
What are Vector Symbolic Architectures?
So far, we conceptually described hypervectors, distance metrics, and the permutation (p), bundling (+), binding (*), and distance operators (dist).
Z = [C+p(A)]*C+C
{<[C,A],C>, C}
VSA Computation
What are Vector Symbolic Architectures?
So far, we conceptually described hypervectors, distance metrics, and the permutation (p), bundling (+), binding (*), and distance operators (dist).
There are many candidate VSA implementations.
Left: The HD computation implementation of vector symbolic architecture.
Z = [C+p(A)]*C+C
{<[C,A],C>, C}
VSA Computation
HD Computation Implementation
Hypervectors: Binary Vectors
Distance (dist): Hamming Distance
Bundling(+): Majority Vote
Binding(*): XOR
Permutation(p): Cyclic Rotation
Benefits of Vector Symbolic Architectures
Why use VSA Computation?
Resilient to errors! All bits are equally important.
For a large hypervector, it will take many corruptions to change the distance value.
[0,1,1,1,0,.........,1,1,0]
hypervector:
Random bit A
[0,1,1,1,0,.........,0,1,0]
feature vector
Is the word a verb
Random bit B
The word is a proper noun
Why use VSA Computation?
Lots of hypervector representations. Can potentially store data in real-valued and arbitrary multi-level memories.
Nice fit for MLC / analog emerging memory technologies.
[0,3,2,5,4,.........,1,4,0]
Radix-6 Hypervector
0
1
5
4
2
3
Dissimilar (distance=3)
Similar (distance =1)
Why use VSA Computation?
Amenable to analysis! Can potentially reason about error through permutation, bundling, and binding operators.
How do people currently reason about error in VSAs?
Not in Set
In Set
Hamming Distance
Probability
Why develop analyses?
(what do people do now)
Element is in a set?
Goal: we want to identify if an elements A and D are in a hypervector HV that was constructed by bundling A, B, and C
Codebook contains random hypervectors for A, B, C, and D
Codebook = A, B, C, D
Hypervector HV = A+B+C
HV represents the set {A,B,C}
Question 1: Is A ∈ {A, B, C}?
Question 2: Is D ∈ {A, B, C}?
Element is in a set?
Problem: We need to set a threshold distance for distinguishing between elements in a set, and elements not in a set.
We want to be able to correctly differentiate between elements in the set, and elements not in the set.
Codebook = A, B, C, D
Hypervector HV = A+B+C
HV represents the set {A,B,C}
Question 1: Is A ∈ {A, B, C}?
dist(A,HV) < threshold
Question 2: Is D ∈ {A, B, C}?
dist(D,HV) < threshold
Element is in a set?
Problem: We need to set a threshold distance for distinguishing between elements in a set, and elements not in a set.
“Recall accuracy” - fraction of the time we correctly tell when items are in / not in a set.
Codebook = A, B, C, D
Hypervector HV = A+B+C
HV represents the set {A,B,C}
Question 1: Is A ∈ {A, B, C}?
dist(A,HV) < threshold
Question 2: Is D ∈ {A, B, C}?
dist(D,HV) < threshold
Element is in a set?
Problem: We need to set a threshold distance for distinguishing between elements in a set, and elements not in a set.
How do we find a good threshold?
Codebook = A, B, C, D
Hypervector HV = A+B+C
HV represents the set {A,B,C}
Question 1: Is A ∈ {A, B, C}?
dist(A,HV) < threshold
Question 2: Is D ∈ {A, B, C}?
dist(D,HV) < threshold
Element is in a set?
Problem: We need to set a threshold distance for distinguishing between elements in a set, and elements not in a set.
Approach 1: Empirically set threshold by doing a lot of simulations.
Con: Time consuming + requires test inputs. how do we know we have selected the best threshold?
How do we find a good threshold?
Codebook = A, B, C, D
Hypervector HV = A+B+C
HV represents the set {A,B,C}
Question 1: Is A ∈ {A, B, C}?
dist(A,HV) < threshold
Question 2: Is D ∈ {A, B, C}?
dist(D,HV) < threshold
Element is in a set?
Problem: We need to set a threshold distance for distinguishing between elements in a set, and elements not in a set.
Approach 2: Maximize the chances of finding a good threshold by using large (N=10,000) hypervectors.
Con: May unnecessarily use space.
How do we find a good threshold?
Codebook = A, B, C, D
Hypervector HV = A+B+C
HV represents the set {A,B,C}
Question 1: Is A ∈ {A, B, C}?
dist(A,HV) < threshold
Question 2: Is D ∈ {A, B, C}?
dist(D,HV) < threshold
Element is in a set?
Problem: We need to set a threshold distance for distinguishing between elements in a set, and elements not in a set.
Approach 3: Maximize the chances of finding a good threshold by executing multiple instantiations of the HD computation with different codebooks.
Con: May unnecessarily use space.
How do we find a good threshold?
Codebook = A, B, C, D
Hypervector HV = A+B+C
HV represents the set {A,B,C}
Question 1: Is A ∈ {A, B, C}?
dist(A,HV) < threshold
Question 2: Is D ∈ {A, B, C}?
dist(D,HV) < threshold
Element is in a set?
Problem: We need to set a threshold distance for distinguishing between elements in a set, and elements not in a set.
Approach 4: Winner take all – don’t set a threshold, find the lowest distance hypervector.
Con: Restrictive (works mostly in associative memory cases), cannot determine if an element isn’t contained in any hypervectors.
How do we find a good threshold?
Codebook = A, B, C, D
Hypervector HV = A+B+C
HV represents the set {A,B,C}
Question 1: Is A ∈ {A, B, C}?
dist(A,HV) < threshold
Question 2: Is D ∈ {A, B, C}?
dist(D,HV) < threshold
Summary
Basically, standard practices improve computational robustness with empirical approaches.
As we discussed earlier, empirical methods have a lot of drawbacks…
Cons:
Can we more intelligently optimize HD computations?
Recall in Vector Symbolic Architectures
Codebook = A, B, C, D
Hypervector HV = A+B+C
HV represents the set {A,B,C}
Question 1: Is A ∈ {A, B, C}?
dist(A,HV) < threshold
Question 2: Is D ∈ {A, B, C}?
dist(D,HV) < threshold
Hamming Distance
dist(D,HV)
dist(A,HV)
Threshold
When picking our threshold in simulation, we considered one random instantiation of the codebook…
Recall in Vector Symbolic Architectures
Codebook = A, B, C, D
Hypervector HV = A+B+C
HV represents the set {A,B,C}
Question 1: Is A ∈ {A, B, C}?
dist(A,HV) < threshold
Question 2: Is D ∈ {A, B, C}?
dist(D,HV) < threshold
Hamming Distance
dist(D,HV)
dist(A,HV)
Threshold
It could be that if we try more random instantiations of the code book, the threshold doesn’t work so well.
Errors
Recall in Vector Symbolic Architectures
Alternate approach: we model the distribution of distances across random codebook instantiation. We model distributions for the distances of elements in the set, and elements not in the set.
In Set
Probability
Hamming Distance
dist(D,HV)
dist(A,HV)
Recall in Vector Symbolic Architectures
We then set the threshold that minimizes misclassification error.
In Set
Probability
Hamming Distance
dist(D,HV)
dist(A,HV)
Threshold
False Negative
False Positive
False Positive: element is reported as in set, when it’s not in a set.
False Negative: element is reported as not in the set, when it’s in a set.
Recall in Vector Symbolic Architectures
Symbolic normal distributions over number of elements in bundled hyper vector (m), and dimension of the hypervector (d)
In Set
Probability
Hamming Distance
dist(D,HV)
dist(A,HV)
Recall in Vector Symbolic Architectures
Analytically set threshold to minimize overlap between parametrized distributions.
In Set
Probability
Hamming Distance
dist(D,HV)
dist(A,HV)
Threshold
Recall in Vector Symbolic Architectures
In-set recall accuracy: area under the distribution to the left of the threshold.
In Set
Probability
Hamming Distance
dist(D,HV)
dist(A,HV)
Threshold
Recall in Vector Symbolic Architectures
Okay, so what can you do with this?
Statically compute the recall accuracy for a given n,m and threshold without running simulations. This threshold will work well across a range of codebooks.
In Set
Probability
Hamming Distance
dist(D,HV)
dist(A,HV)
Threshold
Recall in Vector Symbolic Architectures
Okay, so what can you do with this?
Automatically find a good threshold by studying the distribution. Having a threshold will enable the identification of elements that are in a set / not in the set.
In Set
Probability
Hamming Distance
dist(D,HV)
dist(A,HV)
Threshold
Recall in Vector Symbolic Architectures
Okay, so what can you do with this?
Given the number of elements in the hypervector set m and a recall accuracy x, find the smallest hypervector size (n) that gives you the recall accuracy x.
In Set
Probability
Hamming Distance
dist(D,HV)
dist(A,HV)
Threshold
Recall in Vector Symbolic Architectures
Okay, so what can you do with this?
Given a hypervector size (n) and a recall accuracy x, find the largest set you can represent with that hypervector.
In Set
Probability
Hamming Distance
dist(D,HV)
dist(A,HV)
Threshold
What else can you do with this model?
The model gives you probabilistic guarantees:�
Probabilistic model is computationally efficient, easy to work with.
Incidentals
For the remainder of the talk, we consider the following VSA implementation.
Hypervector Model: Randomly generated 0-1 vectors
Computational Operators
Incidentals
For the remainder of the talk, we consider the following VSA implementation.
Hypervector Model: Randomly generated 0-1 vectors
Computational Operators
Analytical Models: We developed the following analytical models for VSAs:
Future Models:
Model: Cartesian Product Of Sets
Problem: Given two hypervector sets s1 and s2, a ∈ s1 and b ∈ s2
What is the recall accuracy of <a,b> ∈ s1 x s2?
Some of these models get quite complicated..
Case Study 1: Dollar Price of Mexico with 99% accuracy
Dollar Price of Mexico Problem
Analogical query: given the following facts, what is the dollar price of mexico?�
With VSAs, we can elegantly encode the problem of finding the equivalent to the dollar for mexico.
The answer: the Peso!
Dollar Price of Mexico Problem
How big of a hypervector do we need to return “Peso” 99% recall accuracy?
With analytical model: we deduce we need a hypervector with d=334 and we also obtain the threshold thresh=0.437
to use for the comparison.
Dollar Price of Mexico Problem
How big of a hypervector do we need to return “Peso” 99% recall accuracy?
With analytical model: we deduce we need a hypervector with d=334 and we also obtain the threshold thresh=0.437
to use for the comparison.
Significantly smaller than d=10,000!
Case Study 2: Testing if an element is in a set
How long should a vector
For a given recall of 0.99, maximum # of supported elements m grows approximately linearly with dimension of hypervector.
Implications of this analysis:
Case Study 2: HD computation of finite automata with 99.7% accuracy
Finite Automata with HD Vectors
HD representation of a finite automata maintains a set of states the automata may enter. What should the size of this hypervector be if we want a 99% string matching accuracy?
Finite Automata with HD Vectors
To obtain a 99% string matching accuracy, we analytically determine we need a 99.7% recall accuracy.
Finite Automata with HD Vectors
HD representation of a finite automata maintains a set of states the automata may enter. What hypervector dimension + threshold do we need to obtain a 99.7% recall accuracy?
Finite Automata with HD Vectors
HD representation of a finite automata maintains a set of states the automata may enter. What hypervector dimension + threshold do we need to obtain a 99.7% recall accuracy?
Initially, we need a hypervector with a d=2126.
We also derive a threshold of 0.471
Finite Automata with HD Vectors
Initially, we need a hypervector with a d=2126.
After one transition, we need a hypervector with d=820.
We also derive a threshold of 0.453
HD representation of a finite automata maintains a set of states the automata may enter. What hypervector dimension + threshold do we need to obtain a 99.7% recall accuracy?
Finite Automata with HD Vectors
Initially, we need a hypervector with a d=2126.
After one transition, we need a hypervector with d=820.
After two transitions, we need a hypervector with d=193.
We also derive a threshold of 0.404
HD representation of a finite automata maintains a set of states the automata may enter. What hypervector dimension + threshold do we need to obtain a 99.7% recall accuracy?
Finite Automata with HD Vectors
Initially, we need a hypervector with a d=2126.
After one transition, we need a hypervector with d=820.
After two transitions, we need a hypervector with d=193.
We can discard/partially compute over the state hypervector as the computation progresses!
HD representation of a finite automata maintains a set of states the automata may enter. What hypervector dimension + threshold do we need to obtain a 99.7% recall accuracy?
Conclusions
Our recall accuracy model enables practitioners to analytically compute the recall accuracy for a given VSA computation. (subject to constraints over the implementation)
Conclusions
Our recall accuracy model enables practitioners to analytically compute the recall accuracy for a given VSA computation. (subject to constraints over the implementation)
With this model we can intelligently parametrize the VSA computation:
Conclusions
Our recall accuracy model enables practitioners to analytically compute the recall accuracy for a given VSA computation. (subject to constraints over the implementation)
With this model we can intelligently parametrize the VSA computation:
Conclusions
Our recall accuracy model enables practitioners to analytically compute the recall accuracy for a given VSA computation. (subject to constraints over the implementation)
With this model we can craft new program optimizations:
These are some initial steps to identify if this direction is worth exploring.
These are some initial steps to identify if this direction is worth exploring.
What’s the goal here?
Goal: develop a memory-aware VSA computation compiler – optimize VSA computations for accelerators that use emerging memory technologies.
Memory-Aware Vector Symbolic Architecture Compiler
Level Error Models and Memory Array Architecture (#rows,#cols)
Vector Symbolic Architecture Computation
Recall Accuracy
Accept as input a VSA computation (uses binding, bundling, and permute operators), a target recall accuracy, and an error model of the emerging memory architecture.
Memory-Aware Vector Symbolic Architecture Compiler
Level Error Models and Memory Array Architecture (#rows,#cols)
Vector Symbolic Architecture Computation
Recall Accuracy
Statically (at compile-time) select VSA operator implementations, a hypervector representation (e.g., binary/k-radix) + dimension, and a level allocation that (1) meets the target recall accuracy (2) maximizes performance / minimizes memory utilization.
Optimization-Aware Vector Symbolic Architecture Compiler
Level Error Models and Memory Array Architecture (#rows,#cols)
Vector Symbolic Architecture Computation
VSA Implementation
+,*,dist operator implementations
Hypervector Representation
+,*,dist operator implementations
Recall Accuracy
Level Allocation
Automatically insert compute- and memory-optimizing program transformations (e.g., data discard) when appropriate, while still delivering desired recall accuracy.
Optimization-Aware Vector Symbolic Architecture Compiler
Level Error Models and Memory Array Architecture (#rows,#cols)
Vector Symbolic Architecture Computation
VSA Implementation
+,*,dist operator implementations
Hypervector Representation
+,*,dist operator implementations
Recall Accuracy
Level Allocation
What’s the broader vision?
Broad Vision
With this system, we will be able to effectively map VSA computations to approximate accelerators while still providing accuracy guarantees.
Resistive Memory
Hardware Characterizion
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Read/Write Interface
Optimization-Aware Vector Symbolic Architecture Compiler
Broad Vision
With this system, we will be able to effectively map VSA computations to approximate accelerators while still providing accuracy guarantees.
Resistive Memory
Hardware Characterizion
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Read/Write Interface
Optimization-Aware Vector Symbolic Architecture Compiler
Broad Vision
With this system, we will be able to effectively map VSA computations to approximate accelerators while still providing accuracy guarantees.
2. Can productively target accelerators that
make use of promising, but unreliable emerging technologies.
Resistive Memory
Hardware Characterizion
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Read/Write Interface
Optimization-Aware Vector Symbolic Architecture Compiler
Broad Vision
With this system, we will be able to effectively map VSA computations to approximate accelerators while still providing accuracy guarantees.
3. Can inspire new hardware optimization algorithms for VSA accelerators.
Resistive Memory
Hardware Characterizion
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Read/Write Interface
Optimization-Aware Vector Symbolic Architecture Compiler
A Quick Update on the Level Allocation Algorithm from Last Time
Software Stack
Last time, we implemented a new level allocation algorithm, QAlloc.
Given K (# of levels) and hardware characterization data, QAlloc finds the lowest-error level allocation scheme for the storage medium.
Resistive Memory
Hardware Characterizion
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Read/Write Interface
Numerical Application Domain
And Software Tooling
Software Stack
Level allocation algorithm produces level allocation schemes by analyzing hardware characterization data.
Hardware characterization data captures resistance drift, process variation-induced resistance variations, and other effects.
Resistive Memory
Hardware Characterization Data
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Level Allocation
Our level allocation approach confers two key benefits over existing level allocation approaches.
Resistive Memory
Hardware Characterization Data
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Level Allocation
Distribution-Agnostic: Doesn’t assume any particular distribution. Works with non-normally distributed analog resistance errors.
Resistive Memory
Hardware Characterization Data
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Level Allocation
Distribution-Agnostic: Doesn’t assume any particular distribution. Works with non-normally distributed analog resistance errors.
In contrast, existing techniques typically fit a distribution to data and then work with distribution parameters (e.g., variance)
Resistive Memory
Hardware Characterization Data
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Level Allocation
We employ a search algorithm that intelligently searches over a larger search space of level allocations than existing techniques.
Resistive Memory
Hardware Characterization Data
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Level Allocation
Last year, we obtained preliminary results that showed our level allocation approach produces lower-error level allocations than competing approaches.
SOTA (SBA): 11.25%
8-Level Allocation results
Average Level Drift Probability
Our Approach: 6.24%
Level Allocation
Last year, we obtained preliminary results that showed our level allocation approach produces lower-error level allocations than competing approaches.
This year, we performed an ablation study to understand why.
SOTA (SBA): 11.25%
8-Level Allocation results
Average Level Drift Probability
Our Approach: 6.24%
N(0,sigma): 8.74% �(Same Distribution as SBA)
Ablation study
N(mu,sigma): 10.35%
Distribution-Agnostic: 6.24%
Level Allocation
Our allocation algorithm searches over a larger space of level allocations.
As a result, it is sometimes able to find better level allocations than SBA, even when working with the same analog resistance model as SBA.
Conclusion 1
SOTA (SBA): 11.25%
8-Level Allocation results
Average Level Drift Probability
Our Approach: 6.24%
N(0,sigma): 8.74% �(Same Distribution as SBA)
Ablation study
N(mu,sigma): 10.35%
Distribution-Agnostic: 6.24%
Level Allocation
Our level allocation algorithm’s ability to model non-normal distributions is critical to finding better performing allocations.
We find this is true for all the level allocations it produces.
Conclusion 2
SOTA (SBA): 11.25%
8-Level Allocation results
Average Level Drift Probability
Our Approach: 6.24%
N(0,sigma): 8.74% �(Same Distribution as SBA)
Ablation study
N(mu,sigma): 10.35%
Distribution-Agnostic: 6.24%
Level Allocation
Summary: We have developed a distribution-agnostic level allocation technique that works directly with characterization data.
Resistive Memory
Hardware Characterization Data
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation
Level Allocation
This level allocation algorithm is generic, and can be applied (or adapted to) to any resistance-based memory device that supports MLC storage.
Resistive Memory
Hardware Characterization Data
Hardware Read-Write Algorithms
Level Allocation Schemes + Level-Based Error Models
Level Allocation