Explorations in Static Pointer Analysis
Adaptive Scalability and Strong Guarantees
Ph.D. Defense
George Kastrinis
Advisor Prof. Yannis Smaragdakis
Athens – July 6th, 2020
HELLENIC REPUBLIC
National and Kapodistrian
University of Athens
Precise yet scalable static pointer analysis algorithms can be obtained by careful choice of different policies for different parts of the program
In a complementary fashion, analyses can be designed to offer (uniquely) strong guarantees on the soundness of results, but only for a part of the program
Static
Pointer Analysis
Naive Pointer Analysis.
B eat (B arg) {
return arg;
}
void relax (A a1, A a2) {
obj1 = new 🍦
obj2 = new 🍕
r1 = a1.eat(obj1);
r2 = a2.eat(obj2);
}
obj1 ↦ {🍦}
obj2 ↦ {🍕}
arg ↦ {🍦, 🍕}
arg ↦ {🍦}
r1 ↦ {🍦, 🍕}
r2 ↦ {🍦, 🍕}
♳
♴
4
Call-Site Sensitive Pointer Analysis .
B eat (B arg) {
return arg;
}
void relax (A a1, A a2) {
obj1 = new 🍦
obj2 = new 🍕
r1 = a1.eat(obj1);
r2 = a2.eat(obj2);
}
obj1 ↦ {🍦}
obj2 ↦ {🍕}
r1 ↦ {🍦}
r2 ↦ {🍕}
♳
♴
6
♳::arg ↦ {🍦}
♴::arg ↦ {🍕}
arg ↦ {🍦, 🍕}
Object Sensitive Pointer Analysis.
void relax (A a1, A a2) {
obj1 = new 🍦
obj2 = new 🍕
r1 = a1.eat(obj1);
r2 = a2.eat(obj2);
}
B eat (B arg) {
return arg;
}
obj1 ↦ {🍦}
obj2 ↦ {🍕}
a1↦ {??}
a2↦ {??}
{🌊,🎢,🏯,🏰,🎪}
7
Object Sensitive Pointer Analysis.
B eat (B arg) {
return arg;
}
void relax (A a1, A a2) {
obj1 = new 🍦
obj2 = new 🍕
r1 = a1.eat(obj1);
r2 = a2.eat(obj2);
}
obj1 ↦ {🍦}
obj2 ↦ {🍕}
🌊::arg ↦ {🍦}
🎪::arg ↦ {🍕}
a1↦ {🌊}
a2↦ {🎪}
r1 ↦ {🍦}
r2 ↦ {🍕}
8
Object Sensitive Pointer Analysis.
B eat (B arg) {
return arg;
}
void relax (A a1, A a2) {
obj1 = new 🍦
obj2 = new 🍕
r1 = a1.eat(obj1);
r2 = a2.eat(obj2);
}
🎪::arg ↦ {🍕}
🏯::arg ↦ {🍕}
obj1 ↦ {🍦}
obj2 ↦ {🍕}
🏰::arg ↦ {🍕}
a1↦ {🌊,🎢}
a2↦ {🎪,🏯,🏰}
r1 ↦ {🍦}
r2 ↦ {🍕}
🌊::arg ↦ {🍦}
🎢::arg ↦ {🍦}
9
Object Sensitive Pointer Analysis.
B eat (B arg) {
return arg;
}
void relax (A a1, A a2) {
obj1 = new 🍦
obj2 = new 🍕
r1 = a1.eat(obj1);
r2 = a2.eat(obj2);
}
🎪::arg ↦ {🍕}
obj1 ↦ {🍦}
obj2 ↦ {🍕}
🏰::arg ↦ {🍕}
a1↦ {🌊,🏰}
a2↦ {🎪,🏰}
r1 ↦ {🍦,🍕}
r2 ↦ {🍦,🍕}
🌊::arg ↦ {🍦}
🏰::arg ↦ {🍦}
10
1-Call-Site Sensitivity Revisited.
class D extends S {
void bar () {
a2 = new 🍕
b2 = id2(a2);
}
}
class C extends S {
void foo () {
a1 = new 🍦
b1 = id2(a1);
}
}
class S {
Object id2 (Object a) { return id(a); }
Object id (Object a) { return a; }
}
a1 ↦ {🍦}
a2 ↦ {🍕}
♳
♴
♳::a ↦ {🍦}
♴::a ↦ {🍕}
♵
♵::a ↦ {🍦, 🍕}
b1 ↦ {🍦, 🍕}
b2 ↦ {🍦, 🍕}
{♵,♳}::a ↦ {🍦}
{♵,♳}::a ↦ {🍦}
{♵,♴}::a ↦ {🍕}
{♵,♴}::a ↦ {🍕}
11
1-Object Sensitivity to the Rescue.
class S {
Object id2 (Object a) { return id(a); }
Object id (Object a) { return a; }
}
class D extends S {
void bar () {
a2 = new 🍕
b2 = id2(a2);
}
}
class C extends S {
void foo () {
a1 = new 🍦
b1 = id2(a1);
}
}
a1 ↦ {🍦}
a2 ↦ {🍕}
🎪::a ↦ {🍦}
🌊::a ↦ {🍕}
b1 ↦ {🍦}
b2 ↦ {🍕}
this ↦ {🎪}
this ↦ {🌊}
🎪::a ↦ {🍦}
🌊::a ↦ {🍕}
12
Datalog
Less Programming
More Specification
Datalog 101.
P(x, z), Q(y) ← R(x, y), S(y, z, w), T(w, x)
if
then
join variable
14
Pointer Analysis in Datalog.
a = new 🍦
b = new 🍕
c = new 🍬
a = b;
b = a;
c = b;
VarPointsTo (var, obj) ←
Alloc (var, obj)
VarPointsTo (to, obj) ←
Move (to, from),
VarPointsTo (from, obj)
VPT
(a, 🍕)
(b, 🍦)
(c, 🍕)
(c, 🍦)
Alloc
(a, 🍦)
(b, 🍕)
(c, 🍬)
Move
(a, b)
(b, a)
(c, b)
(a, 🍦)
(b, 🍕)
(c, 🍬)
recursion
15
Pointer Analysis in Datalog.
MERGE (obj, hCtx, invo, calleeCtx) = callerCtx,
CallGraphEdge (invo, calleeCtx, toMethod, callerCtx) ←
Call (base, sig, invo),
VarPointsTo (base, calleeCtx, obj, hCtx),
LookUp (obj, sig, toMethod)
invo: base.sig (...);
constructor
new context
16
Two Context Constructors / A Multitude of Analyses.
RECORD (obj, ctx) = hCtx
MERGE (obj, hCtx, invo, calleeCtx) = callerCtx
heap context ( HC )
calling context ( C )
17
1-call-site sensitive w/ a context sensitive heap.
RECORD (obj, ctx) = ctx
MERGE (obj, hCtx, invo, ctx) = invo
C = HC = Call
18
1-object sensitive w/ a context sensitive heap.
RECORD (obj, ctx) = ctx
MERGE (obj, hCtx, invo, ctx) = obj
C = HC = Alloc
19
2-object sensitive w/ a context sensitive heap.
RECORD (obj, ctx) = first(ctx)
MERGE (obj, hCtx, invo, ctx) = {obj, hCtx}
C = Alloc x Alloc
HC = Alloc
20
Hybrid Context-Sensitivity for Points-To Analysis (PLDI ‘13)
George Kastrinis, Yannis Smaragdakis
Introspective Analysis: Context-Sensitivity, Across the Board (PLDI ‘14)
Yannis Smaragdakis, George Kastrinis, George Balatsouras
A Datalog Model of Must-Alias Analysis (SOAP ’17)
George Balatsouras, Kostas Ferles, George Kastrinis, Yannis Smaragdakis
An Efficient Data Structure for Must-Alias Analysis (CC ’18)
George Kastrinis, George Balatsouras, Kostas Ferles, Nefeli Prokopaki-Kostopoulou, Yannis Smaragdakis
Defensive Points-To Analysis: Effective Soundness via Laziness (ECOOP ’18)
Yannis Smaragdakis, George Kastrinis
22
Achieving Scalability
Episode 1
Object
Sensitivity
Call-Site
Sensitivity
Hybrid Context
Object Sensitivity + Static Methods.
void relax (A a1) {
a1.eat (obj1);
a1.drink (obj2);
}
static
B Places::reserve (B arg) {
return arg;
}
void drink (B b2) {
r2 = Places.reserve (b2);
}
void eat (B b1) {
r1 = Places.reserve (b1);
}
??::arg ↦ {🍦}
a1 ↦ {🌊}
🌊::b1 ↦ {🍦}
🌊::arg ↦ {🍦, 🍺}
🌊::arg ↦ {🍦}
obj1 ↦ {🍦}
obj2 ↦ {🍺}
🌊::b2 ↦ {🍺}
r1 ↦ {🍦, 🍺}
r2 ↦ {🍦, 🍺}
27
A Hybrid Approach.
void relax (A a1) {
a1.eat (obj1);
…
a1.drink (obj2);
}
void drink (B b2) {
r2 = Places.reserve (b2);
}
void eat (B b1) {
r1 = Places.reserve (b1);
}
static
B Places::reserve (B arg) {
return arg;
}
??::arg ↦ {🍦}
♳::arg ↦ {🍦}
a1 ↦ {🌊}
🌊::b1 ↦ {🍦}
obj1 ↦ {🍦}
obj2 ↦ {🍺}
🌊::b2 ↦ {🍺}
r1 ↦ {🍦}
r2 ↦ {🍺}
♳
♴
♴::arg ↦ {🍺}
28
Two Three Context Constructors / A Multitude of Analyses.
RECORD (obj, ctx) = hCtx
MERGE (obj, hCtx, invo, calleeCtx) = callerCtx
MERGE STATIC (invo, calleeCtx) = callerCtx
29
Uniform 2-object sensitive w/ a context sensitive heap Hybrid.
RECORD (obj, ctx) = first(ctx)
MERGE (obj, hCtx, invo, ctx) = {obj, hCtx, invo}
MERGE STATIC (invo, ctx) = {first(ctx), second(ctx), invo}
C = Alloc x Alloc x Call
HC = Alloc
30
Selective 1-object-sensitive Hybrid A.
RECORD (obj, ctx) = ⭐
MERGE (obj, hCtx, invo, ctx) = obj
MERGE STATIC (invo, ctx) = invo
C = Alloc U Call
HC = ⭐
31
Selective 1-object-sensitive Hybrid B.
RECORD (obj, ctx) = ⭐
MERGE (obj, hCtx, invo, ctx) = {obj, ⭐}
MERGE STATIC (invo, ctx) = {first(ctx), invo}
C = Alloc x (Call U ⭐)
HC = ⭐
32
Selective 2-object-sensitive w/ context sensitive heap Hybrid.
RECORD (obj, ctx) = first(ctx)
MERGE (obj, hCtx, invo, ctx) = {obj, hCtx, ⭐}
MERGE STATIC (invo, ctx) = {first(ctx), invo, second(ctx)}
C = Alloc x (Alloc U Call) x (Alloc U Call U ⭐)
HC = Alloc
33
1-call
1-call+H
1-obj
2-obj+H
U-1-obj
S-2-obj+H
U-2-obj+H
antlr
SB-1-obj
SA-1-obj
34
1.53x
2-object-sensitive+Heap
1.12x
1-object-sensitive
35
Achieving Scalability
Episode 2
Introspective Analysis
Introspective Analysis.
Cheap
Analysis
refine YES/NO?
Expensive
Analysis
37
all in under 12 min
losing a fraction of precision
when-all-else-fails analysis
38
Achieving
Strong Guarantees
Episode 3
Must-Alias Analysis
Must-Alias Analysis.
Must-(Under-approximate)-Alias Analysis.
if (...) {
x = z
} else {
…
}
y = x
y ~ x
y ~? z
b,c,d
a1
a2
member
member
owner
Datalog Model
&
Specialized Graph
x ~? z
a1.member ~ b
a1.member ~ c.owner.member
40
equivalence class
~2 orders of magnitude speedup
41
Achieving
Strong Guarantees
Episode 4
Defensive Points-To Analysis
Defensive Points-To Analysis.
Defensive (Sound) Points-To Analysis.
if (...) {
x = new 🍦
} else if (...) {
x = new 🍕
} else {
...
}
y = x
x.fld = new 🍦
y.fld = new 🍕
x ↦ {🍦, 🍕}
x ↦ {⭐}
when unsure
infer everything
x.fld ↦ {🍦}
y.fld ↦ {🍕}
y ~? x
x.fld ↦ {🍦,🍕}
but be lazy
43
45.6% vars with
significant points-to sets
5-call-site
44.8%
devirtualized calls
44
Precise yet scalable static pointer analysis algorithms can be obtained by careful choice of different policies for different parts of the program
In a complementary fashion, analyses can be designed to offer (uniquely) strong guarantees on the soundness of results, but only for a part of the program
Hybrid Context-Sensitivity for Points-To Analysis (PLDI ‘13)
George Kastrinis, Yannis Smaragdakis
Introspective Analysis: Context-Sensitivity, Across the Board (PLDI ‘14)
Yannis Smaragdakis, George Kastrinis, George Balatsouras
An Efficient Data Structure for Must-Alias Analysis (CC ’18)
George Kastrinis, George Balatsouras, Kostas Ferles, Nefeli Prokopaki-Kostopoulou, Yannis Smaragdakis
A Datalog Model of Must-Alias Analysis (SOAP ’17)
George Balatsouras, Kostas Ferles, George Kastrinis, Yannis Smaragdakis
More Sound Static Handling of Java Reflection (APLAS ‘15)
Yannis Smaragdakis, George Balatsouras, George Kastrinis, Martin Bravenboer
Set-Based Pre-Processing for Points-To Analysis (OOPSLA ‘13)
Yannis Smaragdakis, George Balatsouras, George Kastrinis
Efficient Reflection String Analysis via Graph Coloring (ECOOP ‘18)
Neville Grech, George Kastrinis, Yannis Smaragdakis
Defensive Points-To Analysis: Effective Soundness via Laziness (ECOOP ’18)
Yannis Smaragdakis, George Kastrinis
Static Analysis of Java Dynamic Proxies (ISSTA ‘18)
George Fourtounis, George Kastrinis, Yannis Smaragdakis
Efficient and Effective Handling of Exceptions in Java Points-To Analysis (CC ‘13)
George Kastrinis, Yannis Smaragdakis
Explorations in Static Pointer Analysis: Adaptive Scalability and Strong Guarantees