The Impact of Compiler-Based Performance Optimizations on Security
7 February 2020
Michael D. Brown
Software Assurance Associate Branch Head
Research Scientist, GTRI-CIPHER
1
1
Introduction
Modern compilers are focused on two goals when converting source code into executable binaries.
Security is typically not a concern. This has led to a number of security weaknesses that are introduced by compiler optimizations.
In this talk, we will explore recent research on security concerns related to optimizing compilers.
2
Agenda
3
What are Compiler Performance Optimizations?
Modern compilers are tasked with producing executable binaries from source code that are both functionally correct and performant.
Source code is first converted into an Intermediate Representation (IR) (similar to Assembly language).
While in IR, software analysis and software transformation techniques are used to alter the code to improve performance without altering functional semantics.
Optimized code is then used to generate a machine executable binary.
Architecture of an Optimizing Compiler [1]
4
Example: Constant Propagation (and Dead Code Elimination)
Optimizes code by carrying forward constant values or performing static computations at compile time, simplifying the program.
Simplifications may enable later optimizations, such as Dead Code Elimination.
int foo(int z){
int a = 1, b = 2;
int c = 1 + b;
int d = 100;
if(c > 0){
d = d - z;
} else{
d = a + z;
}
return d;
}
int foo(int z){
int a = 1, b = 2;
int c = 1 + 2;
int d = 100;
if(c > 0){
d = 100 - z;
} else{
d = 1 + z;
}
return d;
}
int foo(int z){
int a = 1, b = 2;
int c = 3;
int d = 100;
if(3 > 0){
d = 100 - z;
} else{
d = 1 + z;
}
return d;
}
int foo(int z){
int d = 100;
if(true){
d = 100 - z;
}
return d;
}
int foo(int z){
return 100 - z;
}
5
Example: Function Cloning via Interprocedural Constant Propagation
Creates optimized variants of functions based on interprocedural constant propagation. Sacrifices code size, but some function calls will execute faster.
int bar(int z) {
int b = foo(z, z);
int c = foo(1, z);
return b + c;
}
�
int foo(int x, int y)
{
if(x > 0){
y = y * 2;
}
else{
y = y + 8;
}
return y;
}
int bar(int z) {
int b = foo(z, z);
int c = foo_1(z);
return b + c;
}
int foo(int x, int y)
{
...
}
int foo_1(int y)
{
if(1 > 0){
y = y * 2;
}
else{
y = y + 8;
}
return y;
}
int bar(int z) {
int b = foo(z, z);
int c = foo_1(z);
return b + c;
}
int foo(int x, int y)
{
if(x > 0){
y = y * 2;
}
else{
y = y + 8;
}
return y;
}
int foo_1(int y)
{
return y * 2;
}
6
Performance Optimizations in Modern Compilers
Modern compilers implement dozens of individual optimizations, many of which are interdependent.
To simplify the process of selecting which optimizations to apply, optimization levels are used. For example, GCC [2] defines several optimization levels:
7
Agenda
8
Security Weaknesses Introduced By Performance Optimizations
Problem: Despite years of research formally proving compilers correct with respect to the source code’s functional semantics, numerous instances have arisen where compilers violate the security guarantees implemented in source code.
Recent work in 2015 by D’Silva et al. [1] studied this situation and coined the term “Correctness-Security Gap” to describe it.
The underlying problem is that compilers were never designed to satisfy security criteria. It is the subject of open debate whether or violation of these criteria by a compiler constitutes buggy behavior.
Instances of the correctness-security gap fall into three categories: Elimination of Security-Oriented Code, Information Leaks, and Side Channel Introduction.
9
Example: Elimination of Security-Oriented Code
A common example occurs as a result of the Dead Store Elimination optimization.
This optimization eliminates memory write operations to memory locations that will not be read later.
Data sanitization code that blanks out sensitive data in memory triggers this optimization, resulting in the removal of the security oriented code.
This is also an example of information leakage, as sensitive data might later be recovered by an attacker.
Vulnerable Security-Oriented Function [1]
10
Example: Side Channel Introduction
A common example occurs as a result of the Common Subexpression Elimination optimization.
This optimization eliminates redundant computations of subexpressions, reducing execution time.
This optimization can break side-channel defenses that are designed to ensure constant computation time at conditional branches.
Vulnerable Security-Oriented Function Before and After CSE Optimization [1]
11
Agenda
12
Gadget-Based CRAs: Return Oriented Programming (ROP)
[6]
13
Gadget-Based CRAs: Jump Oriented Programming (JOP)
[6]
14
Gadget Types
Functional gadgets perform computational tasks. They are used like individual instructions in an ISA.
Special purpose gadgets are used as scaffolding to create gadget chains. They are mostly used to create JOP and COP attacks, however syscall gadgets are useful in all attack models.
Attackers find these gadgets by searching for gadget producing instructions – return, indirect call, and indirect jumps.
x86 and x86-64 have variable length instructions, so it is possible to decode instructions from an offset other than the original instruction boundary. Gadgets found using this method are called unintended gadgets.
15
Motivation
Our recent work in the effects of software debloating (another type of software transformation) have shown that even small changes in software can result in large changes to code reuse gadget populations, including the introduction of new gadgets [3].
This led us to question what effect compiler optimizations have on code reuse gadget populations: are they introducing weaknesses against Code Reuse Attacks?
We set out to answer these motivating questions:
16
Study Methodology
We conducted our study in a top down manner to answer our motivating questions.
Using two different production compilers, GCC v7.4.0 and Clang v8.0.1, we built unoptimized (-O0) and optimized (-O1, -O2, -O3) variants of 21 different programs and analyzed them to learn how gadget populations differ at various optimization levels.
We then drilled deeper and built over 900 different “single-optimization” variants and analyzed them to learn how gadget populations are influenced by individual optimizations. We identified particularly interesting optimizations by identifying outliers in the resulting data set.
17
Study Methodology - Benchmarks
Seven of the benchmarks are commonly used Linux packages:
Fourteen of the packages are from SPEC 2006:
We selected 21 different software packages that varied in size, complexity, and functionality.
18
Study Methodology - Metrics
We use previously established metrics for analyzing code reuse gadget sets in program binaries [3], which are calculated using a static analysis tool, GSA (GadgetSetAnalyzer) [3, 4].
19
Results – Gadget Introduction
Our prior work indicates that software transformations can have drastic effects on code reuse gadget populations.
We expect that some compiler optimizations will remove gadgets, but some will likely introduce new ones. Specifically, space-speed tradeoff optimizations at –O3 that duplicate code.
20
Results – Gadget Introduction
21
Results – Gadget Introduction
Key Observations on Gadget Counts:
Key Observations on Gadget Introduction Rate:
22
Results – Coarse-Grained Qualitative Analysis
Given the high rates of gadget introduction observed, we expect that we will see interesting qualitative gadget set results based on lessons learned in prior work [3].
We start with a coarse-grained qualitative analysis in which we observe changes in our gadget set metrics at different optimization levels.
23
Results – Coarse-Grained Qualitative Analysis (GCC)
24
Results – Coarse-Grained Qualitative Analysis (Clang)
25
Results – Coarse-Grained Qualitative Analysis
Key Observations for Gadget Set Expressivity:
Key Observations for Special Purpose Gadget Availability:
26
Results – Fine-Grained Qualitative Analysis
The results of our coarse-grained analysis exposed non-linearity. This is unexpected because increasing the optimization level typically means we perform all lower level optimizations in addition to a new set of optimizations.
Since some optimizations can create new optimization targets for others, interactivity between individual optimizations is a partial cause of this non-linearity.
The unpredictability of unintended gadgets remains another major factor.
We need to dig deeper and look at individual optimizations. In our fine-grained analysis, we generated over nine hundred different variants and compared them to the appropriate baseline:
27
Results – Fine-Grained Qualitative Analysis
28
Results – Fine-Grained Qualitative Analysis
How about we just talk about trends we observed.....
29
Results – Fine-Grained Qualitative Analysis
Key Observations:
Is there signal in this noise?
30
Root Causes - Noise
We performed differential binary analysis on the variant binaries with IDA [9] and BinDiff [10] to identify the root causes of the effects we observed.
In general, the noise we observed was the result of “churn” that occurs during software transformation.
We observed that increases in both qualitative metrics could be explained largely by the introduction of new unintended gadgets of short length, typically in the data segments like branch offsets.
This is most prevalent for special purpose gadget introduction. Syscall and Intra-Stack Pivot gadgets were disproportionately affected because they are shorter than four bytes in length:
31
Root Causes – Duplication of Gadget Producing Instructions
A significant cause of gadget introduction was caused by the duplication of gadget producing instructions. This occurs logically as a result of IPA CP Function Cloning.
int foo(int x, int y)
{
if(x > 0){
y = y * 2;
}
else{
y = y + 8;
}
return y;
}
int foo(int x, int y)
{
if(x > 0){
y = y * 2;
}
else{
y = y + 8;
}
return y;
}
int foo_1(int y)
{
return y * 2;
}
32
Root Causes – Duplication of Gadget Producing Instructions
A significant cause of gadget introduction was caused by the duplication of gadget producing instructions. This happens in both simple and complex transformations.
33
Root Causes – Special Purpose Gadgets as Optimizations
In one optimization, Tail Call Elimination implemented in Clang, we observed the intentional creation of instruction sequences that can be used as JOP Dataloader gadgets.
34
Agenda
35
Solutions - Noise
A number of static software transformation techniques have been proposed by Onarlioglu et al [11] that can eliminate unintended gadgets.
For example, constants in a program that correspond to the encoding of a return instruction can be transformed to eliminate unintended gadgets terminating at the location fairly easily:
These techniques are typically not implemented in modern compilers because they are machine- specific, and cannot be implemented in IR. Most compiler back-ends are not architected for optimization passes.
36
Solutions – Duplicated Gadget Producing Instructions
Complier optimizations in GCC can be patched to avoid duplicating returns, indirect jumps, and indirect calls. Provides most of the benefit of the optimization, but doesn’t introduce new gadgets.
Alternatively, a machine-specific back end pass could intra-procedurally merge gadget-producing instructions.
37
Solutions – Duplicated Gadget Producing Instructions
Complier optimizations in GCC can be patched to avoid duplicating returns, indirect jumps, and indirect calls. Provides most of the benefit of the optimization, but doesn’t introduce new gadgets.
Alternatively, a machine-specific back end pass could intra-procedurally merge gadget-producing instructions.
38
Solutions – Special Purpose Gadgets as Optimizations
JOP Dataloader gadgets introduced by Tail Call Elimination cannot be fixed through static software transformation without violating the program’s functional semantics.
Dynamic gadget protection methods that incur runtime overhead could be used.
Alternatively, we can choose to disable optimizations such as Tail Call Elimination if the performance benefit is not worth the introduction of this type of gadget.
This could be the case when a program doesn’t otherwise contain these gadgets and tail call elimination would only optimize code that runs infrequently.
39
Some Closing Notes on Performance Implications
The performance benefits of compiler optimizations are quite significant.
Solutions to mitigate the introduction of high quality gadgets all carry a cost, which must be considered before use.
40
Conclusion - Summary of Key Takeaways
41
Questions?
42
References
43