1 of 34

Generating Test-Cases for

ML Compilers

Jiawei Liu

@JiaweiLiu_

https://jiawei-site.github.io/

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

2 of 34

Agenda

  • (ASPLOS’23) NNSmith: Generating Diverse and Valid Test Cases for DL Compilers
  • (FSE’23) NeuRI: Diversifying DNN Generation via Inductive Rule Inference
  • Tutorial of NNSmith package

2

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

3 of 34

DL system correctness is crucial

3

AI Applications

High-level AI Frameworks

Optimized Libraries

Hardware

AI Compilers & Optimizers

OpenAI Triton

NVIDIA GPUs

Google TPUs

CPUs

AMD GPUs

Apple Silicon

Safety

Functionality

User experience

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

4 of 34

DL systems drive DNNs

A DL model (DNN) = a program

  • Data structure: Tensors (multi-dim arrays)
  • Algorithm: “Tensor-flow” over operators
    • Operators transform tensors to tensors

4

def classify(img: Tensor):

y = resize(img, dsize)

z = conv2d(y, ksize)

...

DL Model = Program

Data (Image)

“Bear”

DL system

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

5 of 34

Test-case: DNN model + inputs

5

Test Generator

DL Computation

DL Compiler

Optimized

Non-opt.

Crash

Inconsistency

Oracle

How to Generate

Valid Models & Inputs?

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

6 of 34

Generating valid models

Only valid models can be compiled and executed.

6

Parser

Graph passes

Low-level passes

Codegen

Runtime

Code path

Invalid models

Valid models

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

7 of 34

Generating valid models

Model validity: each operator must be legally constructed

  • Operator(tensors, attributes) -> tensors
    • Different operators have different constraints
    • Constraints are on tensor shapes and attributes

7

Invalid Model

x = ... # shape=[1,3,32,32]

y = avg_pool(x, ksize=33)

💥 Constraint Violation 💥

Kernel size (33) must be no larger (<=) than the last two dimensions (32)!

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

8 of 34

NNSmith: Solver-aided Generation

  • Define constraint rule for each operator
  • Accumulate model-wise constraints
  • Solve the constraints to materialize a real model

8

x= input() # [x0,x1,x2]

y= relu(x) # [y0,y1,y2]

z= pool(y, ksize,stride)

[x0,x1,x2] =x.shape >0

[y0,y1,y2] =y.shape =x.shape

(y1-ksize)//stride > 0

(y2-ksize)//stride > 0

{x0=1, x1=8, x2=8, ksize=3,...}

Collect

Constraints

Solve Constraints

Concrete DNN

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

9 of 34

Minimalist abstraction

  • Need to model hundreds/thousands of operators!
  • What’s the minimalist abstraction for each operator?
    • in order to perform valid model generation

9

Operator rule: a minimal set of properties for model validity

  • Rule#1: Input constraints
  • Rule#2: Type propagation (tensor type: data type + shape)
  • Programmable as Python DSL

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

10 of 34

Rule#1: Input constraints

A requires function that produces a set of predicates over input shape dims and attributes.

10

class Pool2d(OpBase):

# Declare attributes

def __init__(s, kh, kw, pad_h, ...):

s.kh = kh # height of kernel size

...

# Declare constraintsdef requires(s, itensors: List[ATensor]):

ih, iw = itensors[0].shape[-2:]

return [

0 < s.kh < 2 * s.padh + ih,

0 < s.kw < 2 * s.padw + iw,

0 < s.stride, ]

Example of Pool2D:

  • Input dims: o0 = [c0,h0,w0]
  • Attributes: k{h,w}, pad{h,w}, …
  • Constraints:
    • 0 < kh ≤ 2 x padh + h0
    • 0 < kw ≤ 2 x padw + w0
    • n0,c0,h0,w0,… > 0

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

11 of 34

Rule#2: Type (Shape) propagation

  • Input constraints require knowing the input shapes

11

Input Layer

Layer 1

def requires(s, itensors):

...

output shape [c, h, w]

No problem!

[c, h, w] is known

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

12 of 34

Rule#2: Type (Shape) propagation

  • Input constraints requires knowing the input shapes
  • Input shapes needs to be deduced!

12

Input Layer

Layer 1

Layer 2

def requires(s, itensors):

...

output shape [c, h, w]

output shape ???

Unknown…

Needs to deduce the shapes!

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

13 of 34

Rule#2: Type (Shape) propagation

13

Prior layers.

class Pool2d(OpBase):

...

# Declare constraints� def requires(s, itensors: List[ATensor]):

...

# Declare propagation of shapes & dtypes

def type_transfer(s, itensors: List[ATensor])

List[ATensor]:

c, ih, iw = itensors[0].shape

oh = (ih + 2*s.padh - s.kh) // s.stride + 1

ow = (iw + 2*s.padw - s.kw) // s.stride + 1

return [ ATensor(

shape=(c, oh, ow),

dtype=itensors[0].dtype) ]

Pool2d

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

14 of 34

Building a DNN = inserting next Op.

14

Input1

3x3 conv

Add

x,y,z > 0

c,h,w > 0

h-2 > 0

w-2 > 0

- - - - - -

x == c

y == h-2

z == w-2

[c,h,w]

[c,h-2,w-2]

Input0

[x,y,z]

x == c

y == h-2

z == w-2

def requires(s, itensors)

New Constraints

Global Constraints

✅SAT!

1. Randomly pick an operator type (Add)

2. Randomly pick its producers (Input & ReLU)

3. Collect and stack push new constraints

4. Constraint solving

    • ❌UNSAT: Pop the constraint stack & operator
    • ✅SAT: Run “type_transfer”

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

15 of 34

Building a DNN = inserting next Op.

15

Input1

3x3 conv

Add

[c,h,w]

[c,h-2,w-2]

Input0

[x,y,z]

def type_transfer(s, itensors)

[x,y,z]

1. Randomly pick an operator type (Add)

2. Randomly pick its producers (Input & ReLU)

3. Collect and stack push new constraints

4. Constraint solving

    • ❌UNSAT: Pop the constraint stack & operator
    • ✅SAT: Run “type_transfer

5. Goto #1

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

16 of 34

Generating valid computational input

Model validity is NOT enough

  • If we random init X; W
    • ~60% model output NaNs/Infinities
  • Because of input domain violation
    • e.g., Log(X) where X < 0
    • e.g., Arcsin(X) where |X| > 1

16

T = {M, (X; W), O}

Test-case

✅ Model

❓ Computational Inputs

(Inputs & weights)

Expected Output

🚨 False Alarm 🚨

# Undefined behavior!

cast(NaN, to=int) => 🗑️

👻 False Negative 👻

# Normal input: 💥bug found

🐛BUGGY_ADD(1, 1) => 3

# NaN/Inf: 👻bug swallowed

🐛BUGGY_ADD(Inf, 1)=> Inf

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

17 of 34

NNSmith: Input Search with Gradient

Goal: Search (X*; W*) s. t. M(X*; W*) doesn’t incur NaN/Inf

17

Gradient-guided Input Search:

  1. Randomly init (X*; W*)
  2. Execute M(X*; W*) layer by layer
    1. 💥 If the operator is Fi(Xi) produces NaN/Inf
      1. Say Fi = Log
      2. Apply a loss function max(-X, 0).sum() over Xi
      3. Backward propagation to push out-of-domain values to in-domain
    2. NaN/Inf not found; exit!
  3. Redo #2 until timeout

Log(X)

  • Domain: X > 0
  • Goal: Smaller X, larger loss
  • Loss: max(-X, 0).sum()

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

18 of 34

Evaluation Setup

18

Luo et al.

ICSE 21

LEMON

FSE 20

Tzer

OOPSLA 22

Systems under Test

Model-Level Fuzzer

TensorIR-Level Fuzzer

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

19 of 34

NNSmith result highlights

  • 72 new bugs with 51 fixed (paper ver.)
    • 17 semantic bugs (silent wrong output)
    • 43 logic bugs in compiler passes
  • Improve SOTA [1] by 80% coverage on ONNXRT
  • Gradient guide finds valid inputs for 98% of models, with only 3.5ms overhead each

19

Bug reports

[1] Luo et al. 2021. Graph-based Fuzz Testing for Deep Learning Inference Engines. (ICSE 2021)

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

20 of 34

Diversifying Valid Models

  • Model diversity is determined by operator diversity
  • NNSmith manually supports ~60 operator rules

20

Model Diversity

Operator Diversity

Manual and thus unscalable…

Can we automatically synthesize operator rules?

Automated Rule Inference

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

21 of 34

NeuRI: Neural Net Rule Inference 🌠

21

Rule Synthesis

Test Suite

Model Hub

Filter

Simplify

Augment

Invocations

Trace Tensor APIs

Instrumentation

Simplified

Examples

Expr Prune

Rule Reuse

Deduplicate

Rule Inference

Shape

Propagation

Input

Constraints

Rules

Invocations

Forward

Backward

Hybrid DNN Generation

Concolic Op Insertion

GraphIR

Models

Interpreter

Compiler

Inconsistent

Results

Runtime

Error

Sanitizer

Error

Oracle Checking

Bug Reports

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

22 of 34

Instrumenting Operator Invocation

  • Instrument operator invocations from regression tests
  • Simplify the layout of invocations
  • Create more records via mutation

22

avgpool( , ksize=2, …) =>

avgpool, [3,3,3], {ksize=2,…} => [3,2,2]

Simplify

API

Input Shapes

Attributes

Output Shapes

Invocation

Example

Record

Record

Example

avgpool, [3,3,3], {ksize=3, } => [3,1,1]

avgpool, [2,3,3], {ksize=2, } => [2,2,2]

avgpool, [3,3,4], {ksize=2,…} => [3,2,3]

Mutate & Validate

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

23 of 34

Infer Operator Rules from Records

Each type (e.g., operator) of records has 3 sets of symbols

23

avgpool, [3,3,3], {ksize=3, } => [3,1,1]

avgpool, [2,3,3], {ksize=2, } => [2,2,2]

avgpool, [3,3,4], {ksize=2,…} => [3,2,3]

  • #1 Shape propagation: {o = f(A U I); o ∈ O}
    • {O0=I0, O1=(I1-ksize)//stride+1, O2=(I2-ksize)//stride+1}
    • #constraints = #output dimensions (|O|)
  • #2 Input constraints: {0 =/< f(A U I); …}
    • {ksize>0, stride>0, O*>0, ...}
    • #constraints is variadic/unknown

Input Shapes I

Attributes A

Output Shapes O

Question: How to infer f(A U I)?

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

24 of 34

Inductive Rule Inference

Let f(A U I) be an expression under arithmetic grammar

24

⟨expr⟩ ::= ⟨op⟩ ⟨expr⟩⟨expr⟩ | ⟨item⟩

⟨item⟩ ::= ⟨symbol⟩ | ⟨literal⟩

⟨op⟩ ::= + | - | × | ÷ | min | max | mod

⟨symbol⟩ ::= Symbols from A U I

⟨literal⟩ ::= Constant integers

Search-based Inductive Synthesis: Enumerate all terms of the grammar s.t. ∃ expr satisfies all collected examples

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

25 of 34

Optimizations

  • Pruning search space
    • Bounded search
    • Prune over equivalence & rarity

25

  • Rule reusing
    • Insight: Rules of different operators can be similar
    • Try inferred rules as a shortcut before inductive synthesis
  • Post-deduplication
    • Insight: The set of inferred constraints can be redundant
    • Simplify the constraint set for readability & faster online solving

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

26 of 34

Model Generation

  • Some rules are inferred and others are not
  • NNSmith only works for symbolic operator (rule inferred)

26

Concrete

Operator Information

Symbolic

Operator Information

Records

Rule inferred

Rule not found

Works for me

Can we also make use of this?

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

27 of 34

Concolic Model Generation

Using both concrete + symbolic (concolic) information

  • Inserting a symbolic operator <- Same as NNSmith
  • Inserting a concrete operator with exact shape match

27

x= input() # [3,16,16]

y= pool(x,...) # [3,14,14]

Concrete

Operator Information

Query invocation w/

input shape [3,14,14]

reshape(y,[588])

x= input() # [3,16,16]

y= pool(x,...) # [3,14,14]

z= reshape(y,[588]) # inserted

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

28 of 34

Evaluation Setup

28

  • XLA
  • TensorFlow Lite

  • Torch Inductor
  • Torch JIT

NNSmith

ASPLOS 23

Muffin

ICSE 22

DeepREL

FSE 22

Variants of NeuRI

Systems under Test

Model-Level Fuzzer

Op-Level Fuzzer

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

29 of 34

4-month bug finding with NeuRI

🔥 100 new bugs with 57 bugs fixed

    • 27 semantic bugs

🔥 9 bugs are marked as PyTorch

    • constituting 10% of all PyTorch high-priority bugs.

🔥 1 security vulnerability

29

Bug reports

“… the bugs you've reported are high quality … don't look like specially fuzzed set that's impossible to see in practice. They did reveal a few common themes that are easy to encounter in practice …”

-- PyTorch Developer (#93357)

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

30 of 34

👩‍💻 pip install nnsmith

  • https://github.com/ise-uiuc/nnsmith 100 GitHub ⭐
    • 4.5k PyPI downloads; 185 bugs found with 126 bugs fixed
    • Supports 3 model formats and 7 compilers
  • Fuzzing infrastructure:
    • Internal IR to simplify generation
    • Customizable oracle
    • Automated report synthesis

30

NNSmith Tutorial

Bug Reports

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

31 of 34

👩‍💻Generate a PyTorch model

nnsmith.model_gen model.type=torch debug.viz=true

31

Add more options:

# A larger model

mgen.max_nodes=16

# Single input & single output

mgen.method="single-io-cinit"

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

32 of 34

👩‍💻 Fuzzing PyTorch 2.0 compiler

nnsmith.fuzz fuzz.time=1m model.type=torch \

backend.type=pt2

32

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

33 of 34

👩‍💻 Automating bug reports

When finding a bug, NNSmith turns it into a bug report.

33

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io

34 of 34

Thank you!

  • NNSmith: Valid model & input generation (ASPLOS 2023)
  • NeuRI: Automatically inferring operator rules (FSE 2023)
  • https://github.com/ise-uiuc/nnsmith

34

Homepage: https://jiawei-site.github.io/ or jw-liu.xyz/

Contact: jiawei6@illinois.edu @JiaweiLiu_

jiawei6@illinois.edu

University of Illinois, Urbana-Champaign

jiawei-site.github.io