Free Lunch for Testing: �Fuzzing Deep-Learning Libraries �from Open Source
Anjiang Wei, Yinlin Deng, Chenyuan Yang, Lingming Zhang
CCF-2131943
CCF-2141474
1
Deep-Learning Libraries
Model Definition
class MyNet(nn.Module):
self.l1 = Conv2d(32, 16, 3)
self.l2 = Maxpool2d((3,2),2)
…
def forward(self, x):
x = self.l1(x)
x = self.l2(x)
return F.relu(x)
Loading Dataset
class MyDataset(Dataset):
def __getitem__(self, idx):
image = read_image(…)
image = normalize(image)
label = read_label(…)
Training / Inference
net = MyNet()
for data, label in MyDataset:
out = net(data)
loss = criterion(out, label)
loss.backward()
…
User
Library
Python
C++
CPU
Aten
CuDNN
GPU
Mobile
2
Prior Work
3
1Pham et al. “CRADLE: cross-backend validation to detect and localize bugs in deep learning libraries”. ICSE 2019
Existing Models
High-level Library
Differential Testing
Prior Work
4
2Wang et al. “Deep learning library testing via effective model generation”. FSE 2020.
We also acknowledge using their slides to illustrate model-level mutation rules
Layer Switch (LS)
Layer Copy (LC)
Layer Addition (LA)
Layer Removal (LR)
Activation Function Removal (AFRm)
Activation Function Replace (AFRp):
Multi-Layers Addition (MLA)
Example LEMON model-level mutation rules
Motivation
5
Challenge of Fuzzing APIs
...
m = torch.nn.Conv2d(16, 33, 3)
input = torch.randn(20, 16, 50, 100)
output = m(input)
...
- in_channels (int) – Number of channels in the input image
- stride (int or tuple, optional) – Stride of convolution. Default: 1
- groups (int) – Controls the connections between inputs and � outputs. Constraints: in_channels and out_channels� must both be divisible by groups
Code Snippets from Doc
Documentation for torch.nn.Conv2d
6
FreeFuzz Overview
7
Doc Code
Lib Tests
DL Models
m = torch.nn.Conv2d(16,33,(3,5),…)
input = torch.randn(20,16,50,100)
output = m(input)
def test_conv():
sizes = [(1, 256, 109, 175),
(1, 256, 80, 128),…]
conv = torch.nn.Conv2d(1,256,…)
for size in sizes:
x = torch.randn(size,…)
out = conv(x)
class MyModel(nn.Module):
self.conv1=nn.Conv3d(32,…)
self.conv2=nn.Conv3d(64,…)
…
net = MyModel()
for data in dataset:
net(data)
Code Collection
Instrumentation
API Value Space
Argument Value Space
torch.nn.Conv2d:
entry1: in_channels=16,out_channels=33,
kernel_size=(3,5),…� Input_tensor_shape=(20,16,50,100)� Input_tensor_dtype=float32
entry2: in_channels=1,…
Customized Type
(3,5): (int, int)
input: Tensor<4, float32>
in_channels, int:
torch.nn.Conv2d: 16, 1,…
torch.nn.Conv3d: 32, 64,…
torch.nn.Conv3d:
entry1: in_channels=32,…
out_channels, int:
torch.nn.Conv2d: 33, 16,…
torch.nn.Conv3d: …
Mutation
Type Mutation
Tensor<4, float32>
🡪 Tensor<4, float16>
Random Value Mutation
in_channels = random_int()
l = torch.nn.Conv2d(in_channels,…)
input = torch.randn(…,dtype=float16)
Database Value Mutation
# similar_API = torch.nn.Conv3d
in_channel = db.sample_from(Conv3d)
# in_channel -> 32, 64, …
l = torch.nn.Conv2d(in_channels,…)
input = torch.randn(…, dtype=float16)
Oracle
Differential Testing
CPU
Metamorphic Testing
GPU
Tensors
<float16>
Equal?
Faster?
Tensors
<float32>
Crash?
Instrumentation: Type Monitoring System FuzzType
8
Instrumentation: API Value Space
API Value Space
torch.nn.Conv2d:
Entry1:� in_channels=16,� out_channels=33,� kernel_size=(3,5),� …� tensor_shape=(20,16,50,100)� tensor_dtype=float32
Code Execution
m = torch.nn.Conv2d(16, 33, (3,5),…)
input = torch.randn(20, 16, 50, 100)
output = m(input)
9
Instrumentation: Argument Value Space
10
Argument Value Space
torch.nn.Conv2d:
entry1: in_channels=16,out_channels=33,� entry2: in_channels=1, out_channels=5,
in_channels, int:
torch.nn.Conv2d: 16, 1, …
torch.nn.Conv3d: 32, …
torch.nn.Conv3d:
entry1: in_channels=32,out_channels=64
out_channels, int:
torch.nn.Conv2d: 33, 5,…
torch.nn.Conv3d: 64, …
API Value Space
Mutation: Overview
11
See paper for details
Randomly sample one entry in API Value Space
For each argument in the entry:
if no_mutation(): # random boolean
continue
type = FuzzType(argument)
if do_type_mutation(): # random boolean
type = TypeMutation(type)
if select_rand_over_db(): # random boolean
argument = RandValueMutation(type, argument)
else:
argument = DBValueMutation(type, argument)
Simplified Algorithm:
Mutation: Type Mutation
12
Mutation Strategies | T1 | T2 |
Tensor Dim Mutation | Tensor<n1, DT> | Tensor<n2, DT> (n1 ≠ n2) |
Tensor Dtype Mutation | Tensor<n, DT1> | Tensor<n, DT2> (DT2 ≠ DT1) |
Primitive Mutation | T1 = int | bool | float | str | T2 (T2 ≠ T1) |
Tuple Mutation | (Tii ∈1…n) | (type_mutate(Ti)i ∈1…n) |
List Mutation | [Tii ∈1…n] | [type_mutate(Ti)i ∈1…n] |
Mutation: Value Mutation
13
in_channels = random_int()
l = torch.nn.Conv2d(in_channels,…)
Test Oracle
14
m = torch.nn.Conv2d(64, 128, 1, 2).cuda()
tensor = torch.rand(1, 64, 32, 32). cuda()
torch.backends.cudnn.enabled = True
output1 = m(tensor) # with CuDNN enabled
torch.backends.cudnn.enabled = False
output2 = m(tensor) # with CuDNN disabled
print(output1.sum(), output2.sum()) # debugging
assert torch.allclose(output1, output2) # fail
Buggy code #1
RQ1: Input Source Study
15
RQ2&3: Coverage Trend & Ablation Study
Coverage trend analysis for PyTorch
16
RQ4: Comparison with Prior Work
17
Comparison on input coverage
Comparison with LEMON on mutation
| FreeFuzz (tf1.14) | LEMON | CRADLE |
# API | 313 | 30 | 59 |
Line Cov. | 33389 | 29489 | 28967 |
| FreeFuzz (tf1.14) | LEMON |
# API | 313 | 35 |
Line Cov. | 35473 | 29766 |
Time | 7h | 25h |
RQ5: Detected Bugs
18
| FreeFuzz | FreeFuzz -TypeMu | FreeFuzz -RandMu | FreeFuzz -DBMu | FreeFuzz -AllMu | Confirmed�(Fixed) |
Pytorch | 28 | 13 | 24 | 26 | 5 | 23(7) |
Tensorflow | 21 | 20 | 5 | 20 | 2 | 15(14) |
More Bug Examples
19
import torch
from torch.nn import Conv3d
x = torch.rand (2 ,3 ,3 ,3 ,3)
Conv3d(3, 4, 3, padding_mode='reflect')(x) # Crash
Documentation
torch.nn.Conv3d
padding_mode (string, optional) � Supported values: 'zeros', 'reflect', 'replicate' or 'circular’
Buggy code #2
More Bug Examples
20
import torch
m_gpu = torch.nn.MaxUnpool2d(2, stride=2).cuda()
m_cpu = torch.nn.MaxUnpool2d(2, stride=2)
tensor = torch.rand(1, 1, 2, 2)
indices = torch.randint(-32768, 32768, (1, 1, 2, 2))
gpu_result = m_gpu(tensor.cuda(), indices.cuda())
cpu_result = cpu(tensor, indices) # Exception on CPU
Buggy code #3
GPU produces a wrong result silently�without throwing any error!
Conclusion
21
Questions? Email: Anjiang Wei <anjiang@stanford.edu>
Backup Slides
22
Code Collection & Instrumentation
23