1 of 14

Bugsplainer: Explaining Software Bugs Leveraging Code Structures in Neural Machine Translation

Parvez Mahbub

Dalhousie University

parvezmrobin@dal.ca

Ohiduzzaman Shuvo

Dalhousie University

oh599627@dal.ca

Mohammad Masudur Rahman

Dalhousie University

masud.rahman@dal.ca

2 of 14

Why should we care?

Explaining Software Bugs 🪲

2

[1] T. Roehm, R. Tiarks, R. Koschke, and W. Maalej, “How do professional developers comprehend software?” ICSE 2012 

Numerous approaches to automatically find the location of a bug

Identify certain parts of the code as buggy without offering any meaningful explanation

Developers spend ≈ 50% of their time comprehending the code during software maintenance [1]

2

3 of 14

We present

3

Bugsplainer — a novel transformer-based generative model

Can leverage code structures

Trained using both buggy and fixed source code

This Photo by Unknown Author is licensed under CC BY-ND

4 of 14

4

In: Bug-fix Commit

Generate diffSBT

Discriminatory Pre-train

diffSBT of Bug-free Nodes

diffSBT of Buggy Nodes

Fine-tune

Commit Message

Out: Fine-tuned Model

In: Buggy Code

Generate diffSBT

Fine-tuned Model

Out: Generated Explanation

In: Line Numbers to Explain

diffSBT

How does it work?

Training Bugsplainer

Explanation Generation

5 of 14

diffSBT

5

6 of 14

Experiments

Automatic Metrics & Human Evaluation

6

7 of 14

Experimental Design

Dataset

  • 10,000 repository
  • 150,000 bugfix commit
  • 110,000 commits for training

Model

  • RoBERTa tokenizer
  • T5 architecture
  • 60M parameters

7

8 of 14

Evaluation using Metrics

8

Model�

BLEU

Semantic Similarity

Exact Match

pyflakes

0.49

5.68

0.00

CommitGen

9.94

35.39

1.04

NNGen

24.16

47.33

14.17

Fine-tuned CodeT5

26.19

54.52

8.85

Bugsplainer

32.90

55.22

18.14

9 of 14

Human Evaluation

9

# Developers

# Countries

Programming Experience

Bugfix Experience

20

6

1-10 years

1-7 years

10 of 14

How does it look like?

10

Technique

Generated Explanation

Ground Truth

Fix a bug where the lyricswiki fetcher would try to unescape an empty (None) response and crash

CommitGen

Small bug fix for error handling

NNGen

fix UnicodeDecodeError with non-ASCII text

Fine-tuned CodeT5

Don’t try to get lyrics if we are licensed

pyflakes

no error found

Bugsplainer

fix crash when lyrics not found

11 of 14

Bugsplainer Meets ChatGPT

11

Ground Truth

Fix a bug where the lyricswiki fetcher would try to unescape an empty (None) response and crash

12 of 14

Take-Home Messages

  • Software bugs not only claim precious development time but also cost billions every year
  • We propose Bugsplainer, a novel technique that generates explanations for buggy code segments
  • Bugsplainer outperforms the baselines.
  • This work was supported by Mitacs Accelerate International Program and our industry partner — Metabob Inc.

12

12

13 of 14

Thank You! Questions?

PARVEZMROBIN.COM

14 of 14

14