1 of 84

CirFix: Automated Hardware Repair and its�Real-World Applications

Priscila Santiesteban, Yu Huang, Westley Weimer, Hammad Ahmad

Preliminary Exam Presentation

January 20^th, 2023

Title Slide 2

2 of 84

Debugging = Expensive & Time-Consuming

2

“Debugging, on average, has grown to consume more than 60% of today’s ASIC and SoC verification effort.”

-Harry Foster, Mentor Graphics Corporation

3 of 84

What is a Good Solution to Debugging?

3

1. Fix Bugs

2. Localize bugs

3. But also be liked and useful to users!

As established, we want a solution that will help reduce time and costs associated with debugging bugs.

We may seek the following properties to our solution :

1. A solution that can fix bugs – this can reduce costs associated with finding bugs later or debugging time

2. A solution that can find bugs – this may reduce debugging time

3. Whatever solution we come up with should we liked and useful for programmers – a solution that isn’t liked by users won’t be deployed and thus we won’t reap the benefits we desires in terms of solving the issue of debugging.

These properties are not something we came up out of the blue but are inspired by the software domain. The software side has developed solutions to debugging that have 1 or 2 of these properties.

Not only that, but they have shown to be successful and influential

Intuition may ask us: can we transfer these proposed software-based solutions to hardware since they claim to be good solutions?

In the next few slides I will discuss what solutions the software domain have considered that deploy these properties and why these solutions may not be currently applicable to the hardware domain.

4 of 84

(1) Motivation for Automated Fixing�Automatic Program Repair

4

Definition: Automatic Program Repair (or APR) is a technique that identifies a patch for a given bug

5 of 84

(1) Motivation for Automated Fixing�Automatic Program Repair

5

Definition: Automatic Program Repair (or APR) is a technique that identifies a patch for a given bug

Faulty program w/ bug(s)

Test suite w/ at least one failing test

Fault localizing

Patch

Validation

Repaired program

No Repairs Found

OR

“Generate and Validate”

Evaluating

6 of 84

(1) Motivation for Automated Fixing�Automatic Program Repair

6

Definition: Automatic Program Repair (or APR) is a technique that identifies a patch for a given bug

7 of 84

(1) Motivation for Automated Fixing�Automatic Program Repair

7

Definition: Automatic Program Repair (or APR) is a technique that identifies a patch for a given bug

Facebook

8 of 84

(1) Motivation for Automated Fixing�Automatic Program Repair

8

Definition: Automatic Program Repair (or APR) is a technique that identifies a patch for a given bug

9 of 84

(1) Motivation for Automated Fixing�Automatic Program Repair

9

Definition: Automatic Program Repair (or APR) is a technique that identifies a patch for a given bug

However, APR does not work for hardware as it currently stands

10 of 84

What is a Good Solution to Debugging?

10

1. Fix Bugs

2. Localize bugs

3. But also be liked and useful to programmers!

11 of 84

(2) Motivation for Localizing Bugs�Fault Localization

11

Definition: Fault Localization (FL) is a technique used to help determine which lines in the code could be responsible for a given bug

12 of 84

(2) Motivation for Localizing Bugs�Fault Localization

12

Definition: Fault Localization (FL) is a technique used to help determine which lines in the code could be responsible for a given bug

13 of 84

(2) Motivation for Localizing Bugs�Fault Localization

13

Definition: Fault Localization (FL) is a technique used to help determine which lines in the code could be responsible for a given bug

14 of 84

(2) Motivation for Localizing Bugs�Fault Localization

14

Definition: Fault Localization (FL) is a technique used to help determine which lines in the code could be responsible for a given bug

However, FL does not work for hardware as it currently stands

15 of 84

What is a Good Solution to Debugging?

15

1. Fix Bugs

2. Localize bugs

3. But also be liked and useful to users!

16 of 84

(3) Motivation for Human Relevance�Users Subjective Judgment

APR tools need to be trusted to even be deployed!

Literature gives evidence of programmers not trusting automated repairs

APR tools are hard to use in many settings (more prevalently in the classroom)

Humans sometimes care about more complicated fixes (2+ fixes)

APR may only help produce one fix

16

FIRSTLY,

In the software domain, we have found that automatic repair tools need to be trusted to be deployed. The literature has given us evidence that humans have a hard time trusted automatic generated patches, which can result in a fail in part of the tool and its purpose

Besides needing to be trusted, we find that automatic program repair tools are limited in the settings they can be used in.

For example, although APR could be deployed to help industrial programmers fix bugs, it could also help the very students that will become industrial programmers how to fix bugs!

Expanding the use case for APR could be beneficial

LASTLY

In the software domain, we learn that humans sometimes care about fixing more complicated bugs. Why does this matter?

Well, most automatic program repair tools only produce single-edit fixes. Meaning Most Automatic program repair tools can fix a bug in a program if it only needs to do a single-edit to fix it.

It may not be able to help programmers that want to fix a more complicated bug.

17 of 84

Summary of Motivation

A good solution may help….

17

1. Fix Bugs

2. Localize bugs

3. But also be liked and useful for users!

18 of 84

What are Some Desired Properties?

Need Automated Program Repair for circuit design

Issue – APR does not work for circuits, but we need to alter it to guide the search

Need good fault localization for circuit designs

Issue – Existing methods cannot handle parallelism and rely on knowing which lines were executed

Need a tool humans enjoy using and is useful to them

Issue - Automated test case approaches do not tell us how humans feel

Different tools are useful in many contexts

18

19 of 84

Circuit Design Example:�Gate Level Modeling in Verilog

19

20 of 84

Circuit Design Example:�Gate Level Modeling in Verilog

20

21 of 84

Circuit Design Example:�Gate Level Modeling in Verilog

21

22 of 84

Proposed Solution

Propose a genetic programming-based APR algorithm to guide the search

Genetic Programming (GP) : Technique that iteratively evolves programs until we reach desired functionality

Propose a data-flow based fault localization algorithm that works on circuits

Data-flow approach propagates values along control-flow and data dependencies
It does not rely on which lines are visited and handles circuit style concurrency

Propose conducting a focused human study

We need a human study to evaluate subjective judgment of the specific tool

22

23 of 84

Introducing CirFix: High-level

Presents a novel approach to guide the search for a hardware design repair using the existing hardware design processes
Presents a novel dataflow-based fault localization approach for hardware designs to implicate faulty design code
Presents a human study to evaluate the objective performances and subjective judgment of designers as they interact with CirFix’s fault localization

23

CirFix: A hardware-design focused automated repair algorithm based on genetic programming (GP)

24 of 84

Introducing CirFix: High-level

Presents a novel approach to guide the search for a hardware design repair using the existing hardware verification process
Proposes a novel dataflow-based fault localization approach for hardware designs to implicate faulty design code
Proposes a human study to evaluate the objective performances and subjective judgment of designers as they interact with CirFix’s fault localization

24

CirFix: A hardware-design focused automated repair algorithm based on genetic programming (GP)

25 of 84

Fitness Function in APR

25

Test suite with two failing tests

tc0: pass tc4: fail

tc1: pass tc5: pass

tc2: fail tc6: pass

tc3: pass tc7: pass

Fitness = 0.75

(6 passing, 2 failing tests)

26 of 84

Fitness Function in APR

Compiler version N-2017.12-SP2-1_Full64; Runtime version N-2017.12-SP2-1_Full64;

time, clk, reset, enable, count_out, overflow_out

0, 0, 0, 0, x, x

5, 1, 0, 0, x, x

...

250, 0, 0, 1, 5, 1

255, 1, 0, 1, 5, 1

256, 1, 0, 1, 6, 1

$finish called from file "first_counter_tb_t3.v", line 70.

$finish at simulation time 258

26

Test suite with two failing tests

tc0: pass tc4: fail

tc1: pass tc5: pass

tc2: fail tc6: pass

tc3: pass tc7: pass

Fitness = 0.75

(6 passing, 2 failing tests)

Fitness = ???

27 of 84

Fitness Function Modified for Hardware

Fitness scores to evaluate candidate repairs
Testbench instrumentation to record the values of wires and registers at specified timesteps
Bit-level comparison of instrumented wires and registers against expected behavior (oracle)

Details on the bit-level comparison detailed in the paper!

27

28 of 84

Introducing CirFix: High-level

Presents a novel approach to guide the search for a hardware design repair using the existing hardware design process
Proposes a novel dataflow-based fault localization approach for hardware designs to implicate faulty design code
Proposes a human study to evaluate the objective performances and subjective judgment of designers as they interact with CirFix’s fault localization

28

CirFix: A hardware-design focused automated repair algorithm based on genetic programming (GP)

29 of 84

Fault localization for Hardware

Produces a set of implicated design code for a faulty circuit description

29

AST for circuit design, simulation output, circuit oracle

Comparison of output wire values between simulation and oracle to get identifier names with output mismatch

Fixed point analysis of assignments to output wires and registers

Set of implicated AST nodes

As a high-level: our fault localization approach produces a set of implicated design code for a faulty circuit description.

The approach requires a AST for the circuit design, simulation output and a circuit oracle

It will then conduct a comparison between the simulation outputs of the wires/registers to their expected behavior (similar to what we saw in the fitness function) .

If the outputs do not match the expected output, our fault localization will determine those wires/registers are suspicious and collect their identifier names under a set called a “MISMATCH”

This is important as the fault localization will use these mismatch wires to implicate lines of circuit code that use them! I will go into further detail on the next slide.

Finally, our fault localization outputs the set of implicated AST nodes.

30 of 84

Fixed Point Analysis of Assignments

30

Returns a set of wire/register names that have output mismatch

Two ways to be implicated:

Assignments: assigned variable in the mismatch set

(e.g., count <= 1’b1;)

Conditionals: conditional includes a mismatched variable

(e.g., if(reset==1’b1) count <= 1’b0;)

Fault Localization Algorithm (Fixed Point Analysis)

On this slide, you’ll see pseudo code for our fault localization embedded with fixed point analysis. (understanding how exactly the fault localization works due to the interest times)

As mentioned on the previous slide, the set of wires and register that have values that don’t match our expected output are collected on the set MISMATCH PRIME.

The purple box represents the bulk of the fixed-point analysis. In short, we utilize our mismatch set to check for implicated lines of code, particularly nodes in a abstract syntax tree.

There are two ways a node could be implicated and that’s either by data dependencies (or assignments) or control dependencies (or conditionals).

FINALLY, This box transitively captures the data and control dependencies by bringing in implicated wires/registers that aren’t in the mismatch set, is now part of the mismatch set.

31 of 84

32 of 84

Introducing CirFix: High-level

Presents a novel approach to guide the search for a hardware design repair using the existing hardware design process
Proposes a novel dataflow-based fault localization approach for hardware designs to implicate faulty design code
Proposes a human study to evaluate the objective performances and subjective judgment of designers as they interact with CirFix’s fault localization

32

CirFix: A hardware-design focused automated repair algorithm based on genetic programming (GP)

33 of 84

Experimental Setup

RQ #1. What fraction of defect scenarios can CirFix repair?

RQ #2. Does the fault localization scale?

RQ #3.a. Does the fault localization algorithm improve designer’s objective performance?

RQ #3.b. In what context do designers find the fault localization helpfull?

33

Experimental Setup

Research Questions of Focus:

Can we fix bugs?

Can we find bugs?

Do designers like it?

34 of 84

Experimental Setup

RQ #1. What fraction of defect scenarios can CirFix repair?

RQ #2. Does the fault localization scale?

RQ #3.a. Does the fault localization algorithm improve designer’s objective performance?

RQ #3.b. In what context do designers find the fault localization helpfull?

34

Experimental Setup

Research Questions of Focus:

Can we fix bugs?

Can we find bugs?

Do designers like it?

35 of 84

35

A defect scenario consists of :

A Verilog circuit design
An instrumented testbench for the design
A developer-provided oracle for circuit behavior
A design defect for the circuit

Benchmark Suite of Defect Scenario

36 of 84

Benchmark Suite of Defect Scenario

36

A defect scenario consists of :

A Verilog circuit design
An instrumented testbench for the design
A developer-provided oracle for circuit behavior
A design defect for the circuit
A seeded defect for the circuit by hardware experts

37 of 84

Benchmark suite: Hardware Projects

37

Project	Description	LOC
decoder_3_to_8	3-to-8 decoder	25
counter	4-bit counter with an overflow bit	56
flip_flop	T-flip flop	16
fsm_full	Finite state machine	115
lshift_reg	8-bit left shift register	30
mux_4_1	4-to-1 multiplexer	19
i2c	Two-wire, bidirectional serial bus for data exchange	2018
sha3	Cryptographic hash function	499
tate_pairing	Core for running the Tate bilinear pairing algorithm for elliptic curves	2206
reed_solomon_decoder	Core for Reed-Solomon error correction	4366
sdram_controller	Synchronous DRAM (SDRAM) memory controller	420

Introductory-level VLSI course

Projects

OpenCores projects

Open-source GitHub project

38 of 84

Benchmark Suite Defect Seeding

38

Recruited three hardware experts to seed defects into circuits

Two categories of defects

Category 1 (i.e., “easy”)
Category 2 (i.e., “hard”)

32 defect scenarios in benchmark suite

19 Category 1 defects
13 Category 2 defects

39 of 84

RQ1: What fraction of defects can CirFix Repair?

2.05 hours average wall-clock time to find a repair

39

Result: CirFix found 21/32 (65.6%) plausible repairs

16/32 (50.0%) correct repairs (via manual inspection)

40 of 84

Experimental Setup

RQ #1. What fraction of defect scenarios can CirFix repair

RQ #2. Does the fault localization scale?

RQ #3.a. Does the fault localization algorithm improve designer’s objective performance?

RQ #3.b. In what context do designers find the fault localization helpfull?

40

Experimental Setup

Research Questions of Focus:

Can we fix bugs?

Can we find bugs?

Do designers like it?

41 of 84

Why do we care about the scalability of the fault localization?

One Threat to Validity in Our CirFix’s Experiments:

Open-Source projects available are not indicative of real-world applications could have thousands of wires and multiple dependences

41

42 of 84

Why do we care about the scalability of the fault localization?

One Threat to Validity in Our CirFix’s Experiments:

Open-Source projects available are not indicative of real-world applications could have thousands of wires and multiple dependences

42

Pointer/Alias Analysis

43 of 84

Why do we care about the scalability of the fault localization?

One Threat to Validity in Our CirFix’s Experiments:

Open-Source projects available are not indicative of real-world applications could have thousands of wires and multiple dependences

43

Pointer/Alias Analysis

However, the more we increase the number of pointers but keep the number of address the same, the more likely we end up with all the pointers (even indirectly) point to the same address. Now imagine those pointers are the wires in a circuit and the addresses are more wires directing output.

We would expect this behavior from larger hardware projects.

Assume we run our fault localization algorithm on such project.

When we search for additional wires/registers that are transitively mismatched, we may end up with most of these wires being implicated by our fault localization, leading to a less precise fault localization.

We could leverage this insight to approximate scalability of our solutions to real-world applications, by running the benchmarks we do have but under a less precise fault localization.

Let me run through an example.

44 of 84

Review on CirFix’s Fault Localization

44

Fixed Point Analysis Assignment

Returns a set of wire/register names that have output mismatch

Two ways to be implicated:

Assignments: assigned variable in the mismatch set
Conditionals: conditional includes a mismatched variable

Example Initial Mismatch List :

{wire A, wire B}

Example Final Mismatch List :

{wire A, wire B, wire C, register A}

45 of 84

Introducing Noise to the Fault Localization

45

“R” – the probability a matching simulated wire/register and oracle will be added to the Initial mismatch

Example Initial Mismatch List (R = 0) :

{wire A, wire B}

46 of 84

Introducing Noise to the Fault Localization

46

“R” – the probability a matching simulated wire/register and oracle will be added to the Initial mismatch

NEW Example Initial Mismatch List (R >0):

{wire A, wire B,

wire C, register f}

47 of 84

Introducing Noise to the Fault Localization

47

“R” – the probability a matching simulated wire/register and oracle will be added to the Initial mismatch

'F', 'reset’}

NEW Example Initial Mismatch List (R >0):

{wire A, wire B,

wire C, register f}

Example Final Mismatch List (R = 0) :

{wire A, wire B, wire C, register A}

48 of 84

Introducing Noise to the Fault Localization

48

“R” – the probability a matching simulated wire/register and oracle will be added to the Initial mismatch

'F', 'reset’}

NEW Example Final Mismatch List (R >0) :

{wire A, wire B, wire C, register A, register f}

NEW Example Initial Mismatch List (R >0) :

{wire A, wire B,

wire C, register f}

49 of 84

RQ2: Does CirFix Fault Localization Scale?

49

Answer: CirFix may scale to larger real-world projects

Why?

21/21 (100%) plausible repairs at 25% additional noise

19/21 (90.4%) plausible repairs at 50% additional noise

20/21 (95.2%) plausible repairs at 75% additional noise

50 of 84

Experimental Setup

RQ #1. What fraction of defect scenarios can CirFix repair?

RQ #2. Does the fault localization scale?

RQ #3.a. Does the fault localization algorithm improve designers’ objective performance?

RQ #3.b. In what context do designers find the fault localization helpfull?

50

Experimental Setup

Research Questions of Focus:

Can we fix bugs?

Can we find bugs?

Do designers like it?

51 of 84

Human Study Experiment Setup

51

A debugging task consists of :

A sample of defect scenarios paired with debugging hints

Debugging hints were either given in full, partial, or none

Questions to assess participant objective performance and subjective judgement

UM IRB- HUM00199335

52 of 84

Experimental Design: Debugging task

52

Defect Scenario Snippet

Design Description

Debugging Hints

(Full, Partial, None)

UM IRB- HUM00199335

53 of 84

Experimental Design: Debugging task

53

Defect Scenario Snippet

Design Description

Debugging Hints

(Full, Partial, None)

UM IRB- HUM00199335

54 of 84

Experimental Design: Debugging task

54

Defect Scenario Snippet

Design Description

Debugging Hints

(Full, Partial, None)

UM IRB- HUM00199335

55 of 84

Experimental Design: Survey Questions

55

What line(s) in the circuit design are responsible for the bug?

How would you fix the bug? Indicate the line number you are changing

How accurate did you find the debugging information for this bug?

How useful did you find the debugging information for this bug?

Objective Performance

(Precision, Recall, F-Score)

Subjective Judgment

(Likert Scale Ratings)

UM IRB- HUM00199335

56 of 84

Experimental Design: Recruitment

Recruited participants (n=33) with the following characteristics:

Undergraduate or Graduate Student
Range of experiences in hardware language design

(less than a month – 2+ years, median = 4 months – 1year )

56

UM IRB- HUM00199335

57 of 84

RQ3.a: Does the FL algorithm improve designer’s objective performance?�

57

Answer: CirFix’s FL produced no significant improvement in designer’s objective performance

Why?

No statistical difference in time taken to complete debugging task
Objective F-Score increased as debugging hints increased, but not statistically significant (p = 0.12)

58 of 84

RQ3.b: In what context do designers find the FL helpful?

58

Answer: Designers rate CirFix’s FL helpful for debugging multi-line defects in pedagogical contexts

Why?

Participants rated debugging hints for multi-line defects as more useful and accurate than for single-line defects (p = 0.01 ,p = 0.002)
Participants rated debugging hints for student projects as more useful and accurate than for OpenCore Projects (p = 0.02 ,p = 0.001)

59 of 84

Conclusion

Problem: Debugging is Expensive and Time-Consuming

59

60 of 84

Conclusion

Problem: Debugging is Expensive and Time-Consuming

60

A good solutions consists of:

Fix bugs
Find bugs
Be liked and useful to designers

61 of 84

Conclusion

Problem: Debugging is Expensive and Time-Consuming

61

A good solutions consists of:

Fix bugs
Find bugs
Be liked and useful to designers

State of the art in Software:

Automatic Program Repair
Fault Localization
Humans want tools to be useful

62 of 84

Conclusion

Problem: Debugging is Expensive and Time-Consuming

62

A good solutions consists of:

Fix bugs
Find bugs
Be liked and useful to designers

Issues with State of the art:

Automatic Program Repair
Fault Localization
Humans want tools to be useful

Not Applicable to Hardware

63 of 84

Conclusion

Problem: Debugging is Expensive and Time-Consuming

63

A good solutions consists of:

Fix bugs
Find bugs
Be liked and useful to designers

Proposed Solution:

Hardware-Based APR (CirFix)
Data-Flow Based FL
Human Study to Evaluate Designer Relevance

64 of 84

Summary of Results

Does our solution have the properties of a good solution?

Does our solution fix bugs in hardware?

CirFix produces 50% correct repairs (comparable to indicative APR)

Does our solution find bugs in hardware?

CirFix is insensitive to noise on fault localization, and is thus likely to apply to more complex circuit designs

Do humans like our solution and do they find it useful?

As a debugging aid, humans find CirFix’s fault localization helpful in debugging multi-line defects for pedagogical contexts

64

CirFix: Automated Hardware Repair and its Real-World Applications

Presented by: Priscila Santiesteban

65 of 84

Bonus Slides

66 of 84

Fitness Function: Comparison

Bit-level comparison of instrumented wires and registers against an oracle to assess functional correctness of candidate repairs:

where and correspond to the b^th bit for time step c_i in the simulated circuit and the oracle respectively

Oracle: a developer-provided information for circuit behavior

66

Simulation output

Oracle

Bit values of 0 or 1

67 of 84

Fitness Function: Comparison

Bit-level comparison of instrumented wires and registers against an oracle to assess functional correctness of candidate repairs

Oracle: a developer-provided information for circuit behavior

67

Oracle time, output A, output B 5 , x , x 15 , 0 , 0 25 , 0 , 1 35 , 1 , 0 45 , 1 , 1 55 , 0 , 0	Simulation Result time, output A, output B 5 , x , x 15 , x , x 25 , 0 , 1 35 , 1 , 0 45 , 1 , 1 55 , 1 , 1

Fitness Score (sum/total_possible) 4/4 = 1.0

Fitness Function: Comparison

68 of 84

Fitness Function: Comparison

Bit-level comparison of instrumented wires and registers against an oracle to assess functional correctness of candidate repairs

Oracle: a developer-provided information for circuit behavior

68

Oracle time, output A, output B 5 , x , x 15 , 0 , 0 25 , 0 , 1 35 , 1 , 0 45 , 1 , 1 55 , 0 , 0	Simulation Result time, output A, output B 5 , x , x 15 , x , x 25 , 0 , 1 35 , 1 , 0 45 , 1 , 1 55 , 1 , 1

Fitness Score (sum/total_possible) 4/4 = 1.0 0/4 = 0.0

Fitness Function: Comparison

69 of 84

Fitness Function: Comparison

Bit-level comparison of instrumented wires and registers against an oracle to assess functional correctness of candidate repairs

Oracle: a developer-provided information for circuit behavior

69

Oracle time, output A, output B 5 , x , x 15 , 0 , 0 25 , 0 , 1 35 , 1 , 0 45 , 1 , 1 55 , 0 , 0	Simulation Result time, output A, output B 5 , x , x 15 , x , x 25 , 0 , 1 35 , 1 , 0 45 , 1 , 1 55 , 1 , 1

Fitness Score (sum/total_possible) 4/4 = 1.0 0/8 = 0.0 2/10 = 0.2

Fitness Function: Comparison

70 of 84

Fitness Function: Comparison

Bit-level comparison of instrumented wires and registers against an oracle to assess functional correctness of candidate repairs

Oracle: a developer-provided information for circuit behavior

70

Oracle time, output A, output B 5 , x , x 15 , 0 , 0 25 , 0 , 1 35 , 1 , 0 45 , 1 , 1 55 , 0 , 0	Simulation Result time, output A, output B 5 , x , x 15 , x , x 25 , 0 , 1 35 , 1 , 0 45 , 1 , 1 55 , 1 , 1

Fitness Score (sum/total_possible) 4/4 = 1.0 0/8 = 0.0 2/10 = 0.2 4/12 = 0.33

Fitness Function: Comparison

71 of 84

Fitness Function: Comparison

Bit-level comparison of instrumented wires and registers against an oracle to assess functional correctness of candidate repairs

Oracle: a developer-provided information for circuit behavior

71

Oracle time, output A, output B 5 , x , x 15 , 0 , 0 25 , 0 , 1 35 , 1 , 0 45 , 1 , 1 55 , 0 , 0	Simulation Result time, output A, output B 5 , x , x 15 , x , x 25 , 0 , 1 35 , 1 , 0 45 , 1 , 1 55 , 1 , 1

Fitness Score (sum/total_possible) 4/4 = 1.0 0/8 = 0.0 2/10 = 0.2 4/12 = 0.33 6/14 = 0.43

Fitness Function: Comparison

72 of 84

Fitness Function: Comparison

Bit-level comparison of instrumented wires and registers against an oracle to assess functional correctness of candidate repairs

Oracle: a developer-provided information for circuit behavior

72

Oracle time, output A, output B 5 , x , x 15 , 0 , 0 25 , 0 , 1 35 , 1 , 0 45 , 1 , 1 55 , 0 , 0	Simulation Result time, output A, output B 5 , x , x 15 , x , x 25 , 0 , 1 35 , 1 , 0 45 , 1 , 1 55 , 1 , 1

Fitness Score (sum/total_possible) 4/4 = 1.0 0/8 = 0.0 2/10 = 0.2 4/12 = 0.33 6/14 = 0.43 4/16 = 0.25

Fitness Function: Comparison

73 of 84

Implication for Fault Localization

A node in the AST is implicated when:

Either node represent an assignment and the left child of the node corresponds to an identifier in the mismatch set (e.g., count <= 1’b1;)
Or the node represents a conditional statement and an identifier in the statement belongs to the mismatch set (e.g., if(reset==1’b1) count <= 1’b0;)

73

74 of 84

Repair Operators

Genetic Operators: “borrowing code from elsewhere in the design”

Mutation
Crossover

74

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

Parents

Children

Mutation

1

0

1

0

1

0

1

0

1

0

1

0

Crossover

1

75 of 84

Repair Operators

Repair Templates: “introducing new design code”

Flip conditionals (e.g., reset==1'b1 to reset!=1'b1)
Address off-by-one-errors (e.g., counter <= counter+2 to counter <= (counter+2)-1)
Two more detailed in the paper text…

75

76 of 84

Fix Patterns

Pre-identified templates that can be applied to aspects of design code to fix a defect

76

Defect Category	Fix Pattern Description
Incorrect conditional	Negate the condition of a conditional block (e.g., reset == 1’b1 to reset != 1’b1)
Incorrect sensitivity list	Trigger an always block on either the falling edge of a signal (e.g., always@(negedge clk)), or the rising edge of a signal (e.g., always@(posedge clk)), or when a signal is level (e.g., always@(clk)), or on any change to a variable within the block (e.g., always@*)
Incorrect assignment	Change a blocking assignment to a non-blocking one (e.g., overflow = 1'b1; to overflow <= 1'b1;) and vice versa
Off-by-one error	Increment or decrement the value of an identifier by 1 (e.g., counter <= #1 counter + 2; to counter <= #1 (counter + 2) - 1;)

77 of 84

Fix Localization

Guidelines for “how to perform mutations to code”

Reduces the number of mutations resulting in syntactically invalid code

77

Insertion	Replacement
Program statements (e.g., assignments, if-statements) are the only sources of inserted code Such statements are inserted into begin … end blocks only	An element in the design code replaced by another element of the same type (e.g. reset == 1’b1 to enable == 1’b1), or by an element sharing the same abstract type (e.g., counter + 1 to counter - 1)

78 of 84

Abstract Syntax Trees (AST)

AST: a program represented as a tree data structure

78

block

counter_out

if-then-else

equals

reset

1’b1

assignment

4’b0000

if-then-else

#1

overflow_out

assignment

4’b1

#1

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79 of 84

CirFix fitness function is highly effective at capturing incremental changes to design code to guide the search for repairs

Exhibits high fitness distance correlation
Capable of increasing testing prowess without added testing logic to the testbench

79

Reported by original, off-the-shelf testbench

Reported by instrumented testbench and fitness function

RQ3. How effective is CirFix’s Fitness Function?

80 of 84

Generate and Validate: Genetic Algorithms

Goal: Evolve populations of individuals to meet a certain goal

80

Population initialization

Fitness calculation

Selection

Reproduction

Termination criteria met?

Return best individual

Y

N

81 of 84

Selection: Fitness-based Tournaments

The process of choosing a parent for reproduction to produce a child for the next generation

81

Fitness Value

Candidate

Select K candidates at random

Pick the best as parent

82 of 84

82

83 of 84

Software Programs vs. Hardware Descriptions

83

Software Program	Hardware Description
Support design concepts like expressions, assignments, control structures, etc.
Compiled to executable binaries	Simulated using software for testing, synthesized into actual hardware for production
Typically based around a serial execution model (i.e., one line of code executes before the next)	Inherently parallel in nature (think: parts of hardware can operate simultaneously)
Use test suites to evaluate functional correctness, where individual test cases may pass or fail	Use testbenches to instantiate a DUT and manipulate input signals to simulate behavior and read output

84 of 84

Fitness Function: Comparison

Bit-level comparison of instrumented wires and registers against an oracle to assess functional correctness of candidate repairs:

where and correspond to the b^th bit for time step c_i in the simulated circuit and the oracle respectively

Oracle: a developer-provided information for circuit behavior

84

Simulation output

Oracle

Bit values of 0 or 1