1 of 200

Please do not redistribute these slides without prior written permission

1

2 of 200

CS 5500

Foundations of Software Engineering

Dr. Mike Shah

2

3 of 200

Pre-Class Warmup (1/4)

  • How many lines of code can a programmer write in a full work day?

3

4 of 200

Pre-Class Warmup (2/4)

  • How many lines of code can a programmer write in a full work day?
  • According to the book pictured to the right (written in the 1975 by Fred Brooks at IBM) about 10 lines of code actually stay in the project.
    • Why does that number seem so low?
    • How can we get that number higher?

4

Person

5 of 200

Pre-Class Warmup (3/4)

  • How much time (as a percentage of their day) do you think programmers spend debugging their code?
    • (next slide)

5

6 of 200

Pre-Class Warmup (4/4)

  • How much time (as a percentage of their day) do you think programmers spend debugging their code?

6

7 of 200

Note to self: Start audio recording of lecture :)�(Someone remind me if I forget!)

7

8 of 200

“The Daily Scrum”

8

9 of 200

Last Class (“What did I do yesterday”) (1/2)

  • Review into how the D programming language works
    • Understanding pointers
    • Stack, Heap, and static memory
  • Programming Paradigms
    • Procedural Programming
    • Functional Style (map, filter, reduce)
    • Object-Oriented Programming
      • Structs - Value Type
      • Classes - Reference Type
        • Classes allow for inheritance, structs do not
        • Classes can implement an interface
  • Brief exercise in class design with inheritance
    • (And not to forget about composition)

9

10 of 200

Last Class (“What did I do yesterday”) (2/2)

  • Questions or comments on the lab?

10

11 of 200

Course Logistics (“What are we going to do today”)

  • Deliverables
    • Assignment 2 due shortly
    • Assignment 3 available from Github, do a ‘git pull’
  • Today we are going to learn about debugging and program analysis

11

12 of 200

Blocking (“What is stopping forward progress”)

  • Other open questions?

12

13 of 200

Goal(s) for today

  • Fundamentals of Program Analysis for writing better code
  • Understand Debugging Strategies
  • Introduction to GDB and/or LLDB

Today is all about giving you tools to fix bad code...for the 1% of us who create bugs ;)

13

14 of 200

What Causes Bad Code to be Written? (1/3)

  • Question to the audience:

14

15 of 200

What Causes Bad Code to be Written? (2/3)

  • Question to the audience:
    • Could it be the software methodology?
      • Mismanagement?
        • Perhaps the wrong software development methodology used (e.g. Agile versus something else?)
        • (No one having the big picture in mind?)
    • Lack of experience?
      • Programmers using the wrong data structures/algorithms or otherwise structuring code poorly?
      • Code that appears to work, but perhaps hastily put together (or left untested) -- brittle code.
    • Perhaps programming language you are using?
      • e.g.,
        • C and C++ (and other languages) allow for memory errors
          • (The freedom to do *almost* anything must be used responsibly)
        • JavaScript, dynamic typing
        • Most languages ‘casting’ could be a code smell
    • Active sabotage? Yikes!

15

16 of 200

What Causes Bad Code to be Written? (3/3)

  • Question to the audience:
    • Could it be the software methodology?
      • Mismanagement?
        • Perhaps the wrong software development methodology used (e.g. Agile versus something else?)
        • (No one having the big picture in mind?)
    • Lack of experience?
      • Programmers using the wrong data structures/algorithms or otherwise structuring code poorly?
      • Code that appears to work, but perhaps hastily put together (or left untested) -- brittle code.
    • Perhaps programming language you are using?
      • e.g.,
        • C and C++ (and other languages) allow for memory errors
          • (The freedom to do *almost* anything must be used responsibly)
        • JavaScript, dynamic typing
        • Most languages ‘casting’ could be a code smell
    • Active sabotage? Yikes!

16

I want to start off by touching on this one -- and then seeing what we can do to mitigate this circumstance.

17 of 200

Code Smells

17

18 of 200

Question to Audience: What is a Code Smell? (1/3)

  • (Answer next slide)

18

19 of 200

Question to Audience: What is a Code Smell? (2/3)

  • A code smell is a hint that something has gone wrong somewhere in your code.
    • Definition source: https://wiki.c2.com/?CodeSmell
  • Instruction definition: A code smell is a sort of ‘anti-pattern’ or ‘complexity’ introduced that might make code brittle to maintain, or perhaps leave edge-cases open to bugs.
  • Question to the Audience:
    • What would be a (code smell) bad piece of code or bad practice?

19

20 of 200

Question to Audience: What is a Code Smell? (3/3)

  • Question to the Audience:
    • What would be a (code smell) bad piece of code or bad practice?
      • Some that we may have heard or been mentioned before:
        • Global Variables (scope is the entire program)
        • Maybe too big of functions (or too many ‘jobs/responsibilities’ in one function)
        • Maybe too few comments explaining ‘why’ code is as it is.
        • Single letter variables with non-local scope.
        • Too many nested-loops (Complexity or bad data access pattern for performance)
        • Too many nested if/else conditionals? (Complexity)
        • Casting data (Losing type safety, possible error if data is not correct format)
      • These are not laws however -- sometimes it’s appropriate to break the rules.

20

21 of 200

How to write good code in DLang

21

22 of 200

  • A collection of rules has been presented for the official D style guide.
    • Some of these have to do with code formatting
    • Some of these rules have to do with best practices with the language
    • It’s a non-exhaustive list, but I think a good reference to start from.
  • As always, style guides are a convention
    • We are not necessarily recommending that all code follow these rules. They're likely to be controversial in any discussion on coding standards. However, they are required in submissions to Phobos and other official D source code.”

22

23 of 200

(Aside) Inspiration from other languages

  • Generally speaking we want to follow D Language conventions.
    • It’s useful to know that the D style guide exists, but other programming languages might offer other explanations to rules.
  • I think the core guidelines in C++ are a nice example of conventions to source. (e.g. when to use an exception, try/catch, etc.)

23

24 of 200

DLang Avoiding Code Smells

Walter Bright on Avoiding Code Smells (https://www.youtube.com/watch?v=lbp6vwdnE0k)

24

25 of 200

Nice Formatting

  • Format your code nicely :)
    • Use automated formatting tools (indent, your IDE, etc.)
  • We have available a tool (Called D Format) that can perform automatic formatting for us.

25

26 of 200

  • Install with
    • dub run dfmt -- -h
  • Then run with:
    • dub run dfmt -- --inplace --space_after_cast=false --max_line_length=80 --soft_max_line_length=70 --brace_style=otbs formatting.d
    • This automatically formats the file ‘formatting.d’ with the given parameters.
  • See my video tutorial here for more:
    • https://youtu.be/r4uRUMkN6_k
  • Note: We’ll talk about dub later on in the course -- which is a package management tool.

26

27 of 200

(short) Global Variables

  • Sometimes we need global state
    • The problem is if we use short variable names for globals
      • It’s not clear what they mean when in a different program
      • ‘Variable shadowing’ rules in languages may make it unclear which ‘char c’ is being used.
        • i.e. You can sometimes have a global variable named ‘c’ and a local variable named ‘c’ in languages.
  • rule of thumb
    • The more ‘global’ the variable, the longer the variable name should be
    • The more ‘local’ a variable, the more reasonable it may be to have a shorter name.
      • Conventions like ‘i’ or ‘idx’ for an index are common for locals

27

28 of 200

(short) Global Variables

  • Example of how globals can cause confusion
    • The ‘writeln’ here are purely to help you understand which global is being used.
    • The loop at the bottom is a good example of using ‘i’ within a small scope.

28

29 of 200

Style

  • The public part of your interface (before you scroll) should appear towards the top.

29

30 of 200

Don’t reuse variables

  • Leaves room for mistakes when you re-assign variables, especially if you have to look down later.
    • Note: Compilers use something called SSA (Static Single Assignment) behind the scenes, using ‘1 unique register’ per assignment to help ease the analysis. (bottom-right image)
    • Note: Bottom left example we can use ‘const’

30

Bad!

Good

Best

What compilers do

31 of 200

Don’t reuse variables

  • Prefer example on the right
    • Again, minimize scope for your variables

31

Bad!

Good

32 of 200

Side Channel Globals

  • Prefer (as shown on the right side) passing in globals as a parameter rather than direct usage.
    • Can easily see what is coming into function and what is coming out of function.
    • Would have to measure -- but in some cases this might actually be more efficient code as well!

32

Bad!

Good

33 of 200

Aggregate Globals

  • When you must use global variables, put them into a ‘struct’ so that they are aggregated.
  • Improves searchability -- can just search “MyGlobals” for example.
    • Can also improve ‘plasticity of code’ (i.e. the ability to change our code)
      • e.g. Can make MyGlobals heap or stack allocated as necessary as well depending on how many or what data you have.

33

Bad!

Better

34 of 200

Aggregate Globals

  • Example showing lines 6-8 where we are able to wrap our global variables.
  • Line 11 creates our globals.
  • Note: We will learn later about ‘static this()’ and ‘shared static this()’ which are ways to run code before main
    • The use case would be if we wanted to allocate ‘myGlobals’ on the heap for instance.
    • https://dlang.org/spec/module.html#staticorder

34

35 of 200

Leaky Abstractions

  • Top view shows how we have ‘leaked’ our abstraction that this is a linked list
    • We have failed to abstract.
  • Prefer instead the bottom-right -- creating a range with an iterator
    • Allows us to much more easily substitute data structure later on if we choose.
    • Bottom-left shows another example in D

35

Bad!

Better

Usage

36 of 200

Leaky Abstractions bad (left) -- fixed on the right

36

37 of 200

Supporting multiple platforms

  • The important part is at the bottom (assert(0))
    • Need to be able to fail quickly when you’re on a platform that is not supported.
    • Ideally unit tests will help with the discoverability here.

37

38 of 200

Pull Requests

  • See the right
    • This is very important advice especially when working in teams
      • Make your changes before a pull request small!
      • Minimize how many files you are modifying in each pull request.

38

39 of 200

We can go on...

  • In fact, I recommend you watch Walters talk (https://www.youtube.com/watch?v=lbp6vwdnE0k)
    • The reality is, we cannot memorize all of the possible code smells.
  • They come from experience and talking with your teammates.
    • I encourage you to talk with your teammates, review code, and plan ahead on the code that you are going to write.

39

40 of 200

  • Another nice talk (by Sandi Metz) highlighting various ‘code smells’ (Ruby language used)
  • One nice takeaway are some of the categories she groups them into.

40

41 of 200

(Aside) More on Best Practices https://www.youtube.com/watch?v=nqfgOCU_Do4

  • Talk by Jason Turner on C++ Best Practices
    • Examples are in C++, but perhaps also useful to observe.

41

42 of 200

Potential Code Smell - Construction Separate (1/2)

  • What does this code do? How to improve it?

42

43 of 200

Potential Code Smell - Construction Separate (2/2)

  • Better to always initialize things as you declare them.
    • Often can lead to insights on immutable or const for the type as well.
    • Assume immutable or const unless otherwise needed.

43

44 of 200

Code Smell - Out Variables (1/3)

  • What does this code do? How to improve it?

44

45 of 200

Code Smell - Out Variables (2/3)

  • We can be explicit in our parameters
    • This is sometimes useful if we’re working with a C-library from the D programming language.

45

Note: In DLang we can explicitly specify ‘out’ which fixes this problem somewhat

46 of 200

Code Smell - Out Variables (3/3)

  • Better yet, return some aggregate type (e.g. a tuple, or otherwise some struct perhaps)

46

47 of 200

Code Smell - Raw Loops (1/3)

  • What does this code do? How to improve it?

47

48 of 200

Code Smell - Raw Loops (2/3)

  • Solution: Use std.algorithm
    • ‘all’ is exactly what we want.

48

49 of 200

Code Smell - Raw Loops (3/3)

  • Solution: Use std.algorithm
    • ‘all’ is exactly what we want.

49

This is a templated function that takes in a predicate ( a function that evaluates to true or false)

We’ll talk more about this as we proceed further.

50 of 200

Code Smell - Multi-Step Functions (1/3)

  • What does this code do? How to improve it?

50

51 of 200

Code Smell - Multi-Step Functions (2/3)

  • Why is it bad?
    • Harder to reason about
    • Limited to data type ‘int’ (what if we have a database with more than 2 billion things)

51

52 of 200

Code Smell - Multi-Step Functions (3/3)

  • Solution: Decompose into smaller functions
  • Then you can just read the return value and understand the algorithm

52

53 of 200

How to learn ALL the rules? (1/2)

  • It takes practice to learn many of these rules
  • Your mental model of programming also evolves as you proceed.
  • That said however -- tools help with writing better code.
    • Tools can be run on the command line, our IDEs, etc. to help us write better code, or otherwise correct mistakes we make if we’re coding too many hours late at night!

53

54 of 200

How to learn ALL the rules? (2/2)

  • It takes practice to learn many of these rules
  • Your mental model of programming also evolves as you proceed.
  • That said however -- tools help with writing better code.
    • Tools can be run on the command line, our IDEs, etc. to help us write better code, or otherwise correct mistakes we make if we’re coding too many hours late at night!

54

Let’s take a look at some tools now!

55 of 200

Introduction to Program Analysis

(Static Analysis & Dynamic Analysis)

55

56 of 200

The field of Program Analysis (1/2)

  • Simply put--program analysis is the study and practice of building automated tools to understand and measure specific aspects of software.
    • Some categories include:
      • Finding where correctness bugs exist
      • Finding where security loopholes exist
      • Finding performance problems
      • Finding stylistic changes that need to be made to code
      • Potentially visualizing software
        • i.e. using mediums other than text like graphs or charts

56

57 of 200

The field of Program Analysis (2/2)

  • Simply put--program analysis is the study and practice of building automated tools to understand and measure specific aspects of software.
    • Some categories include:
      • Finding where correctness bugs exist
      • Finding where security loopholes exist
      • Finding performance problems
      • Finding stylistic changes that need to be made to code
      • Potentially visualizing software
        • i.e. using mediums other than text like graphs or charts

57

So what are our tools for figuring out these properties?

58 of 200

Tools for Program Analysis

The two most common tools for program analysis are:

  • Static analysis
  • Dynamic analysis

58

*There are also hybrid analysis of static and dynamic analysis

59 of 200

Static Analysis

Analyzing program source and/or data before they run

59

60 of 200

Static Analysis (1/3)

  • Static analysis is looking at a programs source code to learn facts about the program.

60

61 of 200

Static Analysis (2/3)

  • Static analysis is looking at a programs source code to learn facts about the program.
  • This analysis happens before you run a program
    • Notice the phrasing “could this program”
    • (Perhaps even before compile-time!)
      • (Though running an analysis on a program with syntax errors is probably not too useful...)

61

Fact Check: Could this program ever divide by 0?

62 of 200

Static Analysis (3/3)

  • Static analysis is looking at a programs source code to learn facts about the program.
  • This analysis happens before you run a program
    • Notice the phrasing “could this program”
    • (Perhaps even before compile-time!)
      • (Though running an analysis on a program with syntax errors is probably not too useful...)
      • Oops, looks like it can!

62

Fact Check: Could this program ever divide by 0?

63 of 200

How Does Static Analysis Work?

63

64 of 200

Example #1 Static Analysis in the Compiler

  • Dead code elimination
    • We can search the whole program and see this variable is never assigned or re-assigned a value--and thus get rid of it
      • i.e. there is no ‘use’ of a ‘defined’ variable named ‘unusedVariable’.

64

65 of 200

Example #2 Static Analysis in the Compiler

  • Compilers parse our code for errors during compilation and can provide some analysis (i.e. using clang analyze).
    • User friendly compilers even output and point to the error they have found!

65

66 of 200

Example #2 Static Analysis in the Compiler (GDC)

66

67 of 200

How do these analysis work? CFG Abstraction (1/2)

  • Typically we model a program as a Control Flow Graph (CFG)
    • That is, your program is broken into Basic Blocks (B1,...,BN)
    • And then we can start gathering facts about the program
      • e.g. Does a variable declared in B1 never show up again? If so, then it is safe to eliminate it

67

68 of 200

How do these analysis work? CFG Abstraction (2/2)

  • The nodes (i.e. basic blocks) are the computation
  • The edges are the control flow

68

69 of 200

Program Correctness and Program Optimization

  • The two previous use cases are where we would use program analysis
    • Find Optimizations (i.e. eliminating dead code)
    • Enforce program correctness (i.e., detect syntactical and some classes of logical errors in our code)

69

70 of 200

Another Static Analysis -- Indent tool (1/3)

  • A tool may looks at how our source code is formatted
    • This happens before run-time, and is an analysis of our code.
  • Here’s an example with the `indent` tool on unix
    • Code on the left has been analyzed (checking how nested code is)
    • Code on the right appropriately adjusted

70

71 of 200

Another Static Analysis -- Indent tool (2/3)

  • A tool may looks at how our source code is formatted
    • This happens before run-time, and is an analysis of our code.
  • Here’s an example with the `indent` tool on unix
    • Code on the left has been analyzed (checking how nested code is)
    • Code on the right appropriately adjusted

71

Question to the audience: Is this a static analysis?

72 of 200

Another Static Analysis (sort of) -- Indent tool (3/3)

  • A tool may looks at how our source code is formatted
    • This happens before run-time, and is an analysis of our code.
  • Here’s an example with the `indent` tool on unix
    • Code on the left has been analyzed (checking how nested code is)
    • Code on the right appropriately adjusted

72

  • The line is a little gray here on if this is a static analysis.
  • Indentation may give us some understanding of developer intent, but is not very well-defined.
  • However, I count bad indentation as a stylistic error and count it :)

Question to the audience: Is this a static analysis?

73 of 200

DLang Static Analysis Tools -- dscanner

  • For languages like DLang (and really any language) static analysis tools have great value
    • The primary tools used currently are the compiler (DMD) and DScanner
    • https://github.com/dlang-community/D-Scanner
    • Run: dub fetch dscanner
    • Then: dub run dscanner
      • e.g. dub run dscanner -- --report unusedVariable.d
      • hint use:
        • --reportFile file.json
        • to get json output

73

74 of 200

DLang Static Analysis Tools -- dscanner

  • Example of a program with unused and unmodified variables
  • Run:
    • dub run dscanner -- --report unusedVariable.d
  • Observe the report
    • Note: IDE/Text Editors may integrate this information more easily

74

75 of 200

2. Dynamic Analysis

75

76 of 200

Dynamic Analysis (1/2)

  • Gathering facts about a program that is running
    • Typically this is done by logging and storing information in an internal data structure.

76

77 of 200

Dynamic Analysis (2/2)

  • Gathering facts about a program that is running
    • Typically this is done by logging and storing information in an internal data structure.

77

Fact Check: What are all of the values of ‘x’ a user input in this instance of the program

78 of 200

Example #1 of Dynamic Analysis - Profilers (man perf)

  • Profiling tools with our programs gathering performance information about a program.
    • Typically gathering execution time at different granularities (e.g. function) within the program
    • A log of the linux ‘perf’ tool is shown below.

78

Call Tree

Entries Sorted By Time Spent

79 of 200

Profiling

  • Built into the D compiler is a way to add instrumentation at a function level to tell us how much time is spent in each function.
    • This can give us good intuition into where to spend our efforts optimizing our program.
  • Secondly, we also have the ability to instrument memory allocations.
    • This can tell you if you’re unnecessarily allocating on the heap.

79

80 of 200

  • So highlighted above is the ‘-profile’ flag being used.
  • Below is the summary of the profile (trace.log )
    • Note the summary is found at the bottom of trace.log
    • See full talk on profiling here: https://www.youtube.com/watch?v=MFhTRiobWfU

80

81 of 200

Example #2 of Dynamic Analysis - Valgrind (pronounced val--grinn)

  • e.g.

81

82 of 200

(Repeat) Program Correctness and Program Optimization

  • The two previous use cases are where we would use program analysis
    • Find Optimizations
    • Enforce program correctness

82

83 of 200

Static and Dynamic Analysis

(Digging a little deeper)

83

84 of 200

Static Analysis and Dynamic Analysis (1/3)

  • Both analysis we are asking or interested in some property of a program.
  • That means we are asking some question:
    • Static Analysis
      • “Given my program P, I am interested in knowing property A. Does P exhibit A in any possible execution?”
    • Dynamic Analysis
      • “Given my program P, I want to monitor property A. Log all occurrences of A in a single execution of P

84

85 of 200

Static Analysis and Dynamic Analysis (2/3)

  • Both analysis we are asking or interested in some property of a program.
  • That means we are asking some question:
    • Static Analysis
      • “Given my program P, I am interested in knowing property A. Does P exhibit A in any possible execution?”
    • Dynamic Analysis
      • “Given my program P, I want to monitor property A. Log all occurrences of A in a single execution of P

85

  • Correctness
  • Performance
  • Maybe Style
  • etc.

86 of 200

Static Analysis and Dynamic Analysis (3/3)

  • Both analysis we are asking or interested in some property of a program.
  • That means we are asking some question:
    • Static Analysis
      • “Given my program P, I am interested in knowing property A. Does P exhibit A in any possible execution?”
    • Dynamic Analysis
      • “Given my program P, I want to monitor property A. Log all occurrences of A in a single execution of P”

86

Notice the key difference here between static and dynamic analysis?

87 of 200

Static Analysis - A little more rigorous (1/2)

  • Static analysis however can be a bit more rigorous than just looking at the code however.
    • We can actually build up some sort of model of the source code.
    • Generally this allows us to ask questions about program behavior
      • i.e. Something more complex than indentation of source code, but about the execution of the software

87

88 of 200

Static Analysis - A little more rigorous (2/2)

  • Static analysis however can be a bit more rigorous than just looking at the code however.
    • We can actually build up some sort of model of the source code.
    • Generally this allows us to ask questions about program behavior
      • i.e. Something more complex than indentation of source code, but about the execution of the software

88

Static program analysis allows us to ask and answer questions about program behavior

89 of 200

Static Analysis

  • Because a static analysis looks at any possible execution, it is an over approximation of program behavior.
    • i.e. You are being very conservative in thinking about what can happen.

89

Execution Space

All possibilities of what your program could do

What your program actually does

90 of 200

Dynamic Analysis

  • With dynamic analysis you are looking at one possible execution at a time each time you run your software.
    • Thus test your program with many inputs, to push it down several different execution paths in order to get better testing coverage (whether for performance or correctness).

90

Execution Space

All possibilities of what your program could do

An actual execution

An actual execution

An actual execution

An actual execution

91 of 200

Which type of analysis is better?

  • One is not better per say--but can uncover bugs in different areas.
  • Use both!

91

92 of 200

Static Analysis at Google Scale

  • https://dl.acm.org/doi/10.1145/3188720
  • A nice read on how static analysis is used in industry
    • (Work done by a former lab mate)
    • It’s also good motivation for monorepos (like we use)
  • It’s becoming more and more important to use static analysis tools to manage large software infrastructure

92

93 of 200

Short 5 minute break

  • 3 hours and 15 minutes is a long time.
  • I will try to never lecture for more than half of that time without some sort of ‘break’ or transition to an in-class activity/lab.
  • Use this time to stretch, check your phones, eat/drink something, etc.

93

94 of 200

Onwards to Debugging

94

95 of 200

What you’ll learn today -- the metaphor (1/3)

  • For those familiar with the board game monopoly [wiki], there’s a part of the game where you can ‘go to jail’
    • Generally, that’s a bad thing in the game
    • But if you know how to use a debugger, ... (next slide)

95

96 of 200

What you’ll learn today -- the metaphor (2/3)

  • For those familiar with the board game monopoly [wiki], there’s a part of the game where you can ‘go to jail’
    • Generally, that’s a bad thing in the game
    • But if you know how to use a debugger, it’s kind of like having one of these

96

97 of 200

What you’ll learn today -- the metaphor

  • For those familiar with the board game monopoly [wiki], there’s a part of the game where you can ‘go to jail’
    • Generally, that’s a bad thing in the game
    • But if you know how to use a debugger, it’s kind of like having one of these
    • In fact, if you know how to use your debugging tools, it’s like having a lot of these ‘get out of jail free’ cards, that help you get out of tricky situations!

97

98 of 200

What is a bug?

A good place to start

98

Some images today from the wonderful movie ‘A Bug’s Life’ by Disney Pixar.

Apologies for any spoilers! It is a great movie! :)

99 of 200

What is a Software Bug?

  • A software bug is a defect in the logic, correctness, or performance of a software system
    • It is a fault that we want to fix it to match our expectations or a technical specification.
      • (logic) Programs that compiles, but does not do at runtime do what the developer expects
      • (correctness) Program executes path as expected but produces the wrong result
      • (performance) Performance bugs may be dependent on workload on your system or an external system (e.g. a server)
      • (nondeterministic correctness and logic) Heisenbugs for example are bugs that occur in concurrent code and are sporadically observable
  • Software bugs can sometimes go undetected for long periods of time and be difficult to find, depending on the class of the bug
    • Let’s take a moment to look at some infamous software bugs... (next slide)

99

100 of 200

Debugging

The task of removing faults (i.e., defects or bugs) from code using tools and techniques

100

A Bug's Life was a Disney Pixar film in 1998 -- It is not important that you watch it to understand the topic of today :)

101 of 200

Infamous Software Bugs

101

This image is from the American game show “Jeopardy in which contestants answer questions in the form of a question to earn money

Famous Bugs

102 of 200

The First Software Bug - September, 1947

  • Admiral Grace Murray Hopper (Ph.D.) logs the first computer bug in her book
  • “First actual case of bug being found”
    • The term ‘bug’ was popularized by Hopper, but has earlier origins from radio operators using the term.
  • Link to full story

102

103 of 200

Mars Climate Orbiter - 1998

  • Did they mean to put the units in feet or meters?
    • Software calculations were in meters...
    • Team controlling entered parameters in imperial units
  • The probe made an error of about 100km and was destroyed
  • Link to story

103

104 of 200

Win 98 Blue Screen of Death (~1998)

  • This next one is a correctness bug you can see in action!
    • This happened in front of a live audience
    • https://www.youtube.com/watch?v=yeUyxjLhAxU (41 seconds)
    • (The gentleman to the right was not a programmer but in marketing, and later Chief Marketing Officer)
  • Link to story

104

105 of 200

Win 98 Blue Screen of Death (~1998)

  • This next one is a correctness bug you can see in action!
    • This happened in front of a live audience
    • https://www.youtube.com/watch?v=yeUyxjLhAxU (41 seconds)
    • (The gentleman to the right was not a programmer but in marketing, and later Chief Marketing Officer)
  • Link to story

105

And I don’t mean to embarrassed this gentleman on the right -- we know developing software can be tricky!

106 of 200

Y2K Bug - 1999

  • Software developers did not think ahead about code that would last into the new millennium, thus abbreviating 1999 to “99”
    • Banks worried ‘00’ would be interpreted as ‘1900’ and mess up interest rate calculations
    • Media thought there would be disasters (and the bug was real), though we survived.
  • Link to story

106

107 of 200

More bugs (Costly bugs!) [source]

  • 1962
    • Mariner 1 Spacecraft nearly crashes due to a software error ($18 million 1962 dollars)
      • Missing ‘hyphen’ in data transmitted back was 1 of 2 major errors [source]
  • 1988
    • The Morris worm spreads wildly out of control causing an estimated $100 million in damages
      • Error was in the worms ‘replication logic’ [source]
  • 1994
    • Intel’s popular pentium processor had a math error in the fdiv operation costing them $475 million in recalls. [source]
  • 2010
    • Bitcoin Hack lost about 850,000 bitcoins [source]
  • And many more...(the list doesn’t start stop at 2010...)

107

108 of 200

More bugs (Costly bugs!) [source]

  • 1962
    • Mariner 1 Spacecraft nearly crashes due to a software error ($18 million 1962 dollars)
      • Missing ‘hyphen’ in data transmitted back was 1 of 2 major errors [source]
  • 1988
    • The Morris worm spreads wildly out of control causing an estimated $100 million in damages
      • Error was in the worms ‘replication logic’ [source]
  • 1994
    • Intel’s popular pentium processor had a math error in the fdiv operation costing them $475 million in recalls. [source]
  • 2010
    • Bitcoin Hack lost about 850,000 bitcoins [source]
  • And many more...(the list doesn’t start stop at 2010)

108

BUGS!

109 of 200

Why are we creating bugs?

What’s the difficulty?

109

110 of 200

Why is it hard to get software correct? (1/2)

  • Question to the audience: Why is it hard to write correct software? Your thoughts?

110

111 of 200

Why is it hard to get software correct? (2/2)

  • Question to the audience: Why is it hard to write correct software? Your thoughts?
    • (Some of my thoughts)
      • Software changes frequently!
      • Lots of programmers and managers work on a project
        • Programmers rely on building a mental model (some approximation) of the software to reason about behavior
        • Likely this mental model will differ amongst some number of programmers and managers
      • Pressure between meeting tight deadlines and economic costs
        • (i.e., technical debt accrues and make sit hard to write correct software)
      • Poor documentation of APIs -- and sometimes APIs are broken!
      • Lack of testing (unit tests, behavior tests, etc.)
      • Unanticipated inputs (bad user input) or unexpected system events (network down)
    • The reality is, we are humans and will make mistakes!

111

112 of 200

Reality of Software Development (1/2)

  • The reality is you cannot plan for everything
    • We are human and will make mistakes
  • Where this lecture today fits in--is that I want to try to fill in a gap in computer science--we are often not well equipped to debug software, and that’s inevitably where we spend a lot of our time!

112

113 of 200

Today’s topic unfortunately however...is too much of a mystery

113

114 of 200

Today’s topic unfortunately however...is too much of a mystery

114

Learn some debugging techniques

Today’s Goal

“Although computer science education devotes a lot of time to teaching algorithms and fundamentals, it appears that not much of this time is spent applying them to general problems. Debugging is not taught as a specific course in universities. Despite decades of literature suggesting such courses be taught, no strong models exist for teaching debugging.” [ACM Queue The Debugging Mindset 2017]

115 of 200

Daily Wisdom

(Everyday wisdom)

115

116 of 200

From Chuck Norris

  • Some words of wisdom from a famous programmer...

116

https://memegenerator.net/img/instances/45924170.jpg

117 of 200

From Chuck Norris

  • Some words of wisdom from a famous programmer...

117

https://memegenerator.net/img/instances/45924170.jpg

Sorry...that advice is not going to fly!

Do use your debugger!

118 of 200

Some wisdom from Dr. Greg Law

118

119 of 200

Debugging versus testing

  • Debugging is closely related to testing, and both are necessary skills to learn as software engineers
    • Testing means we are checking for the presence of a bug (given an input, test an expected output)
    • Debugging is the process of removing an observed fault in our software
    • We might test again after debugging to confirm the bug has been isolated
      • And likely we may add a unit test to a test suite after debugging
  • We’ll have a separate module on testing in the future!

119

120 of 200

Debugging Techniques

This is interactive--see if you can spot the bug!

120

121 of 200

#1 Scan and Look Debugging

121

...kind of

122 of 200

Common Strategy - Scan and look (1/5)

  • If you’re familiar with the software, sometimes you can just ‘find it’
    • This is called the ‘scan and look’ strategy for bug finding
    • Let’s try it out below

122

123 of 200

Common Strategy - Scan and look (2/5)

  • If you’re familiar with the software, sometimes you can just ‘find it’
    • This is called the ‘scan and look’ strategy for bug finding
    • Let’s try it out below

123

Hmm, do you spot the logic bug? (Either in the code or output)

124 of 200

Common Strategy - Scan and look (3/5)

  • If you’re familiar with the software, sometimes you can just ‘find it’
    • This is called the ‘scan and look’ strategy for bug finding
    • Let’s try it out below

124

The bug has been spotted!

Logical error/typo by the programmer.

Did not provide the correct type.

The lesson here--even if code compiles, it does not imply correctness!

125 of 200

Common Strategy - Scan and look (4/5)

  • If you’re familiar with the software, sometimes you can just ‘find it’
    • This is called the ‘scan and look’ strategy for bug finding
    • Let’s try it out below

125

Here’s the corrected code using the correct type

126 of 200

Common Strategy - Scan and look (5/5)

  • If you’re familiar with the software, sometimes you can just ‘find it’
    • This is called the ‘scan and look’ strategy for bug finding
    • Let’s try it out below

126

Note: The D compiler can be quite handy and prevent some of these ‘implicit conversions’ from happening.

It’s generally wrong for the compiler to assume and truncate 3.1415 should be an integer, so the appropriate error is thrown.

127 of 200

Tradeoffs - Scan and Look (1/2)

  • Pros
    • Anyone can use this strategy, and there are no external tools needed.
  • Cons
    • Not reliable, in some sense you are guessing where the error is
      • (See Where’s Waldo image on the right)
    • This strategy typically does not scale well
      • e.g. Code you did not write is hard to scan
      • e.g. This strategy is likely to be tedious on even small projects (< 1,000 lines of code (LOC)).

127

Where’s Waldo is a children's book where you try to find the main character

https://images-na.ssl-images-amazon.com/images/I/A1auIg-I7WL.jpg

128 of 200

Tradeoffs - Scan and Look (2/2)

  • Pros
    • Anyone can use this strategy, and there are no external tools needed.
  • Cons
    • Not reliable, in some sense you are guessing where the error is
      • (See Where’s Waldo image on the right)
    • This strategy typically does not scale well
      • e.g. Code you did not write is hard to scan
      • e.g. This strategy is likely to be tedious on even small projects (< 1,000 lines of code (LOC)).

128

Where’s Waldo is a children's book where you try to find the main character

https://images-na.ssl-images-amazon.com/images/I/A1auIg-I7WL.jpg

The good news is, we have a tool that can help us having to do this strategy automatically for us.

The compiler can help!

129 of 200

Scan and look (with the compilers help)

  • Using the scan and look strategy can be exhausting
    • So we can improve this solution by using our compiler (it sees all of our code!)
    • -w flag sent to the compiler that will help catch additional snippets of code that may be problematic.

129

130 of 200

Tradeoffs - Scan and look (with the compilers help)

  • Pros
    • Our compiler scales--meaning it can report on errors at compile-time for large programs
    • Over time, compilers tend to get better at finding more errors
  • Cons
    • Only works at compile-time (no bugs found at runtime)
    • Only types of warnings we can fix are what the compiler reports on.
    • What if we don’t have the source code?
      • (i.e., libraries that we link in)
      • We cannot fix those warnings!

130

131 of 200

#2 printf debugging

A technique for helping us debug and retrieve values at run-time

131

132 of 200

Common Strategy - printf debugging (1/5)

  • printf is the ‘C’ function for displaying text on the console
    • (The equivalent in Dlang is writeln)
  • The idea of printf debugging is that we can print a value at a particular point in our source code to discover the state of our program.

132

Try to find the bug!

133 of 200

Common Strategy - printf debugging (2/5)

  • printf is the ‘C’ function for displaying text on the console
    • (The equivalent in C++ is std::cout)
  • The idea of printf debugging is that we can print a value at a particular point in our source code to discover the state of our program.

133

Try to find the bug!

No warnings this time

But see if you can spot the bug (Don’t say anything yet!)

134 of 200

Common Strategy - printf debugging (3/5)

  • printf is the ‘C’ function for displaying text on the console
    • (The equivalent in C++ is std::cout)
  • The idea of printf debugging is that we can print a value at a particular point in our source code to discover the state of our program.

134

Try to find the bug!

Depending on how much code I put on the screen--this bug can be harder to find!

Let’s try to help ourselves out with some output (i.e. writeln) statements

(Bug shown on the next slide)

135 of 200

Common Strategy - printf debugging (4/5)

  • printf is the ‘C’ function for displaying text on the console
    • (The equivalent in C++ is std::cout)
  • The idea of printf debugging is that we can print a value at a particular point in our source code to discover the state of our program.

135

Try to find the bug!

Some well placed output statements anywhere state can change (i.e. a value can be generated or a variable mutated) reveal the value of square(5).

We observe the incorrect value, and confirm we never enter the branch and see ‘output 2’

136 of 200

Common Strategy - printf debugging (5/5)

  • printf is the ‘C’ function for displaying text on the console
    • (The equivalent in C++ is std::cout)
  • The idea of printf debugging is that we can print a value at a particular point in our source code to discover the state of our program.

136

Try to find the bug!

oops, an error in our functions return value--should be (a*a)

137 of 200

Tradeoffs - printf Debugging

  • Pros
    • Can help narrow down where bugs occur
    • You can observe values at run-time
    • You get an idea of where execution is.
    • Can ‘pretty print’ or otherwise format your data output.
  • Cons
    • You may need to make many educated guesses in long running programs
    • You are also modifying the source code directly, and need to remember to remove your output statements
    • It requires you to rebuild your software
      • Recompilation for every small change can be expensive in terms of time
    • It requires you to build additional infrastructure which may or may not be needed
      • Meaning: Not every object has or needs to be printed out, but you will need to see a textual representation of that object

137

138 of 200

#3 Delta Debugging

(A technique to help us narrow our search space for where a bug occurs)

138

139 of 200

Strategy for debugging - Delta Debugging (1/3)

  • With the printf debugging strategy, you are trying to shrink your delta of where an error could occur.
    • This is called Delta debugging

139

140 of 200

Strategy for debugging - Delta Debugging (2/3)

  • With the printf debugging strategy, you are trying to shrink your delta of where an error could occur.
    • This is called Delta debugging

140

Potential bug could be anywhere

141 of 200

Strategy for debugging - Delta Debugging (3/3)

  • With the printf debugging strategy, you are trying to shrink your delta of where an error could occur.
    • This is called Delta debugging

141

Our search space for the bug is somewhere in this range -- that’s where we can gather information.

(Note square function is included in our delta because it is in our search space of where we put the writeln statements.)

142 of 200

Reminder: Turn on warnings (-w)

  • Note, revisiting our note about warnings -- the D compiler is smart enough to statically compute that we can not reach line 20 ever.
    • We’ll treat that warning as an error with -w
      • (or we can treat it as a true warning with -wi if we wanted to proceed, but I generally recommend -w)

142

143 of 200

#4 printf debugging revisited

Improving our printf debugging using our programming language

143

144 of 200

(Aside) printf Debugging in C and C++ with preprocessor

  • There are some programming techniques you can use to help you find and report bugs conditionally
  • In other programming languages like C or C++ you utilize something called ‘the preprocessor’ which does textual replacement before compiling our code
    • The preprocessor allows us to:
      • Choose to conditionally have our printf statements show up at compile-time
      • Write a Macro (a textual replacement function)

144

Check out this video for C and C++ and notes on the preprocessor

https://www.youtube.com/watch?v=ksJ9bdSX5Yo&list=PLvv0ScY6vfd8YRjgGvXKJRAMZQAxNypcH&index=4

145 of 200

Conditional Compilation in DLang (-debug flag)

  • In the D programming language we can pass the ‘-debug’ flag to the compiler.
    • The debug flag will only execute additional error checking code as needed.
      • This is useful while in development.
      • Observe the two different outputs
  • Learn more:

145

1

2

146 of 200

Conditional Compilation in DLang (version flag)

  • Additionally, you may want to have specific code for ‘version’ identifiers that you define, or are standard
    • (e.g. the operating system or architecture are standard version defines)
  • Again, this is a slightly cleaner way to support platform specific code or APIs
  • Learn more:

146

1

2

147 of 200

Tradeoffs - Conditional Compilation to Debug

  • Pros
    • Can make the code slightly cleaner with identified blocks
      • Syntactically cleaner (my opinion) than C or C++ #ifdef blocks
    • Encourage error checking
      • No run-time cost, catch the errors before deployment
    • Can easily turn on and off
  • Cons
    • Still requires source modification
    • Another layer of information flow that may make things difficult to understand.
    • Code could expand beyond expectations with version or debug and make compile times increasingly long
    • Again, adds visual ‘clutter’ to the programmers mental model of how code actually executes

147

148 of 200

Breaking Old Habits

  • Here are some techniques we have seen:
    • 1. scan and look
    • 2. utilizing our compilers
    • 3. delta debugging
    • 4. printf debugging
    • 5. printf with conditional compilation
    • 6. printf with macros
  • However, while in practice they are valid--I want to break some old habits
  • **I want your first resource to be to use an interactive debugger the next time you encounter a bug.**
    • (i.e. not scatter little print statements in your program)

148

149 of 200

Short 5 minute break

  • 3 hours and 15 minutes is a long time.
  • I will try to never lecture for more than half of that time without some sort of ‘break’ or transition to an in-class activity/lab.
  • Use this time to stretch, check your phones, eat/drink something, etc.

149

150 of 200

Interactive Debuggers

Tools that allow introspection into code at run-time e.g., GDB

150

Yes....you will have a part of this -- debuggers save you time!

151 of 200

Interactive Debugger

  • Interactive debuggers allows us to inspect our program without source modification
    • (They can sometimes however be a form of dynamic binary instrumentation)
  • Today I want to show you how to use an interactive debugger so you can resolve your DLang bugs
    • Using GDB (or the debugger associated with your operating system/IDE) will be your first line of defense!

151

152 of 200

How Debuggers Work

  • Debuggers work by attaching to a running process
    • (This means we debug at run-time)
    • Typically debuggers use special system calls in the operating system to handle events that take place within the specific process they are attached to.
  • For linux users, you can investigate ptrace
    • For other operating systems there is an equivalent system call you can further look at.

152

153 of 200

Compiling with Debugging Symbols to help GDB

  • Adding debugging symbols when compiling your program, provides more information to the debuggers when you execute your program
    • Information like source file and line number become more clear
    • Typically you can recover symbols for variable and function names in your source files as well
    • (Extra debugging information is stored typically in a ‘symbol table’ or other auxiliary data structure)
  • Takeaway:
    • When compiling, use ‘-g’ to get debugging symbols
      • Note: With DLang you may have to use both: `-g -gf` (On Linux) and `-g -gc` (on Mac with lldb) to get all of the debugging symbols.

153

154 of 200

Running your program with GDB

  • Most often, when you execute your program, you are going to execute it within gdb.
    • GDB provides you a command-line interface to interactively explore and execute your program
  • Starting GDB
    • From within GDB you can type ‘run’ or ‘r’ to start executing your program
    • Or alternatively ‘start’ which will pause your program (using a breakpoint) at the main function.

154

155 of 200

GDB Live Code

Sample code available in course repository at: https://github.com/MikeShah/C-SoftwareEngineering/tree/master/3 (gdb.d)

155

156 of 200

GNU Debugger GDB (1/3)

  • I am going to teach you how to use the GNU GDB Debugger today
  • This is a free debugger available on Windows, linux, and Intel-based Mac

156

157 of 200

GNU Debugger GDB (2/3)

  • I am going to teach you how to use the GNU GDB Debugger today
  • This is a free debugger available on Windows, linux, and Intel-based Mac

157

You can use whatever debugger you like, but I will show examples in GDB for you to follow along with.

Most IDEs have the same functionality and methodology that I will show, perhaps a different workflow

Mac on M1/M2 (Apple Silicon) should use LDC2 with lldb which is similar

158 of 200

GNU Debugger GDB (3/3)

  • I am going to teach you how to use the GNU GDB Debugger today
  • This is a free debugger available on Windows, linux, and Intel-based Mac

158

One other note is that you’ll occasionally see me press ‘Ctrl+L’ to quickly clear the screen of my terminal window.

Typing ‘refresh’ should also work.

159 of 200

Let’s dive in

  • I want to spend some time looking at a simple piece of code
  • Starting with a simple example is a good way to start!
    • (Here’s what we’ll cover)
      • Compiling with debug symbols
        • -g -gf (linux)
        • -g -gc (ldc2 on mac)
      • Running GDB
        • start or run
      • Starting a program
      • Executing each line one at a time
      • Listing the source code
      • Setting some breakpoints

159

160 of 200

Round 2 -GDB TUI (Text User Interface)

  • Many folks do not know--GDB provides a textual user interface
  • You can use Control-x 1 (or Control-x 2) to enable it.
    • Note: It can take a little practice to switch into the TUI Mode
    • I prefer to just launch with tui: gdb ./prog --tui
  • Ctrl-x o will cycle you through the windows in the tui mode
  • You can additionally type ‘list’ if you want to see the source code you are looking at.
    • list linenumber (e.g. list 10)

160

161 of 200

Breakpoints and stepping through code

  • The basic workflow when debugging is to set a ‘breakpoint’ ‘br’ at a specific function or line in your program.
  • This pauses execution until you decide to resume.
    • You can
      • ‘continue’ - Continues execution until the next breakpoint
      • ‘step’ - step to the next line of code that will execute
      • ‘next’ - execute the next instruction
  • After you set a breakpoint you can:
    • display them: ‘info breakpoints’
    • remove them ‘delete breakpoint 1’ (e.g. deletes first breakpoint)
    • save breakpoints filename
      • source filename (loads the breakpoints)

161

162 of 200

Breakpoints and stepping through code

  • Note: In D -- you’ll put a breakpoint at _Dmain to reach the start of your program

162

163 of 200

print

  • When you are at a breakpoint, you want to observe a value
    • From now on--you do not have to litter your code with ‘std::cout’ statements.
  • The ‘print’ command allows you to do that.
    • print variable
      • (or in hex: print/x variable)
    • And you can also ‘dereference if it’s a pointer’
      • e.g., print *variable
    • And you can also print the address of a variable
      • print &variable

163

164 of 200

Watchpoints

  • You can use a ‘watch’ to interrupt your program and set a break every time that a variables value is modified.
    • e.g.,
      • watch i’ in a loop
      • (Then use ‘continue’ to continue execution)

164

165 of 200

Conditional Breakpoints

  • If you want to monitor variables in a loop, you can set conditional breakpoints that watch for a particular condition
  • e.g.
    • br gdb.d:37 if i>20
    • The above line puts a break point at line 13 if the function called encourgers a breakpoint

165

166 of 200

Backtrace (retrieving the call stack)

  • Segmentation faults can be one of the more common errors you encounter, and often you’ll have to changes of state over time.
  • You can use the ‘backtrace’ command to see ‘how’ or otherwise what functions were called to get you in that location.
    • You can use the command ‘bt’ to review where the program crashed by retrieving a program stack
    • Then ‘info args’ or ‘info frame’ to

166

167 of 200

GDB - Attach to a running process

  • (using gdb2.d)
  • Graphics applications like we have been working on run in infinite loops
  • If you have already started executing a program, you can attach a debugger to it
    • ps -elf | grep program_name
    • look for the Process ID (PID)
    • sudo gdb attach {PID number}
      • Usually you’ll need root privileges
    • Helpful hint: Use ‘finish’ to execute until a function is finished in case you are in some library of code when you attach to a process.
      • (Or otherwise use ‘up’ to move up the call stack)

167

168 of 200

Slightly More Advanced Example (time travel)

  • (gdb3.d example)
  • More advanced debuggers allow for ‘time travel’ and reverse debugging
    • target record-full
      • next
      • reverse-next
      • reverse-step

168

169 of 200

Slightly More Advanced Example (polymorphism)

  • (gdb3.d example)
  • Many IDEs do not support some of the more advanced features
  • How do we know object types?
    • whatis object_name
  • How do we know how an object is behaving?
    • info vtbl object_name

169

170 of 200

More debugging resources

  • Check out the following tutorial specifically in DLang with GDB
    • (Again from me)

170

171 of 200

More debugging resources

  • Mac folks on M1
  • Here’s something on lldb to get you started on DLang.
  • Important Note:
    • Generally speaking, lldb debugger works with the LDC2 compiler for DLang
    • Mixing and matching gdb with code from LDC2 on Mac M1/M2’s may not work.

171

172 of 200

More debugging resources - DDD (1/2)

You are welcome to explore more tools and use them in this course

  • A visual debuggers like (DDD) may be helpful.
    • This debugger visualizes data structures
    • https://www.gnu.org/software/ddd/
  • Tools like source trail or other tools may additionally help you investigate and learn about your codebase.

172

173 of 200

More debugging resources - DDD (2/2)

  • Here’s an example of DDD in practice
    • Launched with: ddd ./gdbexample2
  • Uses the same gdb commands we learned, but also a GUI interface
    • This tool works on Linux
    • The point of me showing you this, is other IDEs you use (CLion, Visual Studio, XCode) provide nice interfaces as well.

173

174 of 200

Text Editors and IDEs

  • There are many text editors and IDEs
    • As you know, I mostly work in VIM and terminal
    • If you’d like, here’s a resource for getting started with VSCode
    • I may in the future do more, but right now I won’t do much with integrations, but you are welcome to explore, integrate GDB, d-scanner, etc.

174

175 of 200

(Aside) Debugging on Mac

  • Mac users with Intel-based Mac’s should be able to use gdb
  • Otherwise -- newer Mac’s (on Apple Silicon) should prefer LLDB
    • There is some limited ‘text user interface’ by typing in:
      • lldb
      • then type ‘gui’

175

176 of 200

Debugging Summary

  • Debugging Techniques
    • Use your debugging tools!
    • Compile with ‘-g’ while developing
    • Treat warnings as errors that need to be fixed (-Werror).
    • Use -w -wi -wo
      • Produce warning messages
    • Use two compilers
    • GDB will help you solve your problems much quicker than guessing and recompiling.

176

177 of 200

Debug and Release Builds

Other considerations to be careful of when distributing software to the masses

177

178 of 200

Debug and release builds (1/2)

  • Recall that we used -debug
  • Just a note that we typically call this a ‘debug build’
  • When we do not include debug symbols, we call that the release build.
    • Question to audience: Why might we not want to give to consumers a ‘debug build’

178

179 of 200

Debug and release builds (2/2)

  • Recall that we used -debug
  • Just a note that we typically call this a ‘debug build’
  • When we do not include debug symbols, we call that the release build.
    • Question to audience: Why might we not want to give to consumers a ‘debug build’
      • Hackers can see extra information!
      • Note: You can use various tools (strip on linux for example) to remove debugging information.

179

180 of 200

Some General Tips on Code Writing and Debugging

180

181 of 200

List of Tips to write better software and ease debugging

  • Use defensive programming techniques to strengthen your code
    • use assert and static assert (we’ll learn more soon!)
    • Break your program into smaller pieces (modularize as necessary)
      • Take advantage of different programming paradigms
  • Do write tests (-unittest)
  • Do think a little bit before writing code
    • Explain to someone else, draw a picture, etc.
  • Make very small changes to programs, then proceed to add more
  • Take breaks
    • Walk away, and revisit the problem a little later when your mind is fresh

181

182 of 200

Closing Thoughts (1/2)

  • Question to audience: Are there any weaknesses to debugging?

182

183 of 200

Closing Thoughts (2/2)

  • Question to audience: Are there any weaknesses to debugging?
    • One thing to consider is ‘code coverage’ and this comes hand-in hand with testing
      • We’ll only be able to use our debugging skills on portions of the code that actually executes
    • There’s also some difficulty of debugging optimized builds
      • Some debuggers support this better than others

183

184 of 200

Retrospective

  • Anything unclear from the lecture before we move forward?

184

185 of 200

In-Class Activity

185

186 of 200

In-Class Activity

  • Complete the in-class activity from the schedule
    • (Do this during class, not before :) )
  • Please take 2-5 minutes to do so
  • These make up a total of 5% of your grade
    • We will review the answers shortly

186

187 of 200

Extra

187

188 of 200

A Code Smell

  • You will be working on refactoring some of your previous code into your project
    • (Refactoring a larger architectural design can often be quite expensive in a large project)
  • Some patterns (like the singleton) or anti-patterns can be detected automatically for us using static and dynamic analysis tools
    • But otherwise, experience is the best teacher for bad code... i.e.
    • Introducing...the code smell!

188

189 of 200

Refactoring Code

189

(Reference from Stranger Things--a great Netflix series!)

190 of 200

Code Smell - Non-Canonical Operators (1/2)

  • What does this code do? How to improve it?

190

191 of 200

Code Smell - Non-Canonical Operators (2/2)

  • Solution:
    • Function should be ‘const’
    • Parameter passed in should be ‘const’

191

192 of 200

192

193 of 200

Linux Debugging Example

  • dmd -g -debug -gf test.d -of=prog
  • info functions passByRef...
    • Find all the names
    • demangle _D4test22passReferenceTypeByRef
  • Or
    • Don’t forget to do
      • br test.passReference...

193

194 of 200

-debug

Compile in debug code

-debug=level

Compile in debug level <= level

-debug=ident

Compile in debug identifier ident

-debuglib=name

Link in libname as the default library when compiling for symbolic debugging instead of libphobos2.a. If libname is not supplied, then no default library is linked in.

194

195 of 200

One ‘teaser’ C++ Design Idiom

(With a fun name--we’re going to observe this later on in the semester)

195

196 of 200

PIMPL Idiom (“Pointer to Implementation”)

  • It’s an interesting experiment to think about ‘hiding’ your code.
  • One way we can hide implementation in our interfaces, is to put the member variables behind a pointer
    • This is where the name, pointer to implementation comes from.
    • Question to audience: What do you think this helps you do? And do you think you can hide information from the debugger?
  • We’re going to review this pattern later on, but you can read more here: https://en.dreference.com/w/cpp/language/pimpl

196

197 of 200

Liskov Substitution Principle (LSP)

197

198 of 200

INVEST Acronym

198

199 of 200

Open Closed Principle

199

200 of 200

List of debugging papers from Prof. Guyer: https://www.cs.tufts.edu/comp/150BUGS/#schedule

  • Memory Safety
  • Sept 11 Memory safety, GC (No paper)
  • Sept 13 Valgrind [PDF] How to Shadow Every Byte of Memory Used by a Program
  • Sept 18 Safe-C [PDF] Efficient detection of all pointer and array access errors
  • Sept 20 SoftBound [PDF] SoftBound: highly compatible and complete spatial memory safety for C
  • Sept 25 Delta Debugging [PDF] Yesterday, my program worked. Today, it does not. Why?
  • [PDF] Isolating cause-effect chains from computer programs
  • Sept 27 LCLint [PDF] Static detection of dynamic memory errors
  • Oct 2 Marple [PDF] Marple: a demand-driven path-sensitive buffer overflow detector
  • Oct 4 Cyclone [PDF] Experience with safe manual memory-management in cyclone
  • Oct 9 No class Substitute Monday schedule
  • Oct 11 DieHard [PDF] DieHard: probabilistic memory safety for unsafe languages
  • Exterminator [PDF] Exterminator: Automatically correcting memory errors with high probability
  • Bonus paper GC vs malloc [PDF] Quantifying the performance of garbage collection vs. explicit memory management
  • Oct 16 [PDF] Enhancing server availability and security through failure-oblivious computing
  • Information Flow
  • Oct 18 TaintCheck [PDF] Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software
  • Oct 29 Bolt [PDF] Bolt: On-Demand Infinite Loop Escape in Unmodified Binaries
  • Oct 31 Terminator [PDF] Termination proofs for systems code
  • Nov 6 Terminator [PDF] Principles of Program Termination
  • Nov 8 Type qualifiers [PDF] Detecting format string vulnerabilities with type qualifiers
  • Nov 13 More type qualifiers [PDF] Flow-sensitive type qualifiers
  • [PDF] Using CQUAL for Static Analysis of Authorization Hook Placement
  • Nov 15 SqlCheck [PDF] The essence of command injection attacks in web applications
  • Nov 20 [PDF] Secure program execution via dynamic information flow tracking
  • Typestate
  • Nov 27 QVM [PDF] QVM: An Efficient Runtime for Detecting Defects in Deployed Systems
  • Grab bag
  • Nov 29 GenProg [PDF] Automatic Program Repair with Evolutionary Computation
  • [PDF] GenProg: A Generic Method for Automatic Software Repair
  • Dec 4 Bug Isolation [PDF] Scalable Statistical Bug Isolation
  • Dec 6 Last paper [PDF] Software Needs Seatbelts and Airbags
  • Extra Papers
  • Date FastTrack [PDF] FastTrack
  • Date RacerX [PDF] RacerX

200