1 of 37

Lecture 5: Big O and

Case Analysis

CSE 373: Data Structures and Algorithms

Answer the Warm Up:

PollEv.com/champk

1

CSE 373 24WI

CSE 373 24WI

1

2 of 37

Warm Up

Which of the following statements are TRUE?

Select all options that apply:

  • A Big-Θ bound will exist for every function
  • If a runtime function is O(n²) it can’t also be Ω(n²)
  • A piece of code whose runtime function is f(n) = 3n + 6 is in the Big-O of O(n2)
  • If a runtime function is in Θ(n) then it could be in O(logn)

Answer the Warm Up:

PollEv.com/champk

CSE 373 24WI

2

3 of 37

Announcements

  • Project 0 – 143 Review turn in closes Sunday Jan 14th
  • Project 1 - Deques released yesterday
    • Due Thursday Jan 18th
  • Exercise 1 out - Due Tuesday Jan 16th

CSE 373 24WI

3

4 of 37

P1 Deques

CSE 373 24WI

4

5 of 37

P1: Deques

  • Deque ADT: a double-ended queue
    • Add/remove from both ends, get in middle

  • This project builds on ADTs vs. Data Structure Implementations, Queues, and a little bit of Asymptotic Analysis
    • Practice the techniques and analysis covered in LEC 02 & LEC 03!

  • 2 components:
    • Debug ArrayDeque implementation
    • Implement LinkedDeque

ArrayDeque<E>

DEQUEUE ADT

State

Collection of ordered items

Count of items

Behavior

addFirst(item) add to front

addLast(item) add to end

removeFirst() remove from front

removeLast() remove from end

size() count of items

isEmpty() count is 0?

get(index) get 0-indexed element

LinkedDeque<E>

CSE 373 24WI

5

6 of 37

P1: Sentinel Nodes

Reduce code complexity & bugs

Tradeoff: a tiny amount of extra storage space for more reliable, easier-to-develop code

Tired of running into these?

Find yourself writing case after case in your linked node code?

Client View:

Implementation:

[3, 9]

Introducing

Sentinel Nodes

CSE 373 24WI

6

7 of 37

P1: Gradescope & Testing

  • From this project onward, we’ll have some Gradescope-only tests
    • Run & give feedback when you submit, but only give a general name
  • The practice of reasoning about your code and writing your own tests is crucial
    • Use Gradescope tests as a double-check that your tests are thorough
    • To debug Gradescope failures, your first step should be writing your own test case
  • You can submit as many times as you want on Gradescope (we’ll only grade the last active submission)
    • If you’re submitting a lot (more than ~6 times/hr) it will ask you to wait a bit
    • Intention is not to get in your way: to give server a break, and guess/check is not usually an effective way to learn the concepts in these assignments

1. Write Implementation

2. Think about edge cases, Write your own tests

3. Run your own tests

4. Run Gradescope tests as a double-check

CSE 373 24WI

7

8 of 37

P1: Working with a Partner

  • P1 Instructions talk about collaborating with your partner
    • Adding each other to your GitLab repos
  • Recommendations for partner work:
    • Pair programming! Talk through and write the code together
      • Two heads are better than one, especially when spotting edge cases ☺
    • Meet in real-time! Consider screen-sharing via Zoom
    • Be kind! Collaborating in our online quarter can be especially difficult, so please be patient and understanding – partner projects are usually an awesome experience if we’re all respectful
  • We expect you to understand the full projects, not just half
    • Please don’t just split the projects in half and only do part
    • Please don’t come to OH and say “my partner wrote this code, I don’t understand it, can you help me debug it?”

CSE 373 24WI

8

9 of 37

Questions?

CSE 373 24WI

9

10 of 37

Big O Proofs

CSE 373 24WI

10

11 of 37

O, and Omega, and Theta [oh my?]

Big-O is an upper bound

  • My code takes at most this long to run

Big-Theta

f(n) is Θ(g(n)) if

f(n) is O(g(n)) and f(n) is Ω(g(n)).

In other words, there exist positive constants c1, c2, nsuch that for all nn

c₁ · g(n) f(n)c₂ · g(n)

Big-Omega

f(n) is Ω(g(n)) if there exist positive constants c, n, such that for all nn, f(n) c · g(n)

Big-O

f(n) is O(g(n)) if there exist positive constants c, n, such that for all nn, f(n)c · g(n)

Big-Omega is a lower bound

  • My code takes at least this long to run

Big Theta is “equal to”

  • My code takes “exactly”* this long to run
  • *Except for constant factors and lower order terms

CSE 373 24WI

11

12 of 37

Examples

  • 4n2 Ω(1)
  • true
  • 4n2 Ω(n)
  • true
  • 4n2 Ω(n2)
  • true
  • 4n2 Ω(n3)
  • false
  • 4n2 Ω(n4)
  • false
  • 4n2 O(1)
  • false
  • 4n2 O(n)
  • false
  • 4n2 O(n2)
  • true
  • 4n2 O(n3)
  • true
  • 4n2 O(n4)
  • true

Big-Theta

f(n) is Θ(g(n)) if

f(n) is O(g(n)) and f(n) is Ω(g(n)).

In other words, there exist positive constants c1, c2, nsuch that for all nn

c₁ · g(n) f(n)c₂ · g(n)

Big-Omega

f(n) is Ω(g(n)) if there exist positive constants c, n, such that for all nn, f(n) c · g(n)

Big-O

f(n) is O(g(n)) if there exist positive constants c, n, such that for all nn, f(n)c · g(n)

CSE 373 24WI

12

13 of 37

Comparing Complexity Classes

CSE 373 24WI

13

14 of 37

Big-O is an upper-bound, not a fit

What do we want to look for on a plot to determine if one function is in the big-O of the other?

You can sanity check that your g(n) function (the dominating one) overtakes or is equal to your f(n) function after some point and continues that greater-than-or-equal-to trend towards infinity

If we want the most-informative upper bound, we’ll ask you for a simplified, tight big-O bound.

O(n^2 ) is the tight bound for the function f(n) = 10n2+15n. See the graph below – the tight big-O bound is the smallest upper bound within the definition of big-O.

If you zoom out a bunch, the your tight bound and your function will be overlapping compared to other complexity classes.

 

 

n3

n5

n4

10n2 + 15n

 

 

n2

10n2 + 15n

CSE 373 24WI

14

15 of 37

Big-Omega definition Plots

  •  

 

 

2n3

n2

n

1

n3

CSE 373 24WI

15

16 of 37

Simplified, tight big-O

In this course, we’ll essentially use:

  • Polynomials eg: nc where c is a constant: eg n, n3, sqrt(n), 1
  • Logarithms eg: logn
  • Exponents eg: cn where c is a constant: eg 2n, 3n
  • Combinations of theses eg: log(log(n)), nlogn, (log(n)2

For this course:

  • a “tight big-O” is the slowest growing upper bound
  • a “tight big-Ω” is the fastest growing lower bound
  • A Θ is always “tight” by definition as it is a fit
  • A “simplified” big-O, big-Ω or big-Θ
    • does not have any non-dominant terms
    • does not have any coefficients
    • does not have any constants factors

CSE 373 24WI

16

17 of 37

Questions?

CSE 373 24WI

17

18 of 37

Where are we?

  • Now we can use complexity classes to understand the range of runtime growth rate as size of data scales upward

for (i = 0; i < n; i++) {

a[i] = 1;

b[i] = 2;

}

O(n)

f(n) = 2n

CODE

COMPLEXITY

CLASS

Code Modeling

RUNTIME

FUNCTION

Asymptotic Analysis

1

2

CSE 373 24WI

18

19 of 37

Our Upgraded Tool: Asymptotic Analysis

TIGHT

BIG-O

RUNTIME

FUNCTION

Asymptotic

Analysis

2

O(n2)

TIGHT

BIG-OMEGA

BIG-THETA

Θ(n2)

f(n) = 10n2 + 13n + 2

We’ve upgraded our Asymptotic Analysis tool to convey more useful information! Having 3 different types of bounds means we can still characterize the function in simple terms, but describe it more thoroughly than just Big-Oh.

Ω(n2)

CSE 373 24WI

19

20 of 37

Our Upgraded Tool: Asymptotic Analysis

TIGHT

BIG-OH

RUNTIME

FUNCTION

Asymptotic

Analysis

2

O(n)

 

TIGHT

BIG-OMEGA

BIG-THETA

Does not exist for this function

isPrime()

Big-Theta doesn’t always exist for every function! But the information that Big-Theta doesn’t exist can itself be a useful characterization of the function.

Ω(1)

CSE 373 24WI

20

21 of 37

Algorithmic Analysis Roadmap

CODE

Code Modeling

RUNTIME

FUNCTION

1

for (i = 0; i < n; i++) {

a[i] = 1;

b[i] = 2;

}

f(n) = 2n

TIGHT

BIG-OH

Asymptotic

Analysis

2

TIGHT

BIG-OMEGA

BIG-THETA

O(n)

Θ(n)

We just finished building this tool to characterize a function in terms of some useful bounds!

Now, let’s look at this tool in more depth. How exactly are we coming up with that function?

Ω(n)

CSE 373 24WI

21

22 of 37

Case Analysis

CSE 373 24WI

22

23 of 37

Case Study: Linear Search

int linearSearch(int[] arr, int toFind) {

for (int i = 0; i < arr.length; i++) {

if (arr[i] == toFind)

return i;

}

return -1;

}

2

3

9

4

5

arr

toFind

2

2

3

9

4

5

toFind

8

i

arr

i

i

i

i

i

The number of operations doesn’t only depend on n.

Even once the since of the data is fixed there are different behaviors based on the organization of the data.

  • If toFind is in arr[0] we only need one iteration
    • f(n) = 4O(1)
  • If toFind is not in arr, we’ll need n iterations
    • f(n) = 3n + 1O(n)

CSE 373 24WI

23

24 of 37

Best Case

Worst Case

On Lucky Earth

On Unlucky Earth (where it’s 2020 every year)

2

3

9

4

5

arr

toFind

2

i

2

3

9

4

5

arr

toFind

8

i

f(n) = 3n + 1

f(n) = 2

O(1)

 

Θ(1)

O(n)

 

Θ(n)

After asymptotic analysis:

After asymptotic analysis:

CSE 373 24WI

24

25 of 37

Case Analysis

Case: a description of inputs/state for an algorithm that is specific enough to build a code model (runtime function) whose only parameter is the input size

  • Case Analysis is our tool for reasoning about all variation other than n!
  • Occurs during the code 🡪 function step instead of function 🡪 O/Ω/Θ step!
  • Best Case = fastest that our code could finish on input of size n
  • Worst Case = slowest that our code could finish on input of size n.
  • Importantly, any position of toFind in arr could be its own case!
    • For this simple example, probably don’t care (they all still have bound O(n))
    • But intermediate cases will be important later

Worst

Best

Other Cases

CSE 373 24WI

25

26 of 37

Caution: Common Misunderstanding

Best/Worst case is based on all variation other than value of n

Incorrect - based on specific values of n

  • “The best case is when n=1, worst is when n=infinity”
  • “The best case is when front is null”
  • “The best case is when overallRoot is null”
  • “The best case is when n is an even number”

Correct - based on state of data structure regardless of n

  • “The best case is when the node I’m looking for is at front, the worst is when it’s not in the list”
  • “Insertion Sort’s best case is when the collection is already sorted, its worst case is when its reverse sorted”
  • “For a method that removes duplicates from a List, the best case is when there are no duplicates and the worst case is when the whole List is duplicates”
  • “The best case is when the BST is perfectly balanced, the worst is when it’s a single line of nodes”

CSE 373 24WI

26

27 of 37

Other cases

“Assume X won’t happen” case

  • Assuming our array won’t need to resize is the most common example

“Average” case

  • Assuming your input is random
  • Need to specify what the possible inputs are and how likely they are
  • f(n) is now the average number of steps on a random input of size n

“In-practice” case

  • This isn’t a real term (I just made it up)
  • Making some reasonable assumptions about how the real-world is probably going to work
  • We’ll tell you the assumptions and won’t ask you to come up with these assumptions on your own
  • Then do worst-case analysis under those assumptions

All of these can be combined with any of O, Ω, and θ

CSE 373 24WI

27

28 of 37

How to do case analysis

  1. Look at the code, understand how thing could change depending on the state of input
  2. How can you exit loops early?
  3. Can you return (exit the method) early?
  4. Are some if/else branches much slower than others?
  5. Figure out what input values can cause you to hit the (best/worst) parts of the code.
  6. not to be confused with number of inputs
  7. Now do the analysis like normal!

CSE 373 24WI

28

29 of 37

Algorithmic Analysis Roadmap

TIGHT

BIG-OMEGA

BIG-THETA

TIGHT

BIG-O

Asymptotic

Analysis

OTHER CASE

FUNCTION

WORST CASE

FUNCTION

BEST CASE

FUNCTION

Case Analysis

CODE

2

for (i = 0; i < n; i++) {

if (arr[i] == toFind) {

return i;

}

}

return -1;

f(n) = 2

O(n)

Θ(n)

f(n) = 3n+1

Ω(n)

1

CSE 373 24WI

29

30 of 37

Review Algorithmic Analysis Roadmap

for (i = 0; i < n; i++) {

if (arr[i] == toFind) {

return i;

}

}

return -1;

 

TIGHT

BIG-OMEGA

BIG-THETA

TIGHT

BIG-O

Asymptotic

Analysis

OTHER CASE

FUNCTION

WORST CASE

FUNCTION

BEST CASE

FUNCTION

Case Analysis

CODE

2

1

f(n) = 2

f(n) = 3n+1

O(1)

Θ(1)

Ω(1)

CSE 373 24WI

30

31 of 37

When to do Case Analysis?

  • Imagine a 3-dimensional plot
    • Which case we’re considering is one dimension
    • Choosing a case lets us take a “slice” of the other dimensions: n and f(n)
    • We do asymptotic analysis on each slice in step 2

f(n)

n

toFind position

At front

(Best Case)

Not present

(Worst Case)

CSE 373 24WI

31

32 of 37

Appendix

CSE 373 24WI

32

33 of 37

Applying Big O Definition

 

 

Show that

is

Apply definition term by term

 

 

Add up all your truths

 

 

Big-O

f(n) is O(g(n)) if there exist positive constants c, n, such that for all nn, f(n)c · g(n)

CSE 373 24WI

33

34 of 37

Exercise: Proving Big O

  •  

Big-O

f(n) is O(g(n)) if there exist positive constants c, n, such that for all nn, f(n)c · g(n)

CSE 373 24WI

34

35 of 37

Writing Big-O Proofs

  •  

CSE 373 24WI

35

36 of 37

Big-O as an upper bound

  •  

 

 

 

Big-O is just an upper bound. It doesn’t have to be a good upper bound

If we want the best upper bound, we’ll ask you for a simplified, tight big-O bound. O(n²) is the tight bound for this example.

It is (almost always) technically correct to say your code runs in time O(n!).

DO NOT TRY TO PULL THIS TRICK IN AN INTERVIEW (or exam)!

CSE 373 24WI

36

37 of 37

Big-O is an upper-bound, not a fit

  •  

 

 

 

CSE 373 24WI

37