1 of 23

Algorithm Analysis

Runtime analysis as a process: comprehending programs, modeling the number of steps, and formulating an answer.

1

Kevin Lin, with thanks to many others.

2 of 23

Runtime Analysis Process

Comprehending. Understanding the implementation details of a program.

Modeling. Counting the number of steps in terms of N, the size of the input.

Case Analysis. How certain conditions affect the program execution.

Asymptotic Analysis. Describing what happens for very large N, as N→∞.

Formalizing. Summarizing the final result in precise English or math notation.

2

boolean dup1(int[] A)

Consider every pair

Array contains a duplicate at front

Array contains no duplicate items

Constant time

Quadratic time

Θ(1)

Θ(N2)

Overall Asymptotic Runtime Bound

Worst case

Best case

3 of 23

Asymptotic Analysis

What happens for very large N, as N→∞.

Simulating billions of particles.

Social network with billions of users.

Logging billions of transactions.

Encoding billions of bytes of video data.

Linear-time algorithms scale better than quadratic-time algorithms (parabolas).

3

Algorithms (Robert Sedgewick, Kevin Wayne/Princeton)

4 of 23

Orders of Growth

4

Algorithm Design (Jon Kleinberg, Éva Tardos/Pearson Education)

5 of 23

Asymptotic Analysis vs. Case Analysis

For a very large array with billions of elements (asymptotic analysis), how is it possible for dup1 to execute only 2 less-than (<) operations?

5

Operation

dup1: Quadratic/Parabolic

dup2: Linear

i = 0

1

1

less-than (<)

2 to (N2 + 3N + 2) / 2

1 to N

increment (+= 1)

0 to (N2 + N) / 2

0 to N - 1

equality (==)

1 to (N2 - N) / 2

1 to N - 1

array accesses

2 to N2 - N

2 to 2N - 2

Q

6 of 23

Asymptotic Analysis vs. Case Analysis

First two items are the duplicates. That makes sense for large inputs.

  • Contents of the array.
  • Even when working with a very large input, all we’ve said is we have a large array.

Case analysis coexists with asymptotic analysis.

In the case of 0 items in the array, there’s only one less than. That seems fast.

  • Why is this thinking wrong for the problem?
  • Asymptotic analysis cares about big inputs: large N.

6

A

7 of 23

Runtime Analysis Process

Comprehending. Understanding the implementation details of a program.

Modeling. Counting the number of steps in terms of N, the size of the input.

Case Analysis. How certain conditions affect the program execution.

Asymptotic Analysis. Describing what happens for very large N, as N→∞.

Formalizing. Summarizing the final result in precise English or math notation.

7

boolean dup1(int[] A)

Consider every pair

Array contains a duplicate at front

Array contains no duplicate items

Constant time

Quadratic time

Θ(1)

Θ(N2)

Overall Asymptotic Runtime Bound

Worst case

Best case

8 of 23

Simplified Modeling Process

Rather than building the entire table of operation counts, we can instead:

  1. Choose our cost model (representative operation).
  2. Figure out the order of growth for the count of the cost model by either:
    • Making an exact count and then discarding the unnecessary pieces.
    • Or, using intuition/inspection to determine order of growth. (Needs practice!)

Let’s redo our analysis of dup1 with this Simplified Modeling Process.

This time, we’ll show all our work.

8

9 of 23

Worst Case Order of Growth: Exact Count of == Operations

The worst case order of growth of the runtime for dup1 is N2.

9

int N = A.length; // N == 6

for (int i = 0; i < N; i += 1)

for (int j = i + 1; j < N; j += 1)

if (A[i] == A[j])

return true;

return false;

==

==

==

==

==

==

==

==

==

==

==

==

==

==

==

0

1

2

3

4

5

0 1 2 3 4 5

i

j

10 of 23

Worst Case Order of Growth: Geometric Argument

The worst case order of growth of the runtime for dup1 is N2.

10

int N = A.length; // N == 6

for (int i = 0; i < N; i += 1)

for (int j = i + 1; j < N; j += 1)

if (A[i] == A[j])

return true;

return false;

==

==

==

==

==

==

==

==

==

==

==

==

==

==

==

0

1

2

3

4

5

0 1 2 3 4 5

i

j

Area of right triangle of side length N - 1.

Order of growth of area is N2.

11 of 23

Runtime Analysis Process

Comprehending. Understanding the implementation details of a program.

Modeling. Counting the number of steps in terms of N, the size of the input.

Case Analysis. How certain conditions affect the program execution.

Asymptotic Analysis. Describing what happens for very large N, as N→∞.

Formalizing. Summarizing the final result in precise English or math notation.

11

boolean dup1(int[] A)

Consider every pair

Array contains a duplicate at front

Array contains no duplicate items

Constant time

Quadratic time

Θ(1)

Θ(N2)

Overall Asymptotic Runtime Bound

Worst case

Best case

“The worst case order of growth of the runtime for dup1 is N2.”

12 of 23

Order of Growth Exercise

What is the order of growth of each function?

(Informally, what is the shape of each function for very large N?)

12

Q

Function

Order of Growth

N3 + 3N4

N4

(1 / N) + N3

N3

(1 / N) + 5

1

NeN + N

NeN

40 sin(N) + 4N2

N2

Function

Order of Growth

N3 + 3N4

(1 / N) + N3

(1 / N) + 5

NeN + N

40 sin(N) + 4N2

13 of 23

Big-Theta Notation

Suppose we have a function R(N) with order of growth f(N). In Big-Theta, we write this as:

13

Function

Big-Theta

N3 + 3N4

Θ(N4)

(1 / N) + N3

Θ(N3)

(1 / N) + 5

Θ(1)

NeN + N

Θ(NeN)

40 sin(N) + 4N2

Θ(N2)

14 of 23

Big-Theta Definition

means there exist positive constants k1 and k2 such that

for all values of N greater than some N0.

14

“Very large N”

15 of 23

Big-Theta Challenge

Find a simple f(N) and corresponding k1 and k2.

means there exist positive constants k1 and k2 such that

for all values of N greater than some N0.

15

Q

16 of 23

Big-O Notation

Whereas Big-Theta can informally be thought of as something like “equals”, Big-O can be thought of as “less than or equal”.

All of the following are true.

16

17 of 23

Big-O Definition

means there exists a positive constant k2 such that

for all values of N greater than some N0.

17

“Very large N”

18 of 23

Big-Omega Definition

means there exists a positive constant k1 such that

for all values of N greater than some N0.

18

“Very large N”

19 of 23

Overall Asymptotic Runtime Bound for dup1

Give an overall asymptotic runtime bound for R as a combination of Θ, O, and/or Ω notation. Take into account both the best and the worst case runtimes (Rbest and Rworst). N is the length of array.

Then, give a few other valid runtime bounds for Rbest , Rworst , and R using asymptotic notation.

19

Q

20 of 23

Tight Asymptotic Runtime Bounds for dup1

Best case: Ω(1) and O(1), therefore Θ(1).

Worst case: Ω(N2) and O(N2), therefore Θ(N2).

Overall: Ω(1) and O(N2).

Because the Ω and O bounds do not agree, Θ bound does not exist.

20

A

21 of 23

Valid but Less-Descriptive Runtime Bounds for dup1

Best case: Ω(1) and O(1), therefore Θ(1).

  • O(N4)
  • Lowest order of growth is constant.

Worst case: Ω(N2) and O(N2), therefore Θ(N2).

  • O(N4)

Overall: Ω(1) and O(N2).

Because the Ω and O bounds do not agree, Θ bound does not exist.

  • O(N4)

21

A

22 of 23

Mystery

Give a tight asymptotic runtime bound for mystery as a function of N, the length of the array, in the best case, worst case, and overall.

22

Q

boolean mystery(int[] a, int target) {

int N = a.length;

for (int i = 0; i < N; i += 1)

if (a[i] == target)

return true;

return false;

}

23 of 23

Mystery: Big-Ω, Big-O, and Big-Θ

Best case: First element in array is the target.

  • Omega(1), O(1), Theta(1)

Worst case: Target is the last item in array.

  • O(N), O(N2), Omega(N), Theta(N)
  • Can you write Omega(N - 1)? Yes, but it’s not a “simple” function.

Overall:

  • No big-theta bound.
  • O(N), Omega(1)

23

A