1 of 32

Algorithm Analysis I

Runtime analysis as a process: comprehending programs, modeling the number of steps, and formulating an answer.

1

Kevin Lin, with thanks to many others.

2 of 32

Feedback from the Reading Quiz

Counting operations. Let’s look at an example where N = 6 instead of N = 10000.

Confusing question. What part of dup1 and dup2 required us to specify a range for the number of times an operation was executed?

Parabolas vs. lines.

2

Demo

3 of 32

Runtime Analysis Process

Comprehending. Understanding the implementation details of a program.

Modeling. Counting the number of steps in terms of N, the size of the input.

Case Analysis. How certain conditions affect the program execution.

Asymptotic Analysis. Describing what happens for very large N, as N→∞.

Formalizing. Summarizing the final result in precise English or math notation.

3

boolean dup1(int[] A)

Consider every pair

Array contains a duplicate at front

Array contains no duplicate items

Constant time

Quadratic time

Θ(1)

Θ(N2)

Overall Asymptotic Runtime Bound

Worst case

Best case

4 of 32

Asymptotic Analysis

What happens for very large N, as N→∞.

Simulating billions of particles.

Social network with billions of users.

Logging billions of transactions.

Encoding billions of bytes of video data.

Linear-time algorithms scale better than quadratic-time algorithms (parabolas).

4

5 of 32

Orders of Growth

5

Algorithm Design (Jon Kleinberg, Éva Tardos/Pearson Education)

6 of 32

Asymptotic Analysis and Case Analysis

For a very large array with billions of elements (i.e. asymptotic analysis), is it possible for dup1 to execute only 2 less-than (<) operations?

6

Operation

dup1: Quadratic/Parabolic

dup2: Linear

i = 0

1

1

less-than (<)

2 to (N2 + 3N + 2) / 2

1 to N

increment (+= 1)

0 to (N2 + N) / 2

0 to N - 1

equality (==)

1 to (N2 - N) / 2

1 to N - 1

array accesses

2 to N2 - N

2 to 2N - 2

Q

7 of 32

For a very large array, is it possible for dup1 to execute only 2 less-than (<) operations?

7

8 of 32

Duplicate Finding

Our goal is to somehow characterize the runtimes of the functions below.

  • Characterization should be simple and mathematically rigorous.
  • Characterization should demonstrate superiority of dup2 over dup1.

8

Operation

dup1: Quadratic/Parabolic

dup2: Linear

i = 0

1

1

less-than (<)

2 to (N2 + 3N + 2) / 2

1 to N

increment (+= 1)

0 to (N2 + N) / 2

0 to N - 1

equality (==)

1 to (N2 - N) / 2

1 to N - 1

array accesses

2 to N2 - N

2 to 2N - 2

9 of 32

Worst Case Order of Growth

9

10 of 32

Simplification 1: Consider Only the Worst Case

Provides a runtime guarantee for any input of size N.

We often only care about worst case, but there are many exceptions.

10

Operation

Worst Case: dup1

i = 0

1

j = i + 1

1 to 10000

less-than (<)

2 to (N2 + 3N + 2) / 2

increment (+= 1)

0 to (N2 + N) / 2

equality (==)

1 to (N2 - N) / 2

array accesses

2 to N2 - N

public static boolean dup1(int[] A) {

for (int i = 0; i < A.length; i += 1) {

for (int j = i + 1; j < A.length; j += 1) {

if (A[i] == A[j]) {

return true;

}

}

}

return false;

}

11 of 32

Identifying Orders of Growth

Consider the algorithm step counts below.

What do you expect will be the order of growth of the runtime for the algorithm?

  1. N [linear]
  2. N2 [quadratic]
  3. N3 [cubic]
  4. N6 [sextic]

11

Q

Operation

Count

less-than (<)

100N2 + 3N

greater-than (>)

2N3 + 1

and (&&)

5,000

12 of 32

What do you expect will be the order of growth?

12

13 of 32

Simplification 2: Restrict Attention to One Operation

Pick some representative operation to act as a proxy for the overall runtime.

Good choices:

  • Less-than (<)
  • Increment (+= 1)
  • Equality (==)
  • Array accesses

Poor choices: assignment of i = 0 or j = i + 1.

We call our choice the cost model.

13

Operation

Worst Case: dup1

i = 0

1

j = i + 1

10000

less-than (<)

(N2 + 3N + 2) / 2

increment (+= 1)

(N2 + N) / 2

equality (==)

(N2 - N) / 2

array accesses

N2 - N

14 of 32

Simplification 3: Eliminate Lower-Order Terms

Ignore lower-order terms.

14

Operation

Worst Case: dup1

increment (+= 1)

(N2 + N) / 2

public static boolean dup1(int[] A) {

for (int i = 0; i < A.length; i += 1) {

for (int j = i + 1; j < A.length; j += 1) {

if (A[i] == A[j]) {

return true;

}

}

}

return false;

}

15 of 32

Simplification 4: Eliminate Multiplicative Constants

Ignore multiplicative constants.

We already threw away the meaningful constant when we chose a single proxy operation.

15

Operation

Worst Case: dup1

increment (+= 1)

N2 / 2

public static boolean dup1(int[] A) {

for (int i = 0; i < A.length; i += 1) {

for (int j = i + 1; j < A.length; j += 1) {

if (A[i] == A[j]) {

return true;

}

}

}

return false;

}

16 of 32

Simplification Summary

  1. Only consider the worst case.
  2. Pick a representative operation (cost model).
  3. Ignore lower order terms.
  4. Ignore multiplicative constants.

The worst case order of growth of the runtime for dup1 is N2.

16

Operation

dup1

i = 0

1

j = i + 1

1 to 10000

less-than (<)

2 to (N2 + 3N + 2) / 2

increment (+= 1)

0 to (N2 + N) / 2

equality (==)

1 to (N2 - N) / 2

array accesses

2 to N2 - N

Order of growth

Operation

Worst Case Growth

increment (+= 1)

N2

17 of 32

Your Turn: Worst Case Order of Growth for dup2

  • Only consider the worst case.
  • Pick a representative operation (cost model).
  • Ignore lower order terms.
  • Ignore multiplicative constants.

The worst case order of growth of the runtime for dup2 is …

17

Operation

dup2

i = 0

1

less-than (<)

1 to N

increment (+= 1)

0 to N - 1

equality (==)

1 to N - 1

array accesses

2 to 2N - 2

Order of growth

Operation

Worst Case Growth

array accesses

N

Operation

Worst Case Growth

Q

18 of 32

Simplified Modeling Process

Rather than building the entire table, we can instead:

  1. Choose a representative operation to count (cost model).
  2. Figure out the order of growth for the count of the representative operation by either:
    • Making an exact count and then discarding the unnecessary pieces.
    • After lots of practice, using inspection to determine order of growth.

Let’s redo our analysis of dup1 with this new process.

This time, we’ll show all our work.

18

19 of 32

Worst Case Order of Growth: Exact Count of == Operations

The worst case order of growth of the runtime for dup1 is N2.

19

int N = A.length; // N == 6

for (int i = 0; i < N; i += 1)

for (int j = i + 1; j < N; j += 1)

if (A[i] == A[j])

return true;

return false;

==

==

==

==

==

==

==

==

==

==

==

==

==

==

==

0

1

2

3

4

5

0 1 2 3 4 5

i

j

20 of 32

Worst Case Order of Growth: Geometric Argument

The worst case order of growth of the runtime for dup1 is N2.

20

int N = A.length; // N == 6

for (int i = 0; i < N; i += 1)

for (int j = i + 1; j < N; j += 1)

if (A[i] == A[j])

return true;

return false;

==

==

==

==

==

==

==

==

==

==

==

==

==

==

==

0

1

2

3

4

5

0 1 2 3 4 5

i

j

Area of right triangle of side length N - 1.

Order of growth of area is N2.

21 of 32

Runtime Analysis Process

Comprehending. Understanding the implementation details of a program.

Modeling. Counting the number of steps in terms of N, the size of the input.

Case Analysis. How certain conditions affect the program execution.

Asymptotic Analysis. Describing what happens for very large N, as N→∞.

Formalizing. Summarizing the final result in precise English or math notation.

The worst case order of growth of the runtime for dup1 is N2.

21

boolean dup1(int[] A)

Consider every pair

Array contains a duplicate at front

Array contains no duplicate items

Constant time

Quadratic time

Θ(1)

Θ(N2)

Overall Asymptotic Runtime Bound

Worst case

Best case

22 of 32

Runtime Analysis Process

Comprehending. Understanding the implementation details of a program.

Modeling. Counting the number of steps in terms of N, the size of the input.

Case Analysis. How certain conditions affect the program execution.

Asymptotic Analysis. Describing what happens for very large N, as N→∞.

Formalizing. Summarizing the final result in precise English or math notation.

The worst case order of growth of the runtime for dup1 is N2.

22

boolean dup1(int[] A)

Consider every pair

Array contains a duplicate at front

Array contains no duplicate items

Constant time

Quadratic time

Θ(1)

Θ(N2)

Overall Asymptotic Runtime Bound

Worst case

Best case

23 of 32

Order of Growth Exercise

Informally, what is the shape of each function for very large N?

In other words, what is the order of growth of each function?

23

Q

Function

Order of Growth

N3 + 3N4

N4

(1 / N) + N3

N3

(1 / N) + 5

1

NeN + N

NeN

40 sin(N) + 4N2

N2

Function

Order of Growth

N3 + 3N4

(1 / N) + N3

(1 / N) + 5

NeN + N

40 sin(N) + 4N2

24 of 32

Big-Theta Notation

Suppose we have a function R(N) with order of growth f(N). In Big-Theta, we write this as:

24

Function

Big-Theta

N3 + 3N4

Θ(N4)

(1 / N) + N3

Θ(N3)

(1 / N) + 5

Θ(1)

NeN + N

Θ(NeN)

40 sin(N) + 4N2

Θ(N2)

25 of 32

Big-Theta Definition

means there exist positive constants k1 and k2 such that

for all values of N greater than some N0.

25

“Very large N”

26 of 32

Big-Theta Challenge

Find a simple f(N) and corresponding k1 and k2.

means there exist positive constants k1 and k2 such that

for all values of N greater than some N0.

26

Q

Demo

27 of 32

Big-O Notation

Whereas Big-Theta can informally be thought of as something like “equals”, Big-O can be thought of as “less than or equal”.

All of the following are true.

27

28 of 32

Big-O Definition

means there exists a positive constant k2 such that

for all values of N greater than some N0.

28

“Very large N”

29 of 32

Big-Omega Definition

means there exists a positive constant k1 such that

for all values of N greater than some N0.

29

“Very large N”

30 of 32

Runtime Analysis Process

Comprehending. Understanding the implementation details of a program.

Modeling. Counting the number of steps in terms of N, the size of the input.

Case Analysis. How certain conditions affect the program execution.

Asymptotic Analysis. Describing what happens for very large N, as N→∞.

Formalizing. Summarizing the final result in precise English or math notation.

The overall order of growth of the runtime for dup1 is

30

boolean dup1(int[] A)

Consider every pair

Array contains a duplicate at front

Array contains no duplicate items

Constant time

Quadratic time

Θ(1)

Θ(N2)

Overall Asymptotic Runtime Bound

Worst case

Best case

31 of 32

Asymptotic Analysis and Case Analysis

For a very large array with billions of elements (i.e. asymptotic analysis), is it possible for dup1 to execute only 2 less-than (<) operations?

31

Operation

dup1: Quadratic/Parabolic

dup2: Linear

i = 0

1

1

less-than (<)

2 to (N2 + 3N + 2) / 2

1 to N

increment (+= 1)

0 to (N2 + N) / 2

0 to N - 1

equality (==)

1 to (N2 - N) / 2

1 to N - 1

array accesses

2 to N2 - N

2 to 2N - 2

32 of 32

Overall Asymptotic Runtime Bound for dup1

Give an overall asymptotic runtime bound for R as a combination of Θ, O, and/or Ω notation. Take into account both the best and the worst case runtimes (Rbest and Rworst).

32

Q

Demo