1 of 47

Lecture 3:

Computational Complexity & Searching

Sookyung Kim

2 of 47

Algorithms & Complexity

2

3 of 47

What is an Algorithm?

  • Computational procedure to solve a problem

3

4 of 47

Efficiency of an Algorithm

Do you like fast/efficient computer program, or slow one?

Of course, we always expect our computers to do their jobs most efficiently!

4

5 of 47

Computation Complexity

  • Cost of algorithm = Sum of operation costs�
  • Model of computation specifies
    • What operations an algorithm is allowed to use
    • Cost (time, space) of each operation
  • Execution costs
    • Time complexity of a program: how much time?
    • Space complexity of a program: how much memory?

5

6 of 47

Measuring Time Complexity

  • Measure execution time in seconds using a client program (e.g., time module)
    • (+) Easy to measure
    • (+) Gives actual time
    • (-) Large amounts of time might be required.
    • (-) Results depend on lots of factors (machine, compiler, data…)�
  • Count the number of operations in terms of input size N
    • (+) Machine independent.
    • (+) Gives algorithm’s scalability.
    • (-) Tedious to compute…
    • (-) Does not give actual time.�
  • Fortunately, we care only about asymptotic behavior (with a very large N – Big Data!)

6

7 of 47

Elementary School Algorithm

Example: Integer multiplication

  • Input: two N-digit numbers x, y
  • Output: product of x and y
  • Primitive operations allowed:
    • Add 2 single-digit numbers
    • Multiply 2 single-digit numbers

5678

×1234

2

3

1

3

7

2

22

17034

11356

5678

7006652

7

8 of 47

Elementary School Algorithm

Example: Integer multiplication

  • Input: two N-digit numbers x, y
  • Output: product of x and y
  • Primitive operations allowed:
    • Add 2 single-digit numbers
    • Multiply 2 single-digit numbers
  • How many primitive operations used?

N multiplications

(up to) N-1 additions

For each row:

N rows

Total operations ≤ 3N2

In total,N(2N-1) operations

(up to) N2 additions

5678

×1234

2

3

1

3

7

2

22

17034

11356

5678

7006652

8

9 of 47

Software Engineer’s Example

Let’s denote len(list) as N:

1 to N

N

(N2 - N)/2

0 to (N2 - N)/2

N

def linear_search(list, value):

for i in range(len(list)):

if list[i] == value:

return i

return -1

def selection_sort(list):

for i in range(len(list)):

smallest = i

for j in range(i+1, len(list)):

if list[j] < list[smallest]:

smallest = j

list[i], list[smallest] = list[smallest], list[i]

Operation

Count

==

Operation

Count

smallest = i

<

smallest = j

swap

9

10 of 47

Big O Notation

How to Characterize Time Complexity more formally and simply?

10

11 of 47

Simplification of Time Complexity

1. We care only about the worst-case performance!

← because we do not know what input data we will get in advance.

Operation

Count

smallest = i

N

<

(N2 - N)/2

smallest = j

0 to (N2 - N)/2

swap

N

11

12 of 47

Simplification of Time Complexity

2. Focus only on a single operation with the highest order of growth (=most expensive).

  • There may be multiple good choices. Then, just choose any of them.

Operation

Count

smallest = i

N

<

(N2 - N)/2

smallest = j

(N2 - N)/2

swap

N

12

13 of 47

Simplification of Time Complexity

3. Remove lower order terms.

← They do not really matter, since the higher order one dominates.

Operation

Count

smallest = i

Ignored

<

Ignored

smallest = j

(N2 - N)/2

swap

Ignored

13

14 of 47

Simplification of Time Complexity

4. Remove constants.

← We have already thrown away information at step 2. At this stage, constants are not meaningful.

Worst-case order of growth: N2

Operation

Count

smallest = i

Ignored

<

Ignored

smallest = j

N2/2

swap

Ignored

14

15 of 47

Formal Definition

  • If a function T(N) has its order of growth less than or equal to f(N), we write this as T(N) ∈ O(f(N)), where O is called Big-O notation.�
  • More mathematically, T(N) ∈ O(f(N)) means that there exists a positive constant c such that T(N) ≤ c f(N) for all values of N greater than some positive N0 (i.e., large N).

15

16 of 47

Example

  • T(N) = N2 + 5N5
  • T(N) ∈ O(N5)?
    • That is, is there a positive constant c such that N2 + 5N5cN5 for large N?
    • Yes!
      • N2 + 5N5 < N5 + 5N5 = 6N5
      • With c = 6, it holds.
  • T(N) ∈ O(N7)?
    • That is, is there a positive constant c such that N2 + 5N5cN7 for large N?
    • Yes!
      • N2 + 5N5 < N5 + 5N5 = 6N5 < 6N7
      • With c = 6, it holds.

16

17 of 47

Example (cont'd)

  • T(N) = N2 + 5N5
  • T(N) ∈ O(N4)?
    • That is, is there a positive constant c such that N2 + 5N5cN4 for large N?
    • No 😕
      • Even if we set very large c, still for N > c, 5N5 > N5 > cN4.
  • We are usually interested in the tightest order; in this example, O(N5).

17

18 of 47

Exercise (cont'd)

  • T(N) = 2N2 + 3N�
  • T(N) = 1/N + 100�
  • T(N) = 100cos(N) + 50N2�
  • T(N) = log N + 2N�
  • T(N) = 2N + N2

O(N2)

O(1)

O(N2)

O(N)

O(2N)

18

19 of 47

What is Important for Asymptotic Analysis?

  • Compare the two algorithms below:
    • Algo1 requires 2N2 operations, while
    • Algo2 requires 500N operations.
    • Algo1 is faster than Algo2 for a small N, but becomes much slower for a very large N.
    • What is important? How fast function is growing!�
  • Order of growth:

500N

2N2

19

20 of 47

Asymptotic Analysis

20

21 of 47

Homework – Problem 1

21

22 of 47

Homework – Problem 2

22

23 of 47

Searching

23

24 of 47

Searching Problem

Given a list of objects and a single target, locate it if it exists.

Example:

Where is 4? → [6].

Where is 17? → It does not exist.

How to solve it?

What is the most efficient way?

Index

0

1

2

3

4

5

6

7

8

9

Value

-7

15

2

6

-1

5

4

10

-4

21

24

25 of 47

Linear Search

  • Let’s try the simplest algorithm: checking one by one!
    • Check [0]. If it contains the target value, return 0. Otherwise, move to the next.
    • Check [1]. If it contains the target value, return 1. Otherwise, move to the next.
    • Check [N-1]. If it contains the target value, return N-1. Otherwise, return -1.

Time complexity?

Can we do better?

def linear_search(list, value):

for i in range(len(list)):

if list[i] == value:

return i

return -1

O(N)

25

26 of 47

Searching Problem

Now, let’s consider an additional condition:

Given a sorted list of objects and a single target, locate it if it exists.

Example:

Where is 4? → [4].

Where is 17? → It does not exist.

Can we use the linear search method to solve this version?

Can we solve it more efficiently?

Index

0

1

2

3

4

5

6

7

8

9

Value

-7

-4

-1

2

4

5

6

10

15

21

Yes!

26

27 of 47

Analogy to Dictionary Search

  • Suppose you are searching for a word in English dictionary.
  • If you use the “linear search” method, how many words do you expect to see before you find the word?
  • Do you really do this? Any better way?

N/2, where N is the number of words in the dictionary!

27

28 of 47

Binary Search

  • Linear search does work for a sorted list, but does NOT take advantage of the fact that it is sorted.
  • Main idea: Check what value is stored in the middle. If it is smaller than the target, we can ignore all values before this. Otherwise, we can ignore all values after this.
  • For instance, target = 15:

15 should appear on the right side of this!�→ We can safely ignore all values on the left.

Index

0

1

2

3

4

5

6

7

8

9

Value

-7

-4

-1

2

4

5

6

10

15

21

28

29 of 47

Binary Search

  • In the next step, we repeat the same procedure with the reduced list:

Again, 15 should appear on the right side of this!�→ We can safely ignore all values on the left.

  • Repeat with the reduced list:

Okay, we found it!

Index

5

6

7

8

9

Value

5

6

10

15

21

Index

8

9

Value

15

21

29

30 of 47

Binary Search

  • What if we are searching a non-existent value? For instance, 14?

14 should be on the left of this.�→ We can safely ignore all values on the right.

  • Removing them gives now an empty list. We confirm that there is no 14.

Index

8

9

Value

15

21

Index

Value

30

31 of 47

Binary Search

first

last

mid

Example: value = 7?

iter=1

iter=2

iter=3

iter=4

iter=5

def binary_search(list, value):

first = 0

last = len(list) - 1

while first <= last:

mid = (first + last) // 2

if list[mid] == value:

return mid

elif list[mid] < value:

first = mid + 1

else:

last = mid - 1

return -1

Index

0

1

2

3

4

5

6

7

8

9

Value

-7

-4

-1

2

4

5

6

10

15

21

31

32 of 47

Time Complexity

  • Linear search
    • Applicable to any list
    • Time complexity: O(N)�
  • Binary search
    • Applicable to a sorted list
    • Time complexity:

O(log N)

32

33 of 47

Additional Problems

  • Does the binary search work when the list contains duplicates?
    • To make it return any of those duplicates?
    • To make it return the leftmost one?
    • To make it return the rightmost one?�
  • Find the first element greater than or equal to the query.

33

34 of 47

Recursion

34

35 of 47

Recursive Algorithm

  • An algorithm that calls itself for subproblems.
  • The subproblems are of exactly the same type as the original problem, with smaller scale.
  • “Complex problems can be seen much simpler.”
  • Examples:
    • Division in elementary school
    • Binary search
    • Sorting (selection sort, merge sort, …)
    • Tree traversal

35

36 of 47

Key Components in Recursive Algorithm

  • Base case: the simplest case where the algorithm can trivially conclude without additional recursive call.
    • E.g., empty list in searching, single element in sorting, …
    • Usually, this case is simply solved by O(1).�
  • General case: the algorithm may call one or more recursive call, to solve exactly the same problem in smaller scale.
    • Recursive call must be strictly smaller; otherwise it may not finish infinitely.
    • Important! Make sure if the same thing is not repeatedly called.�
  • Keep in mind: your algorithm must finish with correct answer for all possible inputs. Then, make it as efficient as possible.

36

37 of 47

Binary Search: Recursive Version

search

search

“Same problem of different size”

Up to O(log N) recursive calls.

Each call takes O(1): computing mid, comparison with value, return output.

def binary_search(list, value, first, last):

if first > last:

return -1

else:

mid = (first + last) // 2

if list[mid] == value:

return mid

elif list[mid] < value:

return binary_search(list, value, mid+1, last)

else:

return binary_search(list, value, first, mid-1)

binary_search(list, value, 0, len(list) - 1)

Time complexity?

O(log N)

37

38 of 47

Reversing a String

  • Print a string in reverse order.
    • E.g., ACTGCC → CCGTCA

Non-recursive version:

Recursive version:

def reverse(str):

output = ""

for i in range(len(str)):

output += str[-i-1]

return output

def reverse(str):

if not str:

return str

else:

return str[-1] + reverse(str[:-1])

return reverse(str[1:]) + str[0]

38

39 of 47

Combination

  • Combination: a selection of r items from a set that has n distinct members, such that the order of selection does not matter.�
  • Recursive definition:
    • C(n, r) = 1 if r = 0 or n
    • C(n, r) = 0 if r > n
    • C(n, r) = C(n - 1, r - 1) + C(n - 1, r)

def comb(n, r):

if r == 0 or r == n:

return 1

elif r > n:

return 0

else:

return comb(n-1, r-1) + comb(n-1, r)

This implementation is not efficient. Why?

39

40 of 47

Combination

def comb(n, r):

if r == 0 or r == n:

return 1

elif r > n:

return 0

else:

return comb(n-1, r-1) + comb(n-1, r)

40

41 of 47

Recursion Summary

  • It provides a simpler way to interpret the problem.�
  • It becomes extremely inefficient if there’s duplicated computation.�
  • Make sure that your algorithm always meets the base case to terminate.

41

42 of 47

Homework 1 – Binary Search

42

43 of 47

Homework 2 – Binary Search

43

44 of 47

Homework 3 – Binary Search

44

45 of 47

Homework 4 - Recursion

45

46 of 47

Homework 5 - Recursion

46

47 of 47

Homework 6 - Recursion

47