JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

1 of 16

CS 161 - Section 3

CA : [Name of CA]

2 of 16

Section 3 agenda

Recap:

Select
QuickSort
Aside: Runtime and Randomness

Practice!

3 of 16

Sorting and Selecting

4 of 16

Main idea: Divide and Conquer

Merge sort – divide and conquer algorithm for sorting an array
SELECT – divide and conquer algorithm for finding the kth smallest element of an array

Big problem

Smaller problem

Yet smaller problem

5 of 16

SELECT

getPivot(A) returns some pivot for us.
Partition(A,p) splits up A into L, A[p], R.

Select(A,k):

If len(A) <= 50:

A = MergeSort(A)
Return A[k-1]

p = getPivot(A)
L, pivotVal, R = Partition(A,p)
if len(L) == k-1:

return pivotVal

Else if len(L) > k-1:

return Select(L, k)

Else if len(L) < k-1:

return Select(R, k – len(L) – 1)

Base Case: If the len(A) = O(1), then any sorting algorithm runs in time O(1).

Case 1: We got lucky and found exactly the k’th smallest value!

Case 2: The k’th smallest value is in the first part of the list

Case 3: The k’th smallest value is in the second part of the list

6 of 16

Partitioning

pivot

L = array with things smaller than A[pivot]

R = array with things larger than A[pivot]

Say we want to find SELECT(A, k)

If k = 5 = len(L) + 1:

We should return A[pivot]

If k < 5:

We should return SELECT(L, k)

If k > 5:

We should return SELECT(R, k – 5)

Partitioning like this takes time O(n) since we don’t care about sorting each half.

7 of 16

Choosing the pivot

This takes time O(1), for each group, since each group has size 5. So that’s O(m)=O(n) total in the for loop.

Pivot is SELECT( , 3 ) = 6:

PARTITION around that 6:

This part is L

This part is R: it’s almost the same size as L.

getPivot(A):

8 of 16

SELECT

getPivot(A) returns some pivot for us.

How?? Median of sub-medians!

Partition(A,p) splits up A into L, A[p], R.

Select(A,k):

If len(A) <= 50:

A = MergeSort(A)
Return A[k-1]

p = getPivot(A)
L, pivotVal, R = Partition(A,p)
if len(L) == k-1:

return pivotVal

Else if len(L) > k-1:

return Select(L, k)

Else if len(L) < k-1:

return Select(R, k – len(L) – 1)

Base Case: If the len(A) = O(1), then any sorting algorithm runs in time O(1).

Case 1: We got lucky and found exactly the k’th smallest value!

Case 2: The k’th smallest value is in the first part of the list

Case 3: The k’th smallest value is in the second part of the list

9 of 16

Quick Sort

10 of 16

QuickSort

QuickSort(A):

If len(A) <= 1:

return

Pick pivot x with pivot.
PARTITION the rest of A into:

L (less than x) and
R (greater than x)

Rearrange A as [L, x, R]
QuickSort(L)
QuickSort(R)

11 of 16

Recall the Naïve memory complexity of Quick Sort is O(n logn)

Why? Think about storing an ordering of n elements for log(n) levels

We can improve it to O(n)

Why? Can use a single array to represent the ordering and update at each level

Can we do even better?

Let these happy Hungarians show you the answer!

https://www.youtube.com/watch?v=ywWBy6J5gz8&ab_channel=AlgoRythmics

In-Place [O(1) memory!] Quick Sort

12 of 16

A better way to do Partition

Pivot

Swap!

Initialize and

Step forward.

When sees something smaller than the pivot, swap the things ahead of the bars and increment both bars.

Repeat till the end, then put the pivot in the right place.

Choose it randomly, then swap it with the last one, so it’s at the end.

13 of 16

Quick Sort vs Merge Sort

	QuickSort (random pivot)	MergeSort (deterministic)
Running time	Worst-case: O(n²) Expected: O(n log(n))	Worst-case: O(n log(n))
In-Place? (With O(log(n)) extra memory)	Yes, can be implemented in-place (relatively) easily	Not as easily since you’d have to sacrifice stability and runtime, but it can be done
Stable?	No	Yes

stable sorting algorithms sort identical elements in the same order as they appear in the input

14 of 16

Which one would you use for a small array?

Which one would you use for an array with millions of elements?

Which one would you use for an array of unknown size?

Quick Sort vs Merge Sort

Given the small size it mostly does not matter. You could still argue for Quick sort as the expected runtime is O(n logn) but we can get away with faster than that, and even if we miss, the worst case is O(n^2). But at this scale the difference will be in the order of ms

Because for large n the Law of Large Numbers kicks in, we can reasonably expect both algorithms to run in O(n logn). It then becomes a priority choice between stability and memory-space

This might be a little of a personal choice, but usually for unknown inputs (assuming you don’t even know the expected range of sizes) we value the predictability and consistency of a deterministic algorithm, so Merge sort would be preferred. Randomized algorithms can be quite challenging to debug.

15 of 16

Expected Runtime

Worst-case runtime

Runtime for Randomized Algorithms

Measures what happens in expectation → computing the average case in the face over randomness in the algorithm
QuickSort → O(nlogn)
Remember that this is still a big O runtime!

Upper bound holds when the adversary does not have control over the randomness in the algorithm

Adversary controls random choices in your algorithm

Ex: the adversary chooses “random” pivots in a manner designed to hinder QuickSort, implement SlowSort instead

QuickSort → O(n^2)

16 of 16

Thank You!