2 of 54

How fast a Comparison based sorting works

In a decision tree, each vertex represents a key comparisons and the branches indicate the result.
A path though a decision tree represents a possible sequence of computations that an algorithm could produce.
When sorting n elements, as there are n! possible outcomes, so, a decision tree must have n! leaves.
As decision tree is a binary tree, its height must be bounded by log₂n!≈(n/2) log₂(n/2), hence, concluded.

3 of 54

Quick Sort

Sorting algorithm, Quick sort was invented by Hoare and based on Principle of Divide and Conquer.
It works in two phase: 1. Partitioning a list with respect to a pivot element, such that, all elements left of the pivot is less than or equal to pivot and all elements larger than pivot are to the right side of pivot.
Recursively invokes Quick sort for both the left half and right half of the given list.
The dividing process continues till size of a list reduces to 1

4 of 54

Quick Sort Algorithm

Quick sort(A, p, r) // here array A[] is going to

1. If (p < r)then //be sorted whose lowest // index is p and highest index r

1.1 q<-partition(A, p, r) // partition function returns

1.2 Quick sort(A, p, q-1) // an index q as the index // of pivot element

1.3 Quick sort(A, q+1, r)

2. End of If

3. End

5 of 54

Quick sort algorithm (contd.)

Partition

Input: data type A[ ] , p, r

Output: pivot position as index

x<-A[r]
i<-p-1
Repeat for j<-p to r-1

3.1 if (A[j]≤x) then // to keep all smaller or equal element of

3.1.1 i<-i+1 // pivot element chosen to the left portion of the array

3.1.2 swap (A, i, j)

3.2 end of if

swap(A, i+1, r) // to keep pivot element just right of the last element found //less than or equal to pivot element chosen and kept in the left portion of the //array

5. return (i+1)

6 of 54

Performance Analysis

To analyse the performance of Quick sort, we have to determine the number of comparisons and number of swapping.
We can sort out performance of any sorting algorithms by its performance in worst case, average case and best case.
To sort a n element list, let us assume C(n) comparisons and S(n) swaps are required.
Partition compares every keys with pivot element at least once, hence does (n – 1) comparisons.
After one execution of partition, assume, two sub lists, one being left of pivot is of size t and another being right of pivot is of length (n – t – 1).

So, C(n) = (n – 1) + C(t) + C(n – t –1)�

7 of 54

Performance Analysis: Worst Case (Contd.)

If according to ordering of elements of A[], rightmost element, chosen as pivot is either less than all the remaining elements or greater than all elements of A[]. In such case portioning, one sub list has 0 elements and another one has (n-1) element.

So, C(0)=0, C(n) = (n-1) + C(n-1)

This is a recurrence relation.

C(1)=0, C(2)=1+C(1), C(3)=2+C(2), C(4)= 3+C(3),…….,

C(n) = (n-1) + C(n-1)

C(n)=(n-1)+(n-2)+(n-3)+…..+3+2+1

=½(n-1) x n

=0.5 x n²- 0.5 x n�

8 of 54

Average case analysis

One call of partition make (n-1) comparisons.
Partition breaks A[] in A[1:p-1] and A[p+1:r] sub arrays.
Hence, C(n, p) = n-1 +C(p-1)+ C(n-p), where p ranges from 1 to n.
Adding all these terms and dividing by n, we can get:

C(n)=n-1+(2/n)x(C(0)+C(1)+C(2)+…..+C(n-1))….(1)

C(n-1)=n-2+(2/(n-1))x(C(0)+C(1)+…+C(n-2)……(2), Multiplying (1) by n and (2) by (n-1) and subtracting….

n x C(n)-(n-1)x C(n-1)=2x(n-1)+C(n-1)…………….(3)

C(n)x(1/(n-1))=2/n+(2xC(n-1))/(n)………………….(4)

9 of 54

Average case analysis(Contd.)

Expanding (4),

C(n)/(n-1)=2x{1/n+1/(n-1)+1/(n-2)+…..+C(2)/3}…(5)

Now, for a Harmonic series,

H_n={1+1/2+1/3+…..+1/n} can be approximated by the area under the curve {1/x} from ½ to n+ ½, i.e.,

H_n= ∫ _½^{n+ ½} {1/x}dx ≈lg n+0.7,

Hence,C(n)/(n-1) ≈ 2x{lg n+0.7}………………………..(6)

Finally, C(n) ≈ O(nlg n)

10 of 54

Logic behind Quick Sort

Performance of Quick sort heavily depends on the value chosen for pivot.
If pivot fails to divide list in to nearly equal sub lists then performance degrades.
Hence, if a sorted list is given as input and pivot is chosen as the 1^st or the last element, then algorithm goes to worst case.
Choice of index for pivot value is sometimes performed by using random function.

11 of 54

Logic of Merge Sort

In terminology of Divide and Conquer paradigm, Merge Sort works as follows:
Divide: Divide the n-element sequence to be sorted into two sub sequences of n/2 elements each.
Sort the two sub sequences recursively into merge sort.
Merge the two sorted sub sequences to produce the sorted answer.

12 of 54

Merge Sort: Contiguous Storage Version

Merge Sort

Input: A[ ], l, h

Output: A[l: h] in sorted sequence

If (l<h) then

1.1 mid<- ⎣(l+h)/2⎦

1.2 Merge Sort(A[ ], l, mid)

1.3 Merge Sort(A[ ], mid+1, h)

1.4 Merge (A[ ], l, mid, h)

2. End of If

3. End Merge Sort

13 of 54

Merge Sort: Contiguous Version(contd.)

Merge

Input: A[ ], l, mid, h, B[ ]

Output: A[ ], l, h

k<-l, i<- l, j<- mid +1
Repeat while ((k ≤ mid) &&(j ≤ h)) do

2.1 if (A[k] ≤ A[j]) then

2.1.1 B[i]<- A[k]

2.1.2 k<- k+1

2.2 else

2.2.1 B[i]<- A[j]

2.2.2 j <- j+1

2.3 End of if

2.4 i<- i +1

3. End of while

14 of 54

Merge Sort: Contiguous Version(contd.)

4. if( k>mid) then

4.1 Repeat for p = j to h do

4.1.1 B[i]<- A[p]

4.1.2 i <- i + 1

4.2 End of for

5. else

5.1 Repeat for p = k to mid do

4.1.1 B[i]<- A[p]

4.1.2 i <- i + 1

4.2 End of for

6 end of if

7. Repeat for i = l to h do

7.1 A[i] <- B[i]

8. End of Repeat

9. End of Algorithm

15 of 54

Analysis of Contiguous Version

Problem with contiguous version is requirement of : 1. Extra space, 2. Computer time, and 3. Programming effort.
In our previous slides we have resolved the problem using an auxiliary array large enough to hold the combined list and copy the entries into the array as the lists are merged. This extra space is proportional to n.
In another approach, sub lists can be put to be merged next to each other, forget the amount of order they already have and then use insertion sort like algorithm to put the combined list into order. So, merging phase consumes O(n²) time.
Lastly, algorithms are available to merge sub lists into combined list in O(n) time with constant amount of space, but logic is slightly complicated. Reference: “Practical in-space merging” by Bing-Chao Huang and Michael A. Langston in Communication of the ACM, vol-31, pp-348-352, 1988.

16 of 54

Merge Sort for Linked List

Merge Sort

Input: List_type * p

Output: List_type * head

head <- p
if (p & p->next) then

2.1 q <- Divide(p)

2.2 p<- Merge Sort (p)

2.3 q<- Merge Sort(q)

2.4 head <- Merge (p, q)

3. End of if

4. End of Merge Sort

17 of 54

Merge Sort for Linked List (contd.)

Divide

Input: List_type * p

Output: List_type *r

r <- (p->next)->next
q <-p
Repeat while (r!=Null) do

2.1 r<-(r->next)

2.2 q<-(q->next)

2.3 if(r!=Null) then

2.3.1 r<-(r->next)

2.4 end of if

End while
r<- (q->next)
(q->next)<-Null
Return r

18 of 54

Merge Sort for Linked List(contd)

Merge

Input: List_type * head, *r

Output: List_type *head

if(!p || !q)then

1.1 write(“error: empty list can’t be merged”)

1.2 exit

end of if
if (p->data<q->data) then

3.1 head<- p

3.2 p <- (p->next)

4. else

4.1 head<- q

4.2 q<-(q->next)

5. end of if

19 of 54

Merge Sort for Linked List(contd)

6. r<- head

7. Repeat while (p & q) do

7.1 if(p->data ≤ q->data) then

7.1.1 r<-(r->next)

7.1.2 r<- p

7.1.3 p<-(p->next)

7.2 else

7.2.1 r<-(r->next)

7.2.2 r<- q

7.2.3 q<-(q->next)

7.3 end of if

8. end of while

9. if(p!=Null){

9.1 (r->next)<- p

10 else

10.1 (r->next)<-q

11. End of if

12. Return head

20 of 54

Complexity Analyses of Merge Sort

Comparison of keys is done only at one place, i.e., within loop of merge function. After each comparison, one of the keys will go to output list. To find the exact number of comparisons, let us think of the recursion tree of the algorithm.

21 of 54

Complexity Analyses of Merge Sort(contd.)

Here according to the recursion tree structure of the algorithm, total number elements in each level is precisely n or every item is treated once in each level. Number of levels (excluding leaves, which contain single elements of the original list) is ⎡lg₂ n⎤. Hence, total number of elements is bounded by O(nlg₂n).

22 of 54

A[]={4, 1, 3, 2, 16, 9, 10, 14, 8, 7}

To make array A, a max heap, consider its length 10, i., e., n is 10. Last internal node at is in index 5, i.e., index(n/2), whose value is 16. 16 is greater than its left child 7, so, no swapping.

23 of 54

A[]={4, 1, 3, 14, 16, 9, 10, 2, 8, 7}

Next internal node is at index 4 and its value is 2. 2 is less than its children. Among its children 14 is greater. So, swap 14 and 2

24 of 54

A[]={4, 1, 10, 14, 16, 9, 3, 2, 8, 7}

Next internal node is at index 3 and its value is 3. 3 is less than its children. Among its children 10 is greater. So, swap 10 and 3.

25 of 54

A[]={4, 16, 10, 14, 1, 9, 3, 2, 8, 7}

Next internal node is at index 2 and its value is 1. 1 is less than its children. Among its children 16 is greater. So, swap 16 and 1.

26 of 54

A[]={4, 16, 10, 14, 7, 9, 3, 2, 8, 1}

Now, internal node 1 is at index 5 and its value is 1. 1 is less than its child 7 which is at index 10. So, swap 1 and 7 and 7 is now at index 5

27 of 54

A[]={16, 4, 10, 14, 7, 9, 3, 2, 8, 1}

Lastly, root node is at index 1 and its value is 4. 4 is less than its children. Among its children 16 is greater. So, swap 16 and 4.

28 of 54

A[]={16, 14, 10, 4, 7, 9, 3, 2, 8, 1}

Now, 4 is at index 2 and its left child is 14. 4 is less than its children. Among its children 14 is greater. So, swap 14 and 4. 4 is at index 4.

29 of 54

A[]={16, 14, 10, 8, 7, 9, 3, 2, 4, 1}

However, 4 is not in its proper index and its right child at index 9 is 8 and greater than 4. So, swap 4, 8. 4 is now at index 9.

30 of 54

What is Heap?

Heap is a complete binary tree or nearly full binary tree
Key at the root of the binary tree is larger than or equal to the keys of its children
Left and Right sub trees are again Heap.
This is called Max Heap.

31 of 54

Max Heap

A complete binary tree can be represented by array, say A[].
Root is at A[1].
Left child of root is at A[2×1]
Right child of root is at A[2×1+1]
For index i of array, left child is at 2×i and right child is at 2×i+1
Parent of i th element is at ⎣i/2⎦, ⎣ ⎦ symbol, called floor, represent closest integer less than or equal to i

32 of 54

A[]={16, 14, 10, 8, 7, 9, 3, 2, 4, 1}

A Max Heap in array A[] and in Tree Representation
In tree corresponding indices are represented beside circle

33 of 54

Max Heap

Theorem necessary to keep in mind: In a binary tree, if n₀ is the number of leaves and n₂is the number of internal nodes of degree 2, then n₀= n₂₊₁ .
In a heap represented by array A[1:n],where n is length(A), the elements A[⎣n/2⎦+1], A[⎣n/2⎦+2],…A[n] are leaves of the heap.
the elements A[1], A[2],…A[⎣n/2⎦] are intermediate nodes of the heap

34 of 54

Heap Sort

1. In an arbitrary array Heap sort calls Build_Heap to construct Heap.

2. Swap A[1] and A[n], so, A[n] is now the maximum element.

3. Size of A[] is (n-1) now.

4. Invoke Build_Heap for A[1:n-1]

5. Swap A[1] and A[n-1]

6. Size of A[] is (n-2)

7. Repeat the steps 2 to 6 till n is reduced to 1

35 of 54

How much Heap Sort works in worst case

Max_heapify is a recursive function. For a single invocation from Build_heap it does two comparison and one swapping from index i to one of the leaves following a path. After each recursive invocation, index gets doubled (as from an intermediate node it goes to its left or right child along the array).
Hence number of comparisons is 2 x lg(n/i) and number of swapping is lg(n/i).

36 of 54

Now in Build_heap we invokes ⎣n/2⎦ times Max_heapify.
So, total number of comparisons are:

2 x ∑^{i=0 to ⎣n/2⎦} lg(n/i)=2x(⎣n/2⎦ lg n-lg (⎣n/2⎦)!

=5 x ⎣n/2⎦=2.5 x n

By Stirling’s approximation and lg(⎣n/2⎦)= lg n -1,

We get lg (⎣n/2⎦)! = ⎣n/2⎦ lg n – 2.5 X n

37 of 54

Now, in Heapsort, after completion of Build_heap, invocation of Max_heapify requires 2 x ∑^{i=1 to n}lg i = 2 x lg n! = 2 x n lg n – 3 x n.
As this term dominates, we can say, total number of comparisons as well as swapping is bounded by O(n x lg n)

38 of 54

Representing Every Comparison Sort

Algorithm must “find” the right answer among n! possible answers

Starts “knowing nothing” and gains information with each comparison

Intuition is that each comparison can, at best,�eliminate half of the remaining possibilities

Can represent this process as a decision tree

Nodes contain “remaining possibilities”
Edges are “answers from a comparison”
This is not a data structure but what our proof uses to represent “the most any algorithm could know”

CSE 332 Data Abstractions, Summer 2012

July 16, 2012

39 of 54

Decision Tree for n = 3

CSE 332 Data Abstractions, Summer 2012

July 16, 2012

a < b < c, b < c < a,

a < c < b, c < a < b,

b < a < c, c < b < a

a < b < c

a < c < b

c < a < b

b < a < c

b < c < a

c < b < a

a < b < c

a < c < b

c < a < b

a < b < c

a < c < b

b < a < c

b < c < a

c < b < a

b < c < a

b < a < c

a < b

a > b

a > c

a < c

b < c

b > c

b < c

b > c

c < a

c > a

a ? b

The leaves contain all the possible orderings of a, b, c

40 of 54

What the Decision Tree Tells Us

Is a binary tree because

Each comparison has 2 outcomes
There are no duplicate elements
Assumes algorithm does not ask redundant questions

Because any data is possible, any algorithm needs to ask enough questions to decide among all n! answers

Every answer is a leaf (no more questions to ask)
So the tree must be big enough to have n! leaves
Running any algorithm on any input will at best �correspond to one root-to-leaf path in the decision tree
So no algorithm can have worst-case running time �better than the height of the decision tree

CSE 332 Data Abstractions, Summer 2012

July 16, 2012

41 of 54

Decision Tree for n = 3

CSE 332 Data Abstractions, Summer 2012

July 16, 2012

a < b < c, b < c < a,

a < c < b, c < a < b,

b < a < c, c < b < a

a < b < c

a < c < b

c < a < b

b < a < c

b < c < a

c < b < a

a < b < c

a < c < b

c < a < b

a < b < c

a < c < b

b < a < c

b < c < a

c < b < a

b < c < a

b < a < c

a < b

a > b

a > c

a < c

b < c

b > c

b < c

b > c

c < a

c > a

a ? b

possible �orders

actual�order

42 of 54

Where are We

Proven: No comparison sort can have worst-case better than the height of a binary tree with n! leaves

Turns out average-case is same asymptotically
So how tall is a binary tree with n! leaves?

Now: Show a binary tree with n! leaves has height Ω(n log n)

n log n is the lower bound, the height must be at least this
It could be more (in other words, a comparison sorting algorithm could take longer but can not be faster)
Factorial function grows very quickly

Conclude that: (Comparison) Sorting is Ω(n log n)

This is an amazing computer-science result: proves all the clever programming in the world can’t sort in linear time!

CSE 332 Data Abstractions, Summer 2012

July 16, 2012

43 of 54

Lower Bound on Height

The height of a binary tree with L leaves is at least log₂ L

So the height of our decision tree, h:

h ≥ log₂ (n!) property of binary trees

= log₂ (n*(n-1)*(n-2)…(2)(1)) definition of factorial

= log₂ n + log₂ (n-1) + … + log₂ 1 property of logarithms

≥ log₂ n + log₂ (n-1) + … + log₂ (n/2) keep first n/2 terms

≥ (n/2) log₂ (n/2) each of the n/2 terms left is ≥ log₂ (n/2)

≥ (n/2)(log₂ n - log₂ 2) property of logarithms

≥ (1/2)nlog₂ n – (1/2)n arithmetic

“=“ Ω (n log n)

CSE 332 Data Abstractions, Summer 2012

July 16, 2012

44 of 54

Lower Bound on Height

The height of a binary tree with L �leaves is at least log₂ L

So the height of our decision tree, h:

h ≥ log₂ (n!)

= log₂ (n*(n-1)*(n-2)…(2)(1))

= log₂ n + log₂ (n-1) + … + log₂ 1

≥ log₂ n + log₂ (n-1) + … + log₂ (n/2

≥ (n/2) log₂ (n/2)

= (n/2)(log₂ n - log₂ 2)

≥ (1/2)nlog₂ n – (1/2)n �"=" Ω(n log n)

CSE 332 Data Abstractions, Summer 2012

July 16, 2012

45 of 54

Insertion Sort Algorithm

For each element from 2nd (nextPos = 1) to last:

Insert element at nextPos where it belongs
Increases sorted subarray size by 1

To make room:

Hold nextPos value in a variable
Shuffle elements to the right until gap at right place

Chapter 10: Sorting

46 of 54

Lets analyze the Insertion sort

The time taken to sort depends on the fact that we are sorting how many numbers
Also, the time to sort may change depending upon whether the array is almost sorted (can you see if the array was sorted we had very little job).
So, we need to define the meaning of the input size and running time.

47 of 54

Input Size

Depends on the notion of the problem we are studying.

Consider sorting of n numbers. The input size is the cardinal number of the set of the integers we are sorting.

Consider multiplying two integers. The input size is the total number of bits required to represent the numbers.

Sometimes, instead of one numbers we represent the input by two numbers. E.g. graph algorithms, where the input size is represented by both the number of edges (E) and the number of vertices (V)

48 of 54

Running Time

Proportional to the Number of primitive operations or steps performed.
Assume, in the pseudo-code a constant amount of time is required for each line.
Assume that the ith line requires ci, where ci is a constant.
Keep in mind the RAM model which says that there is no concurrency.

49 of 54

Run Time of Insertion Sort

In the RAM model the total time required is the sum of that for each statement:

Steps Cost Times

50 of 54

Best Case

If the array is already sorted:

While loop sees in 1 check that A[i]<key and so while loop terminates. Thus t_j=1 and we have:

The run time is thus a linear function of n

51 of 54

Worst Case: The algorithm cannot run slower!

If the array is arranged in reverse sorted array:

While loop requires to perform the comparisons with A[j-1] to A[0], that is t_j=j

The run time is thus a quadratic function of n

52 of 54

Average Case

Instead of an input of a particular type (as in best case or worst case), all the inputs of the given size are equally probable in such an analysis.

E.g. coming back to our insertion sort, if the elements in the array A[0..j-1] are randomly chosen. We can assume that half the elements are greater than A[j] while half are less. On the average, thus t_j=j/2. Plugging this value into T(n) still leaves it quadratic. Thus, in this case average case is equivalent to a worst case run of the algorithm.
Does this always occur? NO. The average case may tilt towards the best case also.

53 of 54

Bubble Sort Algorithm, Refined

do
Initialize exchanges to false
for each pair of adjacent array elements
if values are out of order
Exchange the values
Set exchanges to true
while exchanges

Chapter 10: Sorting

54 of 54

Analysis of Bubble Sort

Excellent performance in some cases

But very poor performance in others!

Works best when array is nearly sorted to begin with
Worst case number of comparisons: O(n²)
Worst case number of exchanges: O(n²)
Best case occurs when the array is already sorted:

O(n) comparisons
O(1) exchanges (none actually)

Chapter 10: Sorting