Introduction to Algorithms 6.046 Lecture 6 Prof. Shafi Goldwasser September 24, 2001 L5.1 How fast can we sort? All the sorting algorithms we have seen so far are comparison sorts: only use comparisons to determine the relative order of elements. E.g., insertion sort, merge sort, quicksort, heapsort. The best worst-case running time that weve seen for comparison sorting is O(n lg n) . Q:Is O(n lg n) the best we can do? A: Yes, as long as we use comparison sorts TODAY: Prove any comparison sort has (n long) worst case running time The Time Complexity of a Problem The minimum time needed by an algorithm to solve it. Upper Bound: Problem P is solvable in time Tupper(n) if there is an algorithm A which

outputs the correct answer in this much time A, I, A(I)=P(I) and Time(A,I) Tupper(|I|) Eg: Sorting computable in Tupper(n) = O(n2) time. The Time Complexity of a Problem The minimum time needed by an algorithm to solve it. Lower Bound: Time Tlower(n) is a lower bound for problem if no algorithm solve the problem faster. There may be algorithms that give the correct answer or run quickly on some inputs instance. The Time Complexity of a Problem The minimum time needed by an algorithm to solve it. Lower Bound: Time Tupper(n) is a lower bound for problem P if no algorithm solve the problem faster. But for every algorithm, there is at least one instance I for which either the algorithm gives the wrong answer or it runs in too much time. A, I, A(I) P(I) or Time(A,I) Tlower(|I|) Eg: No algorithm can sort N values in Tlower = sqrt(N) time. The Time Complexity of a Problem The minimum time needed by an algorithm to solve it. Upper Bound:

A, I, A(I)=P(I) and Time(A,I) Tupper(|I|) Lower Bound: A, I, A(I) P(I) or Time(A,I) Tlower(|I|) There is and there isnt a faster algorithm are almost negations of each other. Prover-Adversary Game Upper Bound: A, I, A(I)=P(I) and Time(A,I) Tupper(|I|) I have an algorithm A that I claim works and is fast. Oh yeah, I have an input I for which it does not. I win if A on input I gives the correct output in the allotted time. What we have been doing all along. Prover-Adversary Game Lower Bound: A, I, [ A(I) P(I) or Time(A,I) Tlower(|I|)]

I have an algorithm A that I claim works and is fast. Oh yeah, I have an input I for which it does not . I win if A on input I gives the wrong output or runs slow. Proof by contradiction. Prover-Adversary Game Lower Bound: A, I, [ A(I) P(I) or Time(A,I) Tlower(|I|)] I have an algorithm A that I claim works and is fast. Lower bounds are very hard to prove, because I must consider every algorithm no matter how strange. Today: Prove a Lower Bound for any comparison based algorithm for the Sorting Problem

How? Decision trees help us. Decision-tree example Sort a1, a2, , an 1:2 2:3 123 1:3 213 1:3 132 312 2:3 231 321 Each internal node is labeled i:j for i, j {1, 2,, n}. The left subtree shows subsequent comparisons if ai aj. The right subtree shows subsequent comparisons if ai

aj. Decision-tree example Sort a1, a2, a3 9, 4, 6 1:2 94 2:3 123 1:3 213 1:3 132 312 2:3 231 321

Each internal node is labeled i:j for i, j {1, 2,, n}. The left subtree shows subsequent comparisons if ai aj. The right subtree shows subsequent comparisons if ai aj. Decision-tree example Sort a1, a2, a3 9, 4, 6 1:2 2:3 123 1:3 213 1:3 132 312 96 2:3

231 321 Each internal node is labeled i:j for i, j {1, 2,, n}. The left subtree shows subsequent comparisons if ai aj. The right subtree shows subsequent comparisons if ai aj. Decision-tree example Sort a1, a2, a3 9, 4, 6 1:2 2:3 123 1:3 213 1:3 132 312

4 6 2:3 231 321 Each internal node is labeled i:j for i, j {1, 2,, n}. The left subtree shows subsequent comparisons if ai aj. The right subtree shows subsequent comparisons if ai aj. Decision-tree example Sort a1, a2, a3 9, 4, 6 1:2 2:3 123 1:3 213 1:3 132 312

2:3 231 321 469 Each leaf contains a permutation , ,, (n) to indicate that the ordering a(1) a(2) a(n) has been established. Decision-tree model A decision tree can model the execution of any comparison sort: One tree for each input size n. View the algorithm as splitting whenever it compares two elements. The tree contains the comparisons along all possible instruction traces. The running time of the algorithm = the length of the path taken. Worst-case running time = height of tree. Any comparison sort Can be turned into a Decision tree class InsertionSortAlgorithm {

1:2 for (int i = 1; i < a.length; i++) { int j = i; while ((j > 0) && (a[j-1] > a[i])) { 2:3 1:3 a[j] = a[j-1]; j--; } 123 213 1:3 2:3 a[j] = B; }} 132 312

231 321 Lower bound for decisiontree sorting Theorem. Any decision tree that can sort n elements must have height (n lg n) . Proof. The tree must contain n! leaves, since there are n! possible permutations. A height-h binary tree has 2h leaves. Thus, n! h 2 h . lg(n!) (lg is mono. increasing) lg ((n/e)n) (Stirlings formula) = n lg n n lg e = (n lg n) . Lower bound for comparison sorting Corollary. Heapsort and merge sort are asymptotically optimal comparison sorting algorithms. Sorting

Lower Bound Is there a faster algorithm? If different model of computation? class InsertionSortAlgorithm { for (int i = 1; i < a.length; i++) { int j = i; while ((j > 0) && (a[j-1] > a[i])) { a[j] = a[j-1]; j--; } a[j] = B; }} Sorting in linear time Counting sort: No comparisons between elements. Input: A[1 . . n], where A[ j]{1, 2, , k} . Output: B[1 . . n], sorted. Auxiliary storage: C[1 . . k] . Counting sort for i 1 to k do C[i] 0 for j 1 to n do C[A[ j]] C[A[ j]] + 1 C[i] = |{key = i}| for i 2 to k do C[i] C[i] + C[i1] C[i] = |{key i}|

for j n downto 1 doB[C[A[ j]]] A[ j] C[A[ j]] C[A[ j]] 1 Counting-sort example A: B: 1 2 3 4 5 4 1 3 4 3

1 C: 2 3 4 Loop 1 A: 1 2 3 4 5 4 1

3 4 3 B: for i 1 to k do C[i] 0 C: 1 2 3 4 0 0 0

0 Loop 2 A: 1 2 3 4 5 4 1 3 4 3 C:

1 2 3 4 0 0 0 1 B: for j 1 to n do C[A[ j]] C[A[ j]] + 1 i}| C[i] = |{key = Loop 2 A: 1

2 3 4 5 4 1 3 4 3 C: 1 2 3

4 1 0 0 1 B: for j 1 to n do C[A[ j]] C[A[ j]] + 1 i}| C[i] = |{key = Loop 2 A: 1 2 3 4

5 4 1 3 4 3 C: 1 2 3 4 1 0

1 1 B: for j 1 to n do C[A[ j]] C[A[ j]] + 1 i}| C[i] = |{key = Loop 2 A: 1 2 3 4 5 4 1

3 4 3 C: 1 2 3 4 1 0 1 2 B: for j 1 to n

do C[A[ j]] C[A[ j]] + 1 i}| C[i] = |{key = Loop 2 A: 1 2 3 4 5 4 1 3 4 3

C: 1 2 3 4 1 0 2 2 B: for j 1 to n do C[A[ j]] C[A[ j]] + 1 i}| C[i] = |{key = Loop 3

A: B: 1 2 3 4 5 4 1 3 4 3 1 2

3 4 C: 1 0 2 2 C': 1 1 2 2 for i 2 to k do C[i] C[i] + C[i1] C[i] = |{key i}|

Loop 3 A: B: 1 2 3 4 5 4 1 3 4 3 1 2

3 4 C: 1 0 2 2 C': 1 1 3 2 for i 2 to k do C[i] C[i] + C[i1] C[i] = |{key i}|

Loop 3 A: B: 1 2 3 4 5 4 1 3 4 3 1

2 3 4 C: 1 0 2 2 C': 1 1 3 5 for i 2 to k

do C[i] C[i] + C[i1] C[i] = |{key i}| Loop 4 A: B: 1 2 3 4 5 4 1 3 4 3 3

for j n downto 1 doB[C[A[ j]]] A[ j] C[A[ j]] C[A[ j]] 1 1 2 3 4 C: 1 1 3 5 C': 1

1 2 5 Loop 4 A: B: 1 2 3 4 5 4 1 3 4

3 4 3 for j n downto 1 doB[C[A[ j]]] A[ j] C[A[ j]] C[A[ j]] 1 1 2 3 4 C: 1 1 2 5

C': 1 1 2 4 Loop 4 A: B: 1 2 3 4 5 4

1 3 4 3 3 3 4 for j n downto 1 doB[C[A[ j]]] A[ j] C[A[ j]] C[A[ j]] 1 1 2 3 4 C:

1 1 2 4 C': 1 1 1 4 Loop 4 1 2 3 4

5 A: 4 1 3 4 3 B: 1 3 3 4 for j n downto 1 doB[C[A[ j]]] A[ j]

C[A[ j]] C[A[ j]] 1 1 2 3 4 C: 1 1 1 4 C': 0 1 1

4 Loop 4 1 2 3 4 5 A: 4 1 3 4 3 B:

1 3 3 4 4 for j n downto 1 doB[C[A[ j]]] A[ j] C[A[ j]] C[A[ j]] 1 1 2 3 4 C: 0

1 1 4 C': 0 1 1 3 Analysis (k) (n) (k) (n) (n + k) for i 1 to k do C[i] 0 for j 1 to n do C[A[ j]] C[A[ j]] + 1

for i 2 to k do C[i] C[i] + C[i1] for j n downto 1 do B[C[A[ j]]] A[ j] C[A[ j]] C[A[ j]] 1 Running time If k = O(n), then counting sort takes (n) time. But, sorting takes (n lg n) time! Wheres the fallacy? Answer: Comparison sorting takes (n lg n) time. Counting sort is not a comparison sort. In fact, not a single comparison between elements occurs! Stable sorting Counting sort is a stable sort: it preserves the input order among equal elements. A: 4 1 3

4 3 B: 1 3 3 4 4 Exercise: What other sorts have this property? Radix sort Origin: Herman Holleriths card-sorting machine for the 1890 U.S. Census. (See Appendix .) Digit-by-digit sort. Holleriths original (bad) idea: sort on most-significant digit first. Good idea: Sort on least-significant digit

first with auxiliary stable sort. Modern IBM card One character per column. Produced by the WWW Virtual PunchCard Server. So, thats why text windows have 80 columns! Operation of radix sort 329 457 657 839 436 720 355 720 355 436 457 657 329 839

720 329 436 839 355 457 657 329 355 436 457 657 720 839 Correctness of radix sort Induction on digit position Assume that the numbers are sorted by their low-order t 1 digits. Sort on digit t 720 329 436

839 355 457 657 329 355 436 457 657 720 839 Correctness of radix sort Induction on digit position Assume that the numbers are sorted by their low-order t 1 digits. Sort on digit t Two numbers that differ in digit t are correctly sorted. 720 329 436 839

355 457 657 329 355 436 457 657 720 839 Correctness of radix sort Induction on digit position Assume that the numbers are sorted by their low-order t 1 digits. Sort on digit t Two numbers that differ in digit t are correctly sorted. Two numbers equal in digit t are put in the same order as the input correct order. 720

329 436 839 355 457 657 329 355 436 457 657 720 839 Analysis of radix sort Assume counting sort is the auxiliary stable sort. Sort n computer words of b bits each. r b/r base-2 Each word can be viewed8 as having 8 8 8

digits. 32-bit word Example: r = 8 b/r = 4 passes of counting sort on base-28 digits; or r = 16 b/r = 2 passes of counting sort on base-216 digits. How many passes should we make? Analysis (continued) Recall: Counting sort takes (n + k) time to sort n numbers in the range from 0 to k 1. If each b-bit word is broken into r-bit pieces, each pass of counting sort takes (n + 2r) time. Since there are b/r passes, we have . Choose r to minimize T(n, b): Increasing r means fewer passes, but as r >> lg n, the time grows exponentially. = ? Choosing r Minimize T(n, b) by differentiating and setting to 0. Or, just observe that we dont want 2r >> n, and theres no harm asymptotically in choosing r as large as possible subject to this constraint.

Choosing r = lg n implies T(n, b) = (b n/lg n) . For numbers in the range from 0 to n d 1, we have b = d lg n radix sort runs in (d n) time. Conclusions In practice, radix sort is fast for large inputs, as well as simple to code and maintain. Example (32-bit numbers): At most 3 passes when sorting 2000 numbers. Merge sort and quicksort do at least lg 2000 = 11 passes. Cant sort in place using counting Downside: sort. Also, Unlike quicksort, radix sort displays little locality of reference, and thus a well-tuned quicksort fares better sometimes on modern processors, with steep memory hierarchies. Appendix: Punched-card technology Herman Hollerith (1860-1929) Punched cards Holleriths tabulating system Operation of the sorter Origin of radix sort

Modern IBM card Web resources on punchedcard technology Return to last slide viewed. Herman Hollerith (1860-1929) The 1880 U.S. Census took almost 10 years to process. While a lecturer at MIT, Hollerith prototyped punched-card technology. His machines, including a card sorter, allowed the 1890 census total to be reported in 6 weeks. He founded the Tabulating Machine Company in 1911, which merged with other companies in 1924 to form International Business Machines. Punched cards Punched card = data record. Hole = value. Algorithm = machine + human operator. Replica of punch card from the 1900 U.S. census. [Howells 2000]

Holleriths tabulating system Pantograph card punch Hand-press reader Dial counters Sorting box Figure from [Howells 2000]. Origin of radix sort Holleriths original 1889 patent alludes to a most-significant-digit-first radix sort: The most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations, then reassorting each group according to the second item entering into the combination, and so on, and finally counting on a few counters the last item of the combination for each group of cards. Least-significant-digit-first radix sort seems to be

a folk invention originated by machine operators. Web resources on punchedcard technology Doug Joness punched card index Biography of Herman Hollerith The 1890 U.S. Census Early history of IBM Pictures of Holleriths inventions Holleriths patent application (borrowed from Gordon Bells CyberMuseum) Impact of punched cards on U.S. history Operation of the sorter An operator inserts a card into the press. Pins on the press reach through the punched holes to make electrical contact with mercuryfilled cups beneath the card. Whenever a particular digit value is punched, the lid of the corresponding sorting bin lifts. The operator deposits the card Hollerith Tabulator, Pantograph, Press, and Sorter into the bin and closes the lid. When all cards have been processed, the front panel is opened, and the cards are collected in order, yielding one pass of a stable sort.