# Merge Sort and Quick Sort --- CS 137 // 2021-09-29 ## Administrivia - You should have turned in: + Your reflection for daily exercise 6 # Questions ## ...about anything? # Heap Sort ## Heaps - A **heap** is a data structure that efficiently supports the following operations: 1. `extract_min`: removes the minimum element 2. `insert`: adds a new element - A heap can be thought of as a **binary tree** that satisfies the following property: + For every node `n`, the value of `n` is less than the value of all its descendants ## Heaps
dfa
100
100
120
120
100->120
200
200
100->200
125
125
120->125
500
500
120->500
250
250
200->250
300
300
200->300
--- - The minimum element is always the root ## Heaps - We can implement heaps using an **array** - Inserting an element is $O(\log n)$ + Add at the bottom and "bubble it up" - Removing the min is $O(\log n)$ + Move last item to the root and "bubble it down" ## Heap Sort ```java void heap_sort(E[] elements) { Heap
heap = make_heap(elements); for (int i = 0; i < elements.length; i++) { elements[i] = extract_min(heap); } } ``` - What's the runtime complexity? + $O(n \log n)$ ## Heap Sort: Debrief - Heap sort is literally the same algorithm as selection sort with a clever use of a data structure - **Takeaway**: Using the right data structure in the right place can yield **huge** runtime improvements! - Knowing about common data structures and how to employ them is a big component of this course # Merge Sort ## Divide and Conquer ![split big problem into subproblems](/teaching/2021f/cs137/assets/images/divide-and-conquer.png) ## Merge Sort - **Key idea:** + Divide the list into two halves + Recursively sort each independently + Merge the two sorted lists together ![merge sort animation](https://upload.wikimedia.org/wikipedia/commons/c/cc/Merge-sort-example-300px.gif) ## Implementation ```py def merge_sort(elements): if len(elements) > 1: mid = len(elements) // 2 left_half = elements[:mid] right_half = elements[mid:] merge_sort(left_half) merge_sort(right_half) merge(left_half, right_half, elements) ``` ## Merging ```py def merge(src1, src2, dst): i1, i2, i3 = 0, 0, 0 # while both lists at least have one element while i1 < len(src1) and i2 < len(src2): if src1[i1] < src2[i2]: dst[i3] = src1[i1] i1 = i1 + 1 i3 = i3 + 1 else: dst[i3] = src2[i2] i2 = i2 + 1 i3 = i3 + 1 # Copy the remaining elements into the destination dst[i3:] = src1[i1:] + src2[i2:] ``` ## Visualization of Merge Sort
# Quick Sort ## Quick Sort - **Key idea**: + Select a *pivot* element + Partition the array so that elements $\le$ the pivot are on the left and elements $>$ are on the right + Recursively sort each part ![quick sort animation](https://upload.wikimedia.org/wikipedia/commons/9/9c/Quicksort-example.gif) ## Implementation ```py def quick_sort(arr, left, right): if left < right: mid = partition(arr, left, right) quick_sort(arr, left, mid-1) quick_sort(arr, mid+1, right) def partition(arr, left, right): pivot = arr[right] i = left-1 for j in range(left, right+1): if arr[j] <= pivot: i = i+1 arr[i], arr[j] = arr[j], arr[i] return i ``` ## Visualization of Quick Sort
## Quick Sort Analysis - What is the worst-case running time of quick sort? + $O(n^2)$ - If we **randomly** select a pivot, with high probability it will be "near" the center + Expected runtime becomes $O(n\log n)$ - **Takeaway:** Randomness is another tool that can speed up algorithms significantly ## Comparison of Sorts - Merge sort is a great **distributed** sorting algorithm + Can be parallelized + Works well even on massive datasets distributed across many servers - Quick sort performs amazingly well on datasets that can fit into RAM in one giant array + Makes great use of the cache + Does not use much extra memory + Usually 2-3 times faster in these situations