Merge Sort
A few weeks ago I wrote a post about the Quadratic Sorting Algorithms, Bubble Sort, Selection Sort, and Insertion Sort. These are typically the first algorithms that developers will encounter when learning about sorting algorithms. They're great at teaching the fundamentals of programmatically moving data around, but beyond setting an educational foundation they're not super useful in real world programming. They lack the sophistication needed to efficiently sort large amounts of data in a short amount of time.
In this post I want to explore a more advanced sorting algorithm called Merge Sort. With an average time complexity of O(n log n)
, Merge Sort is capable of improving the time complexity of a sorting operation well past what any of the Quadratic Sorting Algorithms can accomplish, which average O(n²)
. However, that improved efficiency comes with a tradeoff: it's not as simple to understand this algorithm, and is a bit more complicated to implement. I'm going to do my best to break down the concepts behind what makes Merge Sort work the way it does, and hopefully that will help the code implementation at the end of this post make some sense.
How Does Merge Sort Work?
Merge Sort exploits the fact that an array that is either empty or only has one value is always sorted, and that it is extremely fast to combine two sorted arrays while maintaining the sorting order. At a high level, Merge Sort will decompose one array into smaller arrays of 0 or 1 elements, then build up a newly sorted array.
Say we have an array of 8 elements that we want to sort:
[8, 3, 5, 4, 7, 6, 1, 2];
We start by splitting this into two arrays of 4 elements:
[8, 3, 5, 4][(7, 6, 1, 2)];
Then we further split into four arrays of 2 elements:
[8, 3][(5, 4)][(7, 6)][(1, 2)];
Finally we split it into eight arrays of 1 element:
[8][3][5][4][7][6][1][2];
Then we start merging these back together. The single-element arrays are sorted, and merging sorted arrays is much easier than merging unsorted arrays:
[3, 8][(4, 5)][(6, 7)][(1, 2)];
Then we repeat the process:
[3, 4, 5, 8][(1, 2, 6, 7)];
Finally we repeat the merging process again:
[1, 2, 3, 4, 5, 6, 7, 8];
Merging Two Sorted Arrays
It should come as no surprise that the second half of this entire process, where individual sorted arrays are compared and then merged is where most of the complexity lives. It's worthwhile to break out this functionality into its own function, which we'll call merge
, that takes in two sorted* arrays, then combines them into a single sorted array.
*the two arrays must both be sorted in the same direction: both ascending or both descending. For the purposes of this blog post we are sorting in ascending order.
function merge(arr1, arr2) { let results = []; let i = 0; let j = 0; while (i < arr1.length && j < arr2.length) { if (arr2[j] > arr1[i]) { results.push(arr1[i]); i++; } else { results.push(arr2[j]); j++; } } while (i < arr1.length) { results.push(arr1[i]); i++; } while (j < arr2.length) { results.push(arr2[j]); j++; } return results;}
The function merge
accepts two arguments, arr1
and arr2
, which are each sorted arrays. We loop through and assign pointers, called i
and j
at the start of each array. An empty array, results
is created, which will be populated with the merged, sorted values of both arays. The pointers are compared, and the lower of the two values is pushed into result
.
Let's break it down and look at what happens when we use the function above to merge and sort two sorted arrays:
merge([3, 4, 5, 8], [1, 2, 6, 7]);
Merge Step 1
result: [];arr1: [3, 4, 5, 8]; iarr2: [1, 2, 6, 7]; jj < i, so j gets pushed into result
result: [1];
Merge Step 2
result: [1];arr1: [3, 4, 5, 8]; iarr2: [1, 2, 6, 7]; jj < i, so j gets pushed into result
result: [1, 2];
Merge Step 3
result: [1, 2];arr1: [3, 4, 5, 8]; iarr2: [1, 2, 6, 7]; ji < j, so i gets pushed into resultresult: [1, 2, 3];
Merge Step 4
result: [1, 2, 3];arr1: [3, 4, 5, 8]; iarr2: [1, 2, 6, 7]; ji < j, so i gets pushed into resultresult: [1, 2, 3, 4];
Merge Step 5
result: [1, 2, 3, 4];arr1: [3, 4, 5, 8]; iarr2: [1, 2, 6, 7]; ji < j, so i gets pushed into resultresult: [1, 2, 3, 4, 5];
Merge Step 6
result: [1, 2, 3, 4, 5];arr1: [3, 4, 5, 8]; iarr2: [1, 2, 6, 7]; jj < i, so j gets pushed into resultresult: [1, 2, 3, 4, 5, 6];
Merge Step 7
result: [1, 2, 3, 4, 5, 6];arr1: [3, 4, 5, 8]; iarr2: [1, 2, 6, 7]; jj < i, so j gets pushed into resultresult: [1, 2, 3, 4, 5, 6, 7];
Merge Step 8
result: [1, 2, 3, 4, 5, 6, 7];arr1: [3, 4, 5, 8]; iarr2: [1, 2, 6, 7]; jj is now beyond the bounds of arr2, meaning we have reached the end of arr2.
Knowing that the rest of arr1 is sorted, we simply push the remainder of arr1 into resultresult: [1, 2, 3, 4, 5, 6, 7, 8];
Merge Step 9
result: [1, 2, 3, 4, 5, 6, 7, 8];result is now a sorted array containing the contents of both arr1 and arr2, we can return result
The above operation will work, no matter how large or small the sorted arrays we give to the merge
function are. But we've kind of been working backwards. To get to the point where we can use the merge
function, we first need to write the function that takes a single array and breaks it into smaller arrays. That looks like this:
function mergeSort(arr) { if (arr.length <= 1) { return arr; } let mid = Math.floor(arr.length / 2); let left = mergeSort(arr.slice(0, mid)); let right = mergeSort(arr.slice(mid)); var counter = 1; return merge(left, right);}
This code works recursively. We use Math.floor(arr.length / 2)
to find the middle point of the array, which tells us where to subdivide. Then we take that new, subdivided array, find its middle, and subdivide again, and keep subdividing the array until we get down to arrays of 0 or 1 elements. From there we use the merge
function as described above.
This looks something like the following:
[8, 3, 5, 4, 7, 6, 1, 2][(8, 3, 5, 4)][(8, 3)][8][3][ // [8] and [3] get sorted and merged here, resulting in [3, 8] (5, 4)][5][4][ // [5] and [4] get sorted and merged here, resulting in [4, 5] // [3, 8] and [4, 5] get sorted and merged here, resulting in [3, 4, 5, 8] (7, 6, 1, 2)][(7, 6)][7][6][ // [7] and [6] get sorted and merged here, resulting in [6, 7] (1, 2)][1][2];// [1] and [2] get sorted and merged here, resulting in [1, 2]// [6, 7] and [1, 2] get sorted and merged here, resulting in [1, 2, 6, 7]// [3, 4, 5, 8] and [1, 2, 6, 7] get sorted and merged here, resulting in [1, 2, 3, 4, 5, 6, 7, 8]
After the first, unsorted array is broken into its smallest elements, the sorting and merging happens at each step up until a fully sorted array gets returned.
Merge Sort Implementation
Here is the full implementation of merge sort. To use it, you simply have to pass mergeSort()
an array, like mergeSort([8, 3, 5, 4, 7, 6, 1, 2])
.
function merge(arr1, arr2) { let results = []; let i = 0; let j = 0; while (i < arr1.length && j < arr2.length) { if (arr2[j] > arr1[i]) { results.push(arr1[i]); i++; } else { results.push(arr2[j]); j++; } } while (i < arr1.length) { results.push(arr1[i]); i++; } while (j < arr2.length) { results.push(arr2[j]); j++; } return results;}
function mergeSort(arr) { if (arr.length <= 1) { return arr; } let mid = Math.floor(arr.length / 2); let left = mergeSort(arr.slice(0, mid)); let right = mergeSort(arr.slice(mid)); var counter = 1; return merge(left, right);}
Merge Sort Big O
Complexity | Big O |
---|---|
Time (Best) | O(n log n) |
Time (Average) | O(n log n) |
Time (Worst) | O(n log n) |
Space | O(n) |
Time Complexity
Say we start with an array of 8 elements, like the example above. To break it into single-element arrays, we have to do 3 split operations:
[3, 4, 5, 8, 1, 2, 6, 7][(3, 4, 5, 8)][(3, 4)][3][4]; // split! // split! // split!
If we have an array of 16 items, we need to do 4 splits. For a an array of 32 items, we need to do 5 splits. As n
grows, O
grows to log n
, which gives us the log n
part of n log n
. However, we also have to account for the number of comparison operations. If we're still looking at the 8-element array, there are always 8 comparisons per decomposition. That number of comparisons is always a linear relationship to the size of the data set, which gives us the first n
in O(n log n)
.
Space Complexity
With space complexity, as n
grows we have to store more and more individual arrays. If space is a consideration, Merge Sort may not be as space-efficient as Bubble Sort or the other algorithms.