The Merge Sort Algorithm
The merge sort algorithm works by breaking down the numbers to be sorted into groups and then sorting the groups.
Say we want to sort the following list of numbers:
Firstly we break the numbers into pairs, then we sort each pair of numbers into order. The numbers are sorted into order by comparing one number with the other. If the second number is smaller than the first, then the two numbers are swapped over, otherwise they remain in the same positions.
Now we have four groups of two numbers in sorted order. We split these groups into two pairs of two numbers. Now we merge each of the pairs of two together into a single group of 4 numbers sorted into order. Let’s take a look at how this is done in detail, using the first pair as an example:
- The first pair contains the numbers 4 and 10 in one group and 7 and 99 in the other.
- We begin by comparing the first number from the first group with the first number of the second group, so we are comparing 4 against 7. The smallest of the two is moved into the first position in the block of four. So in this case we decide 4 is less than 7 and it is moved down to become the first number in the sorted block.
- The first pair now only has one number left which is 10. We compare this with the first number from the second group, which is 7.
- As 7 is smaller than 10, we move it down to become the 2nd number in our block of sorted numbers.
- Now we have only 10 remaining from the first pair and 99 remaining from the second pair. We compare the two numbers. As 10 is smaller, it is moved down to become the 3rd number in the block.
- When we run out of numbers in one of the blocks, we simply move down any numbers in the remaining block in sequence from left to right because they are already in sorted order. Only one number remains, 99, and so it is moved down into fourth position.
- The procedure is repeated for pair 2
Finally, we combine the two groups of four numbers into a single group of eight numbers by repeating the merging procedure as was used above but with larger groups. So the comparisons that happen are:
- 1 < 4. Move down 1.
- 4 < 8. Move down 4.
- 7 < 8. Move down 7.
- 8 < 10. Move down 8.
- 10 < 16. Move down 10.
- 16 < 99. Move down 16.
- 19 < 99. Move down 19.
- Only 99 remains, so move it down.
A merge sort is often faster than alternatives such as insertion sort or a bubble sort but has the disadvantage that it can use more memory to hold the groups temporarily whilst sorting is in progress.
An Implementation of Merge Sort in Python
This implementation works exactly as above but handles the special situation where the size of the list of numbers to be sorted is not even. The number of groups being sorted must always be a multiple of 2 because, as shown above, the groups are merged in pairs. The way this problem is handled is if the number of groups is odd, one group is removed from the list and held back as a “remainder” group, making the list even in size. When the list of groups subsequently has an odd length, the remainder group is added back in again to make the length of the list to be sorted even.
For example if we start with 11 numbers, this is a odd length of list and we cannot make pairs out of it. We hold back the 11th number as a “remainder” and pair up the remaining 10 numbers to be merged into 5 groups of two numbers. Now we have an odd number of groups again and cannot pair them up for merging. So we add in the remainder group that was previously held back to make 6 groups. Now we have an even set that can be paired up for merging.
There must always be a situation where we end up with an odd length of groups because eventually we end up with a single sorted list (which is one group). Hence, in any list of numbers there is always an opportunity to add back in the remainder group.
#Merge two groups of numbers that each must be already sorted into a single group
#The two groups do not have to be the same size
def merge_two_sorted_groups(group_a, group_b):
index_group_a = 0
index_group_b = 0
merged_group = []
#Merge sort two groups of numbers as per the algorithm explained above
while index_group_a < len(group_a) and index_group_b < len(group_b):
if group_a[ index_group_a ] < group_b[ index_group_b ]:
merged_group.append( group_a[ index_group_a ] )
index_group_a +=1
else:
merged_group.append( group_b[ index_group_b ] )
index_group_b +=1
#If either of the groups still has numbers remaining, tack them on the end of the sorted group
#They must already be in order because it is a requirement of this method that each group to be merged is pre-sorted
#At this stage only one of the two groups can have numbers remaining and the other must be empty
merged_group.extend( group_a[ index_group_a: ] )
merged_group.extend( group_b[ index_group_b: ] )
return merged_group
#Do a merge sort on a list of numbers
def merge_sort( num_list ):
#If there is only one or zero numbers, then the list is sorted already
if len( num_list ) < 2:
return num_list
groups_of_numbers = []
remainder_group = None
#Break the list of numbers into an array of arrays, each containing one number
#e.g. if num_list = [ 1 , 2, 3, 4 ], then this results in [ [1], [2], [3], [4] ]
#However, if the original num_list contains an odd number of numbers, then take the last one off the list
#and store it in the "remainder_group" variable, hence resulting in an even sized listed of numbers
#This is because we can only merge even sets of groups as we merge groups in pairs
if len( num_list ) % 2 == 0:
array_end = len( num_list )
else:
array_end = len( num_list ) - 1
remainder_group = [ num_list[-1] ]
for i in range(0,array_end):
groups_of_numbers.append( [ num_list[i] ] )
#Now start sorting
#Loop until sorting is done
#We've finished sorting when we only have one group left
while len( groups_of_numbers ) > 1:
temp_groups_of_numbers = []
#Take each adjacent pair of groups and merge them together into a single group
#Therefore we end up with half the number of groups we started with
for i in range(0,len( groups_of_numbers ),2):
merged_pair = merge_two_sorted_groups( groups_of_numbers[i], groups_of_numbers[i+1] )
temp_groups_of_numbers.append( merged_pair )
#If the resulting number of groups is not even, we can't subsequently merge it the next time around the loop
#We can only merge pairs of groups
#If we have an odd number of groups, this is resolved by tacking on the remainder group (if there is one), therefore making an even number of groups
#If there is no remainder group, we remove one group from the end of the odd sized list (making the list even in size) and make it the remainder group
if len(temp_groups_of_numbers) % 2 != 0:
#If there is a remainder group, then tack it on the end
if remainder_group is not None:
temp_groups_of_numbers.append( remainder_group )
remainder_group = None
else:
#If we've only got one group left and no remainder group, then the list is sorted
if len( temp_groups_of_numbers ) > 1:
#Otherwise remove one group from the end of the list and make it the remainder group
remainder_group = temp_groups_of_numbers.pop()
groups_of_numbers = temp_groups_of_numbers
return groups_of_numbers
if __name__ == "__main__":
number_list_to_sort = [ 10,4,99,7,19,1,8,16,100 ]
sorted_list = merge_sort( number_list_to_sort )
print(sorted_list)