Welcome back! In the previous blog, we saw two of our simplest sorting algorithms: selection sort and bubble sort. Though they were simple, they take large time to sort (O(n^2)). For large data set this means a huge amount of time and thus people go for other sorting algorithms like merge sort, quick sort etc. Today, we will see how merge sort and quick sort happens and then compare them.
Another point, time is not the only parameter to judge the performance of an algorithm - another parameter is how much space the algorithm takes. Although you have lots and lots Bytes of memory in your PC and laptop to adjust space for the algorithm, but think of your phone. If we run our algorithm in your phone then there would not have that much of space. In that situation space complexity comes into picture and algorithms have to be compared not only with respect to time but also with respect to space / memory requirement. Thus, space complexity becomes critical in case of embedded systems and cache performance in PCs.
Merge Sort: Before jumping into merge sort algorithm let us discuss how to merge two arrays. You have two sorted arrays and you have to make a single array maintaining order. It's pretty straight forward using extra space.
C function to merge two arrays:
1st array: array[l1:h1], 2nd array: array[l2:h2]
void merge (int* array, int l1, int h1, int l2, int h2, int n)
{
int i=l1,j=l2,k=l1;
int* temp = (int*) malloc(sizeof(int)*n);
for(;;)
{
if (i>h1 || j>h2)
break;
if (array[i] < array[j])
temp[k++] = array[i++];
else
temp[k++] = array[j++];
}
while (i<=h1)
temp[k++] = array[i++];
while (j<=h2)
temp[k++] = array[j++];
for(i=l1;i<k;i++)
array[i] = temp[i];
free(temp);
}
void quick_sort(int* array, int s, int n)
{
int pivot = partition(array, s, n);
if (s < pivot)
{
quick_sort(array, s, pivot);
}
if (pivot+1 < n)
{
quick_sort(array, pivot+1, n);
}
}
Time Complexity:
A quantification of time complexity is how many comparisons is to make for sorting. For quick sort algorithm, number of comparison is:
Worst case: k=1 or k=n
C(n) = n + C(n−1)
= n + n−1 + C(n−2)
= n + n−1 + n−2 + C(n−3)
= n + n−1 + n−2 + ... + 3 + 2 + C(1)
= (n + n−1 + n−2 + ... + 3 + 2 + 1) − 1
= n(n+1)/2 − 1
≈ n^2 / 2
Best case: Each sub-array would be of same size (n/2).
C(n) = n + 2C(n/2). This is same as merge sort recursion. So the solution is,
C(n) = nlog n
Avg case: Here all the values of k are equally likely. We would not be doing the calculation here. The result is:
C(n) = 1.39 n log n, i.e. just 1.39 times than the best case
Pivot selection methods:
Pivot selection can be done in many ways:
1. First / last element in the array
2. Median of the array
3. Largest number in the array
4. Middle element in the array
5. Randomly selected element
Selecting largest element in the array results in worst case time complexity. As the partition would always be 1 and n-1 elements. Selecting median as pivot is the best case as the array is divided always into two equal parts: n/2. For calculating the median of an array in linear time we can use median of median's algorithm. I'm planning discuss this algorithm in detail in later blog. All the other methods gives more or less same performance as they are not exploring any significant property of the data set.
Comparison of Merge sort and Quick sort:
Time taken:
Generally time wise quick sort is faster than merge sort for randomly ordered array.
Space required: For merge sort, the additional space requirement is n or minimum as n/2. However, the recursion depth is log n. For quick sort, partitioning is in-place. Partitioning takes O(1) time however for best case the recursion depth is log n. Thus, the space complexity is O(log n).
Quick sort is not stable. So, if you need a stable sorting for your problem then you better use merge sort.
Next time we'll discuss some popular interview questions in the domain of sorting algorithm... See you...
Another point, time is not the only parameter to judge the performance of an algorithm - another parameter is how much space the algorithm takes. Although you have lots and lots Bytes of memory in your PC and laptop to adjust space for the algorithm, but think of your phone. If we run our algorithm in your phone then there would not have that much of space. In that situation space complexity comes into picture and algorithms have to be compared not only with respect to time but also with respect to space / memory requirement. Thus, space complexity becomes critical in case of embedded systems and cache performance in PCs.
Merge Sort: Before jumping into merge sort algorithm let us discuss how to merge two arrays. You have two sorted arrays and you have to make a single array maintaining order. It's pretty straight forward using extra space.
C function to merge two arrays:
1st array: array[l1:h1], 2nd array: array[l2:h2]
void merge (int* array, int l1, int h1, int l2, int h2, int n)
{
int i=l1,j=l2,k=l1;
int* temp = (int*) malloc(sizeof(int)*n);
for(;;)
{
if (i>h1 || j>h2)
break;
if (array[i] < array[j])
temp[k++] = array[i++];
else
temp[k++] = array[j++];
}
while (i<=h1)
temp[k++] = array[i++];
while (j<=h2)
temp[k++] = array[j++];
for(i=l1;i<k;i++)
array[i] = temp[i];
free(temp);
}
To sort a array in merge sort algorithm, we will break the array in smallest possible unit: a single element. And then we'll keep on merging the parts. Merge sort algorithm goes as follows:
void merge_sort(int* array, int l, int h, int n)
{
int mid;
if (l >= h)
return;
mid = (l+h)/2;
merge_sort(array, l, mid, n);
merge_sort(array, mid+1, h, n);
merge(array, l, mid, mid+1, h, n);
}
You can find a pictorial description of the algorithm here.
Let us see the time complexity graph for merge sort in best case, average case and worst case:
Time Complexity:
Worst case complexity: If the array size is 1 then the merging takes constant time. This forms the basis of the recursion. For merging two arrays each of length n/2, we need n comparisons. To sort n element array using merge sort if we need T(n) time then,
T(n) = 2T(n/2) + n, T(0) = T(1) = 0
Now, let us solve this recursion.
T(n) = 2T(n/2) + n
= 2[2T(n/4) + n/2] + n
= 4T(n/4) + 2n
= 4[2T(n/8) + n/4] + 2n
= 8T(n/8) + 3n
..................
= nT(n/n) + n * log(n)
= n log(n)
Best case complexity: If the first element of the right hand side array is greater than the highest element in the left hand side array (for ascending sort) then the merge takes n/2 comparisons only. In this situation, the time complexity comes to:
T(n) = 2T(n/2) + n/2
Solving the recursion we get:
T(n) = n/2 * log(n) ~ O(n log n)
So, both the best case and worst case complexity of merge sort is O(n log n).
Space Complexity:
Typical merging is not in place operation(as our example here). It needs additional memory of n. And the algorithm has a recursion tree of depth of 2n-1. Thus, this has worst case space complexity of O(n).
Quick Sort:
Quick sorting is based on how to partition an array into sub-arrays such that each element in the left hand sub-array will be smaller than every element in the right hand sub-array. For partitioning the array we choose an element and then compare other elements to it and finally decides which are the elements that are smaller than this element (left sub-array) and which are greater than this (right hand sub-array). This critical element is called as 'pivot'. Time complexity of quick sort depends on the selection process of the pivot element as well.
Partition Example (Pivot as the first element of the array):
Code:
int partition(int* array, int s, int n)
{
int pivot_val = array[s];
int pivot;
int i=s, j=n-1;
while(i<j && i<n-1 && j>=0)
{
while (array[j] >= pivot_val && i<j) j--;
while (array[i] <= pivot_val) i++;
if (i<j)
{
swap(array,i,j);
}
}
swap(array,j,s);
return j;
}
Typical merging is not in place operation(as our example here). It needs additional memory of n. And the algorithm has a recursion tree of depth of 2n-1. Thus, this has worst case space complexity of O(n).
Quick Sort:
Quick sorting is based on how to partition an array into sub-arrays such that each element in the left hand sub-array will be smaller than every element in the right hand sub-array. For partitioning the array we choose an element and then compare other elements to it and finally decides which are the elements that are smaller than this element (left sub-array) and which are greater than this (right hand sub-array). This critical element is called as 'pivot'. Time complexity of quick sort depends on the selection process of the pivot element as well.
Partition Example (Pivot as the first element of the array):
Partition algorithm example
Code:
int partition(int* array, int s, int n)
{
int pivot_val = array[s];
int pivot;
int i=s, j=n-1;
while(i<j && i<n-1 && j>=0)
{
while (array[j] >= pivot_val && i<j) j--;
while (array[i] <= pivot_val) i++;
if (i<j)
{
swap(array,i,j);
}
}
swap(array,j,s);
return j;
}
void quick_sort(int* array, int s, int n)
{
int pivot = partition(array, s, n);
if (s < pivot)
{
quick_sort(array, s, pivot);
}
if (pivot+1 < n)
{
quick_sort(array, pivot+1, n);
}
}
Time Complexity:
A quantification of time complexity is how many comparisons is to make for sorting. For quick sort algorithm, number of comparison is:
C(n) = n + C(n-k) + C(k-1), C(0)=C(1) = 0
Where the final position of the pivot is k.Worst case: k=1 or k=n
C(n) = n + C(n−1)
= n + n−1 + C(n−2)
= n + n−1 + n−2 + C(n−3)
= n + n−1 + n−2 + ... + 3 + 2 + C(1)
= (n + n−1 + n−2 + ... + 3 + 2 + 1) − 1
= n(n+1)/2 − 1
≈ n^2 / 2
Best case: Each sub-array would be of same size (n/2).
C(n) = n + 2C(n/2). This is same as merge sort recursion. So the solution is,
C(n) = nlog n
Avg case: Here all the values of k are equally likely. We would not be doing the calculation here. The result is:
C(n) = 1.39 n log n, i.e. just 1.39 times than the best case
Pivot selection methods:
Pivot selection can be done in many ways:
1. First / last element in the array
2. Median of the array
3. Largest number in the array
4. Middle element in the array
5. Randomly selected element
Selecting largest element in the array results in worst case time complexity. As the partition would always be 1 and n-1 elements. Selecting median as pivot is the best case as the array is divided always into two equal parts: n/2. For calculating the median of an array in linear time we can use median of median's algorithm. I'm planning discuss this algorithm in detail in later blog. All the other methods gives more or less same performance as they are not exploring any significant property of the data set.
Comparison of Merge sort and Quick sort:
Time taken:
Generally time wise quick sort is faster than merge sort for randomly ordered array.
Space required: For merge sort, the additional space requirement is n or minimum as n/2. However, the recursion depth is log n. For quick sort, partitioning is in-place. Partitioning takes O(1) time however for best case the recursion depth is log n. Thus, the space complexity is O(log n).
Quick sort is not stable. So, if you need a stable sorting for your problem then you better use merge sort.
Next time we'll discuss some popular interview questions in the domain of sorting algorithm... See you...


