MergeSort and counting inversions

Below is an implementation of MergeSort and which also counts the number of Inversions. I’m assuming you are already familiar with Mergesort. Definition of an inversion is that If i < j and A[i] > A[j] then we have one inversion.

The trick to calculate the number of inversions in O(nlogn) time is piggy back it on merge sort. Each time the element from the rightArray is copied over to the sorted array in the merge step, we will count and add the number of remaining elements in the left array. This will give us the total number of inversions.

Another key point to keep in mind is that merge sort takes O(n) space. In-place merge sort is possible but not any faster than using an auxiliary array. If you are doing an in-place merge sort remember that when you copy over the smaller element from the right array over to the sorted array, you have to shift once all the existing elements in the sorted array to the right. Simply copying over the element from the right array over to the sorted array and incrementing the rightIndex does not work! Bottom line is that memory is cheap, use it.

Running time of mergesort is O(nlogn).
1. The time required to split the input array into two halves is constant.
2. The time required for the divide step is 2*T(n/2).
3. The time required for the merge step of n elements is O(n)since we have to go through the entire array once.

How much time is taken by the divide step 2*T(n/2)? With the recursive call the work is divided into two sub problems, each of size (n/2). Merging a sub problem of (n/2) elements takes O(n/2) time. But since there are two sub problems at this level, the total time taken will be 2 * (Time taken for merging each step), i.e. 2 * O(n/2) = n.

So at each level of the tree the time taken for the merge step is O(n). The number of levels that we have is say ‘L’. The total number of levels with respect to input size is denoted as L = log(n) + 1. We can ignore the constant term 1, thus the total number of levels in terms of input size is log(n).
Time taken by 2*T(n/2) i.e Step 2 = L * (Time taken for merge step at each level)
= log(n) * n = O(nlogn)

Thus time taken by merge sort = Step 1 + Step 2 + Step 3

= 1 + O(nlogn) + O(n)

=O(nlogn)  [Removing the lower order terms]

(Here is a link that has a diagram along with a detailed explanation.)

Below is my implementation of counting inversions from an input file that has 100,000 elements. The path to the input file is the only run time argument that the main method needs.

GitHub link for below code is here

MergeSort.java

package mergesort;

import java.util.Arrays;

public class MergeSort {

public static long mergesort(int[] numList) {

long countInversions = 0;
if (numList.length &amp;amp;gt; 1) {
int middleIndex = numList.length / 2;
int[] leftArray = Arrays.copyOfRange(numList, 0, middleIndex);
int[] rightArray = Arrays.copyOfRange(numList, middleIndex, numList.length);
countInversions += mergesort(leftArray);
countInversions += mergesort(rightArray);
countInversions += merge(numList, leftArray, rightArray);
}
return countInversions;
}

private static long merge(int[] numList, int[] leftArray, int[] rightArray) {

int leftIndex, rightIndex;
leftIndex = rightIndex = 0;
int i = 0;
long countInversions = 0;
while (leftIndex &amp;amp;lt; leftArray.length &amp;amp;amp;&amp;amp;amp; rightIndex &amp;amp;lt; rightArray.length) {
if (leftArray[leftIndex] &amp;amp;lt;= rightArray[rightIndex]) {
numList[i] = leftArray[leftIndex++];
} else {
numList[i] = rightArray[rightIndex++];
int numRemainingElementsInLeftArray = leftArray.length - leftIndex;
countInversions = countInversions + numRemainingElementsInLeftArray;
}
i++;
}
while (leftIndex &amp;amp;lt; leftArray.length) {
numList[i++] = leftArray[leftIndex++];
}
while (rightIndex &amp;amp;lt; rightArray.length) {
numList[i++] = rightArray[rightIndex++];
}
return countInversions;
}

public static long countInversions(int[] numList) {

// Definition of Inversion: If i A[j] then we have one inversion
// We will piggyback on merge sort and each time the right element is
// copied over to the sorted array in the merge step, we will count and
// add the number of remaining elements in the left array. This will
// give us the total number of inversions.
long numberOfInversions = mergesort(numList);
return numberOfInversions;
}
}

Main.java

package mergesort;

import java.io.BufferedReader;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

public class Main {

public static void main(String args[]) {

System.out.println("The first argument is the input file path, currently it is" + args[0]);
Path inputFilePath = Paths.get(args[0]);
int[] numList = null;
numList = new int[100000];
int index = 0;
try {
BufferedReader reader = Files.newBufferedReader(inputFilePath);
String line = null;
while ((line = reader.readLine()) != null) {
numList[index] = Integer.parseInt(line);
index++;
}
} catch (IOException ex) {
ex.printStackTrace();
}
// We have now all the input integers from file stored in array numList
long countInversions = MergeSort.countInversions(numList);
System.out.println("Number of inversions are: " + countInversions);
}
}

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s