Friday, April 20, 2012

4.1 Maximum Sub Array

I studied the maximum sub array problem from <Introduction to Algorithms>. I'm gonna write down the notion so that I can recall it in the future easily.

The problem
I won't describe the whole problem such as buying or selling a stock. Instead, I'm going to describe the the problem in another words. For a given integer array, which contains some positive or negative values, try to find a continuous sub array that the sum is the maximum.

Here is an example:
Array = {13, -3, -25, 20, -3, -16, -23, 18, 20, -7, 12, -5, -22, 15, -4, 7}
From the given array, we can easily find out the sub array, which is {18, 20, -7, 12}. We can get it because we are good at mathematics. But how to resolve it in programming? That's the maximum sub array problem.

The traditional solution
I do not want to speak so much about the traditional solution coz the most of programmer can tough it when he/she really think about it. But I still need to describe them because we can analyze it and compare with the Divide-Conquer-Combine solution. Okay, here we go!

The easily way to resolve this problem is adding these items one by one, comparing the sum to get the maximum sum, iterating all array items. After that, we will get the maximum sum sub array. Here is the pseudocode:

GetMaxSum(int currentIndex)
{
    sum = -∞;
    max = -∞
    right = -1;
    for(int i = currentIndex; i < array.Length; i++)
    {
        sum += array[i];
        if(sum > max)
        {
            max = sum;
            right = i;
        }
    }
    return (currentIndex, right, max);
}

This is the core code, and we still need a calling procedure, which will iterate the whole array and compare the result one by one. Such as:
First value 13: -3, -25 .......
Second Value -3: -25, 20, -3 .......

We can easily analyze the running times. In the for loop in the GetMaxSum method, it costs array.Length - current index times. For the calling procedure, because we will iterate the whole array (except the final one), which length is array.Length - 1. So the final running times should be ((array.Length -1) + 1)*(array.Length - 1)/2. In another words, it's (n-1) + 1)*(n-1)/2. So far, we can easily get the Θ(n2).

The Divide-Conquer-Combine solution
For the divide-conquer-combine solution, we can divide the array to two parts, such as [first, mid] and [mid + 1, last]. Then, we can try to find the sub array. But before that, it looks like we miss one important criterion that the sub array can cross the mid, which is [i, j]. i first < i < mid, mid < j < last. That's an important one. Okay, now we get three criteria:
1. sub array are in [first, mid]
2. sub array are in [mid + 1, last]
3. sub array are in [i, j], i first < i < mid, mid < j < last

Let me show the pesudocode for the third one first:
GetCrossingSubArray(int mid)
{
    left = -1
    leftMax = -∞;
    sum = -∞;
    for(int i = mid; i >= first; i--)
    {
        sum += array[i];
        if(sum > leftMax)
        {
            left = i;
            leftMax = sum;
        }
    }

    right = -1

    rightMax = -∞;
    sum = -∞;
    for(int i = mid; i <= last; i++)
    {
        sum += array[i];
        if(sum >  rightMax)
        {
             right = i;
             rightMax = sum;
        }
    }
    return (left, right, leftMax + rightMax);
}

Also, we still need the call procedure. Absolutely, it contains the criteria 1 and 2. That is:
GetSubArray(int first, int last)
{
    if(first == last)
        return (first, last, array[first]) //only one element
    else
    {
        int mid = (first + last)/2;
        leftMax = GetSubArray(first, mid);
        rightMax = GetSubArray(mid + 1, last);
        crossingMax = GetCrossingSubArray(mid);
        if(leftMax > rightMax && leftMax > crossingMax)
            return leftMax;
        else if(rightMax > leftMax && rightMax > corssingMax)
            return rightMax;
        else
            return crossingMax;
    }
}

For analyzing this algorithm, we need to calculate the running time separately. For the GetCrossingSubArray, the for loop is a constants time, which is n. For the GetSubArray, the running time is T(n) = 2T(n/2) + GetCrossingSubArray, which is T(n) = 2T(n/2) + n. Therefore, the final running times are: Θ(nlgn) + Θ(n). (Considering the GetSubArray as a tree, each level's sum is n. The tree high is lgn, which is the same with merge sort)

No comments:

Post a Comment