Sherlock and Array

See the original problem on HackerRank.

Warning: HackerRank has recently changed the constraints of this problem but it did not update the challenge itself. Now they allow zeros to be part of the input. In particular this can cause empty sub-arrays to be a valid solution.

For example this was not valid input before the update:

1
2 0 0 0

Now it is. A solution is the position 0 since the left sub-array is empty.

In the rest of this post we'll discuss how to fix some snippets which have not automatically accommodated such a new requirement.

Solutions

This challenge can be solved in many ways. This problem is also known as “Equilibrium index of an array”, because we have to find an index that makes the array “balanced”.

Inefficient solution \(O(N^2)\)

For completeness, if we run a loop for each element of the array and we calculate left and right sum for every iteration, we find a solution but at a very high cost.

Let's explore alternatives.

Linear solutions with no extra space

We can come up with similar approaches to solve this problem linearly and without allocating extra space.

One idea is to get total sum of array first (one scan). Then Iterate through the array and keep updating the left sum which is initialized as zero. In the loop, we can get right sum by subtracting the elements one by one:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
bool equilibrium(const vector<int>& v)
{    
    auto total = accumulate(begin(v), end(v), 0);
    auto leftSum = 0;
    
    for (auto i : v)
    {
        total -= i;
        if (total == leftSum)
        {
            return true;
        }
        leftSum += i;
    }
    
    return false;
}

This solution does not suffer zeros in the input.

Some variants of this solution based on pre-calculating the total sum exist. One, for example, consists in running both leftSum and rightSum until they cross or the length of the array is reached.

An alternative clever solution that is worth sharing avoids iterating on the array to calculate the total sum first, resulting in a single scan (by Roberto Melis and Gianluca Maini). It uses a variation of the two pointer technique that we have seen several times in the past Coding Gym labs:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
bool equilibrium(const vector<int>& v)
{
    auto left = 0u;
    auto right = v.size()-1;
    auto sum = 0;
    
    while (left <= right)
    {
        if (sum == 0 && left == right)
            return true;
        
        if (sum >= 0)
        {
            sum -= v[right--];
        }
        else
        {
            sum += v[left++];
        }
    }
    return false;
}

The solution above might not work with zeros in the inuput. For example, for this test it returns “NO”:

1
0 0 2 0

C# by Matteo Furlan

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
static String balancedSums(List<Integer> arr) {
  int left = 0;
  int right = 0;

  for(int i=1; i < arr.size(); i++) {
    right += arr.get(i);
  }

  for(int i=0; i < arr.size(); i++) {
    if (left == right) {
      return "YES";
    }

    left += arr.get(i);

    if (i + 1 < arr.size()) {
      right -= arr.get(i + 1);
    }
  }
  return "NO";
}

C by Michele Liberi

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
int main() {
  int t, n, i, *A, L, R;
  scanf("%d\n", &t);
  while (t--) {
    scanf("%d\n", &n);
    A = malloc(n * sizeof(int));
    L = R = 0;
    for (i = 0; i < n; ++i)
      scanf("%d", A + i), R += A[i];
    for (i = 0; i < n; ++i) {
      R -= A[i];
      if (L == R) {
        printf("YES\n");
        break;
      }
      L += A[i];
      if (L > R) {
        printf("NO\n");
        break;
      }
    }
    free(A);
  }
  return 0;
}

This variation of the two pointer solution works by checking if an element is between two equal-sum halves:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
bool equilibrium(const vector<int>& v) 
{
        if(v.size() == 1) 
            return true;
        
        auto lptr = 0;
        auto rptr = v.size() - 1;
        auto lsum = v[lptr];
        auto rsum = v[rptr];
        
        while(rptr > lptr) 
        {
            if(lsum < rsum) 
            {
                lsum += v[++lptr];
            } 
            else if(rsum < lsum) 
            {
                rsum += v[--rptr];
            } 
            else if(lsum == rsum)
            {
                // the value in the middle is the pivot
                if(rptr - lptr == 2)
                    return true;
                lsum += v[++lptr];
            }
        }
        if(rptr == lptr && lsum == rsum) 
            return true;
        
        return false;
    }

Clearly, this solution is particularly good because it runs in a single scan. However, there is some state shared through all the iterations. This makes parallelization a bit more complicated.

Read on for alternatives.

Linear solution with prefix sum

This class of solutions employ an additional array storing the prefix sum of the elements. Actually, since the data are not really needed, we can allocate only the prefix sum array. In a more realistic environment, this decision could be not feasible.

The prefix sum is the incremental sum of all the elements:

\(pre[i] = A[0] + A[1] + … + A[i]\)

Then \(pre[i]\) stores the sum of all the elements from \(0\) to \(i\).

The prefix sum exhibits an interesting property:

\(sum(A[l]-A[r]) = pre[r] - pre[l-1]\)

That is: the sum of the elements of \(A\) from index \(l\) to index \(r\) is \(pre[r] - pref[l-1]\). For instance:

1
2
arr = 1 2 3 4 8 1 9
pre = 1 3 6 10 18 19 28

The sum of \(arr\) from index \(1\) to index \(3\) is \(9\), or \(pre[3] - pre[0] = 10 - 1\).

We also know that the total sum of the elements is \(pre[n-1]\).

Traversing over \(i\), we can easily check whether left sum is equal to the right sum this way:

\(pre[i] == pre[n-1] - pre[i-1]\)

Basically, at each step we verify if \(i-1\) is the pivot index.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
copy_n(istream_iterator<int>(cin), N, begin(pre));
partial_sum(begin(pre), end(pre), begin(pre));

bool found = N==1;
for (auto i=1; i<N; ++i)
{
	if (pre[i] == pre[N-1]-pre[i-1])
	{
		found = true;
		break;
	}
}
cout << (found ? "YES" : "NO") << "\n";

C++ has the wonderful adjacent_find to express what we are doing in the for loop more easily:

1
2
3
4
5
6
7
8
9
partial_sum(begin(pre), end(pre), begin(pre));
        
auto equalPoint = adjacent_find(
	begin(pre), 
	end(pre), [ub=pre[N-1]](auto left, auto right) {
   		return left == ub - right;
});

cout << (equalPoint == end(pre) && N!=1 ? "NO" : "YES") << "\n";

The two previous solutions suffer zeros in the input with:

1
2 0 0 0

To fix it, just add a fake 0 at the beginning of the array - to emulate an “empty subarray”:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
cin >> N;
vector<int> v(N);
copy_n(istream_iterator<int>(cin), N, begin(v));
N+=1; // <-- NEW
partial_sum(begin(v), end(v), begin(v));
v.insert(begin(v), 0); << // <-- NEW

bool found = N==1;
for (auto i=1; i<N; ++i)
{
    if (v[i] == v[N-1]-v[i-1])
    {
	    found = true;
	    break;
    }
}
cout << (found ? "YES" : "NO") << "\n";

And:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
cin >> N;
vector<int> v(N);
copy_n(istream_iterator<int>(cin), N, begin(v));
partial_sum(begin(v), end(v), begin(v));
v.insert(begin(v), 0); // <-- NEW

auto equalPoint = adjacent_find(begin(v), end(v), [ub=v.back()](auto left, auto right) {
   return left == ub - right;
});

cout << (equalPoint == end(v) && N!=1 ? "NO" : "YES") << "\n";

Python

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
def balancedSums(arr):
    psum = 0
    sum_acc = []
    for x in arr:
        psum += x
        sum_acc.append(psum)

    for i, x in enumerate(arr):
        if i == 0:
            sum_left = 0
        else:
            sum_left = sum_acc[i-1]
        sum_right = sum_acc[-1] - sum_acc[i]
        if sum_left == sum_right:
            return "YES"
    return "NO"

Haskell

1
2
3
4
5
6
7
balancedSum :: [Int] -> Bool
balancedSum [_] = True
balancedSum arr = any match $ zip prefixSum $ tail prefixSum
  where
    match (v, v') = v' == s - v
    prefixSum = scanl1 (+) (0:arr)
    s = last prefixSum

Another version of this approach, just for fun, consists in using two prefix sums - left and right, and finding the element in common at the same index:

1
2
3
4
5
6
copy_n(istream_iterator<int>(cin), N, begin(v));
vector<int> left(N), right(N);
partial_sum(begin(v), end(v), begin(left));
partial_sum(rbegin(v), rend(v), rbegin(right));
auto equalPoint = mismatch(begin(left), end(left), begin(right), not_equal_to<>{}).first;
cout << (equalPoint == end(left) ? "NO" : "YES") << "\n";

mismatch runs a linear loop over i and returns the first occurrence of !pred(left[i], right[i]. In our case pred is not_equal_to<>, so !pred is like equal_to. It's a bit muddled, isn't it?!

This solution does not suffer zeros in the input.

Why you should care about prefix sum solutions

Prefix sums have a huge advantage compared to the other solutions. They can be very easily parallelized because the resulting algorithm is completely “stateless” (iterations have no inter-dependencies).

In C++17 we could simply add tag as first parameter of our standard algorithms to enable parallel execution. For instance:

1
2
3
4
5
6
7
partial_sum(std::execution::par, begin(pre), end(pre), begin(pre));
        
auto equalPoint = adjacent_find(std::execution::par, begin(pre), end(pre), [ub=pre[N-1]](auto left, auto right) {
   return left == ub - right;
});

cout << (equalPoint == end(pre) && N!=1 ? "NO" : "YES") << "\n";

Clearly, balance this with the number of elements you have or the algorhtm will result in worse performance.

We've worked on this challenge in these gyms: modena  turin 
comments powered by Disqus