New Year Chaos

See the original problem on HackerRank.

Solutions

We can see, intuitively, that the number of people a person \(p\) has bribed corresponds to the number of people on her right with value less than \(p\). For instance:

1
1 2 3 5 4

In this case, \(5\) has bribed \(4)) and took her position.

Another example:

1
2 1 5 3 4

We have three bribes:

  • \(2\) has bribed \(1\)
  • \(5\) has bribed \(4\)
  • \(5\) has bribed \(3\)

This problem is equivalent to calculating the number of inversions of the array, that is the number of elements that are out of the natural order of a sequence.

A bruce force approach consists in running two nested loops and counting how many times \(A_i > A_j\), with \(1 \leq i \leq n\) and \(i \le j \leq n\). Clearly, this is quadratic.

Linear approach

There are many ways to calculate the inversion count in Computer Science. One approach consists in using Merge Sort, another one on Self-Balancing Trees and a less known is based on Binary Indexed Tree.

Such methods are generic and solve the problem in more than linear time - usually, function of \(N \cdot log(N)\).

Actually, we can do better for this particular problem by taking advantage of a constraint: the maximum allowed bribes for each person is \(2\). Let’s elaborate a bit more.

If we look at the last element \(A_{last}\), we have two possibilities:

  • \(A_{last} == last\)
  • \(A_{last} \neq last\)

\(last\) is the “expected element”, or the element that would have been there if the array was perfectly sorted.

If we fall into the second case, the expected element moved towards left because it has bribed its neighbor. However, it can move at most by two positions because of the constraint. So we can check where it is an act accordingly:

  • if last is in A[last-1], we swap A[last] with A[last-1]. We add \(1\) to the inversion count;
  • else, if last is in A[last-2] then we shift A[last-1] to A[last-2], A[last] to A[last-1] and set A[last] to last. We add \(2\) to the inversion count;
  • else, the sequence is too chaotic.

We can do this for each value of \(last\) down to \(0\).

For example:

1
2 1 5 3 4

We start from the last element. The expected element is \(5\) but we find \(4\). Where is \(5\)? Is \(2\) positions towards left. So we shift the last three elements:

1
2 1 3 4 5

And we add \(2\) to the inversion count (that is 2 now).

Then we find 4 and 3 in the right positions.

Next, we find that 2 is not in the right position. It’s just one position away on the left. So we swap \(2\) and \(1\) and add 1 to the inversion count.

We have found that 3 is the number of inversions!

On the other hand, this example:

1
1 5 2 3 4

is too chaotic because 5 moved to far towards left.

Here is a possible implementation in C++:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
void solve(vector<int>& v)
{
    int N = v.size();
    auto cnt = 0;
    for (auto i=N-1; i>=0; --i)
    {
        auto expected = i + 1;           
        if (v[i] != expected)
        {
            if (i >= 1 && expected == v[i-1])
            {
                ++cnt;
                swap(v[i], v[i-1]);
            }
            else if (i >= 2 && expected == v[i-2])
            {
                cnt += 2;
                v[i-2] = v[i-1];
                v[i-1] = v[i];
                v[i] = i;
            }
            else
            {
                cout << "Too chaotic\n";
                return;
            }
        }
    }
    cout << cnt << "\n";
}

It’s interesting to see that we can turn the shift into an application of std::rotate:

1
rotate(begin(v)+i-2, begin(v)+i-1, begin(v)+i+1);

This solution is linear only because of the constraint. Otherwise, it would be function of the “maximum distance” \(d\): \(O(d*N)\). In the most geneal form, \(d\) is \(N\) and this solution is quadratic.

Clearly, we also need to take into account that this solution changes the input.

An alternative

We might dive into the known implementations we have referenced above but we would like to show you a (less efficient) alternative.

The solution is divided in two steps:

  • check if the sequence is too chaotic;
  • count the number of inversions.

The first step can be solved linearly and for any constraint on the maximum distance (2, for our problem):

1
2
3
4
5
6
7
bool excess = false;
const int d = 2; // can be any number
for (auto i=0u; !excess && i<N; ++i)
{
    auto expPos = v[i]-1;
    excess = (expPos > i) && (expPos-i > d);
}  

Then we could calculate the number of inversions with any technique we like.

Even though it’s not the most efficient way (better ways are discussed in the links we provided above), let us show you an interesting pattern.

Suppose we copy the elements from our input sequence into another sequence so that the resulting sequence is sorted.

If the input sequence is already sorted, for any element \(A_i\), we always insert at the end of the destination sequence.

Otherwise, we should find where to insert \(A_i\) in the sorted sequence.

For example:

1
1 2 4 3 5

Suppose we have already copied 1 2 4 and we are about to copy 3. Since the sequence is not sorted, we should insert 3 between 2 and 4:

1
2
1 2   4
    ^

So, the number of elements on the right of the arrow are the out of order elements up to that point.

How to locate such element quickly (better than linear time)? It’s an upper bound (actually can be both upper bound or lower bound because the sequence does not have duplicates)!

So, the distance between the upper bound and the end of the destination sequence will be added to the final count.

Let’s do a full example:

1
2 1 5 3 4

We start from an empty destination.

We read 2 and we insert it straight into the destination (the upper bound is “nul”).

Then we read 1 and we get the upper bound, that is 2. So we do +1 to our counter.

We read 5. The upper bound is “nul” so we do not add anything to our counter.

Destination at this point:

1
1 2 5

We read 3. The upper bound is 5. We add +1 to the counter (that becomes 2).

We read 4. The upper bound is still 5 and we add +1 to the counter (3).

The answer is, indeed, 3.

In the following example:

1
2 3 1 4 5

When we read 1 the destination sequence is:

1
2 3

So upper bound(1) is 2 and 2 elements are on the right:

1
2
  2 3
^ 

The destination might be a sorted set. Here is an implementation in C++:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#include <set>
#include <vector>
#include <iostream>
#include <algorithm>
#include <iterator>

using namespace std;

size_t getInvCount(const vector<int>& arr)
{
    set<int> set1;   
    size_t invcount = 0;    
    for (int i = 0; i < arr.size(); i++)
    {
        auto itset = set1.upper_bound(arr[i]);
        set1.insert(itset, arr[i]);       
        invcount += distance(itset, set1.end());
    }
    return invcount;
}

int main(){
    int T, N;
    cin >> T;
    vector<int> v; 
    v.reserve(100000);
    while (T--)
    {
        cin >> N;
        v.resize(N);
        copy_n(istream_iterator<int>(cin), N, begin(v));
        bool excess = false;
        for (auto i=0u; !excess && i<N; ++i)
        {
            auto expPos = v[i]-1;
            excess = (expPos > i) && (expPos-i>2);
        }    
        cout << (excess ? "Too chaotic" : to_string(getInvCount(v))) << '\n';
    }
}

Two facts:

  • we do not pay a *double lookup” because insert is called properly with a valid hint;
  • distance on the set iterators is amortized linear. However, since of the initial constraint, this results constant in our case.

The solution uses extra space but it does not change the input.

The overall complexity is \(O(N \cdot log(N))\).

What it’s interesting about this exercise:

  • we found a nice application of upper bound;
  • we exploited and understood the constraints of the problem;
  • we can start from here to delve into the topic of counting inversion, studying and applying more generic (and efficient) solutions.
We've worked on this challenge in these gyms: modena 
comments powered by Disqus