Secret repetition

See the original problem on HackerRank.

Solutions

This problem is very easy but, as we’ll see in a minute, it can be solved in a very clever way.

The easiest approach uses extra space, a frequency table. The core of the algorithm in C++:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
int N; cin >> N;
vector<int> A(2*N);
copy_n(istream_iterator<int>(cin), 2*N, begin(A));
vector<int> freq(1'000'000 + 1);
for (auto i : A)
{
    freq[i]++;
    if (freq[i] > 1)
    {
       cout << i;
       break;
    }
}

This solution is linear.

Other data structures work as well, such as static arrays (when the domain is smaller) and hash maps.

Clearly, the domain plays a crucial role: here, the extra space used is constant because the domain is fixed and does not vary with the input. It’s a trade-off.

Another working solution avoids allocating extra space by exploiting two observations over the input format:

  1. since we have \(N+1\) unique elements in a \(2 \cdot N\) array and the repeating one is present \(N\) times, this means that there can be no other element repeating
  2. if half of the array is repeated, it doesn’t matter how we shuffle it, for one of the repeated elements there has to be another repeated element at least 3 positions away, otherwise the array isn’t \(2 \cdot N\).

Thus, we only have to compare elements with their neighbors that are distance 1, 2, or 3 away:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
for (int k = 1; k <= 3; ++k)
{
   for (auto i = 0u; i < A.size() - k; ++i)
   {
      if (A[i] == A[i + k])
      {
         cout << A[i];
         return;
      }
   }
}

Practically speaking, this solution is a bit slower than the other one (mostly because it has an inner loop) but it’s very interesting to find.

As we have seen, both the solutions exploit some data of the input domain.

Another similar solution that iterates on the array only once:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
if (A.front() == A.back())
    cout << A.front();
else
{
	for (auto i = 2u; i < A.size(); ++i)
	{
		if (A[i] == A[i - 1] || A[i] == A[i - 2]) 
		{
			cout << A[i];
			return;
		}
	}
}

The only case where we need the explicit if condition above is when N=2. For example:

1
2 1 3 2

In this case, the result is the last number, so we just return it in the end:

1
2
3
4
5
6
7
8
9
for (auto i = 0; i < A.size() - 2; ++i)
{
   if (A[i] == A[i + 1] || A[i] == A[i + 2])
   {
      cout << A[i];
      return;
   }
}
cout << A.back(); 

By the way, Python allows us to remove the condition due to the behavior of the slice operator (e.g. A[-1] returns the last element):

1
2
3
for i in range(len(A)):
   if A[i - 1] == A[i] or A[i - 2] == A[i]:
      return A[i]

An esotheric solution consists in observing that if you pick two numbers randomly, there is a 25% chance you bump into the repeated number. So, in average, we will find the answer in 4 attempts, thus O(4) runtime!

Taken from here:

1
2
3
4
5
void repeatedNTimes(vector<int>& A, int i = 0, int j = 0) 
{
  while (A[i = rand() % A.size()] != A[j = rand() % A.size()] || i == j);
  cout << A[i];
}
We've worked on this challenge in these gyms: modena 
comments powered by Disqus