Find the Missing Value

See the original problem on HackerRank.

Given an unsorted array of \( n-1 \) elements, it is missing exactly one element from the sequence of \( 1 \) to \( n\). Find the missing element.

Input Format

An integer \( n \) followed by \( n-1 \) space separated numeric values.

Constraints

\( 1<n<10^6 \)

Output Format

The missing value.

Solutions

This is an example of problems that we call manifold at Coding Gym, since they give us room to find several alternative solutions.

We sort the input array and we linearly scan the sequence to find the missing value. For example: In C++:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <iterator>
#include <cstdio>
#include <vector>
#include <iostream>
#include <algorithm>
using namespace std;

int main() 
{
    int n; cin >> n;
    vector<int> v(n-1);
    copy_n(istream_iterator<int>(cin), n-1, begin(v));
    sort(begin(v), end(v));
    auto missing = n;
    for (auto i=1; i<n; ++i)
    {
        if (i != v[i-1])
        {
            missing = i;
            break;
        }
    }
    cout << missing;
}

In Haskell:

1
2
3
import Data.List (sort)
findTheMissingValue (_:xs) = head [ i | (x, i) <- zip (sort xs) [1..], i /= x ]
main = interact $ show . findTheMissingValue . fmap read . words

Pros:

  • easy to extend to a more generic domain (e.g. if we have a baseline array, we can easily adapt this solution to locate the element missing in the second array)

Cons:

  • we pay sort - \( O(N \cdot logN) \)

Notes:

  • we change the original array (what if not permitted?)

Variant: use binary search to locate the missing value, since the array is sorted.

Additional Storage

We could keep track of elements into a support storage (e.g. a flag array). This is just an example:

1
2
3
4
5
vector<bool> hits(n, false); // C++ addicted will disapprove this!
for (auto val : v)
    hits[val-1] = true;
    
cout << (distance(begin(hits), find(begin(hits), end(hits), false))+1);

Pros:

  • it's linear
  • we do not modify the input array
  • easy to extend to a more generic domain and also to duplicates (e.g. turn the vector into a frequency array)

Cons:

  • we pay additional storage

Mathematics

We can exploit the problem domain to the limit.

  • calculate the sum of all the numbers from \( 1 \) to \( N \). Call it \( expected \)
  • calculate the sum of input numbers. Call if \( sum \)
  • the missing value is just \( expected - sum \)

It's interesting to note that the sum from \( 1 \) to \( N \) is the Gauss Formula:

\( \dfrac{n(n+1)}{2} \)

In C++:

1
2
3
auto sum = accumulate(begin(v), end(v), 0ull); // aka: reduce
auto expected = n*(n+1)/2;
cout << (expected - sum);

Pros:

  • efficient
  • no extra space
  • we do not modify the input array
  • the same idea (summation, not Gauss) can be reused to solve the same problem in more generic domains

Cons:

  • what if \( \dfrac{n(n+1)}{2} \) overflows?

In Haskell there is no overflow, because the type Integer is unbounded (also known as bignum).

1
2
3
findTheMissingValue :: [Integer] -> Integer
findTheMissingValue (n:xs) = (n * (n + 1) `div` 2) - sum xs
main = interact $ show . findTheMissingValue . fmap read . words

XOR

The following solution is very similar to the previous one but it never overflows.

It's based on XOR property:

  • \( A \oplus B = C \)
  • \( A \oplus C = B \)
  • \( B \oplus C = A \)

That's the idea:

  • suppose \( A \) is the result of XORing all the input elements
  • suppose \( B \) is the missing value
  • the result of XORing \( A \oplus B \) is \( C \)
  • then \( C \) is the result of XORing all the elements from 1 to N
  • we can easily find \( B \) as \( A \oplus C \)

Talk is cheap, let's look at some code:

1
2
3
4
5
6
7
8
auto xorVals = accumulate(begin(v), end(v), 0, bit_xor<>{}); // aka: reduce
auto expected = 0;
// naive approach, for an efficient method see, for instance,
// here https://www.geeksforgeeks.org/calculate-xor-1-n/
for (auto i=1; i<=n; ++i)
    expected ^= i;
    
cout << (expected ^ xorVals);

In Haskell, but reducing with xor on a single list: the concatenation of the original list plus numbers from 1 to n.

1
2
3
4
5
import Data.Bits (xor)
findTheMissingValue :: [Integer] -> Integer
findTheMissingValue (n:xs) = foldl xor 0 ([1..n] ++ xs)
main = interact $ show . findTheMissingValue . fmap read . words

Pros:

  • never overflows
  • the same idea can be reused to solve the same problem in more generic domains

Cons:

  • tricky ;)
We've worked on this challenge in these gyms: modena  padua  milan  barcelona  turin  bari  rome  polimi 
comments powered by Disqus