See the original problem on HackerRank.
Given an unsorted array of \( n-1 \) elements, it is missing exactly one element from the sequence of \( 1 \) to \( n\). Find the missing element.
Input Format
An integer \( n \) followed by \( n-1 \) space separated numeric values.
Constraints
\( 1<n<10^6 \)
Output Format
The missing value.
Solutions
This is an example of problems that we call manifold at Coding Gym, since they give us room to find several alternative solutions.
Sort + Search
We sort the input array and we linearly scan the sequence to find the missing value. For example: In C++:
|
|
In Haskell:
|
|
Pros:
- easy to extend to a more generic domain (e.g. if we have a baseline array, we can easily adapt this solution to locate the element missing in the second array)
Cons:
- we pay sort - \( O(N \cdot logN) \)
Notes:
- we change the original array (what if not permitted?)
Variant: use binary search to locate the missing value, since the array is sorted.
Additional Storage
We could keep track of elements into a support storage (e.g. a flag array). This is just an example:
|
|
Pros:
- it’s linear
- we do not modify the input array
- easy to extend to a more generic domain and also to duplicates (e.g. turn the vector into a frequency array)
Cons:
- we pay additional storage
Mathematics
We can exploit the problem domain to the limit.
- calculate the sum of all the numbers from \( 1 \) to \( N \). Call it \( expected \)
- calculate the sum of input numbers. Call if \( sum \)
- the missing value is just \( expected - sum \)
It’s interesting to note that the sum from \( 1 \) to \( N \) is the Gauss Formula:
\( \dfrac{n(n+1)}{2} \)
In C++:
|
|
Pros:
- efficient
- no extra space
- we do not modify the input array
- the same idea (summation, not Gauss) can be reused to solve the same problem in more generic domains
Cons:
- what if \( \dfrac{n(n+1)}{2} \) overflows?
In Haskell there is no overflow, because the type Integer
is unbounded (also
known as bignum).
|
|
XOR
The following solution is very similar to the previous one but it never overflows.
It’s based on XOR property:
- \( A \oplus B = C \)
- \( A \oplus C = B \)
- \( B \oplus C = A \)
That’s the idea:
- suppose \( A \) is the result of XORing all the input elements
- suppose \( B \) is the missing value
- the result of XORing \( A \oplus B \) is \( C \)
- then \( C \) is the result of XORing all the elements from 1 to N
- we can easily find \( B \) as \( A \oplus C \)
Talk is cheap, let’s look at some code:
|
|
In Haskell, but reducing with xor on a single list: the concatenation of the
original list plus numbers from 1 to n
.
|
|
Pros:
- never overflows
- the same idea can be reused to solve the same problem in more generic domains
Cons:
- tricky ;)