See the original problem on HackerRank.
Given an unsorted array of \( n1 \) elements, it is missing exactly one element from the sequence of \( 1 \) to \( n\). Find the missing element.
Input Format
An integer \( n \) followed by \( n1 \) space separated numeric values.
Constraints
\( 1<n<10^6 \)
Output Format
The missing value.
Solutions
This is an example of problems that we call manifold at Coding Gym, since they give us room to find several alternative solutions.
Sort + Search
We sort the input array and we linearly scan the sequence to find the missing value. For example: In C++:


In Haskell:


Pros:
 easy to extend to a more generic domain (e.g. if we have a baseline array, we can easily adapt this solution to locate the element missing in the second array)
 with a small effort, can work with duplicates
Cons:
 we pay sort  \( O(N \cdot logN) \)
 requires modifying the array (what if not permitted?)
Variant: use binary search to locate the missing value, since the array is sorted.
Optimized sort + search
Since the domain is \( [1N] \), we can actually sort in \( O(N) \) by applying Cycle sort (we can even sort with Counting sort or such, however those require some additional storage):


Pros:
 it’s linear
 minimal number of element writes
Cons:
 we do modify the array
 if the domain changes, cycle sort might be inconvenient (in general, it’s quadratic)
Additional Storage
We could keep track of elements into a support storage (e.g. a flag array). This is just an example:


Pros:
 it’s linear
 we do not modify the input array
 easy to extend to a more generic domain and also to duplicates (e.g. turn the vector into a frequency array or map)
Cons:
 we pay additional storage
Note: since the domain is relatively small, we can even use constant storage:


Mathematics
We can exploit the problem domain to the limit.
 calculate the sum of all the numbers from \( 1 \) to \( N \). Call it \( expected \)
 calculate the sum of input numbers. Call if \( sum \)
 the missing value is just \( expected  sum \)
It’s interesting to note that the sum from \( 1 \) to \( N \) is the Gauss Formula:
\( \dfrac{n(n+1)}{2} \)
In C++:


Pros:
 efficient
 no extra space
 we do not modify the input array
 the same idea (summation, not Gauss) can be reused to solve the same problem in more generic domains
Cons:
 what if \( \dfrac{n(n+1)}{2} \) overflows?
In Haskell there is no overflow, because the type Integer
is unbounded (also known as bignum).


A C++ variation that does not suffer for integer overflow:


XOR
The following solution is very similar to the previous one but it never overflows.
It’s based on XOR property:
 \( A \oplus B = C \)
 \( A \oplus C = B \)
 \( B \oplus C = A \)
That’s the idea:
 suppose \( A \) is the result of XORing all the input elements
 suppose \( B \) is the missing value
 the result of XORing \( A \oplus B \) is \( C \)
 then \( C \) is the result of XORing all the elements from 1 to N
 we can easily find \( B \) as \( A \oplus C \)
Talk is cheap, let’s look at some code:


In Haskell, but reducing with xor on a single list: the concatenation of the
original list plus numbers from 1 to n
.


Pros:
 never overflows
 the same idea can be reused to solve the same problem in more generic domains
Cons:
 tricky ;)
Marking array elements
Another  a bit hackish  solution to solve the problem consists in marking array elements as negative by using array elements as indexes.
For example:


The first pass of the algorithm marks as negative those numbers whose indexes are elements of the array (they are not missing elements):
5
can’t lead to any numbers marked because it would go out of bounds1
leads to marking5
as5
(because5
is at1
’s position that is0
)3
leads to marking3
as3
(because3
is in the right position)2
leads to marking1
as1
(because1
is at2
’s position that is1
)
So, after this first pass, the array will look like:


The second part of the algorithm checks if any of the numbers is not negative. If any, the missing element is just the next index (because the domain starts from 1
). In this case above, the missing element is 4
because 2
is positive and lies at index 3
(so the solution is the number 3+1
).
The only exception is when, after the first pass, all the elements get negative. Here, the missing element is just N
:


After the first pass:


The missing element is 5
(N=5
).
Talk is cheap, show me the code:


Side note: we can restore the original array by turning negative elements positive.
Pros:
 still linear
 can’t overflow
Cons:
 not easily generalizable