See the original problem on HackerRank.
Solutions
A brute force solution involves building the binary string one character at a time, starting from an empty string. At each step, it tries adding a 0
or a 1
and continues until the string reaches length n
. When a full-length string is formed, it checks if it’s already in the given list. If it’s not, the function prints it and stops. This approach guarantees that the first missing string found is among the smallest in lexicographical order.
Here is a recursive implementation:
|
|
In the worst case, the time complexity is \(O(N^2)\), since hashing a string of length \(N\) takes \(O(N)\) and \(N\) strings may need to be checked. The space complexity is \(O(N^2)\) for storing \(N\) strings of length \(N\), plus \(O(N^2)\) for the recursion stack (as we store a string up to length \(N\) at each level).
A similar approach is iterative, for example using a stack:
|
|
A linear solution
An efficient solution relies on a simple observation: since we have \(N\) binary strings of length \(N\), we can think of them as forming an \(N \cdot N\) grid, one string per row. The diagonal of this grid, made by taking the i-th character from the i-th row, forms a new string of length \(N\). If we flip each bit of this diagonal, the resulting string is guaranteed to differ from every original string in at least one position (specifically, at index i
for the i-th string), so it cannot be part of the input list.
For example, suppose we’re given the following 5 binary strings:
|
|
If we read the diagonal (the 0th character of the 0th string, the 1st of the 1st, and so on) we get:
|
|
Flipping each bit of this diagonal string gives:
|
|
This new string is guaranteed not to appear in the input, because it differs from the i-th string at position i
.
This solution is linear in time and constant in space.
This applies Cantor’s diagonal argument, a classical idea from set theory. Originally, Cantor used it to show that some infinite sets, like the real numbers, are uncountable (they can’t be matched one-to-one with the natural numbers). This concept introduced the notion of different sizes of infinity and laid the foundation for cardinality theory. In our setting, we use the same diagonal trick.
Here is the code:
|
|
And this, with ranges:
|
|
Playing with small values of N
A useful variation of the problem arises when working with values of \(N\) that can fit into standard integer types (like 32 or 64 bits). Since we’re given \(N\) binary strings of length \(N\), and there are \(N+1\) possible integers in the range \([0, N]\), at least one of them is guaranteed to be missing from the input.
The approach is to convert each binary string into its decimal representation and store these values in a set. Then, by scanning from 0
to N
, we can efficiently find the first number that isn’t in the set. This missing number corresponds to a valid binary string that does not appear in the input.
Here is a code example in Python:
|
|