See the original problem on HackerRank.
A professor calls out student IDs of students one by one while marking attendance. He notices that the number of students recorded in the attendance sheet is far more than the number of students who are actually present in the classes.
Hence, he decides to use a robot which can record the students’ voices and keep track of which students have responded to attendance calls.
At the end of each session, the robot outputs the student IDs of the students who have responded to attendance calls. With this information, the professor needs your help to find out which students were absent.
Complete the function findTheAbsentStudents which takes in an integer array \( a \) denoting the student IDs recorded by the robot and returns the list of student IDs of the students which were absent in increasing order.
The first line of input contains a single integer \( n \) denoting the number of students.
The second line contains \( n \) space-separated integers \( a_1,a_2,...,a_n \) denoting the student IDs recorded by the robot. The students have IDs from \( 1 \) to \( n \), inclusive.
- \( 1 \leq n \leq 100 \)
- \( 1 \leq a_i \leq n \)
- There is at least one absent student.
Print a single line containing the student IDs of the students which were absent, space-separated and in increasing order.
Since the input of this problem is quite little, an \( O(N^2) \) solution works. Basically, we could loop from 1 to \( N \) and check if each element appears in the input. If not, we print that number.
We are at Coding Gym and we could do much better than this.
At Coding Gym, we tag this kind of challenges as “Manifold”, meaning that many different solutions exist.
First of all, it’s worth noticing that this exercise is a specialization of a much more general problem: given two list of elements, find which elements from the first list are missing in the second. Usually, to solve such a problem hash tables or set difference are enough.
A very first solution to our challenge consists in creating a set from \( 1 \) to \( N \) and simply calculating the difference with the input, turned into a set too:
- \( O(N) \) extra space for the support range
- turning the input sequence into a sorted set is \( O(N \cdot LogN) \)
- set difference is linear
- absent students set must be sorted for printing
Hashing is discussed bit at the end.
Anyway, for our problem such extra space is not really a problem and we could even generate an incremental range on the fly.
What makes our problem even more interesting is that we can exploit its structure and constraints.
We need to check numbers from \( 1 \) to \( N \). This means we can sort the input and just check linearly:
We can decide to remove duplicates or not depending on the size of the input: since we know that elements spans from \( 1 \) to \( 100 \), it’s not hard to imagine that when \( N \) is bigger that 100, duplicates are certain.
A similar solution consists in using a set (or any other sorted data structure) instead of array + sort.
Sometimes we are allowed to modify the input, this means we could not remove duplicate elements. In this case it’s worth mentioning this variant based on lower bound:
All the solutions above are affected by sorting, meaning they are order of \( O(N \cdot LogN) \).
Is it possible to solve this problem linearly?
Since the solutions above are dominated by the sort cost, we have to think of another way to obtain the same information that sort gives without actually sorting.
We could use some extra space to mark each element we read. Actually, if we don’t really need the input, we won’t have any extra space at all!
Since the input is little, we could have a constant extra space:
In general, we could use a hash table:
This last snippet could adapted to solve the more general problem of two arbitrary ranges.
Javscript Solution (alepez)