See the original problem on HackerRank.
The naive solution to this problem consists in sorting the array in ascending order, subtracting the minimum value from all the sticks and printing the count of non-zero sticks.
We can do better and avoid updating the stick lengths at every step (that makes the solution quadratic).
One way consists in visiting each element only once:
We start from index 0 and we print the distance between index
0 and index
8 (the end of the array), that is
Then we find the first element greater than
1 (value at the first index). It’s
2, at position
2. We print the span from
2, that is
Basically, we count the number of elements we cut off at every step. This is given by the span of the “smallest” values along the way (indeed,
N - distance(it, end(sticks)) gives back the number of smallest elements).
Since the sequence is sorted, that particular application of
find_if should suggest you another pattern:
upper_bound. Basically, a binary search for the element greater than a given one:
upper_bound may actually do more iterations than
find_if. It’s interesting to think about that: if the array has unique elements (or a few duplicates),
upper_bound unnecessarily visits more elements (it splits the search space in two and does check elements more time than
find_if). On the other hand, if the array has a lot of duplicates, it can give some advantage since it can skip many of them (instead,
find_if visits all of them linearly).
When we search for the first element greater than
find_if, we just go through all ones first.
On the other hand,
upper_bound cuts the space in two, like:
a[mid] is equal than the wanted element (
1), it moves to the right hand side:
The same happens here then at the next iteration it finds
As you see, we visited less elements than
However, when we need to find the next element greater than
2, you can see that we visit more.
Anyway, this is just pure speculation. It should be measured since other “variables” ride on this topic (e.g. caching, CPU instructions).
This solution costs \(N \cdot log(N)\) overall, since the cost of sorting prevails.
Can we do better?
A linear solution can be found just by using a frequency table. Actually, we can obtain the same cost by using a linear sorting algorithm like counting or radix sort, using additional space.
In this case the domain of the problem is quite small (just 1000 elements) so setting up such an additional space costs - practically speaking - nothing.
The idea is very similar to the others discussed above:
- we fill up the frequency table with the number of occurrences of each input value
- we start with
count = N(we print it) and
idx = 0
- at every iteration, we search the table for the next (from
idx) non-zero value and we subtract such a value from
- we print count and move on
Basically, the frequency table turns the problem of “finding the next greater element” into “finding the next non-zero element”:
This solution is linear in time and constant in space. If the domain grows up, we should decide if paying the price of allocating more memory is affordable for our scenario.