Pair picker

See the original problem on HackerRank.

Solutions

The brute force solution has a quadratic time complexity and involves iterating through each value while checking if it forms a matching pair.

For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
int N; cin >> N;
vector<int> cards(N);
copy_n(istream_iterator<int>(cin), N, begin(cards));
auto res = std::numeric_limits<int>::max();

for(auto i=0; i<N; ++i)
{
    for(auto j=i+1; j<N; ++j)
    {
        if(cards[i] == cards[j])
        {
            res = min(res, j-i+1);
        }
    }
}

cout << (res == std::numeric_limits<int>::max() ? -1 : res);

A more optimized, linear, solution involves keeping track of the positions of the last occurrence of each card value. As we iterate through the array, we calculate the minimum number of picks needed to form a matching pair. The interesting part of this solution lies in the choice of data structures used to memorize the last positions of card values, as this directly impacts the efficiency and simplicity of the solution.

For example, we might use a map (either tree-based or hash-based), such as:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
int N; cin >> N;
vector<int> cards(N);
copy_n(istream_iterator<int>(cin), N, begin(cards));

map<int, int> lastPos;
auto res = std::numeric_limits<int>::max();;
for (auto i=0; i<N; ++i)
{
    const auto card = cards[i];
    if (auto it = lastPos.find(card); it != end(lastPos))
    {
        res = min(res, i - it->second + 1);
    }
    lastPos[card] = i;
}

cout << (res == std::numeric_limits<int>::max() ? -1 : res);

Or directly an array, since the domain is quite limited (the card values span from \(0\) to \(10^6\)):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
int N; cin >> N;
 
array<int, 1000001> freq{};
auto res = std::numeric_limits<int>::max();

for(auto i=0; i<N; ++i)
{
    int card; cin >> card;
    if(freq[card])
    {
        res = std::min(res, i - freq[card] +  2);
    } 
    freq[card] = i+1;
}

cout << (res == std::numeric_limits<int>::max() ? -1 : res);

Regardless of the data structure choice, the solution is online, meaning it processes the input one element at a time without needing to store the entire card deck in memory. The space complexity only depends on the domain size, thus is “constant” - specifically, it corresponds to the number of unique card values that can appear in the input (\(10^6 + 1\)).

Here is a simple benchmark showing performance differences of the three main data structures proposed. Feel free to experiment with different input sizes and variations to gain deeper insights into the performance of each of the three solutions.

We've worked on this challenge in these gyms: modena 
comments powered by Disqus