Solutions

This problem can be solved in several ways, each carrying its own pros, cons and tradeoffs.

All the solutions are based on this observation: the answer is always \(max(6-n,4-d)\) where \(n\) is string length and \(d\) is the number of different type of characters that are already present in the input password.

Preamble

We call character family the group of characters belonging to the same requirement. The problem requires to check against 4: digits, lowercase letters, uppercase letters, special chars.

As usual, we can assume the problem won’t feed the input with other characters. This is the basic tradeoff.

A data of the problem that could change in a real environment is the number of characters required for each family. In this basic form of the problem is always ‘1’. This leads to solution that very likely just need to set a “bool flag” for each family. Flags could be turned into a counter in case of extending such requirements.

Another point is how to check if a character belongs to a family: in this problem we could even use the standard functions isdigit, isupper, and islower because we know they cover all the cases except the special chars. Again, in a real environment it can happen we must handle this simplification accordingly.

Variations discussed

changing the content of a family at runtime (e.g. reading it from file)
adding/removing families at runtime (e.g. “less-strong password” not requiring special_characters)
different number of occurrences per family (e.g. 2 lowercase, 3 special_characters)

Range check + bit mask

Very likely, the most efficient solution fully exploits the data and structure of the problem.

Considerations:

we don’t need to dynamically add families,
we don’t need to dynamically change families,
digits, lowercase chars and uppercase chars are contiguous in the ASCII table.

Saying that, we can easily map a char to its family this way:

1
2
3
4
5
6
7


int CharFamily(char c) 
{
   if(c >= '0' && c <= '9') return 0;
   if(c >= 'a' && c <= 'z') return 1;
   if(c >= 'A' && c <= 'Z') return 2;
   return 3; // the input will be valid, not need to check
}

Tradeoff: we are exploiting the structure of the ASCII table because we just need to check contiguous chars. In addition ,we know that if the char is not a digit, nor a lowercase, nor an uppercase then it must be a special char.

The main point of this solution is the contract between CharFamily and the bitmask.

Having such mapping from chars to families, we can set a flag whenever a char of a certain family is found. We can use a bit mask. In C++ we can exploit the simplicity of bitset:

1
2
3
4
5
6


bitset<4> mask;
while (cin >> c)
{
	mask.set(CharFamily(c));
}
cout << max(6-n, 4-(int)mask.count());

Pros:

fast
compact
tiny and constant extra space (bitset)

Cons (futurability/extensibility warnings):

cannot change an existing family at runtime
cannot add/remove families at runtime
need to handle if we change the number of required chars (e.g. 2 lowercase, 3 uppercase)

The first cons is easy to handle: we need to turn CharFamily into a more flexible function or we can design a proper class resolving the family id. Such customization point that can be implemented in terms of other methods such as:

library functions (e.g. isdigit, islower, isupper)
regular expressions (generally slower)
linear (or binary) search on fixed strings (as we’ll do later)

The final cons can be mitigated, for instance, by using an array which stores the frequency of each char family. The final check should be changed accordingly.

Instead of bitmask, we can use an array of integers, even if we are just flagging them. In rust it is necessary to do sum (you cannot sum booleans).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


fn strong_password(password: &str) -> usize {
    let missing_families = password
        .chars()
        .fold([1, 1, 1, 1], |[dig, low, upp, spe], c| match c {
            '0'..='9' => [dig & 0, low, upp, spe],
            'a'..='z' => [dig, low & 0, upp, spe],
            'A'..='Z' => [dig, low, upp & 0, spe],
            _ => [dig, low, upp, spe & 0],
        })
        .iter()
        .sum::<i32>();

    max(6 - (password.len() as i32), missing_families) as usize
}

Dynamic bitmask

Instead of using a bitset, we can use a number (e.g. 32bit/64bit int or uint) to handle more families dynamically. For instance, 32bit int supports up to 32 families. The rest of the code should be adapted accordingly: family ids will be mapped to powers of two.

In this case we have to change CharFamily accordingly.

If we need more than 64 families (64bit ints), we could use a dynamic bitset and, as before, count how many flags are set. In C++, although vector<bool> has many issues, it would be ok here. In other languages we have proper implementations, like C#’s BitArray.

Array-based flexible number of chars per family

In this solution we still have fixed families (e.g. lower_case is fixed) but we support configuration over number of expected occurrences per each family by using a vector called expectedFreq containing the number of chars that i-th family requires:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18


const int families = 4;
vector<int> freq(families);
vector<int> expectedFreq = {1, 1, 1, 1};

while (cin >> c)
{
	freq[CharFamily(c)]++;
}

auto charsToAdd = 0;
for (auto i=0; i<families; ++i)
{
	auto diff = expectedFreq[i] - freq[i];
	if (diff > 0)
		charsToAdd += diff;
}

cout << max(6-n, charsToAdd);

The astute reader will turn the second loop into a zip-map-filter-reduce combination.

This code could read expectedFreq dynamically. In addition, it’s independent from each family: the contract is on the “family id”, that has to be contiguously mapped to expectedFreq.

Pros:

customizable

Cons:

CharFamily’s has to output an increasing family id (strong coupling with expectedFreq)

The following solutions will take into account the problem of managing families dynamically.

A bit of flexible families

We can easily elaborate a solution where customizing a family is quick, even at runtime:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


int len; cin >> len;
string password; cin >> password;

// imagine to read the following from somewhere
// and eventually to change them dynamically
string numbers {"0123456789"}; 
string lower_case = {"abcdefghijklmnopqrstuvwxyz"};
string upper_case = {"ABCDEFGHIJKLMNOPQRSTUVWXYZ"};
string special_characters = {"!@#$%^&*()-+"};

auto d = 
	(string::npos==password.find_first_of(numbers)) + 
	(string::npos==password.find_first_of(lower_case)) +
	(string::npos==password.find_first_of(upper_case)) +
	(string::npos==password.find_first_of(special_characters));
    
cout << max(6-len, d);

(we could even use smarter search methods - e.g. BoyerMoore).

Pros:

compact
not using family id
small (but not as small as before) extra space (families)
can use (fixed) dynamic families (e.g. customization of “special_characters”)

Cons (futurability/extensibility warnings):

cannot add/remove families at runtime
need to handle if we change the number of required chars (e.g. 2 lowercase, 3 uppercase)

Tackling the former cons is easy: we could put the families into an array and perform “find_first_of” on each element (we can even read such array dynamically).

To accommodate the second possible requirement we should pretty much rewrite the algorithm because “find_first_of” does not fit anymore. In that case we have to count the occurrences of chars for each family (e.g. turning the families into a set and performing an intersection or a count operation).

If we really need a lot of flexibility, the next solution is able to handle all the previously discussed new possible requirements.

Fine-grained flexible families

In this solution, we can control families at char-level and we have full flexibility over the family ids (e.g. they are not coupled with an array index).

The basic idea is to put every char into a map with its family id. Also, it gives us control over the number of occurrences for each family, by using another map.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41


string numbers = "0123456789";
string lower_case = "abcdefghijklmnopqrstuvwxyz";
string upper_case = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
string special = "!@#$%^&*()-+";

map<char, int> charToFamilyId;

// we can even fill this map from a configuration
for(auto c: numbers)
  charToFamilyId[c] = 0;
for(auto c: lower_case)
  charToFamilyId[c] = 1;
for(auto c: upper_case)
  charToFamilyId[c] = 2;
for(auto c: special)
  charToFamilyId[c] = 3;
  
// another customization point
map<int, int> familyMinimumOcc = 
{
	{0, 1},
	{1, 1},
	{2, 1},
	{3, 1},
};

int n; cin >> n;
char c;
map<int, int> famOcc;
while (cin >> c)
  famOcc[charToFamilyId[c]]++;

auto charsToAdd = 0;
for (const auto& expected : familyMinimumOcc)
{
	auto actualOcc = famOcc[expected.first];
	if (actualOcc < expected.second)
		charsToAdd += expected.second;
}

cout << max(6 - n, charsToAdd);

Pros:

highly customizable
family-id can be anything (not necessarily an increasing index)

Cons:

more complex
slower than other solutions (just a little)
requires more space (but it’s constant)

Another similar approach consists in combining each family with the expected minimum number of occurrences:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25


int passLen; cin >> passLen;
string password; cin >> password;

// each family combined with the expected minimum number of occurrences
// this is a customization point
vector<std::pair<string, int>> families = {
    {"0123456789", 1},
    {"abcdefghijklmnopqrstuvwxyz", 1},
    {"ABCDEFGHIJKLMNOPQRSTUVWXYZ", 1},
    {"!@#$%^&*()-+", 1}
};

for (auto c : password)
{
    // which family does the character belong to?
    const auto charFamilyIt = find_if(begin(families), end(families), [=](const auto& family){
           return family.first.find(c) != string::npos; 
    });
    charFamilyIt->second--;        
}

const int toAddCount = std::count_if(begin(families), end(families), [](const auto& family){
        return family.second > 0;
});
cout << max(toAddCount, 6-passLen);

Pros:

highly customizable
simple

Cons:

slower than other solutions as each character must be searched into each family, linearly
requires more space (but it’s constant)

Set intersection

Another interesting solution worth sharing:

1
2
3
4


from string import digits, lowercase, uppercase
raw_input()
pass = raw_input()
print max(6 - len(pass), sum(not (set(st) & set(pass)) for st in (digits, lowercase, uppercase, "!@#$%^&*()-+")))

We accumulate how many times the interesection between families and the password is empty.

Accommodating possible dynamic requirements discussed previously is trivial except for configuring how many characters are needed for each family (that requires using a strategy for counting occurrences, like one of those shown previously).

Conclusions

This problem has opened up interesting discussions on customization points that can be added to the algorithm. The spirit of such exercises is to understand pros, cons and tradeoffs of the different implementations.

manifold bitset design