LeetCode 75 - Part 5 - Hash Map / Set
Intent
Confession: I’ve never been comfortable with DSA, Coding Problems or Competitive Programming.
And so this is one more attempt to change that.
LeetCode 75. Two problems a day, starting April 25, 2024.
To further motivate consistency and completion of this, I’ve decided I will do write-ups for all of these problems and perhaps publish them on my blog. Let’s see.
This is Part 5 - Hash Map / Set.
5.1 Find the Difference of Two Arrays
Question
Given two arrays, find both set differences of the two arrays (A-B and B-A).
Solution
We can find a solution for A-B, and use the same one for B-A.
A naive way of finding set difference is to search through the subtrahend B for each element in the minuend A (are these the correct terms to use here?). If we don’t find a match, that element from A is part of the result set A-B. Note that we need to return a set as the result, so it cannot have duplicates, so we need to deduplicate the result we find from the naive search too (sorting or hashset). This would have a time complexity of O(N*M)
.
We can use a hash set to store the unique occurences of B, and then iterate through A, checking for a match in B’s hash set. If we don’t find a match, we add it to the result hash set A-B. This gives us an O(N+M)
time solution with O(M)
auxiliary space complexity (A-B is part of the result).
We can also sort B, and then iterate through A to find matches in a sorted B using binary search. Each search would be O(logM)
, so overall time complexity would be O(N*logM + M*logM) = O((N+M)logM)
(NlogM
for searching A’s elements, and MlogM
for sorting B). Again deduplication would be required (or use hash set of A to begin with).
|
|
Fancy C++:
|
|
We can also use a procedure like the merge in Merge-sort. Good for practice with pointers and procedurals :dizzy_face: but i don’t think this version is very readable:
|
|
In other languages:
Python
|
|
Lol
JavaScript
|
|
Apparently the difference
method of JS’s Set
has limited availability.
C#
|
|
Java
|
|
5.2 Unique Occurrences
Question
Given an array of integers, check whether number of occurences of each value in the array is unique or not.
Solution
We need to check if the frequencies of each value are unique or not. To find the frequencies, we can use a hash map, and then to find if all these frequencies are unique or not, we can use a hash set and check if the size of this hash set of frequencies is the same as the frequency hash map. We can also sort the frequencies and then check if each one is unique or not. This would save some space, but will not change the space complexity as we’re already creating a hash map of frequencies (O(N)
). Also, just cleaner?
|
|
5.3 Determine if Two Strings Are Close
Question
From LeetCode:
Two strings are considered close if you can attain one from the other using the following operations:
- Operation 1: Swap any two existing characters.
- For example,
abcde -> aecdb
- Operation 2: Transform every occurrence of one existing character into another existing character, and do the same with the other character.
- For example,
aacabb -> bbcbaa
(alla
's turn intob
's, and allb
's turn intoa
's) You can use the operations on either string as many times as necessary.
Given two strings,word1
andword2
, returntrue
ifword1
andword2
are close, andfalse
otherwise.
Solution
After trying out some examples, we can see that this problem can be reduced to finding whether the list of frequencies of characters in the two strings are the same and if every character present in one string is present in the other. Note that the frequency of a particular character in both strings need not match - rather it is the list of frequencies of the characters (decoupled from the characters themselves (since they can be transformed into others)) that must match. We can use a hash map to find frequencies in each string (here a 26-length array is fine since only lowercase alphabets are present in the strings).
|
|
5.4 Equal Row and Column Pairs
Question
Given an $N \times N$ matrix, find the number of pairs (row, column) where the row == column (element-wise).
Solution
The brute-force solution will involve checking each column for each row and comparing the two. A comparison would take O(N)
time and to iterate through each candidate pair would take O(N^2)
time, bringing the overall time complexity of this approach to O(N^3)
.
|
|
We need to somehow store the knowledge of a row in a data structure so that we can check if a column matches with it. This is a use case for hashing. We need to hash each row and store its hash in a hash map (the value being the frequency of such a row), and then iterate through each column, check whether a column’s hash exists in the map, and add to the count of result pairs accordingly. Now the question is - how to hash a row (or in general, an array or vector)?
Some solutions suggest using serialization to string or JSON for the row - then this serialized string becomes the hash key. The calculation and comparison of these hashes is O(N)
, so for N
rows, creating the hash map would be O(N^2)
and checking for N columns would again be O(N^2)
. This is better than the naive O(N^3)
solution, and takes O(N^2)
space (since each row could be unique, so N serialized arrays of size N would be stored as keys).
Hashing strings should hint towards tries! Instead of storing each array by serializing it, we could instead use a trie, where each node corresponds to an element in the arrays. The root node denotes an empty array and each node has a children
hash map which points to its children using the elements as keys. In the leaf nodes, we store the count of such arrays (number of rows). Then we try finding each column in the trie, and if we do find a match till the leaf node, we update the result count of matching row-column pairs with the count at the leaf node. This has the same time complexity O(N^2)
and space complexity O(N^2)
but is still more space-efficient than the hash map using serialization (lot of subarrays can be repeated in the keys).
|
|