src.group_anagrams

Classes

Solution()

class src.group_anagrams.Solution[source][source]
groupAnagrams(strs: list[str]) list[list[str]][source]

Thought process

  • res stores word frequency -> list of words

  • defaultdict(list) is a dict that initalizes to list, avoids edge case of key not in dict

  • We iterate through the words

  • We create a count of the frequency of each character in the word

  • We convert the count to a tuple to make it hashable (potentially unecessary in other langs)

  • ord(c) - ord('a') maps the ascii value of the character to a number between 0-25, this gives us the index to store the count

  • count[ord(c) - ord('a')] += 1 increments the count of the character

Notes

  • time complexity: \(O(m\times n)\) where \(m\) is the number of words and \(n\) is average the length of the word

  • space complexity: \(O(m)\)

    Interestingly enough, this solution in practice actually appears to be slower than the solution using sorting() honestly im not too sure why. One reason a friend of mine come up with is that for short words the complexity reduces to \(O(n\log n)\) though in testing it seems even with considerably large words sorting seems to be faster, additionally this seems to hold for some other languages. For python the algorithm used past 3.11 is nearly optimal mergesort. If anyone knows why its seemingly faster than counting do tell.

groupAnagramsSorting(strs: list[str]) list[list[str]][source]

Thought process

  • Similarly for the basic anagram problem, if we sort the words we can group them together

  • We can use a dictionary to store the sorted word -> list of words

Notes

  • time complexity: \(O(m\times n\log n)\) where \(m\) is the number of words and \(n\) is average the length of the word

  • space complexity: \(O(m)\)

groupAnagramsFast(strs: list[str]) list[list[str]][source]

Thought process

  • We can cache the key creation function to avoid recomputing the key

  • Rest of the code is the same as before