diofeher

Solving anagrams problem using python

Hey! I've been studying algorithms this week and I found the problem below:

Devise a function that gets one parameter 'w' and returns all the anagrams for 'w' from the file wl.txt. anagrams("horse") should return: ['heros', 'horse', 'shore']

First let's start with "Anagram" concept: An anagram is a type of word play, the result of rearranging the letters of a word or phrase to produce a new word or phrase, using all the original letters exactly once; for example orchestra can be rearranged into carthorse.

To parse the file list I used:


with open('wl.txt', 'r') as f:
    word_list = [word.strip('\n') for word in f.readlines()]

My first thought was to use itertools.permutations to generate all the possibilities and check if they were in word_list.


import itertools
def anagrams(w):
    result = []
    for i in itertools.permutations(w):
        # itertools.permutations returns a list with tuples of all the 
        # combination [('h', 'o', 'r', 's', 'e'), ('h', 'o', 'r', 'e', 's')...]
        new_word = ''.join(i)
        if new_word in word_list:
            results.append(new_word)
    return results

anagrams('horse')

The code worked fine but it was still slow. So I decided to ask to some friends and one came with "Anagram is an ordering difference! With some insight you can see that, so you have to sort both words and compare.". Man, that was exactly the point I was missing. I made some benchmarks with the new code and cut 50% of the time spent. Here is the solution:


def anagrams(w):
    result = []
    word_letters = sorted(list(w))
    for word in word_list:
        if sorted(word) == word_letters:
            result.append(word)
    return result

anagrams('horse')
comments powered by Disqus