As a part of my Python learning endeavor I have tossed up this fun script. It finds all anagrams in given dictionary in O(nlogn) time. The algorithm goes like this:
- Read dictionary from file.
- Create a copy of all words and store them in 2D array, along with originals.
- Sort letters in these copies of words.
- Sort 2D array by sorted letters of words. This is the O(nlogn) part.
- Words that consist of the same letters, anagrams, should now be adjacent.
- Traverse the array. If the same sorted letter “word” appears, the original word is anagram of it and it has been found.
Beside the algorithm, the script also features command line option parsing for telling it where is the dictionary file and where should it store anagrams. It does output a lot of words with just one switched letter. It could therefore be made better, if hamming distance was attached to them. You could sort them again in decreasing order of hamming distance, which would make more interesting anagrams come first in the list. I didn’t go there. Here is the basic script:
import argparse
import codecs
import sys
from operator import itemgetter
parser = argparse.ArgumentParser()
parser.add_argument("-f", "--file", help="dictionary file")
parser.add_argument("-o", "--output", help="output file")
args = parser.parse_args()
def append_to_file(filename,text):
if filename is None or filename == "":
return
file = codecs.open(filename, "a", "utf8")
file.write("{0}".format(text))
file.close()
def order (line):
chars = list(line)
chars.sort(key=itemgetter(0))
ordered = "".join(chars)
return ordered
# read from file and prepare
words = []
with open(args.file, "r") as f:
for line in f:
word = []
word.append(line.lower())
word.append(order(word[0]))
words.append(word)
# sort
words.sort(key=itemgetter(1))
anagram = False
for i in range(1,len(words)):
if (words[i-1][1] == words[i][1]) and (words[i-1][0] != words[i][0]): # if duplicates in dictionary
if anagram == False:
anagram = True
append_to_file(args.output,"----\n")
append_to_file(args.output,words[i-1][0])
append_to_file(args.output,words[i][0])
else:
anagram = False
I had some fun with Slovene dictionary. I made a whole sentence almost exclusively from anagrams:
Raketirajva akvaterarij zaradi anektiranja Karantanije. /Let’s rocket the aquaterrarium for annexing Carantania.
And a poem exclusively from anagrams! But frankly, it only vaguely means something:
Lagodni dolgani godalni dognali,
dolgina nadlogi gladino odgnali.
I don’t even dare to translate it. I’d be happy if anyone posted some of their interesting/funny/stupid results in comments ;)
Related articles
- Link Of The Week- Anagrams, They Never Lie! (musetracks.wordpress.com)
- PyCon 2013 Coding Challenge Roundup (thumbtack.com)
- C++ – Anagram Checker (bradenlenz.com)
- How to check if two String are Anagram in Java (javarevisited.blogspot.com)
- Meet The Master Anagrammist (mentalfloss.com)

Reddit
LinkedIn
RSS
Twitter
Google