How to find anagrams in Python

As a part of my Python learning endeavor I have tossed up this fun script. It finds all anagrams in given dictionary in O(nlogn) time. The algorithm goes like this:

  1. Read dictionary from file.
  2. Create a copy of all words and store them in 2D array, along with originals.
  3. Sort letters in these copies of words.
  4. Sort 2D array by sorted letters of words. This is the O(nlogn) part. Words that consist of the same letters, anagrams, should now be adjacent.
  5. Traverse the array. If the same sorted letter "word" appears, the original word is anagram of it and it has been found.

Beside the algorithm, the script also features command line option parsing for telling it where is the dictionary file and where should it store anagrams. It does output a lot of words with just one switched letter. It could therefore be made better, if hamming distance was attached to them. You could sort them again in decreasing order of hamming distance, which would make more interesting anagrams come first in the list. I didn't go there. Here is the basic script:


import argparse
import codecs
import sys
from operator import itemgetter

parser = argparse.ArgumentParser()
parser.add_argument("-f", "--file", help="dictionary file")
parser.add_argument("-o", "--output", help="output file")
args = parser.parse_args()

def append_to_file(filename,text):
  if filename is None or filename == "":
    return
  file = codecs.open(filename, "a", "utf8")
  file.write("{0}".format(text))
  file.close()

def order (line):
  chars = list(line)
  chars.sort(key=itemgetter(0))
  ordered = "".join(chars)
  return ordered

# read from file and prepare
words = []
with open(args.file, "r") as f:
for line in f:
  word = []
  word.append(line.lower())
  word.append(order(word[0]))
  words.append(word)

# sort
words.sort(key=itemgetter(1))
anagram = False
for i in range(1,len(words)):
  if (words[i-1][1] == words[i][1]) and (words[i-1][0] != words[i][0]): # if duplicates in dictionary
    if anagram == False:
      anagram = True
    append_to_file(args.output,"----\n")
    append_to_file(args.output,words[i-1][0])
    append_to_file(args.output,words[i][0])
  else:
    anagram = False


Previous: Beginner's view on Python
Next: Cosmic ray threat?