This page has forms you can use to submit a word to a Java servlet and get back a list of words that might have a similar pronunciation. This list might be useful if you are looking for alternate spellings. I originally made use of this algorithm in an application to help people transcribing legal documents find consistent spellings for the names of people mentioned in the documents.
The original metaphone algorithm was published by Lawrence Philips in an article entitled "Hanging on the Metaphone" in the journal Computer Language v7 n12, December 1990, pp39-43. His algorithm - translated into Java, and with minor tweaks - is what we are using here. Naturally, a phonetic encoding system has to assume a particular language and culture. Here we are using essentially American English. Click here to download the encoding class.
The Apache Software Foundation "Commons" project has implemented metaphone and also an improved version called "double metaphone." You can download the Codec toolkit from this site.
The word lists come from the "Moby Words" project - any search engine will locate more information on this project. Although the people and place names lists contain many thousands of names, there are some surprising omissions.
B X S K J T F H L M N P R 0 W Y
That isn't an O but a zero - representing the 'th' sound.
Transformations
Metaphone uses the following transformation rules:
Doubled letters except "c" -> drop 2nd letter.
Vowels are only kept when they are the first letter.
B -> B unless at the end of a word after "m" as in "dumb"
C -> X (sh) if -cia- or -ch-
S if -ci-, -ce- or -cy-
K otherwise, including -sch-
D -> J if in -dge-, -dgy- or -dgi-
T otherwise
F -> F
G -> silent if in -gh- and not at end or before a vowel
in -gn- or -gned- (also see dge etc. above)
J if before i or e or y if not double gg
K otherwise
H -> silent if after vowel and no vowel follows
H otherwise
J -> J
K -> silent if after "c"
K otherwise
L -> L
M -> M
N -> N
P -> F if before "h"
P otherwise
Q -> K
R -> R
S -> X (sh) if before "h" or in -sio- or -sia-
S otherwise
T -> X (sh) if -tia- or -tio-
0 (th) if before "h"
silent if in -tch-
T otherwise
V -> F
W -> silent if not followed by a vowel
W if followed by a vowel
X -> KS
Y -> silent if not followed by a vowel
Y if followed by a vowel
Z -> S
Initial Letter Exceptions
Initial kn-, gn- pn, ac- or wr- -> drop first letter
Initial x- -> change to "s"
Initial wh- -> change to "w"
The code is truncated at 4 characters in this example, but more could be used.