A Better Phonetic Lookup

This applet uses the "Metaphone" phonetic code algorithm described by Lawrence Philips in the December 1990 issue of Computer Language. This algorithm produces better matches than the Soundex algorithm. An input word is reduced to a 1 to 4 character code using relatively simple phonetic rules for typical spoken English.
Type a word into the test word field and press return or click on the Calculate button to see the resulting phonetic code.
In order to test phonetic lookup based on this code, choose one of the Word Sources - this will cause a file having a number of words to be read from the server. The words will be placed in a lookup class calculating the phonetic code on the fly as a key. When a word source is resident, any words with the same phonetic code as words typed in the test field will be displayed in the Matches text area.
The women's names, men's names, and place name files are from Gary Ward's "Moby Words" collection which he has placed in the public domain. These files and many more are available at:
http://fortis.speech.su.oz.au/comp.speech/Section1/Lexical/moby.html
http://www.dcs.shef.ac.uk/research/ilash/Moby/
ftp://ftp.dcs.shef.ac.uk/share/ilash/Moby/

The Metaphone Rules

Metaphone reduces the alphabet to 16 consonant sounds:
B X S K J T F H L M N P R 0 W Y
That isn't an O but a zero - representing the 'th' sound.

Transformations

Metaphone uses the following transformation rules:
Doubled letters except "c" -> drop 2nd letter. Vowels are only kept when they are the first letter.
B -> B   unless at the end of a word after "m" as in "dumb"
C -> X    (sh) if -cia- or -ch-
     S   if -ci-, -ce- or -cy-
     K   otherwise, including -sch-
D -> J   if in -dge-, -dgy- or -dgi-
     T   otherwise
F -> F
G ->     silent if in -gh- and not at end or before a vowel
         in -gn- or -gned- (also see dge etc. above)
     J   if before i or e or y if not double gg
     K   otherwise
H ->     silent if after vowel and no vowel follows
     H   otherwise
J -> J
K ->     silent if after "c"
     K   otherwise
L -> L   
M -> M
N -> N
P -> F   if before "h"
     P   otherwise
Q -> K
R -> R
S -> X   (sh) if before "h" or in -sio- or -sia-
     S   otherwise
T -> X   (sh) if -tia- or -tio-
     0   (th) if before "h"
         silent if in -tch-
     T   otherwise
V -> F
W ->     silent if not followed by a vowel
     W   if followed by a vowel
X -> KS
Y ->     silent if not followed by a vowel
     Y   if followed by a vowel
Z -> S 

Initial Letter Exceptions

Initial  kn-, gn- pn, ae- or wr-      -> drop first letter
Initial  x-                           -> change to "s"
Initial  wh-                          -> change to "w"
The code is truncated at 4 characters in this example, but more could be used.
Lawrence Philips, "Hanging on the Metaphone", Computer Language v7 n12, December 1990, pp39-43.
A good source for further information is this group at Sourceforge: http://aspell.sourceforge.net/metaphone/

Java PhoneticList Class

I have implemented the Metaphone code as part of a class called PhoneticList. As the name indicates, this class tracks lists of objects by the Metaphone code derived from a key string. Operation is similar to a Hashtable except that any number of Objects can have the same code and an Object array is returned by the lookup function. In the example applet, the Objects are Strings but they could be anything.
Source code is available for free, but I would appreciate knowing how you plan to use it. Please contact William Brogden.