You could find someone else's double metaphone implementation, run it on the same long list of words, and compare the results to your own.
For long lists of words, I like infochimps. They have lots of word lists, like this one of 350,000 english words or this one of place names, and many more.
Here are implementations you can compare your results against. Here is an online example, except that it tests only one word at a time - I guess you'll have to download and run one of the scripts to test a large list of words.
For each word, two codes will be returned; you'll probably want to test that both codes returned match the ones returned of another implementation. You probably know that the reference implementation is here with full source code here, but including the links anyway for others' benefit.