My large (120gb) music collection contains many duplicate songs, and I've been trying to fingerprint tracks in the hopes of detecting duplicates. And since I'm a CS Major I'm very curious as to what is done out there? Nothing I do has nearly the accuracy of something like Shazam or Lala.com. How do they "hash" tracks? I have run a standard MD5 hash on all my files (26,000 files) and I found hundreds of equal hashes on different tracks, so that doesn't work.
I'm more interested in Lala.com since they work with full files, unlike Shazam, but I'm assuming both use a similar technique. Can anyone explain how to generate unique identifiers for music?