I'm trying to write a Python script for searching out duplicate mp3/4 files using the song's data as the base for comparison. My situation involves many mp3/4 files with similar file names, but different ID3 tags. At first I tried looping through and using md5 to find duplicate files (ignoring file names). This, of course, didn't work when the ID3 tags didn't match.
As a result, I'm looking for a way to extract only the music data from an mp3/4 in order to run it through md5 and find any duplicates. What is the best way to go about this?