views:

281

answers:

2

Is there a Ruby library that will allow me to either calculate the checksum of an MP3 file's audio data (minus the metadata) or allow me to read in an MP3's audio data to calculate the checksum myself?


I'm looking for something like this:

mp3 = Mp3Lib::MP3.new('/path/to/song.mp3')
mp3.audio.sha1sum # => the sha1 checksum of _only_ the audio, minus the metadata

I found Mp3Info, but it seems a bit tedious. When initializing an Mp3Info object, you can get the frames where the actual audio data begins and ends.

+1  A: 

Isn't the ID3 tag stored either at the end of the file (ID3 v1) in a 128-byte block, or in a block at the start of the file (ID3v2.3 and v2.4) ? (id3.org)

You could use the audio_content method from Mp3Info, and read that much data from the file, though it's probably not much more complicated to have a look in the file yourself and work out where the headers aren't.

Nick Dixon
A: 

Extracting the mp3file without it's metadata is fairly easy done by yourself.

ID3v1

The metadata are the last 128 bytes of the file. The metadata always begins with the 3 bytes "TAG" if it exists. Just ignore this last 128bytes.

ID3v2

The metadata can be stored at the beginning or the end of the file. Most implemantations only support the beginning. ID3v2 has a header where the size is stored. The header is always loacted at the beginning of the metadata. There is an optional footer, which is a copy of the header at the end of the metadata. If the metadata is at the end of the file, the footer is required.

The header has the folloing form

ID3v2/file identifier      "ID3"
ID3v2 version              $04 00
ID3v2 flags                %abcd0000
ID3v2 size             4 * %0xxxxxxx

The footer has the following form

ID3v2/file identifier      "3DI"
ID3v2 version              $04 00
ID3v2 flags                %abcd0000
ID3v2 size             4 * %0xxxxxxx

The d bit says, wheter the footer is present. The size is measured without header and footer. Every byte of the size has always the highest bit set. So only 28 of the acutal 32 bits represent the size.

Just compute, which part of the file is not the metadata, and use it for your hashing.

Be aware, if both ID3v1 and ID3v2 are located at the end of the file, ID3v1 is located behind IDv2

The spec can be found at http://www.id3.org/id3v2.4.0-structure.

johannes