tags:

views:

3692

answers:

5

Similar to this question, I am trying to read in an ID3v2 tag header and am having trouble figuring out how to get individual bytes in python.

I first read all ten bytes into a string. I then want to parse out the individual pieces of information.

I can grab the two version number chars in the string, but then I have no idea how to take those two chars and get an integer out of them.

The struct package seems to be what I want, but I can't get it to work.

Here is my code so-far (I am very new to python btw...so take it easy on me):

def __init__(self, ten_byte_string):
        self.whole_string = ten_byte_string
        self.file_identifier = self.whole_string[:3]
        self.major_version = struct.pack('x', self.whole_string[3:4]) #this 
        self.minor_version = struct.pack('x', self.whole_string[4:5]) # and this
        self.flags = self.whole_string[5:6]
        self.len = self.whole_string[6:10]

Printing out any value except is obviously crap because they are not formatted correctly.

+2  A: 

I was going to recommend the struct package but then you said you had tried it. Try this:

self.major_version = struct.unpack('H', self.whole_string[3:5])

The pack() function convers Python data types to bits, and the unpack() function converts bits to Python data types.

Greg Hewgill
for 'H', you'll need to use a 2-byte slice.
Brian
You're right, I overlooked that. I'll fix my example so it works, but yours is a better answer anyway.
Greg Hewgill
+1  A: 

Can you post an example string of which you need to convert?

Lucas S.
+12  A: 

If you have a string, with 2 bytes that you wish to interpret as a 16 bit integer, you can do so by:

>>> s = '\0\x02'
>>> struct.unpack('>H', s)
(2,)

Note that the > is for big-endian (the largest part of the integer comes first). This is the format id3 tags use.

For other sizes of integer, you use different format codes. eg. "i" for a signed 32 bit integer. See help(struct) for details.

You can also unpack several elements at once. eg for 2 unsigned shorts, followed by a signed 32 bit value:

>>> a,b,c = struct.unpack('>HHi', some_string)

Going by your code, you are looking for (in order):

  • a 3 char string
  • 2 single byte values (major and minor version)
  • a 1 byte flags variable
  • a 32 bit length quantity

The format string for this would be:

ident, major, minor, flags, len = struct.unpack('>3sBBBI', ten_byte_string)
Brian
+4  A: 

Why write your own? (Assuming you haven't checked out these other options.) There's a couple options out there for reading in ID3 tag info from MP3s in Python. Check out my answer over at this question.

Owen
I did see them. But this is actually for a project for school and we decided that we would write our own parser.
jjnguy
+2  A: 

I am trying to read in an ID3v2 tag header

FWIW, there's already a module for this.

fivebells