views:

145

answers:

2

Hi, I am now using python base64 module to decode a base64 coded XML file, what I did was to find each of the data (there are thousands of them as for exmaple in "ABC....", the "ABC..." was the base64 encoded data) and add it to a string, lets say s, then I use base64.b64decode(s) to get the result, I am not sure of the result of the decoding, was it a string, or bytes? In addition, how should convert such decoded data from the so-called "network byte order" to a "host byte order"? Thanks!

+2  A: 

Each base64 encoded string should be decoded separately - you can't concatenate encoded strings (and get a correct decoding).
The result of the decode is a string, of byte-buffer - in Python, they're equivalent.
Regarding the network/host order - sequences of bytes, have no such 'order' (or endianity) - it only matters when interpreting these bytes as words / ints of larger width (i.e. more than 8 bits).

adamk
Thanks! Yes, I think my process is : find one encoded data, then add it to an empty string, then decode it, hold the result to a container, say a list; then empty the string and find the next encoded data, does that make sense please?
ligwin
This seems reasonable, although if you post some more information (e.g. some of your code, what your XML file looks like, etc.) you'll get better answers.
adamk
<scan num="1" msLevel="1" peaksCount="2064" polarity="+" scanType="Full" retentionTime="PT0.5746S" lowMz="300" highMz="1600" basePeakMz="355.07" basePeakIntensity="7959.72" totIonCurrent="85631" msInstrumentID="0"> <peaks precision="32" byteOrder="network" pairOrder="m/z-int">Q5YACgAAAABDlgAbAAAAAEOWAC0AAAAAQ5YAPwAAAABDlgdNAAAAAEOWB18AAAAAQ5YHcAAAAABDlgeCAAAAAEOWB5QAAAAAQ5YHpkNx8H9Dlge4REqBx0OWB8pEpZ10Q5YH3ES2lxFDlgfuRIuPbEOWByAAAAAETIADEAAAAA</peaks> </scan>
ligwin
I dont know how to give it a right format. The data are too large to be fully post here, however, I kept the original format of how it looks like. There are typically thousands of "scan" which looks like this in the XML file, the data is "Q5YACqAAAA....."
ligwin
+2  A: 

Base64 stuff, encoded or not, is stored in strings. Byte order is only an issue if you're dealing with non-characters (C's int, short, long, float, etc.), and then I'm not sure how it would relate to this issue. Also, I don't think concatenating base64-encoded strings is valid.

>>> from base64 import *
>>> b64encode( "abcdefg" )
'YWJjZGVmZw=='
>>> b64decode( "YWJjZGVmZw==" )
'abcdefg'
>>> b64encode( "hijklmn" )
'aGlqa2xtbg=='
>>> b64decode( "aGlqa2xtbg==" )
'hijklmn'
>>> b64decode( "YWJjZGVmZw==aGlqa2xtbg==" )
'abcdefg'
>>> b64decode( "YWJjZGVmZwaGlqa2xtbg==" )
'abcdefg\x06\x86\x96\xa6\xb6\xc6\xd6\xe0'
robert
Yes, I see your example, it would not work with concatenating a string as they got certain formats. I copied one data completely(thousands of them in one XML file) into a text file and decoded by a web base64 decoder, the result is a bin file. what should I do if I would like to know what is it within this encoded data please?
ligwin