tags:

views:

149

answers:

4

I've fallen into a situation where it would be advantageous to store both ascii and binary data within a tab-delimited file. My initial attempts were horrendous. Is this even worth pursuing? Any advice? I'll need to be able to cleanly parse the resulting tab-delimited file easily. Down stream, this data is going into a MySQLdb. And it would be nice to have the binary data stored within the db.

+7  A: 

base64 encode your binary data. Maybe prefix it with base64: or something if that helps. Then it's just an ASCII file and you can easily parse it as such.

singpolyma
A: 

Have you though about using a different format as opposed to tab-delimited?

Since binary data might contain the equivalent of a tab character, this isn't a trivial task.

Ben S
A: 

Maybe store the binary data in hex-blob format? That's at least supported by the MySQL tool chain.

MarkusQ
A: 

Though I am strongly against this method, you can store it directly in the file as long as you know the exact length in bytes of the binary data. You can then start reading from the tab character after the length value. Hopefully, after reading in that specified length of bytes, you have another tab character or a newline.

An example:

ASCII 1 ASCII 2 BinaryLength Blob
this    is horrible 18 ®##]-û¢?#ý¯#d  ­ú2
please  don't 48 Þ­¾ï¥Zߨ}è¨Ùب©×ÚX©©x©†Ú…zŠWG©j ‡­˜zǘǰ˜y|‰}—

You should really Base64 encode the binary data, though.

John Rasch