tags:

views:

4806

answers:

12

I have two applications written in Java that communicate with each other using XML messages over the network. I'm using a SAX parser at the receiving end to get the data back out of the messages. One of the requirements is to embed binary data in an XML message, but SAX doesn't like this. Does anyone know how to do this?

UPDATE: I got this working with the Base64 class from the apache commons codec library, in case anyone else is trying something similar.

+2  A: 

Maybe encode them into a known set - something like base 64 is a popular choice.

mercutio
+3  A: 

Try Base64 encoding/decoding your binary data. Also look into CDATA sections

basszero
+30  A: 

You could encode the binary data using base64 and put it into a Base64 element; the below article is a pretty good one on the subject.

Handling Binary Data in XML Documents

Greg Hurlman
+2  A: 

I usually encode the binary data with MIME Base64 or URL encoding.

Anders Sandvig
+24  A: 

XML is so versatile...

<DATA>
  <BINARY>
    <BIT index="0">0</BIT>
    <BIT index="1">0</BIT>
    <BIT index="2">1</BIT>
    ...
    <BIT index="n">1</BIT>
  </BINARY>
</DATA>

XML is like violence - If it doesn't solve your problem, you're not using enough of it.

EDIT:

BTW: Base64 + CDATA is probably the best solution

(EDIT2:
Whoever upmods me, please also upmod the real answer. We don't want any poor soul to come here and actually implement my method because it was the highest ranked on SO, right?)

Mo
I just repeated that quote to my friend, and after he laughed, he said "and it is painful if directed at you" :)
kaybenleroll
This is nothing less than an utterly disgraceful use of XML if you're serious. And if you're not, how would beginners who don't write-high-level-think-low-level know?
Jenko
Jeremy...for a young 23 year old lad you're awfully serious/literal...you clearly haven't worked long enough in the industry to see why this is an amusing answer with a cautionary tale for the brave between the lines.
Kev
I would presume they would know by 1) how different this answer is from the big green one above with double the votes, and 2) by reading the rest of the thread where others point out how funny the joke is.
Mike Powell
@Mike - you woulda thought that....SO is rapidly becoming a breeding ground for egomaniacal humourless young pedants.
Kev
I think it's funny. But yes, once again, using the actual base64 datatype is the way to go. CData is too generic.
Omniwombat
+1 for putting up a mirror for all those consultants who think XML = golden hammer, lol.Btw, I LOVE xml, but only if used correctly.
Turing Complete
Laughed out loud, showed to all my friends. -1 to the haters !
Dean Radcliffe
+3  A: 

@Mo - XML humour. Nice one. Modded up. :-)

Kev
Come on you humourless gits....lighten up. This was an early answer before we all learned SO ettiquette... :)
Kev
Yeah, but it does'nt deserve to rank as the '2nd-best' answer. That's misleading.
Jenko
So why downvote me?
Kev
And it's hardly offensive...please re-boot humour humour/life module.
Kev
+3  A: 

@(whoever downmodded Mo): Lighten up a little, that was hilarious (both the code and the quip). +1

Mike Powell
Yeah right, so we like to rank jokes as the best answers? Where's our sense of logic gone?
Jenko
Where's our sense of humour evaporated to? This was a closed beta days question....some leeway was allowed back then. Please re-insert humour module.
Kev
+5  A: 

Base64 is indeed the right answer but CDATA is not, that's basically saying: "this could be anything", however it must not be just anything, it has to be Base64 encoded binary data. XML Schema defines Base 64 binary as a primitive datatype which you can use in your xsd.

Boris Terzic
+2  A: 

You can also Uuencode you original binary data. This format is a bit older but it does the same thing as base63 encoding.

Andrei Savu
+1  A: 

Any binary-to-text encoding will do the trick. I use something like that

<data encoding="yEnc>
<![CDATA[ encoded binary data ]]>
</data>
Jarek Przygódzki
A: 

I had this problem just last week. I had to serialize a PDF file and send it, inside an XML file, to a server.

If you're using .NET, you can convert a binary file directly to a base64 string and stick it inside an XML element.

string base64 = Convert.ToBase64String(File.ReadAllBytes(fileName));

Or, there is a method built right into the XmlWriter object. In my particular case, I had to include Microsoft's datatype namespace:

StringBuilder sb = new StringBuilder();
System.Xml.XmlWriter xw = XmlWriter.Create(sb);
xw.WriteStartElement("doc");
xw.WriteStartElement("serialized_binary");
xw.WriteAttributeString("types", "dt", "urn:schemas-microsoft-com:datatypes", "bin.base64");
byte[] b = File.ReadAllBytes(fileName);
xw.WriteBase64(b, 0, b.Length);
xw.WriteEndElement();
xw.WriteEndElement();
string abc = sb.ToString();

The string abc looks something that looks like this:

<?xml version="1.0" encoding="utf-16"?>
<doc>
    <serialized_binary types:dt="bin.base64" xmlns:types="urn:schemas-microsoft-com:datatypes">
        JVBERi0xLjMKJaqrrK0KNCAwIG9iago8PCAvVHlwZSAvSW5mbw...(plus lots more)
    </serialized_binary>
</doc>
Brian Travis
A: 

Here's a good example of how to proceed XEP-0239

PS: don't forget to read Mo's answer.

PS2: read the NOTICE section on the XEP.

mrrtnn