views:

547

answers:

3

W3's EXI (efficient XML interchange) is going to be standardized. It claims to be "the last binary standard".

It is a standard to store XML data optimized for processing and storage, is bundled with XML schema (making the data strongly typed and strongly structured). Well, there are a lot of claimed advantages. I was impressed most by the processing and memory-efficiency measurements.

I am asking myself, what is going to happen to all the established XML APIs?

There is this paragraph related to my question:

4.2 Existing XML Processing APIs

As EXI is an encoding of the XML Infoset, an EXI implementation can support any of the commonly-used XML APIs for XML processing, so EXI has no immediate impact on existing XML APIs. However, using an existing XML API also requires that all names and text appearing in the EXI document be converted into strings. In the future, more efficiency might be achievable if the higher layers could directly use these data as typed values appearing in the EXI document. For instance, if a higher layer needs typed data, going through its string form can produce a performance penalty, so an extended API that supports typed data directly could improve performance when used with EXI.

from: http://www.w3.org/TR/exi-impacts/

I understand it as following: "Using EXI with existing APIs? No performance gain! (Unless you rewrite them all)"

Let's take the Java ecosystem as an example:

We have plenty of XML APIs in latest JDK 6 (With each major JDK release, more and more of them were added.) As far as I can judge, most (if not all) of them are using either in-memory DOM trees, or serialized ("textual") representation to transform/process/validate/... XML data.

What do you guys think, what is going to happen to these APIs with introduction of EXI?

Thank you all for your opinions.

For those who don't know EXI: http://www.w3.org/XML/EXI/

A: 

I'd personally rather not use EXI at all. It seems like it's taking all the clunky, bad things about XML, and cramming them into a binary format, which basically removes the saving grace of XML (plain text format).

It seems like the general trend of the industry is moving towards more lightweight data transfer models (HTTP REST for example), and moving away from heavy-weight models like SOAP. Personally, I'm not super excited about the idea of binary XML.

Anything that claims to be "the last binary standard" is probably wrong.

Andy White
Yeah, I also don't understand the point of EXI. The reason why XML is used even though it is bloated is because it is human readable. If you take that away then XML has nothing over any other standards.
Andrew Marsh
A: 

Let's see EXI as a "better GZIP for XML". FYI, it has no impact on the APIs as you can still used all of them (DOM, SAX, StAX, JAXB ...). Only that in order to get EXI you have to get a streamwriter that writes to it or a streamreader that reads it.

The most efficient way to perform EXI is StAX. But it is true that new API might arise because of EXI. But who said DOM is efficient and well designed for modern languages ;-)

If you are handling big XML files (I got some of them that are few hundreds of MB), you definitively knows why you need EXI : saving tons of space, saving huge amount of memory and processing time.

This is nothing different than HTTP Content-Encoding purpose : you are not required to use it, simply that if both parties understand it, it is a much efficient way to perform the exchange.

By the way, EXI will become the prefered way to content-encore any XML over HTTP IMHO because of SOAP bloat ;-) As soon as EXI settle on the browsers, it could also benefit any enduser : faster transfert, faster analysis = best experience ever for same machine !

EXI does not deprecate string representation, only makes it a bit different. Oh and by the way, when doing UTF (think default UTF8 for instance), you are already using a "compression encoding" for the 32bits unicode code point ... this means, that on the wire data is not the same as real data already ;-)

Regards, TM

+1  A: 

Ivan,

Sorry I didn't notice your question earlier. Usually Google Alerts helps me find questions like yours, but it missed this one for some reason. My apologies for the delayed response.

You don't need any new APIs to get the performance gains of EXI. All the EXI testing and performance measurements the W3C has conducted use the standard SAX APIs built into the JDK. For the latest tests, see http://www.w3.org/TR/exi-evaluation/#processing-results. EXI parsing was on average 14.5 times faster than XML in these tests without any special APIs.

One day, if people think its worthwhile, we may see some typed XML APIs emerge. If and when that happens, you will get even better performance from EXI. However, this is not required to get excellent performance like that reported by the W3C.

 I hope this helps,

 John
John Schneider