views:

314

answers:

10

Background
I'm researching the efficiency of messaging within contemporary web applications, examining the use of alternatives to XML. This is a university project whose results will be released publicly - the greater the participation of the community, the greater the value of the results that are given back.

I need as many real-life examples of XML in use as possible so as to:

  • fully understand to what uses XML is put when host A talks to host B
    I can certainly imagine how XML should/may be used. The reality may be quite different.
     
  • perform tests on actual not hypothetical data
    How XML performs compared to Technology X on sets of real-life data is of equal importance to how XML compares to Technology X on an arbitrary set of data
     
  • identify and measure any patterns of use of XML
     e.g. elements-only, elements plus some attributes or minimal elements and heavy attribute usage

The Question

How do you use XML within the world of web applications?

When Host B returns XML-structured data to Host A over HTTP, what comes back? This may be a server returning data in an AJAX environment or one server collating data from one or more other servers.

Ideal answers would include:

  • A real-life example of XML within an HTTP response
  • The URL, where relevant, to request the above
  • An explanation, if needed, of what the data represents
  • An explanation, if not obvious, of why such messages are being exchanged (e.g. to fulfil a user request; host X returning a health status report to host Y)

I'd prefer examples from applications/services that you've made, developed or worked on, although any examples are welcome. Anything from a 5-line XML document to a 10,000 line monster would be great.

Your own opinions on the use of XML in your example would also be wonderful (e.g. we implemented XML-structured responses because of Requirement X/Person Y even though I thought JSON would have been better because ...; or, we use XML to do this because [really good reason] and it's just the best choice for the job).

Update
I very much appreciate all answers on the topic of XML in general, however what I'm really looking for is real-life examples of HTTP response bodies containing XML.

I'm currently fairly aware of the history of XML, of what common alternatives may exist and how they may compare in features and suitability to given scenarios.

What would be of greater benefit would be a impression of how XML is currently used in the exchange of data between HTTP hosts regardless of whether any current usage is correct or suitable. Examples of cases where XML is misapplied are just as valuable as cases where XML is correctly-applied.

+1  A: 

I suggest you also study JSON which is an alternative to XML and is widely used for its compactness.

fasih.ahmed
I'm comparing XML to JSON, YAML and Google Protocol buffers. At present I'm just trying to collect XML usage data.
Jon Cram
Well I work on a product known as Semotus HipLink and we use JSON extensively for AJAX calls.
fasih.ahmed
+2  A: 

Probably not the answer you want, but I never use XML, it's too complicated, (for my simple needs anyway), but even if my needs were complicated, XML is too complicated a beast it scares me to use it in a complicated problem.

hasen j
On the contrary, best answer so far.
Jon Cram
If XML is complicated, what is simple? ASN.1+BER?
bortzmeyer
suppose I have four data fields, if they are sent as xml, how would the parsing code look like?? complicated tree traversal just to get a couple of values. JSON is much better for that.
hasen j
A: 

Eucaris is a web application to retrieve car registration data. The backend uses XSD-typed XML data for request and response messages.

devio
+1  A: 

Unfortunately, I can't give you any real data for business/legal reasons.

In my experience, xml has been the standard format for 90+% of back end, server to server communications that I have worked on in recent years, purely because of the prevalence of tools for working with it, and the fact that most developers have some experience with it.

Something like google's protocol buffers may well be more efficient for many tasks, but the convenience and safety of a format that most programmers with "enterprisey" experience already know how to use is hard to make a business case against.

If you are selling a service to the outside world, it's much easier to sell if you offer an xml based interface, CIO reads "xml based web service", CIO says "fine, my team knows that..."

Xml is not always (some would argue never) the best tool for the job, but its ubiquity, and the amount of existing codebase and skillsets (good, bad and mediocre) for working with it, often push it to the head of the candidate queue.

seanb
+1  A: 

My advice is to skip XML and look at something simpler like JSON. XML provides only two things:

1) a "standardized" way to serialize complex data 2) a way to verify (via DTD) the correctness of the above serialization

notice that "standardized" is in quotes. The only thing standardized is the way of formatting the tags. What the tags mean is not standard at all. In the end, the only thing XML gives you is a good parser that you don't have to write yourself.

If the data you're passing around can be represented as a simple string, array, or associative array ( or hash), XML is overkill.

Yes, that touches on what I'm looking into. XML is believed to be overkill. In what /precise/ and /measurable/ respects are the alternatives (JSON/YAML) better? Do they offer performance gains? In what ways can such gains be reflected in business terms?
Jon Cram
I agree, but I wouldn't underestimate the value of having a good parser to hand. In my experience most people cannot write good parsers, they end up being very brittle.
+1  A: 

I try not to use it more than I have to. It definitely has its place as a transmission protocol in an architecture where the client and the server do not know about each other and are implemented independently - or an API is being developed independently of any clients. It also has a place in persistence where the same reasoning applies, and I object to it far less in that domain.

However, if the client and server are implemented by the same team then it makes little sense to translate back and forth between the two in a human readable form and there is almost always a faster, cheaper (in terms of processing) alternative, even if the client and server technologies are different.

Concentrating my remarks on transmission protocols, back before XML arrived in the "bad" old client/server days when bandwidth and processing power were precious, it would be the job of the architects to design a protocol (normally binary) with the sole job of efficiency and speed where packet size would be minimized. The obvious limitation is that the handshake was very specific and the binary dialect was unintelligible unless it was published. The up-side was that it was extremely efficient and could be highly optimised for the application at hand. Very often the binary formats were published (have you seen the old Excel BIFF specification - not a protocol, but an example of publishing a binary format).

XML in HTTP, i.e. SOAP, broke that. The rationale was very sane, have a universally understood protocol for the handshake, a sort of computer Esperanto, so that you could separate your client and server architectures and decide on their pace of development and internals completely separately. What's more, future-proof yourself against possible client requirements with the promise that switching clients was just a matter of implementing a new one. What's more, allow any Joe with an XML parser to consume your API. All great stuff and has led to a mushrooming of very well demarked architectures - which is wholly good.

So to quite a large degree the power of this proposition has been manifested and there are clearly advantages, however I think that a) this requirement is often overstated and b) XML protocols are often implemented very sloppily and with scant regard for the processing cost they imply. What's more the originally sane reasoning has given way to cases of extremist religion (I bet I get voted down) and I have seen code passing XML between function calls within the same classes, using exactly the future-proofing rationale and functional separation arguments, which is clearly bonkers.

So my mantra is to make the communication efficient and effective. If that means providing a generalised API and protocol for arbitrary and unknown consumers, then XML is a very good choice. If it means making lightning hot, scalable client/server (i.e. Web) architectures then I try and use a binary protocol, often rolling my own.

The emergence of JSON is testimony to the fact that the XML bandwagon had a few too many layers. JSON is an attempt to shorten the structural elements while maintaining the generality and thereby get the benefits of smaller packets. Protocols like Adobe's AMF are generally much more compact, being almost entirely binary.

And that's where I think the future probably lies. I am certain that it will be possible to keep all of the up-sides that XML represents for publication of interfaces, but be able to trim it dramatically and make it less processor and bandwidth intensive - at least that's my mission as developer and architect.

Imagine if your average client/server request was 1/10th of the size and there was no text parsing at either end, but you retained the generality of the interface. I don't know any developer who wouldn't take that.

Simon
A: 

Like many others, at one point, I experimented with SOAP and XMLRPC, but found browser-support so weak that I needed to "fall back" on an ad-hoc parser when MSXML barfed on input. Early versions of my netMail application used to use XML, and MSIE simply wasn't fast enough with the XML parsing. I still have the XML implementation if you're really interested in seeing it.

Two real world examples spring to mind immediately as ones I've had to deal with in the last few months:

In dealing with Ingram-Micro's XML ordering interface, the messages are dependent on the order of all elements, and is very sensitive to encoding problems. There was simply no way to use standard XML-processing tools to interact with it. An ad-hoc solution would've been better because then there would be no question what order the elements came in. The exchanges are performed by both push and pull methods; with our server POSTing data to IM-XML's endpoint, and their server POSTing data back.

MRIS's XML feeds consist of a line like <Data Separator="~"> and then a bunch of ~-delimited data. The feeds are many megabytes big, and simply taking the approach of line-oriented reads+split instead of "XML" gets the job done in less memory and faster. The "XML" data is downloaded via HTTP GET periodically.

I never use XML anymore; always ad-hoc parsers. I view XML as a design-decision to be an incredibly shortsighted one, and evidence of naivete at best, and downright stupidity the rest of the time.

Most often, I find I use raw javascript expressions (often called JSON) when a browser is involved (simply because eval is "as fast as possible") and S-expressions otherwise.

I'm sorry I can't help you with any good XML examples on the web; I simply don't think there are any.

geocar
A: 

I don't think XML is a byte-efficient language, but that's not what it's for. What XML provides is a good infrastructure on to which protocols can be built. In the case of the product I work on, we use SOAP to send to and receive from business data to external systems over which we have no control, but accept that SOAP is a sound, common messaging protocol. Similarly we use SAML assertions to exchange authorization data between systems.

Alohci
+1  A: 

I have used XML in web applications several times. All of the time it has been through SOAP web services. This is because I program in Visual Studio which has great built-in support for SOAP Web Services. It automatically generates OOP wrappers that allows easy use of it both from AJAX (client end), and .NET (server end for server-to-server communications).

I don't think that I can post any examples, but then I don't think it changes much anyway.

Vilx-
+1  A: 

I'll give you two examples of needs that we satisfied using XML:

  1. We needed to communicate data collected from many UNIX servers about file allocations, sending the details to a Windows server for analysis. Both the detail and summaries are graphically displayed through a web application.

  2. We needed to store multiple formats of form responses in a single repository for later searching and "playback". The forms are generated, stored, searched and played back within a web application.

In both cases we needed the ability to convey loosely-structured data in a self-defining format. In both cases we invented a generic XML structure that was easy for the sending process to generate, easy for the receiving process to both store (essentially a single long string), search and decode, and was easily read and understood by humans, both now and after we're all long gone. We could have invented a syntax other than XML, but nobody could think of anything better at the time, and it has served us well. I can't share specific examples because the data is considered proprietary.

Ken Paul