tags:

views:

217

answers:

6

(Not sure if this should be CW or not, you're welcome to comment if you think it should be).

At my workplace, we have many many different file formats for all kinds of purposes. Most, if not all, of these file formats are just written in plain text, with no consistency. I'm only a student working part-time, and I have no experience with using xml in production, but it seems to me that using xml would improve productivity, as we often need to parse, check and compare these outputs.

So my questions are: given that I can only control one small application and its output (only - the inputs are formats that are used in other applications as well), is it worth trying to change the output to be xml-based? If so, what are the best known ways to do that in C++ (i.e., xml parsers/writers, etc.)? Also, should I also provide a plain-text output to make it easy for the users (which are also programmers) to get used to xml? Should I provide a script to translate xml-plaintext? What are your experiences with this subject?

Thanks.

+8  A: 

Don't just use XML because it's XML.

Use XML because:

  • other applications (that only accept XML) are going to read your output
  • you have an hierarchical data structure that lends itself perfectly for XML
  • you want to transform the data to other formats using XSL (e.g. to HTML)

EDIT:

A nice personal experience:

Customer: your application MUST be able to read XML.

Me: Er, OK, I will adapt my application so it can read XML.

Same customer (a few days later): your application MUST be able to read fixed width files, because we just realized our mainframe cannot generate XML.

Patrick
+0.99... not sure how i feel about "perfectly" ;)
Cogwheel - Matthew Orlando
@Patrick regarding the first point - shouldn't change start in increments? Obviously not everyone is going to move to xml right away, but isn't this the way to do it? Also, isn't the fact that plain-text can be a bitch to parse a good enough reason?
Amir Rachum
@Amir, I agree with the fact that change should start in increments. But I don't always want to be the first to move to it. Start using a technology if it is mature enough (which is the case for XML) and you get clear advantages from it (not necessarily the case for this question). If your data is 2-dimensional (relational data, Excel-like data, ...) then a normal, fixed width file or tab-separated file can be sufficient, and is very easy to parse, and even easier than XML. For hierarchical data I agree that XML may be easier.
Patrick
"Don't just use XML because it's XML." Dude... you deserve a medal. (I.e. using a technology just for the sake of using that technology is pointless).
SigTerm
+4  A: 

Amir, to parse an XML you can use TinyXML which is incredibly easy to use and start with. Check its documentation for a quick brief, and read carefully the "what it does not do" clause. Been using it for reading and all I can say is that this tiny library does the job, very well.

As for writing - if your XML files aren't complex you might build them manually with a string object. "Aren't complex" for me means that you're only going to store text at most.

For more complex XML reading/writing you better check Xerces which is heavier than TinyXML. I haven't used it yet I've seen it in production and it does deliver it.

Poni
Yes to all of this.I have used TinyXml over and over again with a variety of environments. Use TinyXml for parsing; but for writing, use your own stuff, based on cout (or printf if that suits you, but since you mention best practices then never mind).Speaking for myself, I would strongly resist producing multiple output types, but instead I'd show my users how to view xmls, say in a browser.
Detmar
I like libxml for parsing. Granted it's not C++.
Craig W. Wright
A: 

You can try using the boost::property_tree class.

http://www.boost.org/doc/libs/1_43_0/doc/html/property_tree.html
http://www.boost.org/doc/libs/1_43_0/doc/html/boost_propertytree/tutorial.html
http://www.boost.org/doc/libs/1_43_0/doc/html/boost_propertytree/parsers.html#boost_propertytree.parsers.xml_parser

It's pretty easy to use, but the page does warn that it doesn't support the XML format completely. If you do use this though, it gives you the freedom to easily use XML, INI, JSON, or INFO files without changing more than just the read_xml line.

If you want that ability though, you should avoid xml attributes. To use an attribute, you have to look at the key , which won't transfer between filetypes (although you can manually create your own subnodes).

Although using TinyXML is probably better. I've seen it used before in a couple of projects I've worked on, but don't have any experience with it.

Jonathan Sternberg
A: 

Another approach to handling XML in your application is to use a data binding tool, such as CodeSynthesis XSD. Such a tool will generate C++ classes that hide all the gory details of parsing/serializing XML -- all that you see are objects corresponding to your XML vocabulary and functions that you can call to get/set the data, for example:

Person p = person ("person.xml");

cout << p.name ();

p.name ("John");
p.age (30);

ofstream ofs ("person.xml");
person (ofs, p);
Boris Kolpackov
A: 

BTW, before you decide on an XML parser, you may want to make sure that it will actually be able to parse all XML documents instead of just the "simple" ones, as discussed in this article:

Are you using a real XML parser?

Boris Kolpackov