tags:

views:

2394

answers:

6

I have an application, written in C++ using MFC and Stingray libraries. The application works with a wide variety of large data types, which are all currently serialized based on MFC Document/View serialize derived functionality. I have also added options for XML serialization based on the Stingray libraries, which implements DOM via the Microsoft XML SDK. While easy to implement the performance is terrible, to the extent that it is unusable on anything other than very small documents.

What other XML serialization tools would you folks recommend for this scenario. I don't want DOM, as it seems to be a memory hog, and I'm already dealing with large in memory data. Ideally, i'd like a streaming parser that is fast, and easy to use with MFC. My current front runner is expat which is fast and simple, but would require a lot of class by class serialization code to be added. Any other efficient and easier to implement alternatives out there that people would recommend?

+2  A: 

A good solution would be libxml. It provides lightweight SAX parsing and data structures for XML processing. There are several DOM libraries which are built on top of libxml.

Unfortunatly it is a C library, but C++ wrappers are available.

A few years ago I switched from MSXML to libxml because of the performance issues you mentioned.

If you decide to use libxml, you should also take a look at libxslt.

DR
LibXML all the way! Don't forget its sister component, LibXSLT.
spoulson
Good point, I mentioned it in my answer.
DR
+2  A: 

We use Xerces-C++. It was easy to setup and performance is good enough so we don't need to think about changing. However we aren't XML heavy.

I did listen to a podcast by Scott Hanselman (from Hansel Minutes) where they discuss the XML performance of MSXML and XSLT.

graham.reeds
+4  A: 

The Boost Serialization library supports XML. This library basically consists in:

  1. Start from the principles of MFC serialization and take all the good things it provides.
  2. Solve every single issue of MFC serialization!

Among the improvements compared to MFC is support for XML. Note that you don't necessarily control the XML schema of this serialization. It uses its own schema.

Serge - appTranslator
The boost serialization looks like a good fit, and might well be a good first step from moving away from MFC.
Shane MacLaughlin
A: 

We use TinyXML for all our XML needs be it MFC or straight C++.

http://sourceforge.net/projects/tinyxml

Rob
Seems to use a DOM style approach which probably isn't suitable for this particular app. Thanks for the link anyway.
Shane MacLaughlin
+1  A: 

what about RapidXML, I am using it in an MFC app with some modification to support UTF-16 with std::string. I am quite satisfied with it so far.

std::string should be std::wstring, sorry for the typo. -tomgee
+1  A: 

This is an age old problem. I was the team lead of the development team with the most critical path dependencies on the largest software project in the world during 1999 and 2000 and this very issue was the focus of my work during that time. I am convinced that the wheel was invented by multiple engineers who were unaware that others had already invented it. The same is true of XML Data binding in C++. I invented it too, and I've been perfecting it for over 10 years on various projects. I have a solution that addresses the issues noted here and some additional issues that repeatedly arise:

  1. XML Updates. This is the ability to re-apply a subset of XML into an existing object model. In many cases the XML is bound to indexed objects and we cannot afford to re-index for each update.

  2. COM and CORBA interface management. In the same respect that the XML Data Binding can be automated through object oriented practices - so can the instances of interface objects that provide that data to the application layer.

  3. State Tracking. The application often needs to distinguish between an empty value vs. a missing value - both create an empty string. This provides the validation along with Data Binding.

The source code uses the least restrictive license - less so that GPL. The project is supported and managed from here:

http://www.codeproject.com/KB/XML/XMLFoundation.aspx

Now that it's the year 2010, I believe that nobody else will attempt to reinvent the wheel because there are a few to choose from. IMHO - this wheel is the most polished and well rounded implementation available.

Enjoy.