views:

2542

answers:

5

It is not really a question because I have already found a solution. It took me a lot of time, that's why I want to explain it here.

Msxml is based on COM so it is not really easy to use in C++ even when you have helpful classes to deal with memory allocation issues. But writing a new XML parser would be much more difficult so I wanted to use msxml.

The problem:

I was able to find enough examples on the internet to use msxml with the help of CComPtr (smart pointer to avoid having to call Release() for each IXMLDOMNode manually), CComBSTR (to convert C++ strings to the COM format for strings) and CComVariant. This 3 helpful classes are ATL classes and need an #include <atlbase.h>.

Problem: Visual Studio 2008 Express (the free version) doesn't include ATL.

Solution:

Use comutil.h and comdef.h, which include some simple helper classes:

  • _bstr_t replaces more or less CComBSTR
  • _variant_t replaces more or less CComVariant
  • _com_ptr_t replaces indirectly CComPtr through the use of _COM_SMARTPTR_TYPEDEF

Small example:

#include <msxml.h>
#include <comdef.h>
#include <comutil.h>

// Define some smart pointers for MSXML
_COM_SMARTPTR_TYPEDEF(IXMLDOMDocument,     __uuidof(IXMLDOMDocument));     // IXMLDOMDocumentPtr
_COM_SMARTPTR_TYPEDEF(IXMLDOMElement,      __uuidof(IXMLDOMElement));      // IXMLDOMElementPtr
_COM_SMARTPTR_TYPEDEF(IXMLDOMNodeList,     __uuidof(IXMLDOMNodeList));     // IXMLDOMNodeListPtr
_COM_SMARTPTR_TYPEDEF(IXMLDOMNamedNodeMap, __uuidof(IXMLDOMNamedNodeMap)); // IXMLDOMNamedNodeMapPtr
_COM_SMARTPTR_TYPEDEF(IXMLDOMNode,         __uuidof(IXMLDOMNode));         // IXMLDOMNodePtr

void test_msxml()
{
 // This program will use COM
 CoInitializeEx(NULL, COINIT_MULTITHREADED);

 {
  // Create parser
  IXMLDOMDocumentPtr pXMLDoc;
  HRESULT hr = CoCreateInstance(__uuidof (DOMDocument), NULL, CLSCTX_INPROC_SERVER, IID_IXMLDOMDocument, (void**)&pXMLDoc);
  pXMLDoc->put_validateOnParse(VARIANT_FALSE);
  pXMLDoc->put_resolveExternals(VARIANT_FALSE);
  pXMLDoc->put_preserveWhiteSpace(VARIANT_FALSE);

  // Open file
  VARIANT_BOOL bLoadOk;
  std::wstring sfilename = L"testfile.xml";
  hr = pXMLDoc->load(_variant_t(sfilename.c_str()), &bLoadOk);

  // Search for node <testtag>
  IXMLDOMNodePtr pNode;
  hr = pXMLDoc->selectSingleNode(_bstr_t(L"testtag"), &pNode);

  // Read something
  _bstr_t bstrText;
  hr = pNode->get_text(bstrText.GetAddress());
  std::string sSomething = bstrText;
 }

 // I'm finished with COM
 // (Don't call before all IXMLDOMNodePtr are out of scope)
 CoUninitialize();
}
+1  A: 

Another option would to use another XML parser that is already done, such as eXpat. It avoids having to use ATL and the complexities of COM, and is way easier than implementing your own. I suggest this only becasue you've stated that the reason you're looking at msxml is because you don't want to implement your own parser.

ctacke
I already thought about using eXpat, but as msxml already was on the computer, I thought that using it could help making a smaller programm. But if I had known about COM difficulties before that, I might have used eXpat...
Name
+2  A: 

Maybe try using the #import statement.

I've used it in a VS6 project I have hanging around, you do something like this (for illustrative purposes only; this worked for me but I don't claim to be error proof):

#import  "msxml6.dll"

  ...

MSXML2::IXMLDOMDocument2Ptr pdoc;
HRESULT hr = pdoc.CreateInstance(__uuidof(MSXML2::DOMDocument60));
if (!SUCCEEDED(hr)) return hr;

MSXML2::IXMLDOMDocument2Ptr pschema;
HRESULT hr = pschema.CreateInstance(__uuidof(MSXML2::DOMDocument60));
if (!SUCCEEDED(hr)) return hr;

pschema->async=VARIANT_FALSE;
VARIANT_BOOL b;
b = pschema->loadXML(_bstr_t( /* your schema XML here */ ));

MSXML2::IXMLDOMSchemaCollection2Ptr pSchemaCache;
hr = pSchemaCache.CreateInstance(__uuidof(MSXML2::XMLSchemaCache60));
if (!SUCCEEDED(hr)) return hr;

_variant_t vp=pschema.GetInterfacePtr();
pSchemaCache->add(_bstr_t( /* your namespace here */ ),vp); 

pdoc->async=VARIANT_FALSE;
pdoc->schemas = pSchemaCache.GetInterfacePtr();
pdoc->validateOnParse=VARIANT_TRUE;
if (how == e_filename)
    b = pdoc->load(v);
else
    b = pdoc->loadXML(bxmldoc);

pXMLError = pdoc->parseError;
if (pXMLError->errorCode != 0)
    return E_FAIL; // an unhelpful return code, sigh....
Jason S
#import is the great way to go, easy and very little code to write by yoourself
galets
I hadn't heard about "#import" before now (With C++ I have spend more time under Unix than under Windows until now). Seems to be quite interesting.
Name
A: 

Why don't you use some MSXML wrapper that would shield you form COM, such as Arabica?

Nemanja Trifunovic
Thanks for your answer. But I don't really want to add a layer of indirection to use the XML parser.
Name
+1  A: 

You can use TinyXML. It is open source and more over platform independent.

Vinay
Interesting, I hadn't heard about this parser before. It could be an "light" alternative to eXpat (through I haven't checked how much resources eXpat needs).
Name
A: 

I'm happy I posted my question although I already had a solution because I got several alternative solutions. Thanks for all your answers.

Using another parser such as eXpat or the maybe smaller (not so powerfull but enough for my needs) TinyXML could actually be a good idea (and make it easier to port the program to another operating system).

Using an #import directive, apparently a Microsoft specific extension to simplify the use of COM, is also interesting and brought me to the following web page MSXML in C++ but as elegant as in C#, which explain how to simplify the use of msxml as much as possible.

Name