tags:

views:

157

answers:

2

Has any one used SQL server 2008 as an xml document database? what are you thoughts on doing so. Is the indexing and querying of the XML data type suficent to support this type of role? Is the query performance of XML acceptable?

A: 

I have not tried it, but XML seems a little too verbose to me. I figure I can generate xml from my data later, why worry about storing it in XML.

KM
I have documents that are xml and it would be nice to take advantage of that structure and query them. Much simpler then parsing the xml file to pull bits and pieces of interest into tables.
Aaron Fischer
@Aaron Fischer, there will be pain: either pulling them apart to store so you can query within them easily or you can easily store them but painfully pull them apart to query within them. you pick! I like to do it one time, when inserting them, not every time to query them. but then again, I have not messed with them that much to really understand the performance impact of them vs traditional table structures.
KM
+1  A: 

I don't know what exactly your requirements are, and how many documents and what sizes we're talking about here.

SQL Server 2005 does allow you to specify XML schemas so you can definitely get some validation into the equation which is certainly beneficial.

As for XML indexing - you can index on three different strategies once you've created a basic primary XML index.

  • the first index type more for optimizing the XPath to a single XML node when you do lots of XPath-based queries for nodes (CREATE XML INDEX ..... FOR PATH)
  • the second index type more for optimizing access to values inside your XML nodes, when you search more based on values in the XML document (CREATE XML INDEX ..... FOR VALUE)
  • the third is somewhat of a hybrid of the two above (which I never quite groked myself, to be honest; CREATE XML INDEX ..... FOR PROPERTY)

The XML indexes worked quite well in our samples, but the main drawback in our case was the sheer size of the indexes on disk. Our 1.3 GB database grew to over 11 GB just by adding a PRIMARY XML and a XML FOR PATH index to roughly 45'000 entries with an XML field. Due to disk constraints, we ended up having to take down those indices :-(

This is really not all that surprising considering how the XML index will be built-up with entries for each XML node, attribute and so forth - it's just lots of data.

What we've done in the end is create a number of stored functions that reach into the XML from our Entry table, and we extract those bits and pieces that we need most often. Those are now stored on the Entrytable as computed, persisted properties. This is as fast as "proper" fields on the Entry table, it's always up to date and gets set automatically when new data is inserted, and we hardly ever need to really use any significant XQuery requests anymore.

What I can say from personal experience is that the XML support in SQL Server 2005 is really quite profound and well thought out, in my opinion. So all in all, I would say - go give it a try! You won't really be able to tell whether it works and scales nicely enough in your specific case until you've given it a try.

Marc

marc_s