Why would I ever choose to store and manipulate XML in a relational database?

tags:

rdbms
xml

views:

532

answers:

+7 Q:

Why would I ever choose to store and manipulate XML in a relational database?

Modern RDBMS's have support for XML column types and functionality for dealing with XML in stored procedures. Historically I would always have mapped hierarchical data (whether of OO objects or XML) to relational tables. Given the widespread database support for XML should I change my ways?

You can store user generated XML in there.

If a website like stackoverflow used some sort of XML markup instead of mark down you could store the question/answers as XML in the database. You might find yourself trying to parse this user generated XML looking for proprietary tags.

tpower 2008-10-08 20:26:54

Can you elaborate?

Matthew Murdoch 2008-10-08 20:28:50

This can you also do with a LONGVARCHAR field.

Horcrux7 2008-10-08 20:32:54

But can you put a constraint on a varchar to say that it must be valid XML.

tpower 2008-10-08 20:36:30

Let's say you have an entity that has attributes. You can store all those attributes in XML instead of creating a separate attribute table. The XML would be more flexible.

John Sheehan 2008-10-08 20:32:02

+4 A:

If you don't see the need then don't change!

Sometimes you have to persist data that does not have a known structure, or its structure is very volatile. In those cases, instead of creating a table, just save the XML into your existing table

sebastian 2008-10-08 20:33:23

If its XML though, wouldn't you expect it to have a fixed, or at least well-defined structure?

Matthew Murdoch 2008-10-08 20:35:30

No what does the X in XML stand for? If you want fixed you'd just use a set of tables and fields. Well-defined is fine but the definition may change more frequently than you can push Database schema changes through.

AnthonyWJones 2008-10-09 08:32:07

Flexibility is one reason.

If the structure of your data can vary, then you can still keep a common RDBMS table, along with the queries, etc. that gor after it with the somewhat variably structured data.

If you need to add a field at some point, you can do so without changing your RDMS table structure, and thus not break everyone else's queries.

JohnMcG 2008-10-08 20:36:17

I'm not sure its that simple. If queries are being written to extract the XML then changing its structure could still break the code that processes the results.

Matthew Murdoch 2008-10-08 20:45:06

+1 A:

You can process XML data directly in SQL server. E.g. you can apply XPath expressions and just send the filtered result set to the client. SQL server features can build upon XML processing capabilities later.

The features above exist from MS SQL Server 2000 or 2005.

artur02 2008-10-08 20:39:56

But would you only process XML in the server if you were concerned about the overhead of transferring too much to the server/client application?

tpower 2008-10-08 20:43:03

I've no real life XP about it. I can recommend to measure out or find data about the performance of the XML processing power of your server. I read in an MCTS book that you can use XML processing in MSSQL and it's recomended. Check TechNet or MSDN for further details.

artur02 2008-10-12 21:15:09

Processing XPath in SQL Server is VERY slow.

Kirk Broadhurst 2009-09-01 10:41:42

+1 A:

For example, you get XML documents from some other system, with very rich or complex structure that you want to store; but you only need a few well-defined queries to retrieve that data. In that case, just parse the data you need to generate some indexes, and store the whole XML structure in a single field.

To do that you don't need much XML-specific support on the DB engine, but it still helps to keep queries expressive.

Besides that, I'd guess some DMBS with good XML support could let you simply store the XML document, maybe without specifying how to index it. You just use XQuery and hope it somehow optimises to your needs.

Javier 2008-10-08 20:55:12

+3 A:

I have a good real-life example. One of my clients receives an XML file from their suppliers very often with some important data. It is deeply nested. They need to compare it with the previous XML file to see what has changed. Without XML support in database, I had to build a tool that iterates through XML nodes and looks for matches in the tables of relational database. I could use some XML-XML comparison tool, but some of the checks relate to some other data that did not come from XML file and I need to join all that together. Ok, all of this is not that big deal, but still - with XML databases you get that functionality out-of-the-box.

Milan Babuškov 2008-10-08 20:59:34

+1 A:

Here is a real world example from a system I work on. We have a core system and create customer-specific code in java. A different class may be called depending on which customer is transacting. Sometimes this custom code needs to store something and we put it into an XML column in the relevant table. This saves us from modeling everything under the sun. Adding a new customer generally just means writing and installing the java code.

The downside is that reporting, querying and updates are more difficult on the XML column. There are none of the usual good database features like check constraints, etc.

WW 2008-10-20 13:31:33

+1 A:

I've not had the need to store XML so far, but I frequently use the ability to return XML from a stored procedure. It makes some things very useful - mainly reports. I can run a SP to generate the report, send back the results in XML, and then use XSLT to display the result on the site very easily.

Valerion 2008-10-20 13:41:41

I've done this too, but it scales badly. Some reports we had ran to many thousands of rows. The XML all had to be in memory at once (first on the DB, then on the client) vs a recordset being steamed across and consumed.

WW 2008-10-23 03:44:41

I use the XML column type to stash a copy of all business-critical messages we receive from a third-party service. It's very handy for a few reasons.

1) In case of data corruption, we can back-trace to see what data came in when, and in what format.
2) Future development work on systems can be based on real data from the log table - just deserialise and use the data as if it came from a call to the 3p service
3) Makes sure the infrasructure guys are busy allocating disk space to the DB server. ;)

ZombieSheep 2009-09-01 10:26:49

+1 A:

The only reason I would ever use it again is when for extensibility & flexibility.

The overhead of xml (xpath) and maintenance (namespaces) are really not worth the hassle if you can avoid it. We have previously stored large amounts of data in xml and used scalar functions to retrieve it, but it is too slow and causes immense headaches is the xml structure or namespace changes.

But the flexibility is fantastic. You can add new properties whenever you want, you can have project/client/job specific data in there which doesn't require proper columns. The XML does not have to be in a static structure - you simply need a factory which can spawn instances to deal with the different XML (which would need to be related to a project/client/job).

When adding a new table to an existing system, especially one that has a lot of existing data and can't easily be modified, I will add an XML column. In the future if I ever need to add another column to that table, I can simply utilise the XML column rather than being frustrated and having to do a lot of rework.

In summary, you don't start out by putting essential properties in XML. But you should add XML when you know that your table might need to be extended, precisely because it gives you the option of extending.

Kirk Broadhurst 2009-09-01 10:49:13

ansaurus

tags:

views:

answers:

Why would I ever choose to store and manipulate XML in a relational database?

related questions