views:

2904

answers:

3

Does it have any sense (except of server side validation XML/schema/dtd) to store XML in XML type instead of text/varchar/ntext? I'm not planning to do any XML manipulation on database side.

Purpose of my investigation is to decrease database size. Can a using an XML data type for untyped XML for the purpose? Which other pros and cons it will have?

Find an article related to the topic, but not sure if authors assumptions/conclusions are correct.

+4  A: 

If you store xml in an xml typed column, the data will not get stored as simple text, as in the nvarchar case, it will be stored in some sort of parsed data tree, which in turn will be smaller than the unparsed xml version. This not only decreases the database size, but gives you other advantages, like validation, easy manipulation etc. (even though you're not using any of these, still, they are there for future use).

On the other hand, the server will have to parse the data upon insertion, which will probably slow your database down - you have to make a decision of speed vs. size.

Edit:

Personally, I think that data in the database should be stored as xml only when it has structure which is hard to implement in a relational model, e.g. layouts, style descriptions etc. Usually that means that there won't be much data and speed is not an issue, thus added xml features, like data validation and manipulation ability (also, last but not least, the ability to click on the value in managment studio and see formatted xml - I really love that feature!), outweight the costs.

I don't have direct experience in storing large amounts of xml in the database and I wouldn't do that if I had the option, since it is almost always slower that a relational model, but if that would be the case, I'd recommend profiling both options, and choosing between size and speed that best suit your needs.

Saulius
I read the theory (i.e. MSDN article). Any practical experience with XML data type? I'm slightly shocked that storing XML as XML data type increased database processor usage (see link in my original post).
FoxyBOA
Can you please give us some more details regarding your statement that the database size is decreased when using XML? It is contrary to FoxyBOA measured.
Piotr Owsiak
+1  A: 

My quick investigation shows that MS SQL 2005 (Express Edition)

Microsoft SQL Server 2005 - 9.00.3073.00 (Intel X86) Aug 5 2008 12:31:12 Copyright (c) 1988-2005 Microsoft Corporation Express Edition on Windows NT 6.0 (Build 6000: )

store XML with overhead about 70% (possible for faster processing/parsing).

My data before conversion: rows=160320, reserved=178576 KB, data=178184 KB, index_size=272 KB, unused=120 KB

My data after the conversion: rows=160320, reserved=309702 KB, data=307216 KB, index_size=1672 KB, unused=184 KB

So it hasn't any sense to store XML data in XML data type if you are not planning use XML technology on database side.

FoxyBOA
Are you sure that DB size increases? It's contrary to what Saulius stated. It's important to me since I have a logging module that writes to a DB and some guy on my team proposed to keep some log details in XML. If DB would decreases I might consider converting to XML, but definitely not if the DB size increases.
Piotr Owsiak
See figures at my example (compare data value before and after conversion). In my case DB size was increased at 1,7 times.
FoxyBOA
+1  A: 

I am using xml extensively for communicating to handheld devices and am using XQuery in most all my stored procs to retrieve only the data from the XML that I need. It works GREAT!! I am just worried about storage space because with only a hundred thousand records or so, the DB size is 1 to 2 GB. We are expecting going live with millions of records stored for logging and customer usage.. so it will be worrisome until I see what it's actually going to do in terms of space usage.

I'm in other position: all my XML manipulations are executed on a side of application server. So I'm free to choose how to store XML in a database (in text or XML format).
FoxyBOA