tags:

views:

626

answers:

3

I have a set of data in tree structure. Currently I save those data to a binary file. The downside is that the sorting, filtering of these data are exceedingly difficult.Not only that, when the data size is large, it's very slow to read them from hard disk into memory.

So I am thinking about saving these tree like data to XML files. The reason I want this is because

  1. XML has got built in library for filtering and data manipulation
  2. XML manipulation is well supported by community, whereas now I have to support my own data structure manipulation

Given the .Net support for XML, I wonder whether it's faster ( in terms of loading speed) to query data from XML as opposed to query data from binary file? Is there any advantage for me to make the switch? I am pretty sure that as far as programming effort goes, XML beats my own tree data structure hands down, but what about loading speed?

+1  A: 

As a rule of thumb: You won't find XML to be the smallest or fastest way to manage your data.

Your description doesn't give enough detail to say for sure, but perhaps a relational database would be a better approach. It's usually not difficult to map tree structures into relational models. (Going the other way around is a different story...)

Dan Breslau
Very succinctly put. +1
Cerebrus
Map tree structure into a relational models... any guidelines on how to do this?How to map tree nodes into columns and tables? The former is highly unstructured, but the later are defined at the start
Ngu Soon Hui
A good intro to this, with more links, can be found here: http://www.rockstarapps.com/wordpress/?p=82NOTE: Unless you need to optimize on deeply-nested queries, you can stop reading when you reach the section "So what is the best way to do this?", about 20% down.
Dan Breslau
A: 

The data size will probably be larger than you current tree size, since XML is text and therefore all data has to be serialized to a textual representation. So loading may or may not be slower, which also depends on your current load implementation.

As for the rest, querying and modifying data is very simple and fairly efficient if done right, but due to the textual nature it usually cannot exceed a well done binary implementation.

If you need transformation of your tree data (for displaying etc.), XML is great - using XSL Transformations allow you to create pretty much any XML, HTML or text representation of your data with little programming (and therefore also testing and debugging) effort.

Lucero
A: 

From a performance standpoint, XML will almost surely lose the competition against a binary structure. However, from a development and technological standpoint, you are quite right in your estimation that XML wins hands down.

I completely concur with @Dan's statement. Performance with XML data structures goes down exponentially as the size of data increases. It's use is so common because most applications do not deal with very large amounts of data which are usually stored in databases or serialized to binary data.

Cerebrus