views:

102

answers:

1

i've got a list of all countries -> states -> cities (-> subcities/villages etc) in a XML file and to retrieve for example a state's all cities it's really quick with XML (using xml parser).

i wonder, if i put all this information in mysql, is retrieving a state's all cities as fast as with XML? cause XML is designed to store hierarchical data while relational databases like mysql are not.

the list contains like 500 000 entities. so i wonder if its as fast as XML using either of:

Adjacency list model

Nested Set model

And which one should i use? Cause (theoretically) there could be unlimited levels under a state (i heard that adjacency isn't good for unlimited child-levels). And which is fastest for this huge dataset?

Thanks!

+2  A: 

In this article Quassnoi creates a table with 2,441,405 rows in a heirarchical structure, and tests the performance of highly optimized queries for nested sets and adjacency lists. He runs a variety of different tests, for example fetching ancestors or descendents and times the results (read article for more details of exactly what was tested):

                                      Nested Sets    Adjacency Lists
All descendants                        300ms         7000ms
All ancestors                           15ms          600ms
All descendants up to a certain level 5000ms          600ms

His conclusion is that for MySQL nested sets is faster to query, but has a drawback that it is much slower to update. If you have infrequent updates, use nested sets. Otherwise prefer adjacency lists.

You might also wish to consider if using another database that supports recursive CTEs is an option for you.

I would imagine that an XML file of this size would take a reasonably long time to parse, but if you can cache the parsed structure in memory rather than reading it from disk each time then queries against it will be very fast.

Note that the main drawback of using MySQL for storing heirarchical data is that it requires some very complex queries. Whilst you can just copy the code from the article I linked to, if ever you need you modify it slightly then you will have to understand how it works. If you prefer to keep things simple then XML definitely has an advantage as it was designed for this type of data and so you should easily be able to create the queries you need.

Mark Byers
i've read another article about the differences between these two models, and it seems that nested set model is VERY difficult to update (requires a lot of work). so i think i would go with adjacency though it's so simple. i thought the model couldn't fetch ALL descendants when the levels are unlimited? so in this test, there was a fixed nr of levels? what do you think about XML database? i've read a little about it. does it sounds like a solution? or would u go with adjacency?
never_had_a_name
@asje: You might be thinking of this article: http://dev.mysql.com/tech-resources/articles/hierarchical-data.html - it doesn't go into as much detail as Quassnoi does in his. If you want my opinion, I would not use MySQL for this task if possible. I would use adjacency list and use SQL Server, Oracle or PostgreSQL as the database. All of these have support for recursive queries.
Mark Byers
@asjie: In Quassnoi's testing the queries support nesting to any depth. The test database he used had a maximum depth of 8.
Mark Byers
@mark byers. so you would prefer other RDBMS rather than a XML database server? isn´t XML better to use when dealing with tree-structure information? cause i think i will go with either mysql or xml database due to my lack of knowledge of the others.
never_had_a_name
There are, as always, variations... for example you can have a sparse nested set with gaps between the left and right values, which makes many updates much faster, at the expense of increasing the complexity (as you have to test for the worst case where you've run out of gaps, and rearrange the tree). Either way, I've used nested sets extensively in MySQL; there's nothing wrong with it. (Obviously I'm talking InnoDB and not awful MyISAM.)
bobince