Hi,
I am looking for an efficient way to represent and retrieve the geographical relationship eg. districts->states->USA. This should accommodate any level of hierarchy eg. district->region->states->big region(East/west/south/north) -> USA.
My requirements are
- I mostly operate at the lowest level - so getting all of them fast should be the first priority. Constant time is preferred.
- Then, I want to perform aggregates eg.combine districts data at state level (so obtain all the children for a node) easily - this is the second criteria.
- Order at a level does not matter -eg. For NC, I don't mind if I first get Raleigh or Fayetville.
As you have almost have guessed - A Tree datastructure lends itself to the problem logically. But I could not find a way to get all the leaves efficiently. I can check if a node is leaf in O(log n) time but I have check each of the nodes for that.
I have looked B, B+ trees but what I didn't understand is they maintain their order using some ordering like ascending or descending.
My gut feeling is there should efficient solutions for this because - windows or any file system does this. Files->Folders->Big Folders->C -> My Computer. Also this kind of computations must be done in data mining lets say for clustering (I remember reading something of this sorts)
Any leads in this direction would be appreciated.
Thanks