



To gain some experience, I am trying to make an expert system that can answer queries about the animal kingdom. However, I have run into a problem modeling the domain. I originally considered the animal kingdom hierarchy to be drawn like


This I figured would allow me to make queries easily like "give me all birds", but would be much more expensive to say "give me all carnivores", so I rewrote the hierarchy to look like:


But now it will be much slower to query "give me all birds."

This is of course a simple example, but it got me thinking that I don't really know how to model complex relationships that are not so strictly hierarchical in nature in the context of writing an expert system to answer queries as stated above. A directed, cyclic graph seems like it could mathematically solve the problem, but storing this in a relational database and maintaining it (updates) would seem like a nightmare to me. I would like to know how people typically model such things. Explanations or pointers to resources to read further would be acceptable and appreciated.


If you take a look at the MongoDB manual page on Using Multikeys to Simulate a Large Number of Indexes, you will see that MongoDB would let you create one "document" in its database for each animal that contains all sorts of information:

  _id: "hawk",
  attribs: [
   {diet: 'carnivore'},
   {kingdom: 'animal'},
   {class: 'Aves'},
   {order: 'Accipitriformes'},
   {locomotion: 'flight'}

Then you can look up by any combination of attributes you want!

Brandon Craig Rhodes

You've hit upon one of the problems with taxonomies (far from the only one, or even the worst one, in fact). Multiple Inheritance as a conceptual tool avoids many of the problems with taxonomies -- another way of putting it is, a taxonomy defines a tree, a MI-based classification scheme defines a more general directed acyclic graph, and therefore affords extra degrees of freedom in your modeling.

A relational database approach would be different (not thinking of hierarchy or inheritance specifically) but come to much the same conceptual results as "multiple inheritance": the "class" (in the Linnaeus sense of phylum/class/order/family/genus/species) is one field of the record, the diet (carnivore, herbivore, omnivore) a distinct one -- they don't constrain each other, neither in conceptualization nor in searches / retrieval.

If you're forced to model with tools that restrict you to taxonomies (AKA trees, single inheritance, &c), there are some tricks to ameliorate the pain they cause (to a modest degree), but they depend on each tool's specific restrictions, so it's hard to generalize.

Alex Martelli

I wrote up a user roles example using a similar problem with a Graph database back-end. The example I use originally comes from this SQL based example. I wouldn't even try using SQL for this kind of problem nowadays, it's such a pain. (disclaimer: I'm on the Neo4j graphdb team)
