views:

51

answers:

3

To gain some experience, I am trying to make an expert system that can answer queries about the animal kingdom. However, I have run into a problem modeling the domain. I originally considered the animal kingdom hierarchy to be drawn like

-animal
  -bird
    -carnivore
     -hawk
    -herbivore
     -bluejay
  -mammals
   -carnivores
   -herbivores

This I figured would allow me to make queries easily like "give me all birds", but would be much more expensive to say "give me all carnivores", so I rewrote the hierarchy to look like:

-animal
  -carnivore
    -birds
     -hawk
    -mammals
     -xyz
  -herbivores
   -birds
     -bluejay
   -mammals

But now it will be much slower to query "give me all birds."

This is of course a simple example, but it got me thinking that I don't really know how to model complex relationships that are not so strictly hierarchical in nature in the context of writing an expert system to answer queries as stated above. A directed, cyclic graph seems like it could mathematically solve the problem, but storing this in a relational database and maintaining it (updates) would seem like a nightmare to me. I would like to know how people typically model such things. Explanations or pointers to resources to read further would be acceptable and appreciated.

A: 

If you take a look at the MongoDB manual page on Using Multikeys to Simulate a Large Number of Indexes, you will see that MongoDB would let you create one "document" in its database for each animal that contains all sorts of information:

{
  _id: "hawk",
  attribs: [
   {diet: 'carnivore'},
   {kingdom: 'animal'},
   {class: 'Aves'},
   {order: 'Accipitriformes'},
   {locomotion: 'flight'}
  ]
}

Then you can look up by any combination of attributes you want!

Brandon Craig Rhodes
A: 

You've hit upon one of the problems with taxonomies (far from the only one, or even the worst one, in fact). Multiple Inheritance as a conceptual tool avoids many of the problems with taxonomies -- another way of putting it is, a taxonomy defines a tree, a MI-based classification scheme defines a more general directed acyclic graph, and therefore affords extra degrees of freedom in your modeling.

A relational database approach would be different (not thinking of hierarchy or inheritance specifically) but come to much the same conceptual results as "multiple inheritance": the "class" (in the Linnaeus sense of phylum/class/order/family/genus/species) is one field of the record, the diet (carnivore, herbivore, omnivore) a distinct one -- they don't constrain each other, neither in conceptualization nor in searches / retrieval.

If you're forced to model with tools that restrict you to taxonomies (AKA trees, single inheritance, &c), there are some tricks to ameliorate the pain they cause (to a modest degree), but they depend on each tool's specific restrictions, so it's hard to generalize.

Alex Martelli
A: 

I wrote up a user roles example using a similar problem with a Graph database back-end. The example I use originally comes from this SQL based example. I wouldn't even try using SQL for this kind of problem nowadays, it's such a pain. (disclaimer: I'm on the Neo4j graphdb team)

nawroth