You're certainly not talking about 'scifi', but a lot falls outside the standard stuff software engineers are typically exposed to. I spent the last eight years building and using rule engines to do inference over semi-structured data in the retail world.
Doing inference over data is a well established field. There are basically four classes of problems associated with this field:
- Knowledge acquisition (getting rules out of peoples heads and into the code/rules)
- Knowledge representation 'KR' (how to represent your data & rules)
- Efficient pattern matching (matching
a rule form a large ruleset against
large number of facts/data)
- Inference / Reasoning (drawing further conclusions from rule matches, ie rules triggering more rules)
For knowledge acquisition look at:
Ripple Down Rules and Decision Trees, they go a long way and are easy to understand.
Alternatively, the vast field Machine Learning offer a variety of approaches to derive models from data.
For knowledge representation look at RDF and Owl, and to a lesser degree Conceptual Graphs. In terms of expressiveness, RDF & CG are roughly equivalent. The basic concept behind both is a serialisation independent graph (triple) representation of data.
For pattern matching, the classic algorithm is Rete, by Charles Forgy.
For inference, there are two typical strategies: Forward chaining and backward chaining. Forward chaining is done over a ruleset like this:
The data setup:
Rule 1: If A Then B
Rule 2: If B Then C
Facts: A
The execution:
Do {
Newfacts = Eval(RuleSet, Facts)
Facts = Facts + Newfacts
} while (NewFacts.Count > 0)
Feed the data A to this little algorithm, and you will 'infer' (discover) fact C, from the data, thanks to the rulebase. Note that there a a lot of gotchas with inference, especially around things like non-monotonic reasoning (not just adding facts, but changing or removing facts, possibly giving rise to contradictions or loops in the inference).
A simplistic and naive way to get some kind inference going would be to use a database and use joins to match up facts (statements). This may be enough for some applications. When it comes to reasoning, it's easy to get sucked into a world of complications and not-quite-there technologies. Keep it simple.