views:

1207

answers:

2

Martin Fowler defines an elegant object model for the scheduling of recurring tasks here, which maps to OO code very nicely. Mapping this to a relational database schema for persistence, however, is tricky.

Can anyone suggest a schema + SQL combination that encapsulates all the functionality he describes, particularly in the image on page 11. Intersects and Unions are fairly obvious - the complexity lies in representing the 'Temporal Expressions', which take variable parameters and must be interpreted differently, and then combining those into a 'Temporal Set'.

To be clear, there are many ways to represent the concept of recurring events in relational databases. What I'd like everyone's input on is how to map this particular model.

Some possible options:

  • 'Meta' tables that define number of, and use of arguments. Ugly, but probably works. However, there is only likely to be a limited number of 'Temporal Expression' forms, so the extreme flexibility this offers is probably too much.
  • Some form of table inheritance, as supported by Postgres (and presumably, other) RBMS.

Serialising the parameter list and storing the result in a varchar() is not a solution as that method prevents set-based queries :)

+2  A: 

SQL is a language for querying sets of data. It doesn't easily support encoding of domain-specific logic operations. In other words, "rule to be evaluated" is not a data type in SQL. That's an object-oriented concept, that both data and logic are components of an object instance.

So I would say the most you could do within the SQL paradigm would be to store 365 rows, corresponding to the days of the year, and a true/false value for whether the respective day satisfies the criteria of the recurring schedule. So you have to use OO logic implementing Fowler's model to make the calculation, and store the resulting 365 rows.

Then when you need to know "is today (or any given date) part of the schedule?" it's very easy to look up the appropriate row and check the true/false column. Storing 365 rows per year is trivial for any database.

This may seem like cheating, but like I said, SQL is about sets of data, not logic.

Bill Karwin
Thanks for the idea... One immediate problem that I see though, is not being able to reconstruct the rules that were used to generate the day-set. Saving "Every Monday" would give you 52 (or 53) data points, but loading those points wouldn't necessarily imply "Every Monday".
majelbstoat
Model time better then. Have tables of days of year, days of week, months, etc, all linked by a time_id. You can then express any (day-based) rule using selects on these tables.
SquareCog
@Dmitriy: Even if you do that, you have the problem of which tables to join, and what boolean expression to use in the WHERE clause to combine the terms. This is not a problem that SQL was designed to solve.
Bill Karwin
+7  A: 

I'm afraid this answer will be a lot of references and very little practical code, and it has been a while since I last messed with this, but...

I think the two technologies you want to mix here are 'active databases' and 'temporal databases'.

The first would be useful for evaluating the rules and so on, and the second is useful to store temporal data and evaluate at when a certain record is valid. Both of these are pretty large research areas, but you can do most of the temporal stuff in plain SQL (provided your database has good time support). The active part is harder in SQL, but PostgreSQL at least has rules to help slightly with this. I don't know about the others databases, but most of them has rule/trigger/constraint support that would be able to translate to what you are looking for.

Active databases are databases that can react to changes in the facts that it stores using rules. These rules are specified in implementation specific languages, but for every day discussion Event-Condition-Action rules (ECA Rules) are common. For an introduction to active database systems read the articles The Active Database Management System Manifesto and Active Database Systems. For some more information on ECA rules, check out Logical Events and ECA Rules (the pages are in reverse order o_0) and Events in an Active Object-Oriented Database System.

Events processing is a special case of the rule handling dealing with how to handle composite events and trigger their actions appropriately. An interesting read regarding this is Composite Events for Active Databases: Semantics, Contexts and Detection and Anatomy of a Composite Event Detector. Also see the Complex Event Processing site and the Event Stream Processing and Complex Event Processing wikipedia articles.

Temporal databases can be seen as a database that can understand time, and in particular two specific kinds of time; valid-time and transaction-time. The valid-time of a record is the time period during which that record is valid, and the transaction-time of a record is the time during which it is present in the database. As a good practical introduction I'd recommand the book on how to do temporal databases in SQL: Developing Time-Oriented Database Applications in SQL by Richard T. Snodgrass.

Oterhwise, everything you might possibly want to know about temporal databases can be read in Temporal Database Entries for the Springer Encyclopedia of Database Systems which is a pretty comprehensive document (I would start at the 'Temporal Database' entry), but to get started a bit quicker, check out the Temporal Database Glossary which is rather easier to browse and read, and the site Time Center whose publications part has (or did have...) links to most notable publications in the area.

So, now that you know all about this you see quickly that the image on page 11 can be expressed as a composite event, and can be detected/evaluated as such provided you have implemented the proper required subset of a composite event detector, and the rest could be expressed as a entries in tables with temporal aspects :)

Martin Fowler addresses much of this himself in his Patterns for things that change with time that summarizes many patterns that deals with time.

In the end, I would probably create a database schema for the temporal information and either use the DB rules for the active parts or implement that part in the application (there be dragons though). If you use PostgreSQL, the rule mechanisms are described in The Rule System part of the docs.

Much to read, but if you thoroughly understand all this your professional net worth can go up quite a bit :)

Also, good terms to google are 'temporal database', 'active database', 'ECA Rule'.

Henrik Gustafsson
Very interesting stuff, and lots to look through, thanks!
majelbstoat
I was going to give you a proper answer, but I ran out of time, so you got a brain-dump of what I had when I had to stop. Hope you don't mind. I figured it would be a good place for me to store the links for future reference :)
Henrik Gustafsson
Not at all - Just the term 'temporal database' put me onto something new, including this book: http://www.cs.arizona.edu/people/rts/tdbbook.pdf - which, even if it isn't exactly what I need, will still be interesting :)
majelbstoat
Absolutely, how could I forget that one :) I'll put it in the answer so the next guy don't have to read the comments to find this. Snodgrass is very much an authority in the area.
Henrik Gustafsson