Does anyone know of papers/books/etc. that document patterns for databases? For example, one common rule of thumb is that every table should have a primary key and that the key should be devoid of information content. So I was wondering if anyone had written a book or published papers regarding design patterns for designing relational databases?
Books by E.F. Codd and C.J. Date are the most obvious answers. I have not read this particular book but I am familiar with the authors, it is likely quite good.
Applied Mathmatics for Database Professionals by Lexx de Haan and Toon Koppelaars.
Thanks for the link. Wish I'd been smart enough to think of that one myself. I guess I just didn't think there would be any books--I figured I'd be more likely to find research papers and things of that nature.
Actually, I think the rule of thumb is typically to use a natural key rather than a surrogate whenever possible...
So if I have, for instance, an Invoice table and an InvoiceDetail table, we can probably use InvoiceNumber as our primary key on the first one. It already exists in our data and (I assume?) would be unique. For the second table, we are probably going to be stuck needing a surrogate key, however -- whether it's joined to Invoice number as composite or not.
In any event, back to the original question... hometoast's link should get you started.
-- Kevin Fairchild
Using primary keys with business meaning ("natural keys") certainly has its merits, but it can make refactoring your database very difficult. Use caution, especially if there's any reason to believe the database structure will change over time.
@Gaius,
That is the question that a database designer needs to weigh--what is the probable stability of the database structure? Given a long-enough horizon nothing is stable. Or to say the converse, given a long-enough horizon, everything is subject to change. A surrogate key (in theory) should never change its meaning because it never had meaning to begin with.
I guess the other thing to consider in that particular design scenario is who is it that will be seeing the primary key? If the primary key is something that end-users will actually need to refer to then it makes sense to make it something they can understand. But I can't think of many cases where an end-user needs to see a primary key; usually the primary key is present to allow the DB engine to speed up certain operations.
My original thought in asking the question was to find design patterns for database design that were codified by more experienced database designers than myself so as to, hopefully, avoid some easily avoidable errors. It would be interesting reading if anyone had ever codified database design anti-patterns.
Specifically, regarding keys: I strongly disagree with the strange idea that keys must be without meaning. In general, I consider a database a collection of facts; as soon as you start adding arbitrary numbers (like generated keys) and other irrelevant information into it, it should be a warning sign. I recommend this articly by Joe Celko for more on keys.
More general notes:
Suggestions for schema designs/data models for different businesses:
David C. Hay: Data Model Patterns: Conventions of Thought
Rather old, but there is a reason why it's still in print
http://www.dorsethouse.com/books/dmp.html
Maybe not very pattern-like, but still very good: Stephane Faroult, Peter Robson: The Art of SQL http://oreilly.com/catalog/9780596008949/
Another one which I can recommend: Vadim Tropashko: SQL Design Patterns - The Expert Guide to SQL Programming http://www.rampant-books.com/book_2006_1_sql_coding_styles.htm
Systematic text-book about data modelling: Graeme Simsion & Graham Witt, "Data Modeling Essentials" http://www.elsevierdirect.com/product.jsp?isbn=9780126445510
Maybe you are actually looking for a "style guide"?. I that case: Joe Celko: SQL Programming Style http://www.elsevierdirect.com/product.jsp?isbn=9780120887972