views:

113

answers:

0

I'm looking to replace our current model (written entirely in Tcl (please don't even start)) with a new model based around an SQL (sqlite) database, and I'm looking for books/articles giving advice on how one goes about designing a DB schema as well as the model interface around it.

I've been reading questions about updating bad DB schemas as well as migrating schemas, and what views are and how they might help.

I believe I understand database normalization.

My question is, what guidance is there for how to write my model such that its code can remain stable in the presence of DB schema changes, and how to design my DB schema such that migrating older versions to the new schema is as painless as possible.

The situation is we generate a largely read-only DB. These DBs are going to live for a long time, and it's expected that 5-10 years from now the application will be able to open up and view a DB written today. The DB schema is not expected to change much over time, but you never know.

If it helps, you can imagine the DB consisting of a table of people

CREATE TABLE people(id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT, 
                    age INTEGER, height INTEGER, sex INTEGER, ethnicity STRING, 
                    maritalstatus TEXT, autobiography TEXT)

Note: I'll normalize out ethnicity, maritalstatus, and autobiography, just bear with the example.

The end application basically has two views, the first is a tree which allows grouping people by sex by age by height, or by height by ethnicity. The second view shows the autobiographies for all the people selected in the tree view. And the application allows you to change the maritalstatus of people.

In the future I might add weight INTEGER or zipcode INTEGER, that kind of thing.

The DBs will contain on the order of 30-40k people, and the cumulative text of the autobiographies will easily reach 1-2GB.

The tools I have at my disposal are C/C++ and sqlite.

The questions I can think of (I'm sure I'm missing obvious ones):

  1. How do I manage reading an old DB schema with a read-only DB?
  2. When creating the grouping (e.g. by height by ethnicity) should I use indexes, or views?
  3. What is a good way to represent an index in the model? (given that it will be translated to an SQL query)
  4. Do I version the entire DB? Or each table separately?
  5. Is there a known pattern to use to abstract away the fact the model's core is an SQL DB? i.e. Should I be aiming to write the model in such a way that most of its code doesn't depend on the core being SQL? Or do I just let all model code interact with the DB?

I know my model isn't complicated, and my situation is certainly not unique. I'm just trying to learn from others mistakes so I don't go through too much pain of re-discovering the wheel. Heck, I bet I can't find what I want simply because I'm using the wrong search terms.