I am intentionally leaving this quite vague at first. I'm looking for discussion and what issues are important more than I'm looking for hard answers.
I'm in the middle of designing an app that does something like portfolio management. The design I have so far is
- Problem: a problem that needs to be solved
- Solution: a proposed solution to one or more problems
- Relationship: a relationship among two problems, two solutions, or a problem and a solution. Further broken down into:
- Parent-child - some sort of categorization / tree hierarchy
- Overlap - the degree to which two solutions or two problems really address the same concept
- Addresses - the degree to which a problem addresses a solution
My question is about the temporal nature of these things. Problems crop up, then fade. Solutions have an expected resolution date, but that might be modified as they are developed. The degree of a relationship might change over time as problems and solutions evolve.
So, the question: what is the best design for versioning of these things so I can get both a current and an historical perspective of my portfolio?
Later: perhaps I should make this a more specific question, though @Eric Beard's answer is worth an up.
I've considered three database designs. I'll enough of each to show their drawbacks. My question is: which to pick, or can you think of something better?
1: Problems (and separately, Solutions) are self-referential in versioning.
table problems
int id | string name | text description | datetime created_at | int previous_version_id
foreign key previous_version_id -> problems.id
This is problematic because every time I want a new version, I have to duplicate the entire row, including that long description
column.
2: Create a new Relationship type: Version.
table problems
int id | string name | text description | datetime created_at
This simply moves the relationship from the Problems and Solutions tables into the Relationships table. Same duplication problem, but perhaps a little "cleaner" since I already have an abstract Relationship concept.
3: Use a more Subversion-like structure; move all Problem and Solution attributes into a separate table and version them.
table problems
int id
table attributes
int id | int thing_id | string thing_type | string name | string value | datetime created_at | int previous_version_id
foreign key (thing_id, thing_type) -> problems.id or solutions.id
foreign key previous_version_id -> attributes.id
This means that to load the current version of a Problem or Solution I have to fetch all versions of the attribute, sort them by date and then use the most current. That might not be terrible. What seems really bad to me is that I can't type-check these attributes in the database. That value
column has to be free-text. I can make the name
column a reference into a separate attribute_names
table that has a type
column, but that doesn't force the correct type in the attributes
table.
later still: response to @Eric Beard's comments about multi-table foreign keys:
Alas, what I've described is simplistic: there are only two types of Things (Problems and Solutions). I actually have about 9 or 10 different types of Things, so I'd have 9 or 10 columns of foreign keys under your strategy. I wanted to use single-table inheritance, but the Things have so little in common that it would be extremely wasteful to do combine them into one table.