views:

61

answers:

3

We're building a simple web based system whereby somebody adds a record, a CMS page for example, that gets approved by someone in charge before being shown on website.

If the author then decides to edit that page later, we want to create a draft based on the live copy, on approval it will replace the old live page.

We thought about doing complete version control but believe we can keep this simpler by just having 1. Just a draft, 2. Just a live, or 3. One draft and one live.

This functionality is required across multiple 'things' not just pages.

Finally the question: Do you think it would be better to store these two records in the same table, or would a mirror table be better?

I guess it probably depends but I don't like the ideal of having two tables with the same structure. Is the trade off for slightly slower operations (as we'll have to query out the drafts all the time when displaying the data) worth it?

+4  A: 

Moving stuff from table to table when there's a state change is a bad idea.

When you want to add additional states to the workflow, you have to add yet more tables.

It's just a state change -- that's what relational databases are optimized for.

One table, multiple states is the standard approach.

If you find that things are horrifyingly slow -- and you can prove that the state-based query is the entire cause -- you can resort to "materialized views" or similar technology where the state change (and resulting move) is handled by the RDBMS.

Table-per-state is a bad idea.

  1. You can't easily add states. You have to add tables, also, making it painful. Further, you have to update code with the new table name(s) to reflect the new workflow.

    If a state is just a column, adding new states is adding new values and new if-statements in the code. State changes are just updates, not "delete-insert".

    Data lasts forever, workflows come and go every time a user has a clever idea. Don't punish them for wanting to change the workflow.

  2. You can't easily have sub-states. Many state machines are actually multiple, nested, state machines. Adding a sub-state with table-per-state creates yet more tables with yet more rules.

    If a state is just a column, a nested substate is just another column with new if-statements in the code. State changes are just updates, not "delete-insert".

  3. You can't easily have parallel state machines. Many times there are many parallel status code changes. Sometimes there are manual workflows (approvals) and automated worflows (archiving, copying to the data warehouse, etc.) With table-per-state and parallel state machines, there's no way to implement it rationally

    If each state is just a columns, parallel state machines are just parallel updates.

S.Lott
On the whole I agree with what you are saying, but you are aware right that this isn't just one record that goes through a set of states right? It will be initially, but then a draft will need to exist along with the live related record, and possible replace it later after approval.
tsdbrown
@rsdbrown: "adds a record". Sounds singular to me. Please update the question if it not one record. What's important is that updating multiple records or updating a single record doesn't matter. Table-per-state is a bad design and rapidly gets worse when you try to add states. State is just a status code on the row.
S.Lott
Look, we're in agreement with the two tables being a bad design, I even hinted to my thoughts on it in my question! The reason I asked it was because I've seen how others have implemented this in rails plugins such as acts_as_versioned and has_draft. I don't think you understand my question and that's probably my doing. However others seem to grasp the idea and I believe the question is clear enough. Thank you for your input. Yours, others and my own thoughts have helped confirm that I will not have two tables.
tsdbrown
"I don't think you understand my question". That comment didn't clarify your question, nor did you edit your question to clarify it. Simply saying I didn't understand doesn't seem to clarify it either. Feel free to actually update your question to make it more clear.
S.Lott
+2  A: 

No. One entity type, one table.

Reasons to reconsider:

  1. Draft records outnumber live records by a factor of thousands to one.

  2. Security conditions require that some users who access the database directly have certain rights over draft or live records at the GRANT/REVOKE level but not over the other type of record.

A second design to consider would be one table for the Items and a second table for LiveItems. The second table contains only the IDs for the Items that are live. That way you're maintaining your single table design, but you can find the LiveItems by joining your one-column table back to the main table.

Larry Lustig
Thank you for this.
tsdbrown
+1  A: 

Agreed with all the comments given above : only one table.
With the scopes, you can easily get only the published posts or the drafts.

I wouldn't recommand for it.
But if you really wish to have two different models for the drafts and published entries, there's an other solution though : STI.

You'd have two models :

class Post < ActiveRecord::Base
end

class Draft < Post
end

Any Draft object is taken from the Post table.
The Type parameter makes it a post or a draft.

Whenever you want to publish a post you'd then have to do :

@draft = Draft.first
@draft[:type] = 'Post'
Damien MATHIEU
STI was something else I've been considering, having a status column with scopes is almost the same as having a type column. Would you say the scopes/status way would be better then? I like the idea of STI as then the controllers will be like LivePost.all and DraftPost.all
tsdbrown