views:

50

answers:

5

This question has come up a few times in various forums in my searching, but none have provided a concise resolution.

If I have the following tables:

User
+- id
+- username
+- password

Article
+- id
+- title
+- content

and I want to join them in order to determine who created what articles, I could simply add the column user_id to Article to use as a reference. Alternatively, I'm adding an intermediate table to show who/when/what, for example:

User
+- ...

Article
+- ...

ChangeHistory
+- id
+- article_id
+- user_id
+- type [enum(insert, update, delete)]
+- datetime

Now this is fine, but the system I'm working on needs to be much more dynamic, in that new modules can be easily introduced and integrated. So now if I add a Media table I need to split the ChangeHistory between Article and Media having:

User
+- ...

Article
+- ...

Media
+- id
+- title
+- path

ArticleChangeHistory
+- id
+- article_id
+- user_id
+- type [enum(insert, update, delete)]
+- datetime

MediaChangeHistory
+- id
+- media_id
+- user_id
+- type [enum(insert, update, delete)]
+- datetime

This can get out of hand quickly with the introduction of many modules. Each module would need to be responsible for the creation and management of it's own ChangeHistory table.

TL;DR: What practices can I explore to create an intermediate table that can receive references to multiple other unrelated tables? I could add a *record_type* field, holding the name of the table to which the record belongs, but that's ugly. I would need something like a "table ID" to reference the table from which it's coming. That way, when/if modules are added or removed, the model doesn't fall apart.

Any ideas? Thanks so much in advance.

+2  A: 

It seems to me you're asking for a guide to lead you down a twisting, thorny path.

I don't know why you'd need to add a new table every time you get a new "module" (not clear to me what a module is) provided the modules can all be described with the same column layout. You could add a Modules table and then include a ModuleId column in the Articles table.

Also, articles can have more than one author, so if yours do, you'd need an ArticleAuthors table. And in some circles, the order of authors on an article is important. It's a ranking of the importance of their contribution or of their importance in the field. If that's the case, you'd need to have an "ordinal" column to reflect the author's position.

Tim
Thanks for your quick response Tim; Modules represent subsystems of a CMS I'm working on. Multiple authors of an article can be represented by the original author User, and subsequent author edits. Users are ranked by level (I didn't include many irrelevant fields in my example models for simplicity's sake) I've been hoping to find some sort of "table-ID" solution, that is independent of the model itself...
TomcatExodus
To expand a bit more, I do have a Modules table, for registering and unregistering modules (and holding config params, etc) but a Module may have multiple tables, such as Discography with Track and Album. Both Track and Album would need individual ChangeHistory tables (or references in a common intermediate table) and therefore a single module_id would not suffice to reference both Track and Album.
TomcatExodus
+2  A: 

I think you might be best off with a single ChangeHistory table that looks something like:

ItemChangeHistory
+- id
+- changedItem_id
+- user_id
+- changedItemType_id
+- type [enum(insert, update, delete)]
+- datetime

and then use changedItemType_id to figure out what was changed (eg. article, media, etc...) based on an itemType table. This way you only have one table for all content change histories, and you don't have to add/remove tables when you add/remove modules, you simply update the itemType table. Of course this means that removing a module has to be done carefully - you could leave behind the ItemChangeHistory records that reference that module's type (but that's messy) or remove all history associated with the module.


Now you mention a module may have other tables, so maybe those need to be tracked with a submodule ID, also referenced in ItemChangeHistory?

FrustratedWithFormsDesigner
A: 

Why not

ChangeHistory
+- id
+- content_id
+- content_type
+- user_id
+- action [enum(added, updated, deleted)]
+- datetime
Majid
+1  A: 

In my experience, when developers try to make their system really "dynamic" they're actually trying to code for problems that they haven't thought of yet. That's usually a bad path to take. Is it really so much extra work for a module to include two tables instead of one?

In every case where I've seen the pattern (or anti-pattern?) of trying to make a generic "does everything" table it's fallen flat on its face. RDBMSs work best with well-defined problem areas. If the module has a need to keep history then the module should add a history table to go with the table itself. This also has a huge advantage in that down the road you're likely to want to keep different types of information in the history depending on the table or module for which the history is being kept. If you have a generic history table that becomes much more difficult.

Now, if you want to simply capture the last user to update or insert a particular item (table row) and that could be in multiple tables then you could use a pattern of inheritance where you have a parent table and multiple children tables. For example:

CREATE TABLE Audited_Items
(
    id    INT    NOT NULL    IDENTITY,
    CONSTRAINT PK_Audited_Items PRIMARY KEY CLUSTERED (id)
)
CREATE TABLE Articles
(
    id    INT            NOT NULL,
    [Article specific columns]
    CONSTRAINT PK_Articles PRIMARY KEY CLUSTERED (id),
    CONSTRAINT FK_Articles_Audited_Items FOREIGN KEY (id) REFERENCES Audited_Items (id)
)
CREATE TABLE Media
(
    id    INT            NOT NULL,
    [Media specific columns]
    CONSTRAINT PK_Media PRIMARY KEY CLUSTERED (id),
    CONSTRAINT FK_Media_Audited_Items FOREIGN KEY (id) REFERENCES Audited_Items (id)
)
CREATE TABLE Audit_Trail
(
    audited_item_id    INT         NOT NULL,
    audit_datetime     DATETIME    NOT NULL,
    user_id            INT         NOT NULL,
    [audit columns]
    CONSTRAINT PK_Audit_Trail PRIMARY KEY CLUSTERED (audited_item_id, audit_datetime),
    CONSTRAINT FK_Audit_Trail_Audited_Items FOREIGN KEY (audited_item_id) REFERENCES Audited_Items (id)
)
Tom H.
Thank you for your detailed response Tom. I do very much like this design and am going to consider it in my further research.
TomcatExodus
Closing the loop on the question, thank you Tom H., your method appears to satisfy my design quite well. Hopefully few changes will need to be made down the road.
TomcatExodus
A: 

@Majid: I understand that's likely the most reasonable solution, but I was hoping to find something a bit more intuitive, something that could potentially take advantage of existing DB parameters; such as the elusive (and clearly non-existent) table id (by which I mean some sort of internal reference that the engine, InnoDB in my case, uses to describe tables)

@frustrated: I suppose I'm going to go with something like that. I'm going to keep investigating possibilities.

TomcatExodus