ansaurus

Question

Where do you store ad-hoc properties in a relational database?

Answer 1

+2 A:

Definitely the 3.

Sometimes 4 if you have a very good reason to do so.

Do not ever dynamically modify database structure to accommodate for incoming data. One day something could break and damage your database. It is simply not done this way.

Developer Art 2010-08-25 12:42:35

FYI, it's my understanding that the Deltek ERP system uses a variant of option 2 to store user-defined columns.

Scott Whitlock 2010-08-25 12:48:16

Well if someone does it the wrong way does not automatically mean it is a good thing to do as well :-)

Yves M. 2010-08-25 12:52:08

Sure, but in that case it allows them to index certain columns for searching. Using option 3 means you'd have to index all "varchar" type columns together, so if you're searching for "model 1234" you'll be searching all varchar columns, not just the model number column for refrigerators (to be fair, your joins should narrow it down). Option 2 also gives you the ability to have foreign key columns to other entities, which you just can't do with option 3 (or 4, as far as I can tell).

Scott Whitlock 2010-08-25 12:59:17

For Option 4 you can use xpath to query and join information stored as xml. which is just pretty cool.

Yves M. 2010-08-25 13:02:49

@Yves I've never found xpath and xml cool in TSQL.

Chuck Conway 2010-08-25 13:07:39

@Chuck Conway: well this is a bit like smoking in a car. some like it others don't...

Yves M. 2010-08-25 13:37:01

@Yves In general, people agree smoking is bad for you...

Chuck Conway 2010-08-25 14:45:30

@Chuck Conway: Argh. Got me.

Yves M. 2010-08-25 14:48:16

Answer 2

+3 A:

I am using the xml type column for this kind of situations...

http://msdn.microsoft.com/en-us/library/ms189887.aspx

Before xml we had to use the option 3. Which in my point of view is still a good way to do it. Espacialy if you have a Data Access Layer that is able to handle the type conversion properly for you. We stored everything as string values and defined a column that held the orignial data type for the conversion.

Options 1 and 2 are a no-go. Don't change the database schema in production on the fly.

Option 5 could be done in a separate database... But still no control over the schema and the user would need the rights to create tables etc.

Yves M. 2010-08-25 12:45:47

Useful information. Here's an article about the XML datatype with examples: http://www.developer.com/db/article.php/3531196/The-Fundamentals-of-the-SQL-Server-2005-XML-Datatype.htm

Scott Whitlock 2010-08-25 13:03:23

On a related note, if you are using LINQ to SQL as the DAL, then LINQ can't query the XML datatype directly, but you can encapsulate the XML query into a UDF, and LINQ to SQL supports UDF's: http://weblogs.asp.net/scottgu/archive/2007/08/16/linq-to-sql-part-6-retrieving-data-using-stored-procedures.aspx That means you could return custom columns from a LINQ expression, and filter or select on them. I believe you'd just pass in the custom column name (string) as a parameter to the UDF.

Scott Whitlock 2010-08-25 14:16:34

Answer 3

+2 A:

3 or 4 are the only ones I would consider - you don't want to be changing the schema on the fly, especially if you're using some kind of mapping layer.

I've generally gone with option 3. As a bit of sanity, I always have a type column in the CUSTOM_PROPERTY table, which is repeated in the CUSTOM_PROPERTY_VALUE table. By adding a superkey to the CUSTOM_PROPERTY table of <Primary Key, Type>, you can then have a foreign key that references this (as well as the simpler foreign key to just the primary key). And finally, a check constraint that ensures that only the relevant column in CUSTOM_PROPERTY_VALUE is not null, based on this type column.

In this way, you know that if someone has defined a CUSTOM_PROPERTY, say, Tine count, of type int, that you're actually only ever going to find an int stored in the CUSTOM_PROPERTY_VALUE table, for all instances of this property.

Edit

If you need it to reference multiple entity tables, then it can get more complex, especially if you want full referential integrity. For instance (with two distinct entity types in the database):

    create table dbo.Entities (
        EntityID uniqueidentifier not null,
        EntityType varchar(10) not null,
        constraint PK_Entities PRIMARY KEY (EntityID),
        constraint CK_Entities_KnownTypes CHECK (
            EntityType in ('Foo','Bar')),
        constraint UQ_Entities_KnownTypes UNIQUE (EntityID,EntityType)
    )
    go
    create table dbo.Foos (
        EntityID uniqueidentifier not null,
        EntityType as CAST('Foo' as varchar(10)) persisted,
        FooFixedProperty1 int not null,
        FooFixedProperty2 varchar(150) not null,
        constraint PK_Foos PRIMARY KEY (EntityID),
        constraint FK_Foos_Entities FOREIGN KEY (EntityID) references dbo.Entities (EntityID) on delete cascade,
        constraint FK_Foos_Entities_Type FOREIGN KEY (EntityID,EntityType) references dbo.Entities (EntityID,EntityType)
    )
    go
    create table dbo.Bars (
        EntityID uniqueidentifier not null,
        EntityType as CAST('Bar' as varchar(10)) persisted,
        BarFixedProperty1 float not null,
        BarFixedProperty2 int not null,
        constraint PK_Bars PRIMARY KEY (EntityID),
        constraint FK_Bars_Entities FOREIGN KEY (EntityID) references dbo.Entities (EntityID) on delete cascade,
        constraint FK_Bars_Entities_Type FOREIGN KEY (EntityID,EntityType) references dbo.Entities (EntityID,EntityType)
    )
    go
    create table dbo.ExtendedProperties (
        PropertyID uniqueidentifier not null,
        PropertyName varchar(100) not null,
        PropertyType int not null,
        constraint PK_ExtendedProperties PRIMARY KEY (PropertyID),
        constraint CK_ExtendedProperties CHECK (
            PropertyType between 1 and 4), --Or make type a varchar, and change check to IN('int', 'float'), etc
        constraint UQ_ExtendedProperty_Names UNIQUE (PropertyName),
        constraint UQ_ExtendedProperties_Types UNIQUE (PropertyID,PropertyType)
    )
    go
    create table dbo.PropertyValues (
        EntityID uniqueidentifier not null,
        PropertyID uniqueidentifier not null,
        PropertyType int not null,
        IntValue int null,
        FloatValue float null,
        DecimalValue decimal(15,2) null,
        CharValue varchar(max) null,
        EntityType varchar(10) not null,
        constraint PK_PropertyValues PRIMARY KEY (EntityID,PropertyID),
        constraint FK_PropertyValues_ExtendedProperties FOREIGN KEY (PropertyID) references dbo.ExtendedProperties (PropertyID) on delete cascade,
        constraint FK_PropertyValues_ExtendedProperty_Types FOREIGN KEY (PropertyID,PropertyType) references dbo.ExtendedProperties (PropertyID,PropertyType),
        constraint FK_PropertyValues_Entities FOREIGN KEY (EntityID) references dbo.Entities (EntityID) on delete cascade,
        constraint FK_PropertyValues_Entitiy_Types FOREIGN KEY (EntityID,EntityType) references dbo.Entities (EntityID,EntityType),
        constraint CK_PropertyValues_OfType CHECK (
            (IntValue is null or PropertyType = 1) and
            (FloatValue is null or PropertyType = 2) and
            (DecimalValue is null or PropertyType = 3) and
            (CharValue is null or PropertyType = 4)),
        --Shoot for bonus points
        FooID as CASE WHEN EntityType='Foo' THEN EntityID END persisted,
        constraint FK_PropertyValues_Foos FOREIGN KEY (FooID) references dbo.Foos (EntityID),
        BarID as CASE WHEN EntityType='Bar' THEN EntityID END persisted,
        constraint FK_PropertyValues_Bars FOREIGN KEY (BarID) references dbo.Bars (EntityID)
    )
    go
    --Now we wrap up inserts into the Foos, Bars and PropertyValues tables as either Stored Procs, or instead of triggers
    --To get the proper additional columns and/or base tables populated

Damien_The_Unbeliever 2010-08-25 12:59:05

Lets say *all* the tables in your database use uniqueidentifier columns as primary keys. Could you generalize the CUSTOM_PROPERTY_x tables to refer to *any* table in the database, since we can safely assume that all primary keys in the database are unique? Anyone done that?

Scott Whitlock 2010-08-25 13:06:48

Are you asking about, where you've commented elsewhere, that you want some properties to be foreign keys? Or are you meaning that you've got multiple tables that you wish to extend with additional properties, and you wish to place all of the additional properties in the same CUSTOM_PROPERTY_VALUE table?

Damien_The_Unbeliever 2010-08-25 13:31:22

@Damien_The_Unbeliever: The latter. Example: a general NOTES table where you want the user to be able to attach notes to anything in the database, and handle it all in a contained module. Any entity can pass in a request to the notes module and just has to give its primary key as a parameter.

Scott Whitlock 2010-08-25 13:40:29

Nice code example. A slightly modified version could work very well for a "tagging" implementation (where you want the user to be able to "tag" different entity types with tags that could apply to all of them).

Scott Whitlock 2010-08-25 14:30:51

Yeah. In short, I love computed columns, and using Super Keys to enforce stricter relationships than the plain foreign keys would give you.

Damien_The_Unbeliever 2010-08-25 14:35:39

Answer 4

A:

My inclination would be to store things as XML if the database supports that nicely, or else have a small number of different tables for different data types (try to format data so it will fit one of a small number of types--don't use one table for VARCHAR(15), another for VARCHAR(20), etc.) Something like #5, but with all tables pre-created, and everything shoehorned into the existing tables. Each row should hold a main-record ID, record-type indicator, and a piece of data. Set up an index based on record-type, subsorted by data, and it will be possible to query for particular field values (where RecType==19 and Data=='Fred'). Querying for records that match multiple field values would be harder, but such is life.

supercat 2010-08-25 15:35:01

ansaurus

tags:

views:

answers:

Where do you store ad-hoc properties in a relational database?

related questions