tags:

views:

78

answers:

6

My application has one table called 'events' and each event has approx 30 standard fields, but also user defined fields that could be any name or type, in an 'eventdata' table. Users can define these event data tables, by specifying x number of fields (either text/double/datetime/boolean) and the names of these fields. This 'eventdata' (table) can be different for each 'event'.

My current approach is to create a lookup table for the definitions. So if i need to query all 'event' and 'eventdata' per record, i do so in a M-D relaitionship using two queries (i.e. select * from events, then for each record in 'events', select * from 'some table').

Is there a better approach to doing this? I have implemented this so far, but most of my queries require two distinct calls to the DB - i cannot simply join my master 'events' table with different 'eventdata' tables for each record in in 'events'.

I guess my main question is: can i join my master table with different detail tables for each record?

E.g.

SELECT E.*, E.Tablename 
FROM events E 
LEFT JOIN 'E.tablename' T ON E._ID = T.ID

If not, is there a better way to design my database considering i have no idea on how many user defined fields there may be and what type they will be.

A: 

Those type of joins won't work because you will need to pivot the eventdata table to make it columns instead of rows. Therefore it depends on which database technology you are using.

Here is an example with MySQL: How to pivot a MySQL entity-attribute-value schema

Turnkey
Using MySQL - will have a look thanks.
Simon
A: 

My approach would be to avoid using a different table for each event, if that's possible.

I would use something like:

Event (EventId, ..., ...)
EventColumnType (EventColumnTypeId, EventTypeId, ColumnName)
EventColumnData (EventColumnTypeId, Data)

You are them limited to the type of data you can store (everything would have to be strings, for example), but you the number of events and columns are unrestricted.

Paul
unfortunately some of my queries at the moment require me to filter between eventdata numbers, e.g. select ED.* where ED.Value > x.
Simon
A: 

What I'm getting from your description is you have an event table, and then a separate EventData table for each and every event.

Rather than that, why not have a single EventCustomFields table that contains a foreign key to the event table, a field Name (event+field being the PK) and a field value.

Sure it's not the best. You'd be stuck serializing the value or storing everything as a string. And you'd still be stuck doing two queries, one for the event table and one to get it's custom fields, but at least you wouldn't have a new table for every event in the system (yuck x10)

Another, (arguably worse) option is to serialize the custom fields into a single column of the and then deserialize when you need. So your query would be something like

Select E.*, C.*
From events E, customFields C
Where E.ID = C.ID
Josh Sterling
Thanks Josh,However every event in the system will not have different eventData, rather there wil only be around 20 different types of eventdata (so 20 extra user defined tables). Not TOO hard to manage.
Simon
A: 

Is it possible to just impose a limit on your users? I know the tables underneath Sharepoint 2007 had a bunch of columns for custom data that were just named like CustomString1, CustomDate2, etc. That may end up easier than some of the approaches above, where everything is in one column (though that's an approach I've taken as well), and I would think it would scale up better.

joelt
This is possible, i was hoping to avoid a large amount of unneccessary space being used (e.g. some eventdata may just have 1 number, whereas others may have 10strings, 20 numbers etc). ty
Simon
+2  A: 

There are four ways of handling this.

  1. Add several additional fields named "Custom1", "Custom2", "Custom3", etc. These should have a datatype of varchar(?) or similiar

  2. Add a field to hold the unstructured data (like an XML column).

  3. Create a table of name /value pairs which are associated with some type of template. Let them manage the template. You'll have to use pivot tables or similiar to get the data out.

  4. Use a database like MongoDB or another NoSql style product to store this.

The above said, The first one has the advantage of being fast but limits the number of custom fields to the number you defined. Older main frame type applications work this way. SalesForce CRM used to.

The second option means that each record can have it's own custom fields. However, depending on your database there are definite challenges here. Tried this, don't recommend it.

The third one is generally harder to code for but allows for extreme flexibility. SalesForce and other applications have gone this route; including a couple I'm responsible for. The downside is that Microsoft apparently acquired a patent on doing things this way and is in the process of suing a few companies over it. Personally, I think that's bullcrap; but whatever. Point is, use at your own risk.

The fourth option is interesting. We've played with it a bit and the performance is great while coding is pretty darn simple. This might be your best bet for the unstructured data.

Chris Lively
+1 Nice summary of the options, as it's a tough problem that comes up many times. For the NoSQL option do you use a separate collection for the attribute labels or is there a way to do reflection on the collection to get the custom attribute labels?
Turnkey
@Turnkey: With the nosql option, I would still use a "template" of sorts which defined the properties on the records. BTW,Here's an article MSDN magazine did last month. It doesn't go as far as were talking about, but it's a good introduction: http://msdn.microsoft.com/en-us/magazine/ee310029.aspx
Chris Lively
Thanks Chris, i will look into the noSQL option and see how this can benefit my app.Cheers
Simon
A: 

The answer to your main question is: no. You can't have different rows in the result set with different columns. The result set is kind of like a table, so each row has to have the same columns. You can fake it with padding and dummy columns, but that's probably not much better.

You could try defining a fixed event data table, with (say) ten of each type of column. Then you'd store the usage metadata in a separate table and just read that in at system startup. The metadata would tell you that event type "foo" has a field "name" mapped to column string0 in the event data table, a field named "reporter" mapped to column string1, and a field named "reportDate" mapped to column date0. It's ugly and wastes space, but it's reasonably flexible. If you're in charge of the database, you can even define a view on the table so to the client it looks like a "normal" table. If the clients create their own tables and just stick the table name in the event record, then obviously this won't fly.

If you're really hardcore you can write a database procedure to query the table structures and serialize everything to a lilst of key/type/value tuples and return that in one long string as the last column, but that's probably not much handier than what you're doing now.

TMN
Thanks for answering my main question TMN. Its what i thought, and I have actually managed to implement this, albeit has been quite tricky at times.The worst of it is in the multiple queries to sometimes, retrieve one table of data, and the obvious performance hit of this (which is not a real big issue).
Simon