views:

67

answers:

4

Say you’re designing an application that, by requirement, allows a user the flexibility of creating custom types (for managing his data, whatever it may be). One way of handling this is to define a schema which allows us to use metadata for defining these types. This often means the resulting db schema will have some way of storing key/value pairs (the properties belonging to an instance of a type), where the value portion is usually stored as a string (regardless of the underlying datatype of the column). This alone presents a number of issues. I’ve read that some frown upon using a db to track key/value pairs.

The main issue I have with it is how it affects querying. For example, let's say the user wants to create a type called Event having these columns: event_name, description, start_at and end_at (datetimes). Using key/value pairs where all values are strings makes queries more sensitive to how the parameter values are formatted; therefore, querying for a set of events falling between two dates is not as simple as it would be if we were using actual datetime columns.

This moves me to consider alternative designs which would accommodate custom types. The first one that came to mind and which I liked the most was to use the database itself to define these custom types. That is, rather than create a separate set of meta-tables into which all types must be defined, simply permit the user a limited privilege to create/modify his own tables within the database (all of which would be prefixed with his username: e.g. usertable-johndoe-album). The most prominent issue I see with this approach is the sheer number of tables that might ultimately exist. I wonder if most open-source databases (MySQL, Postgres, etc.) have either a hard or a practical limit on how many tables they can manage without being impeded. That is, I know most production-ready databases are tuned to handle millions of records, but I don’t know if they’re equipped to handle hundreds of thousands of tables. Does anyone know?

Given the requirement for allowing users to create their own types, do you prefer key/value pairs or utilizing the database itself? Or if you have another pattern/idea, please describe it.

A: 

"Given the requirement for allowing users to create their own types,"

"do you prefer key/value pairs" Depends. In the general case, described above, it might be a problem. In the specific case of adding a few columns (described in the comments on this answer) it's the obvious and simple solution.

"or utilizing the database itself" In a way.

"There most prominent issue I see with this approach is the sheer number of tables that might ultimately exist" This is largely irrelevant -- you have no real evidence what the number of user-defined types could possibly be.


When you're allowing the users to add their own types, you're creating a framework. Not an application.

You have stuff to support your built-in types.

You have tools and utilities that will support additional extension types.

The "users" are simply plugging their code into your framework.

"But I want end-users, not programmers to use this". Bad idea. Civilian, non-programmers will not create their own types. Only programmers are equipped to understand the concepts in creating their own types.

Since you're creating a framework, try and simplify things so that programmers can add their own types without pain.

S.Lott
There are some cases where it does make sense for users to define types via meta-data. They are usually little more than a defined set of named text fields, and they are created through a (usually) user-friendly UI. Often they are used for some very specific business rule which no one dreamt of at the time the system was designed, and they may only be in use for a few months. For these situations, it makes a lot more sense for the system to be flexible enough to handle such type rather than require a programmer to create and compile something.
FrustratedWithFormsDesigner
Assume a web app that allows users to manage their own data, a PIM of sorts. (This wouldn't be the first application of its kind.) One user wants to track books he's read or wants to read. Another wants to track his contacts; he may even have particular facts he cares about not available in most online address books. The list goes on. Lots of users, though they might not use our jargon, I think, can handle the concept of tracking attributes for a type. It's hardly more than creating a spreadsheet with column names, which lots of people do.
Mario
Adding a few columns is not even in the same realm as allowing definition of custom types. For that key/value pairs does everything you need, and very, very simply. The "cost" of this is microscopic. Also, please update the vague question with these specifics.
S.Lott
+3  A: 

I doubt you'll be hitting a maximum table limit since they usually are bound to OS limitations rather then RDBMS limitations. However there's an alternative modelling approach called the EAV model which basically is a generic way of describing custom entities and associated columns (attributes). There are some downsides but it sounds like it might be a good fit for your needs.

If you do go for custom tables I would opt for namespaces rather then table prefixes to separate user/group specific entities.

Martijn Laarman
+1 for EAV model! I'm going to have to read up on this.
FrustratedWithFormsDesigner
"I doubt you'll be hitting a maximum table limit since they usually are bound to OS limitations rather then RDBMS limitations." This was helpful. Thanks!
Mario
Thanks for the info on EAV. I'll read up on it.
Mario
A: 

With regards to your second idea (assumption is that when you create usertable-johndoe-album it's structure is a mirror of an underlying base-album table): One thing I would say is do NOT create new tables for each user. Otherwise, if you have to change the underlying table structure in the future (for maintenance fixes, perhaps), you will then have to propgate the same change to ALL of the tables that were created by users. You might be able to find a clever way to automate this, but it still opens the door for a lot more potential trouble, IMO (not including hard upper limits to table numbers).

I do think the metatables are the better way to approach this. It might be possible to automatically create views on the types that the user defines, this might simplify the querying the metadata. You might also want to consider having more than just keys and values in the key/value table. Having a userID would allow you to index by UserID and that might help with performance problems.

FrustratedWithFormsDesigner
A: 

Another way to allow a end-user to create his own types is to use something called aspect programming.

In effect you create a number of "aspects" that the user combines in order to create his/her type. These aspects are not typically created by the end-user, instead they are provided by you however the user can combine the aspects to create new types.

Example:

In digital processing you can have 'objects' in your system connected to physical line/controllers/meters. Lets say one object is connected to a temperature meter so the object shows current temperature. The type from which the object was created contains a number of aspects (kind of like properties of the object), one of those aspects could be for instance a display to show the value - temp. in this case. Every time the user wants to connect a temp. meter he would instance an object of that type and configure it to connect to the physical meter (using the aspects).

Now if the end user may want to have something more done, say he wants to add a graph to show temp. for the last 24h readings (history) - provided there is an aspect for this - he can create a new type by inheriting from the before mentioned type and add the history aspect to the new type. So what he has done is he has created a new type. He can then instead of using the original type use this new type when instantiating a temp. object. This allows for a very dynamic way of handling types.

Anders K.