views:

256

answers:

4

I envision that in an application we are shortly developing, there will need to be a means in which the user (developers) can specify their own custom datastructures. These structures will need to be persisted, and then reported against within the system.

I know there are several way to achieve this i.e:

  • Having your SQL database hold information about the structures and values via several joined tables that look something like (tblDynamicStructure, tblDynamicStructureColoumns, tblDynamicStructureData) - This approach requires a lot of table joins which I assume will be very inefficient.

  • Having the SQL tables created on the fly when a user defines their structures - Dont really like the idea of giving this much power to a user.

  • Object Databases - I dont know enough about these to comment

  • Flat files or XML - loose the benifits of a relational store.

I KNOW that this is a common problem with several implementations, so Im not really after the defacto way to achieve this, Im looking more for peoples experiences with each approach along with some pros/cons.

Chris.

A: 

Just use any of the many ORM solutions out there for your language. This maps an object's state to RDBMS tables.

Depending on your language, users can just create an object that extends the ORM class and then make a call to persist it on the database. Good times.

Good luck

Aiden Bell
@Aiden - I dont think I was clear enough. I'm not after away of having my object model reflected as a database model (via an ORM or any other mechanism). What I want is to give the users of the application the ability to create their own data structures, and I want to know other opinions of the best way to do this. Your approach would be effectively allowing the users to create DB tables on the fly (via the ORM), which seems a little dangerous to me.
Owen
The danger-level depends on the ORM's features. Really, the best way to allow users to create persistent data structures (esp if they cross-reference eachother) is ORM or ODB. Allowing users to express their data 'in the language' will be simpler all-round. Even if you write your own ORM with safeguards, this seems like the most user-friendly approach ie code-structures->database.
Aiden Bell
Even if you allow users to upload a sanitized class definition that is instantiated to ODB/ORM through some validation.
Aiden Bell
A: 

Personally I never would create table during the runtime of the application. It will be a maintenance and upgrade nightmare.

Having a dynamic structure is what's left for you. Don't care too much about the joins, except you get hundred thousands to millions of items. Then, you should probably constraint the flexibility of the user to only create certain, kind of flat structures that could be stored in predefined tables.

Rereading the question I see that you write "user (developers)". Of course it depends if the user is a average user or a developer. if it is a developer, he can easily write classes and mapping definitions (eg. NHibernate mapping files) and the application has a framework that installs (eg. NHibernate schema creation) and uses the classes.

At the end, you can mix XML and database structures by serializing objects into columns. With NHibernate, this is trivial. Of course, you can't filter (SQL WHERE) on any serialized fields, but you can still have some predefined fields (from a base class) of specially marked fields in separate columns to filter for.

Stefan Steinegger
+1  A: 

There's a lot of solutions to this fairly common problem, with object oriented databases being the solution that most directly relates to it.

Object oriented databases was a very hot topic about 8-10 years ago, but never made it into the mainstream programming world. They are still out there, but I think they're typically very expensive, and not broadly used.

Here's an approach I've used for datastructures that can be made serializable, and doesn't require foreing key relations (a concept the in itself implies relational databases):

  1. Make your datastructure serializable so that every instance of the object kan be serialized.
  2. Create a table with some common fields that all custom datastructures can provide, like a Name, a type, DateCreated, DateModified, the table should also contain a blob field for binary data
  3. Create an interface that corresponds to the table created above, but doesn't have property for the blob field
  4. Create a helper class with a method that recieves an object that implements the above mentioned interface, does a binary serialization on it, and then stores it in the table mentioned above, using the interface properties for the table fields, and the serialized object for the blob field.
  5. This class should also contain a method that fetches the object from the database on criteria based on the interface, and returns the deserialized object.

This method has some drawbacks:

  • You'll have to have some way of identifying the objects that does not vary between different datastructures, this could be a guid, put on every object on creation time.
  • You'll only be able to search for objects in the database based on the common criteria defined by the interface.
  • You'll only be able to store objects that are serializable, but after all a row in a database table is a serialization of some set of data.

ORM's work by creating and managing your database from the code you write. A lot of them requires complex xml-files defining your datastructures, and won't give your developers much freedom.

Again if your datastructures are fairly simple, but you need to be able to change them while developing your product, there's a free, open sourced .NET product called SubSonic, which in V3 has something called SimpleRepository. This enables you to define your database from code, it doesn't require complex xml-files to setup, and if you change your datastructures, the SimpleRepository will make the necessary changes in your DB on the fly. Check it out on www.subsonicproject.com

Regards Jesper Hauge

Hauge
A: 

if you don't want to invest in an OODB (i haven't either), there's two main answers:

  • properties. a table with (object_id, property_name, value) fields, primary_key=(object_id, property_name), and maybe another index on (property_name, value).

  • serialized objects: a blob, XML or text field to store the whole object.

In the first case, simple queries would use the optional index, but complex ones would still be slow. In the second case objects are not indexed unless you specify 'magic' fields that would go to the index if present.

In both cases non-trivially linked objects can be an issue. In the fist case you have to translate pointers to object_id, in the second case you could store the whole composed object into a single record, or set a limit about how far the serialization should go.

Javier