views:

91

answers:

4

I'm toying with the idea to write another framework to make it easier to develop "bread'n'butter" applications (like create a class with N fields, get an editor for that for free plus the DB persistence).

All data models can be converted into the Entity-Attribute-Value form:

TYPE VARCHAR(32)
ID LONG INT
NAME VARCHAR(32)
VALUE VARCHAR(64000)

with maybe a second table for really large fields so I'd keep a reference in the VALUE column to the entry in the BLOB table. If I was in the mood, I could create one table per value type (so int would INTEGER, avoiding all the transformation problems) and I could use a table to define valid TYPEs, etc.

This would effectively free me from having to worry about the DB design since there isn't one. The database could adjust to any change in my model by using simple updates. I could even have instances of the same class with additional fields.

The drawback is that for each object, I need to read N rows or I need to start with building complex queries which contain N subqueries.

Does anyone have experience with this? Has anyone ever implemented a larger system this way? What other options are out there to persist data besides the normal SQL? I'd especially like to hear about agile systems which adopt easily to changes in the model or which allow to "patch" the model (usually, an instance will have a name but for some, I'd also like to add a comment). Or has anyone encountered something post-SQL? The next great thing?

A: 

Take a look at XML databases (like eXist). You can easily change your "datamodel" by modifying the xml schema. And you can use powerful query languages like XPath and XQuery.

Kees de Kooter
+1  A: 

I haven't used it but what you're trying to do kind of sounds like CouchDB. You may want to look there before reinventing the wheel...

Jason Punyon
A: 

I never based an entire application on this principle, but in almost all applications I do use some form of key-value pair collections which deal with extreme cases when the specific entity requires some additional properties which are not needed for other entities.

I basically serialize the dictionary and store it like that in the database with my entity data. That's what I use for post production patching when I have to deal with something too obscure to warrant a change in the entire model.

With the key-value pair data, I do store the type as well, so I can automatically render appropriate HTML controls. I have just basic types: text, multi-line, RTF, checkbox, number and date.

muerte
+1  A: 

This approach is used by Amazon SimpleDB. You define domains and each domains has rows with a bunch of key/value pairs in it. This data is known as 'semi-structured'.

This approach has some strengths. Like your idea, you do not need to define a database schema. You can introduce new tables ad-hoc, new columns on a per-row basis, and even have columns that have more than one value (instead of creating a has_many relationship with an extra table). If your schema changes, you can introduce these changes transitionally rather than force migration.

On the other hand, you're throwing away decades of development on the relational model. You will hemorrhage speed because your indexing will either be too general or non-existent. Aggregate operations (groups, joins) will be extremely slow. Query optimisation will be difficult, etc.

Both Amazon SimpleDB and Apache CouchDB deal with this issue by making their databases highly distributed. While this ensures reliability and redundancy, it has its own set of problems, such as conflict resolution and out-of-date data.

From your question you seem dead set on an 'agile' methods, so I would recommend one of those two DB engines (depending on whether you'd rather pay Amazon - albeit not much - or build your own setup). They both allow a completely dynamic database schema. Just beware of the pitfalls.

rjh