Advanced Database Design - Whats the most effective way to execute the following...

views:

145

answers:

Advanced Database Design - Whats the most effective way to execute the following...

I am good with ideas, but due to my limited programming experience I sometimes strain to come up with the most effective solution for a given concept. Currently my mind is trying to fathom the most efficient way to reference a database table specific to a data type.

Concept

I am building an admin interface that allows a site owner to 'build' a content set based on selected content 'containers'. The interface currently holds the various sql data types, such as int, varchar, text, etc. and allows the user to select the data type, label it, and then link them together into a group.

This group is then referenced as an instance of the data set and can be reused. For example, the containers 'title' and 'body' could used to build a simple page and each new instance of this group could be a new page.

Issue

The issue I am having is in referencing these containers in the most efficient manner possible. I can't simply have a table with an instance id and a container id, because there is no way to know they type of container. For example, the content-varchar table has an id field and then a value field in the format of varchar. The content-text table has an id field and then a value field in the format of text.

I thought about making one large table that held each possible type of data, but this would be a gross waste of space. Currently I use a sub query to select the specific value from the relating table once I know the data type, but there has to be a better solution.

What are your ideas / suggestions?

+1 A:

Here's what the MySQL manual has to say about the TEXT and BLOB fields:

In most respects, you can regard a BLOB column as a VARBINARY column that can be as large as you like. Similarly, you can regard a TEXT column as a VARCHAR column.

Translation: using all TEXT or BLOB columns wouldn't waste space. MySQL will truncate what it doesn't use (like in VARCHARs).

amphetamachine 2010-01-08 02:57:20

This makes sense, and will make things much more simple. I don't foresee the need to story anything in a binary format, since I can just store the filename of an image, instead of a whole image, so I may be able to get away with just using text. I can then define models that will correctly handle things like dates and timestamps.Thank you for your input here!

sterling 2010-01-08 03:04:40

+3 A:

You're doing things at a uselessly low level.

You're reinventing the relational database the hard way. Inside a relational database.

You can't effectively manage each little piece of data as strongly typed relational data.

One choice is to use higher-level structures (not individual strings, but composite records with multiple things bound together).

The other choice is to use just text. It's simpler. Eventually, it all becomes text when the HTML is sent back to the browser.

"...various sql data types, such as int, varchar, text, etc. and allows the user to select the data type, label it, and then link them together into a group."

Group of simple types == row. Allowing the user to freely define a table based on a collection of datatypes -- well -- you're making them a programmer. You're creating a front-end for table design.

BTW, Lots of tools already do this for you. RoR and Django come to mind: you define a row in a database and a template for positioning those elements on the page and view function that shows how to fetch the row and match it up with a page.

Label simple individual text fields with field label and group label. Much simpler. Just text with two keys. You can the fetch the "group" and use the label/value pairs to populate the template. In Django this is one lines of code to fetch rows, turn them into a dictionary and pass it to a template.

S.Lott 2010-01-08 02:58:09

I appreciate your feedback and I had a feeling I was over complicating this, as I tend to do. Although, I don't fully understand your response, I think you are saying what amphetamachine stated?

sterling 2010-01-08 03:02:18

Thank you for this additional insight. What you are describing is similar to what I am doing. John Feminella's answer pretty much explains my process, but I am extending it out a bit. Thank you again for this invaluable feedback. I tend to get carried away and over complicate things.

sterling 2010-01-08 20:25:45

+1 A:

Unless I'm missing something, this doesn't sound terribly complicated. But it does sound like you're putting a relational database inside a relational database. That's usually the wrong way to go about things.

Still, if you insist, there's a few observations to make here:

A ContentSet defines one or more ContentItems.
A ContentItem can be one of several well-defined primitive types, ContentTypes.

So, sounds like you'd have four tables:

content_sets: ContentSets and their associated metadata, such as the user who created each one, the title of the sets, and so on.
content_items: ContentItems, along with a foreign key to the ContentSet they belong to.
content_types: A list of possible types that can be stored in each ContentItem. Your application will enforce whether the data being stored in each item is correct.
content_groups: Instances of ContentSets, containing realized data and a foreign key to the ContentSet they are an instance of. Use a property bag/blob/text field to store the data itself.

John Feminella 2010-01-08 02:59:32

Yes, what you are stating is the exact correct approach that I have implemented, what I was missing was the fact that I could store each type of data in the same container if I used TEXT columns as pointed out by amphetamachine. Thank you so much for your input here.

sterling 2010-01-08 03:08:23

"you're putting a relational database inside a relational database" I think it's an interesting exercise--albeit a horrendously complicated one--to be doing this. A self-scaling database is an interesting concept to me.

amphetamachine 2010-01-08 03:52:37

+1 A:

Here's one example of this kind of implementation. I wouldn't worry too much about subverting the intent of the relational database - as long as you have a good enough reason to. Anyhow, this implementation went with a single table - and relied on metadata and real-time conversion of data to the needed type.

I had to support a user-extensible set of attributes within the context of a commercial product that users would manage themselves. Discarded alternatives included having them pay a DBA to modify their database (too expensive), generating actual tables (too many failure modes), and using a non-relational database (most of the app required a relational database - for reporting, etc).

In this implementation the customer-extensible data was modeled as key-value pairs tied to the extensible tables within the model, in which user-supplied metadata described the keys and their value constraints. Both attributes were varchars. This could hold any kind of data needed at that time (ten years ago): integers, floats, characters, etc. I don't think it could handle binary images, soundfiles, etc.

Declarative constraints for type, range, case, unknown value, specific values, etc were handled by a triggered stored procedure that read the user-supplied metadata. Conversion to the appropriate type was handled by either the app or a UDF that also read the metadata (can't remember which).

The solution wasn't fast, but it was very flexible, reliable and easy to use.

KenFar 2010-01-08 15:01:44

Thank you for the input here, I think I can find a healthy balance of speed and flexibility with the right table designs. For any binary objects (images / sound files) I will simply save them to the filesystem and store the path.

sterling 2010-01-08 20:08:15

ansaurus

tags:

views:

answers:

Advanced Database Design - Whats the most effective way to execute the following...

related questions