views:

405

answers:

2

[former title: Is there a way to force a relationship structure on a tag-based organizational methodology?]

I have some entities, and they have a series of attributes. Some of the attributes affect what other attributes the entity can have, many of the attributes are organized into groups, and occasionally entities are requited to have certain numbers of attributes from certain groups, or possibly a range of attributes from certain groups.

Is there a way to model these sorts of tag-to-tag relationships, such as requirement, grouping, exclusion, etc. using a database, or is this only possible with programmed "business rules"? Ideally, I would like the possible tags and their relationships to be easily configurable, and hence highly flexible.

One of the ways I have considered is to have the tags and possible relationships, and then you get a tag-tag-applied relationship sort of table, but this seems like a rather brittle approach.

So, is this possible in a more rigorous fashion, and if so, how would I even begin to go about it?

+1  A: 

edit: Your description of variable attributes that apply only depending on the values in other attributes is a non-relational, non-normalized design. RDBMS may not be the best solution for storing this kind of data. Probably RDF would be a good solution for data that requires this level of flexibility.

My earlier answer, pertaining to RDBMS solutions, is below:


Some people model flexible attributes with the Entity-Attribute-Value design, but this is often too unstructured and you end up fighting with data integrity problems. Use this only if you need a virtually limitless number of entity sub-types.

Other people use Single Table Inheritance, where you put all attribute columns used by all sub-types into one very wide table, and leave them NULL on rows where the attribute is irrelevant to the sub-type. But this has limits because the table can grow too wide, and you lose the ability to make any attributes mandatory, because they must all be nullable.

If you have a relatively small number of entity sub-types, I would recommend creating a dependent table for each group of required attributes. Define the dependent table's primary key as a foreign key to the parent table, so you get a one-to-one relationship.

CREATE TABLE Vehicles (
  vehicle_id INT PRIMARY KEY
  ...attributes common to all vehicles...
);

CREATE TABLE Automobiles (
  vehicle_id INT PRIMARY KEY,
  ...attributes specific to autos...
  FOREIGN KEY (vehicle_id) REFERENCES Vehicles(vehicle_id)
);

You can also provide a little more data integrity by encoding the subtype in the primary key of the parent table. That's to make sure a row in Automobiles cannot reference a motorcycle in Vehicles.

CREATE TABLE Vehicles (
  vehicle_id INT,
  vehicle_type VARCHAR(10),
  ...attributes common to all vehicles...
  PRIMARY KEY (vehicle_id, vehicle_type),
  FOREIGN KEY (vehicle_type) REFERENCES VehicleTypes (vehicle_type)
);

CREATE TABLE Automobiles (
  vehicle_id INT,
  vehicle_type VARCHAR(10) CHECK (vehicle_type = 'Automobile'),
  ...attributes specific to autos...
  FOREIGN KEY (vehicle_id, vehicle_type) 
    REFERENCES Vehicles(vehicle_id, vehicle_type)
);

Of course, you need to create a new dependent table each time you define a new sub-type, but this design does give you a lot more structure to enforce data integrity, NOT NULL attributes, and so on.

The only part you need to enforce in application logic is that to make sure to create a row in Automobiles for each row in Vehicles with vehicle_type = 'Automobile'.

Bill Karwin
+1  A: 

There's no difference between using databases to enforce your rules or using source code elsewhere. Code is data. That's the esoteric Lisp answer.

The real question you're asking is whether this is easier in a relational database or in (I assume) an Algol family language. You didn't specify a RDBMS, so I'm going to assume ANSI. That makes this hard.

BTW, this is easy in Prolog. But that's neither here nor there.

I would say to use check constraints for everything. The mental shift needed for this approach is to realize that your UI will need a way to define these tag relationships. Traditionally, you would issue CRUD statements from the UI to the DB. Instead, you need to issue ALTER TABLE statements to CRUD check constraints.

There are two problems with this approach:

  • Such statements are not parameterizable in most RDBMS's. Think SQL injection.
  • Implementations vary in their support for full ANSI check constraints. If subqueries aren't supported, forget it.

If you could clarify your question with a specific RDBMS, then we can give you a better answer.

entaroadun