views:

81

answers:

4

I have a table in my database that represents datafields in a custom form. The DataField gives some representation of what kind of control it should be represented with, and what value type it should take. Simplified you can say that I have 2 entities in this table - Textbox taking any string and Textbox only taking numbers.

Now I have the different values stored in a separate table, referencing the datafield definition. What is the best way to store the data value here, when the type differs?

One possible solution is to have the FieldValue table hold one field per possible value type. Now this would certainly be redundant, but at least I would get the value stored in its correct form - simplifying queries later.

FieldValue
----------
Id
DataFieldId
IntValue
DoubleValue
BoolValue
DataValue
..

Another possibility is just storing everything as String, and casting this in the queries. I am using .Net with NHibernate, and I see that at least here there is a Projections.Cast that can be used to cast e.g. string to int in the query.

Either way in these two solutions I need to know which type to use when doing the query, but I will know that from the DataField, so that won't be a problem.

Anyway; I don't think any of these solutions sounds good. Are they? Or is there a better way?

A: 

If you only have 2 textboxes that the user can enter values into, and it'll be either a string or a number, do you really need to be able to distinguish between int and double, can't you just store it all as a suitable numeric type (depending on the DB). That would get you down to just 2 different types and then a possible solution would be to have 2 tables, one for each of the two types.

In general though, whenever I see that it's difficult to know what datatype something is going to be I'd start worrying that the project is trying to become to generic which has a tendency to become quite messy.

ho1
The example is simplified. The real scenario will be more complex - with more types. The generic approach is completely necessary for the project, but I try to restrict it to only what's strictly necessary. And the different types is definitly part of what I really need.
stiank81
+1  A: 

There is no 3rd "magic" option: your specific situation determines how you want to proceed.

From my experience, a string-only solution makes sense with things like application settings. Usually, I don't need to use such data in queries directly so it doesn't bother me much that it's in string form.

I'm not sure, but it seems to be the case that you are extending an entity with custom attributes, which sounds like you might want to do some processing in the database at some point. In this case, you may as well go with the multiple-column approach and fill in only the column with the right type. It's not going to be pretty, but it might simplify queries.

As I said, it depends on if and what kind of queries you need to run, what kind of performance you need etc.

Tomislav Nakic-Alfirevic
Thx for your thoughts on this. I will run queries on large datasets stored like this, but I don't really depend on the queries to be fast. It is more like a "build statistics" functionality where you don't expect it to finish instantly.
stiank81
+1  A: 

Pulling the idea from Multi-tenant data storage, you can use the 'Name-Pair' values idea as described on MSDN. I somehow think that this article will be more useful apart from the specific section noted.

In effect, to make this a scalable solution, you will need to define the types of data for your custom form using a metadata table wherein you define the actual type of the data you want to store (eg bool, text, int, datetime). You may also consider storing the .Net type as well since this may be able to assist you when it comes to input validation etc. Other details that may be stored as well are the the name of the fields as you would expect them to appear on your custom form. Using this approach you build a custom form based on the stored metadata.

I have successfully used this approach and it works great. As an addition, we also used the metadata table to define whether the expected value for the custom field is user provided (eg name, date of birth) or a pre-defined system value in drop-downlist (eg a list of citye, countries). To support this, we have an additional table, that contains options for the list which is linked back to metadata table.

Ahmad
Thanks. There are some ideas here that might be helpful.
stiank81
A: 

Before offering my view.. I would say that you may need to go back to the ER board. I suppose a CustomForm has many Fields and that these Fields are of different kinds (text, data, maybe even behavior and style), rather than generalizing the concept Field it might be worth instead to consider creating one table for each type of Field. E.g. DateField, UserNameField, etc. This will make the value type be exactly one non-null column with the correct type. I would also bet this would simplify your code (less conditions to check, the db gives you all the info.)

That said, you may not be able to go back to the ER board or there may be underlying valid reasons for doing the multiple-columns-per-type approach. Here are some pros and cons of a such approach.

Pros

  • all in one table, may turn out to be faster (though remember the root of all evil is..)

Cons

  • DB can't enforce that not more than one value is specified (extra code to check exceptions)
  • DB can't enforce that at least one is not NULL (extra code to check for FieldValues without any value)
  • add support of new data type requires changing all existing data (adding a NULL value on columns).

If you are stuck with the CustomForm > Field->FieldValue I would suggest creating one table per FieldValue. e.g.

IntFieldValue
-------------
Id
DataFieldId
Value

DecimalFieldValue
-------------
Id
DataFieldId
Decimal


DateFieldValue
-------------
Id
DataFieldId
Date

With the above you can still create a view which selects from the above tables. The view can be created to offer one column per value type and guarantee that one and only one of them is not NULL. This is also easier to extend (add a new table, change the view, but it does not require updating any existing data with null values for the new type column).