views:

87

answers:

2

Hi there, I'm aware of several question on this forum relating to this. But I'm not talking about splitting tables for the same entity (like user for example)

Suppose I have a huge options table that stores list options like Gender, Marital Status, and many more domain specific groups with same structure. I plan to capture in a OPTIONS table. Another simple option is to have the field set as ENUM, but there are disadvantages of that as well. http://www.brandonsavage.net/why-you-should-replace-enum-with-something-else/

OPTIONS Table:

option_id <will be referred instead of the name>
name
value <more like a description, and not a name/value pair>
group

Query: select .. from options where group = '15'

Usage: Gender & Marital_Status will be in the Persons tables; however the value stored will come from Options

    eg. 
    Person 
    ..
    id=34 name=Prasad gender=31 marital_status=41
    .. 

    Options
    .. 
    31 gender male male
    32 gender female female
    ...
    41 marital_status single single 
    42 marital_status married married
    ..
  • Since this table is expected to be multi-tenant, the no of rows could grow drastically.
  • I believe splitting the tables instead of finding by the group would be easier to write & faster to execute.
  • or perhaps partitioning by the group or tenant?

Pl suggest. Thanks

+1  A: 

This is essentially an EAV model, with all of the advantages and disadvantages therein.

An EAV model is used in circumstances where the number of attributes (properties, parameters) that can be used to describe a thing (an "entity" or "object") is potentially vast, but the number that will actually apply to a given entity is relatively modest. It is also known as a "sparse matrix."

A good example of an appropriate use for an EAV table is symptoms in a medical database. Although there are potentially thousands of possible symptoms, the average person going to the doctor will only present with a much smaller number of symptoms.

The Wikipedia article about EAV should tell you whether this model is appropriate for your particular application, and suggest some best practices in that regard.

Note that if your example columns are Gender and Marital Status, and you have a Persons table, those columns more appropriately belong in the Persons table, not an EAV table.

Robert Harvey
Thanks a lot Robert.You slightly misunderstood: Gender however the value stored will come from Optionseg.Person..id=34 name=Prasad gender=31 marital_status=42..Options...31 gender male male32 gender female female...41 marital_status single single42 marital_status married married...
Prasad
+1  A: 

The system I work on has this precise problem. It's in the health care domain.

We have some standardized code tables, like gender (obvious) and patient status (inpatient, outpatient, emergency, observation, preop). We handle each of these as a separate small table. These tables, being tiny and fairly static in content, don't require much maintenance. So in these cases we embrace the efficiency of making the tables tiny, and pay the cost of having a variety of them.

But then we also have some tables whose values are fed to us by our hospital customers, such things as religion, next-of-kin relationship (daughter, father, etc). We also handle diagnoses in this table, because hospitals have different ways of coding these things, and they are ever-expanding. * These tables routinely get new values in them, when we add new hospital customers to our system, and when those hospitals encounter new problems.

Both the values in these tables, and the types of tables we need to keep, reflect the diversity of human life, and the fact that our hospital customers often discover new things about their patients. In this case it makes sense to keep all these codes in just one reference table. Each entry has an id. We also assign a customer ID and a code type (e.g. religion, diagnosis), code name (e.g. PROT, CATH, BUDD) and code value (e.g. Protestant, Catholic, Buddhist). Finally we add a priority, which lets us control the order of picklists in our app.

In this case the efficiency hit of a single large table is offset by the fact that we can have one code base to maintain this table, and a unified user interface to it.

DON'T put peoples' names, or any other potentially confidential information, in this code table, unless you want to deal with a complex security problem under a lot of pressure sometime in the future.

If you're working in health care IT, you better figure out what you will do about ICD-9 and ICD-10 diagnosis codes. A switchover is coming, and it's not going to be easy.

Good luck

Ollie Jones
thanks a ton Ollie. No I'm working in a very simple domain. Good luck to you too for the migration to the new code :)
Prasad