views:

133

answers:

6

I have a table and am debating between 2 different ways to store information. It has a structure like so

int id

int FK_id

varchar(50) info1

varchar(50) info2

varchar(50) info3

int forTable or char(3) forTable

The FK_id can be a foreign key to one of 6 tables so I need another field to determine which table it's for.

I see two solutions:

  • An integer that is a FK to a settings table which has its actual value.
  • A char(3) field with the a abbreviated version of the table.

I am wondering if anyone knows if one will be more beneficial speed wise over the other or if there will be any major problems using the char(3)

Note: I will be creating an indexed view on each of the 6 different values for this field. This table will contain ~30k rows and will need to be joined with much larger tables

+1  A: 

Have you considered using a TinyInt. Only takes one byte to store it's value. TinyInt has a range of values between 0 and 255.

G Mastros
Without specifically looking at the data how would I know which number goes with which table? Especially a month or so from now. Also is there much benefit to this over just a normal int thats a FK to a settings table which would contain the name of the corresponding table?
corymathews
With only ~30K rows, there probably wouldn't be much difference at all. In fact, I doubt you would *notice* any difference between an int, small int, tiny int, or char(3). Also, if you only have 6 different values in this table, it probably won't be selective enough to be a good candidate for the first column in an index either.
G Mastros
A: 

I think a small integer (tinyint) is called for here. An "abbreviated version" looks too much like a magic number.

I also think performance wise the integer should beat the char(3).

Vinko Vrsalovic
the abbreviated version would be something like 'ppl' for people or similar.
corymathews
+1  A: 

In this case, it probably doesn't matter except for the collation overhead (A vs a vs ä va à)

I'd use char(3), say for currency code like CHF, GBP etc But if my natural key was "Swiss Franc", "British Pound" etc, I'd take the numeric.

3 bytes + collation vs 4 bytes numeric? You'd need a zillion rows or be running a medium sized country before it mattered...

gbn
A: 

First off, a 50 character Id that is not globally unique sounds a little scary. Do the IDs have some meaning? If not, you can easily get a GUID in less space. Personally, I am a big fan of making things human readable whenever possible. I would, and have, put the full name in graphs until I needed to do otherwise. My preference would be to have linking tables for each possible related table though.

Unless you are talking about really large scale, you are much better off decreasing the size of the IDs and taking a few more characters for the name of the table. For really large scale, I would decrease the size of the IDs and use an integer.

Jacob

TheJacobTaylor
Sorry jacob the editor (or I) got a bit mixed up when writing it. The PK is an integer auto increment field. The varchar50 was not supposed to be on that line. Somehow it got placed wrong and I missed it when quickly reading it.
corymathews
Gotcha. 32 bits vs 100 Bytes makes a bit of a difference. I would say that from a speed perspective integer comparison is a single CPU instruction. String comparison is typically multiple (at least one per character). While they could handle more concurrently, I am not aware of them doing so.
TheJacobTaylor
+1  A: 

Because your FK cannot be enforced (since it is a variant depending upon type) by database constraint, I would strongly consider re-evaluating your design to use link tables, where each link table includes two FK columns, one to the PK of the entity and one to the PK of one of the 6 tables.

While this might seem to be overkill, it makes a lot of things simpler and adding new link tables is no more complex than accommodating new FK-types. In addition, it is more easily expandable to the case where an entity needs more than a 1-1 relationship to a single table, or needs multiple 1-1 relationships to the 6 other entities.

In a varying-FK scenario, you can lose database consistency, you can join to the wrong entity by neglecting to filter on type code, etc.

I should add that another huge benefit of link tables is that you can link to tables which have keys of varying data types (ints, natural keys, etc) without having to add surrograte keys or stored the key in a varchar or similar workarounds which are prone to problems.

Cade Roux
Or if you don't use or can't link tables (which is an idea I agree with), then you need a trigger to maintain data integrity as you can't set one field as an FK to 6 different tables.
HLGEM
+1  A: 

Is the reason you need a single table that you want to ensure that when the six parent tables reference a given instance of a child row that is guaranteed to be the same instance? This is the classic "multi-parent" problem. An example of where you might run into this is with addresses or phone numbers with multiple person/contact tables.

I can think of a couple of options:

Choice 1: A link table for each parent table. This would be the Hoyle architecture. So, something like:

Create Table MyTable(
                    id int not null Primary Key Clustered
                    , info1 varchar(50) null
                    , info2 varchar(50) null
                    , info3 varchar(50) null
                    )

Create Table LinkTable1(
                        MyTableId int not null
                        , ParentTable1Id int not null
                        , Constraint PK_LinkTable1 Primary Key Clustered( MyTableId, ParentTable1Id )
                        , Constraint FK_LinkTable1_ParentTable1
                            Foreign Key ( MyTableId )
                            References MyTable ( Id )   
                        , Constraint FK_LinkTable1_ParentTable1
                            Foreign Key ( ParentTable1Id )
                            References ParentTable1 ( Id )  
                        )
...
Create Table LinkTable2...LinkTable3

Choice 2. If you knew that you would never have more than say six tables and were willing to accept some denormalization and a fugly design, you could add six foreign keys to your main table. That avoids the problem of populating a bunch of link tables and ensures proper referential integrity. However, that design can quickly get out of hand if the number of parents grows.

If you are content with your existing design, then with respect to the field size, I would use the full table name. Frankly, the difference in performance between a char(3) and a varchar(50) or even varchar(128) will be negligible for the amount of data you are likely to put in the table. If you really thought you were going to have millions of rows, then I would strongly consider the option of linking tables.

If you wanted to stay with your design and wanted the maximum performance, then I would use a tinyint with a foreign key to a table that contained the list of the six tables with a tinyint primary key. That prevents the number from being "magic" and ensures that you narrow down the list of parent tables. Of course, it still does not prevent orphaned records. In this design, you have to use triggers to do that.

Thomas