views:

2035

answers:

4

I'm now to this point of my project that I need to design my database (Oracle). Usually for the status and countries tables I don’t use a numeric primary key, for example

STATUS (max 6)
AC --> Active
DE --> Deleted

COUNTRIES (total 30)
UK --> United Kingdom
IT --> Italy
GR --> Greece

These tables are static, not updated through the application and it's not foreseen to be change in the future so there is no chance having update problems in tables that will use these values as foreign keys.

The main table of the application will use status and country (more than once e.g. origin country, destination country) and it is foreseen that 600000 rows will be added per year

So my question is, will these VARCHAR(2) keys will have an impact in the performance when querying the join of there 3 tables. Will the first be significantly slower than the second?

SELECT m.*,
       s.status_name,
       c.country_name
  FROM main m, status s, countries c
 WHERE m.status_cd = s.status_cd
   AND m.country_cd = c.country_cd
   AND m.status_cd = 'AC'
   AND m.country_cd = 'UK'

SELECT m.*,
       s.status_name,
       c.country_name
  FROM main m, status s, countries c
 WHERE m.status_cd = s.status_cd
   AND m.country_cd = c.country_cd
   AND m.status_cd = 1
   AND m.country_cd = 2


Clarification:

Status is not binary ("max 6" next to the table name). The values will probably be:

* active
* deleted
* draft
* send
* replaced

and we need to display the decoded values to the user, so we need the names.

A: 

If 'status' is (and will always be?) a binary active/deleted field why bother with the table at all. It seems like normalization taken to an impractical extreme.

It would certainly be quicker, not to mention easier, to simply use a tinyint(1) field and record the active/deleted state as a 1 or 0.

This eliminates one of your joins entirely which has got to be a good thing.

philistyne
Since the question says there are 6 status codes, it isn't a binary field.
Jonathan Leffler
+1  A: 

Check out this link. Bottom line is there isn't much performance difference between varchar and num. So you should go for which ever makes sense for the column. Here, the varchar seems to make more sense.

trex279
A: 

It does not matter which methode you choose in this case. The important part is to use the same kind throughout the database and be consistent in your id convention.

Coentje
+1  A: 

Both the status and country tables are so small that they are going to be memory resident in practice, whether formally stated as such or not. Indeed, except that a foreign key normally requires an index on the referenced primary key field, you might be tempted not to bother with any indexes on the tables.

The performance difference between the joins with different types is going to be negligible, and the numeric code will, if anything, be slower since there's 'more' data to store (but it is all so small that it is negligible, again).

So, go with the natural codes. All else apart, the SQL in the first example is clearer; the 'UK' and 'AC' are much more meaningful than 1 and 2.

In non-Oracle DBMS, you would probably use CHAR(2) for both the status and country code values. Oracle users tend to use VARCHAR2 for everything; I'm not sure whether there is a penalty for using a CHAR(2) column instead, especially since the column values are fixed length. (Under Informix, for instance, a VARCHAR(2) field - a field of up to two characters - would store as 3 bytes, a length (always 2 in your case) and the 2 data bytes. By contrast, a CHAR(2) field would occupy just 2 bytes.)

Jonathan Leffler