tags:

views:

216

answers:

5

Are there any performance drawbacks in SQL to joining a table on a char (or varchar) value, as opposed to say joining on an integer value?

+1  A: 

No there is not (in Oracle). What can make a difference is:
- If datatypes in your join-condition are different (implicit conversion can have performance penalty). So joining a char to a varchar can be bad.
- If one field has an index and the other not.
- If the data in one column is much longer than in the other column.

Edwin
A: 

No, as long as the datatypes are the same between the tables that you join you should be fine

SQLMenace
+1  A: 

A char or varchar will be stored as ASCII or unicode depending on your character set. An int will be stored as some packed form. If you take a number in int form it will be smaller than the same number in character (string) form. Storing as numeric types will always be more performant than storing as characters.

Logicalmind
+4  A: 

Joining on a char or var char will normally have an overhead as opposed to joining on an int. There are two factors I'm aware of when comparing chars:
- collation has to be taken account of
- char fields have more data to be compared*

*(each character is represented by 8 bits, a 10 char value is thus 80 bits long, compared to 32 bits for an int)

*(this is 16 bits per character if using NCHAR or NVARCHAR)

Dems
I knew this question would raise your interest :)
Russ Cam
He says this because if I see joins on character fields I blow my top :)
Dems
+2  A: 

There are performance considerations related to the size of the join column. An integer is 4 bytes wide - so any char value larger than 4 bytes requires more disk I/O, cache space, etc. Obviously, a char(5) wouldn't affect much, but a char(4000) would be extraordinarily wasteful. Double the char sizes if you use unicode types.

Usually this question comes up in the context of natural vs surrogate keys. In that argument, the size of the datatype is not the only factor - as you can often avoid a join with a natural (char) key.

Of course, neither one is likely to be the bottleneck in your application - so design with impunity (at least in this area).

Mark Brackett
It depends on volume before you can say it won't be significant. When processing large amounts of data stored in a warehouse, joining on CHAR(64) fields is noticably less efficient than joining on INT fields.
Dems
In reality, that char(64) would hold some piece of domain data that is unique (the natural key). Often, you can avoid the join if you already had the natural key - which is even better than the int.But yes, if you're designing a large data warehouse, you can have your own opinion on these matters.
Mark Brackett