Are there any performance drawbacks in SQL to joining a table on a char (or varchar) value, as opposed to say joining on an integer value?
No there is not (in Oracle). What can make a difference is:
- If datatypes in your join-condition are different (implicit conversion can have performance penalty). So joining a char to a varchar can be bad.
- If one field has an index and the other not.
- If the data in one column is much longer than in the other column.
No, as long as the datatypes are the same between the tables that you join you should be fine
A char or varchar will be stored as ASCII or unicode depending on your character set. An int will be stored as some packed form. If you take a number in int form it will be smaller than the same number in character (string) form. Storing as numeric types will always be more performant than storing as characters.
Joining on a char or var char will normally have an overhead as opposed to joining on an int. There are two factors I'm aware of when comparing chars:
- collation has to be taken account of
- char fields have more data to be compared*
*(each character is represented by 8 bits, a 10 char value is thus 80 bits long, compared to 32 bits for an int)
*(this is 16 bits per character if using NCHAR or NVARCHAR)
There are performance considerations related to the size of the join column. An integer is 4 bytes wide - so any char value larger than 4 bytes requires more disk I/O, cache space, etc. Obviously, a char(5) wouldn't affect much, but a char(4000) would be extraordinarily wasteful. Double the char sizes if you use unicode types.
Usually this question comes up in the context of natural vs surrogate keys. In that argument, the size of the datatype is not the only factor - as you can often avoid a join with a natural (char) key.
Of course, neither one is likely to be the bottleneck in your application - so design with impunity (at least in this area).