views:

64

answers:

3

With MySQL I often overlook some options like 'signed/unsigned' ints and 'allow null' but I'm wondering if these details could slow a web application down.

Are there any notable performance differences in these situations?

  1. using a low/high range of Integer primary key
    • 5000 rows with ids from 1 to 5000
    • 5000 rows with ids from 20001 to 25000
  2. Integer PK incrementing uniformly vs non-uniformly.
    • 5000 rows with ids from 1 to 5000
    • 5000 rows with ids scattered from 1 to 30000
  3. Setting an Integer PK as unsigned vs. signed
    • example: where the gain in range of unsigned isn't actually needed
  4. Setting a default value for a field (any type) vs. no default
    • example: update a row and all field data is given
  5. Allow Null vs deny Null
    • example: updating a row and all field data is given

I'm using MySQL, but this is more of a general question, thanks.

+1  A: 

From my understanding of B-trees (that's how relational databases are usually implemented, right?), these things should not make any difference. All you need is a fast comparison function on your key, and it usually doesn't matter what range of integers you use (unless you get out of the machine word size).

Of course, for keys, a uniform default value or allowing null doesn't make much sense. In all non-key fields, allowing null or providing default values should not have any significant impact.

Svante
+1  A: 

5000 rows is almost nothing for a database. They normally use large B-trees for indexes, so they don't care much about the distribution of primary keys.

Generally, whether to use the other options should be based on what you need from the database application. They can't significantly affect the performance. So, use a default value when you want a default value, use a NOT NULL contraint when you don't want the column to be NULL.

If you have database performance issues, you should look for more important problems like missing indexes, slow queries that can be rewritten efficiently, making sure that the database has accurate statistics about the data so it can use indexes the right way (although this is an admin task).

Lukáš Lalinský
A: 

using a low/high range of Integer primary key * 5000 rows with ids from 1 to 5000 * 5000 rows with ids from 20001 to 25000

Does not make any difference.

Integer PK incrementing uniformly vs non-uniformly. * 5000 rows with ids from 1 to 5000 * 5000 rows with ids scattered from 1 to 30000

If the distribution is uniform, this makes not difference.

Uniform distribution may help to build more efficient random sampling query, like described in this article in my blog:

It's distribution which matters, not bounds: 1, 11, 21, 31 is OK, 1, 2, 3, 31 is not.

Setting an Integer PK as unsigned vs. signed * example: where the gain in range of unsigned isn't actually needed

If you declare PRIMARY KEY as UNSIGNED, MySQL can optimize out predicates like id >= -1

Setting a default value for a field (any type) vs. no default * example: update a row and all field data is given

No difference.

Allow Null vs deny Null * example: updating a row and all field data is given

Nullable columns are one byte larger: the index key for an INT NOT NULL is 5 bytes long, that for an INT NULL is 4 bytes long.

Quassnoi