tags:

views:

27

answers:

2

I'm using the following query to generate a unique new user:

INSERT INTO `users`
SET `display_name` = CONCAT(
    'user',
    (
        IF(
            EXISTS(
                SELECT COUNT(`id`) FROM (
                    SELECT `id` FROM `users`
                ) AS `A`),
            (SELECT COUNT(`id`) + 1 FROM (
                SELECT `id` FROM `users`
                ) AS `A`),
            1
        )
    )
);

The column id is an INT and a AUTO_INCREMENT PRIMARY KEY. The column display_name is a VARCHAR and is NOT NULL. This query inserts a new user with sequential display names, such as user1, user2, user3, user4, ...

DESCRIBE returned this:

id    select_type    table    type    possible_keys    key       key_len    ref     rows    Extra
1     PRIMARY        NULL     NULL    NULL             NULL      NULL       NULL    NULL    No tables used
2     SUBQUERY       NULL     NULL    NULL             NULL      NULL       NULL    NULL    Select tables optimized away
3     DERIVED        users    index   NULL             UNIQUE    302        NULL    NULL    Using index
4     SUBQUERY       NULL     NULL    NULL             NULL      NULL       NULL    NULL    Select tables optimized away
5     DERIVED        users    index   NULL             UNIQUE    302        NULL    NULL    Using index

Is this query efficient? If not, what is a more efficient method?

+1  A: 

Leaving efficiency aside for a second, your query won't work after you have deleted a user row. Count will decrease in that case and you will reuse the old display_name.

With that said, MAX() is probably the function that better serves your logic and would allow you to get rid of the EXISTS check and second query. Thus making it faster.

However, if you logic becomes complicated. I suggest looking into a User Defined function or Procedure depending on your version of MySQL. These are compiled and will be much faster and look cleaner, i.e.

INSERT INTO users SET display_name = UNIQUE_DISPLAY_NAME();
Jason McCreary
You're right about removing the EXISTS portion and the subquery. I should change COUNT() to MAX().
Exception
@Exception Bill is right above, you still have concurrency issues with `MAX()`. Why don't you just run an update statement after the insert and `CONCAT('user', LAST_INSERT_ID())`
Jason McCreary
+1  A: 

Your query is not a good technique because it can produce race anomalies and delete anomalies.

For example, if two clients go to create a user at the same time, they could both count 15 existing users and both allocate user16.

Also if you have 15 users, but delete user10, then count() will report 14, and you will re-allocate user15.

Instead, I would make the id be an auto-generated primary key. MySQL allocates id values outside of the transaction isolation rules, so concurrent clients are guaranteed never to allocate the same value.

edit: You can't use a trigger unfortunately, because the generated key value is not available in a before trigger, and you can't change the value of the display_name in an after trigger. So you have to do this in an insert followed by an update.

INSERT INTO users DEFAULT VALUES;
UPADTE users SET display_name = CONCAT('user', LAST_INSERT_ID()) 
WHERE id = LAST_INSERT_ID();

Don't worry about gaps. That is, if user15 never is deleted, rolled back, or whatever, it's okay. Don't fall for the Pseudokey Neat-Freak antipattern.

Bill Karwin
I forgot to mention that the primary key is AUTO_INCREMENT, so it is auto-generated. But an anomaly will still occur if there is ever a DELETE, which is why I should use MAX() instead of COUNT().
Exception
No, you should not use MAX() because that allows two concurrent clients to get the same value.
Bill Karwin