views:

179

answers:

4

Hello folks,

I'm developing in LAMP (Linux+Apache+MySQL+PHP) since I remember myself. But one question was bugging me for years now. I hope you can help me to find an answer and point me into the right direction. Here is my challenge:

Say, we are creating a community website, where we allow our users to register. The MySQL table where we store all users would look then like this:

CREATE TABLE `users` (
  `uid` int(2) unsigned NOT NULL auto_increment COMMENT 'User ID',
  `name` varchar(20) NOT NULL,
  `password` varchar(32) NOT NULL COMMENT 'Password is saved as a 32-bytes hash, never in plain text',
  `email` varchar(64) NOT NULL,
  `created` int(11) unsigned NOT NULL default '0' COMMENT 'Timestamp of registration',
  `updated` int(11) unsigned NOT NULL default '0' COMMENT 'Timestamp of profile update, e.g. change of email',
  PRIMARY KEY  (`uid`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8;

So, from this snippet you can see that we have a unique and automatically incrementing for every new user 'uid' field. As on every good and loyal community website we need to provide users with possibility to completely delete their profile if they want to cancel their participation in our community.

Here comes my problem. Let's say we have 3 registered users: Alice (uid = 1), Bob (uid = 2) and Chris (uid = 3). Now Bob want to delete his profile and stop using our community. If we delete Bob's profile from the 'users' table then his missing 'uid' will create a gap which will be never filled again. In my opinion it's a huge waste of uid's. I see 3 possible solutions here:

1) Increase the capacity of the 'uid' field in our table from SMALLINT (int(2)) to, for example, BIGINT (int(8)) and ignore the fact that some of the uid's will be wasted.

2) introduce the new field 'is_deleted', which will be used to mark deleted profiles (but keep them in the table, instead of deleting them) to re-utilize their uid's for newly registered users. The table will look then like this:

CREATE TABLE `users` (
  `uid` int(2) unsigned NOT NULL auto_increment COMMENT 'User ID',
  `name` varchar(20) NOT NULL,
  `password` varchar(32) NOT NULL COMMENT 'Password is saved as a 32-bytes hash, never in plain text',
  `email` varchar(64) NOT NULL,
  `is_deleted` int(1) unsigned NOT NULL default '0' COMMENT 'If equal to "1" then the profile has been deleted and will be re-used for new registrations',
  `created` int(11) unsigned NOT NULL default '0' COMMENT 'Timestamp of registration',
  `updated` int(11) unsigned NOT NULL default '0' COMMENT 'Timestamp of profile update, e.g. change of email',
  PRIMARY KEY  (`uid`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8;

3) Write a script to shift all following user records once a previous record has been deleted. E.g. in our case when Bob (uid = 2) decides to remove his profile, we would replace his record with the record of Chris (uid = 3), so that uid of Chris becomes qual to 2 and mark (is_deleted = '1') the old record of Chris as vacant for the new users. In this case we keep the chronological order of uid's according to the registration time, so that the older users have lower uid's.

Please, advice me now which way is the right way to handle the gaps in the auto_increment fields. This is just one example with users, but such cases occur very often in my programming experience.

Thanks in advance!

+1  A: 

definitely not the moving user ids idea - that will kill you or your mysql server at some point. lets say you have 1,000,000 users and user 2 gets deleted - you had to shift 999,999 records one down... as simple as the query would be it would still lock your db for a while. also i think that messes with your auto_increment value which is set on each insert for each table. insert -> AI+1 -> insert -> AI+1 -> delete -> AI stays the same... if you would shift all the ID's the next auto_increment value would still be 1,000,001 which would now leave 1,000,000 empty.

i say unsigned BIGINT and ignore it - cause if you come even close to the limit of bigint you have many other problems to solve ;)

Tobias
Thank you very much! I'll accept the gaps now :) Like the last sentence ;)
PHPguy
18,446,744,073,709,551,615 IDs are available with bigint... guess that definitely solves the problem of "gaps" - just as an example how much that is: if you have 300,000,000 users (facebook size) each single user would have to delete their account and create a new one over 6 billion times till you run out of IDs ;)
Tobias
A: 

First of all; why do you think it's a "waste" of uids? I mean, it's just an integer (or BIGINT), this aren't the 70's anymore.

Secondly, the performance loss you get if you implement one of your suggested options is far greater than the spatial loss you get from "wasting" uids. If some user deletes his profile, at worst, every user that registered after him gets a new id, so you have to update very, very much records...

I must admit, when I just started programming I remember having to get used to gaps in auto-increment columns. But you will have to accept them, move on, and just let them exist...

Lex
Thank you Lex for your opinion, as you said I have to get used to the gaps in auto_increment as you once did!
PHPguy
A: 

I'd just ignore the gaps and ensure you have as large a range of ids as needed. The gaps do no real harm. Trying to fix them up by updating data might introduce broken relationships that are more trouble.

By the way in MySQL INT(2) the 2 specifies the maximum display width, but doesn't affect the amount of storage. INT(8) uses the same storage as INT(2) - use BIGINT as you imply.

martin clayton
Thank you Martin!
PHPguy
A: 

The maximum value of an unsigned int is 4,294,967,295. The current population of the internet is approximately 1.8 billion people. I'd recommend using an unsigned int for your purposes and don't worry about gaps in your sequence.

On a philosophical note: Donald Knuth once said "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil".

Mark Maunder