views:

104

answers:

6

I'm a beginning programmer, building a non-commercial web-site.

I need a user ID, and I thought it would be logical to use for that a simple INTEGER field with an auto-increment. Does that make sense? The user ID will be not be directly used by users (they'll have to select a user-name); should I care about where they start at (presumably 1)?

Any other best practices I should incorporate in building my 'Users' table?

Thanks!

JDelage

+1  A: 

Yes, that is correct. Auto-Increment starts at 1, usually. It's not usually accepted to have 0 as an ID.

If you are storing passwords, do not store them as clear text, use md5 (most popular) or some other hash.

Cyntech
+1  A: 

Yes, auto incrementing is fine, Problably you will be saving passwords as well, make sure these have some kind of protection, hashing (md5) or encrypting is fine.

Emerion
JDelage
+4  A: 

auto_increment is okay. But, you shouldn't care of it's particular number.
Extremely contrary, you should never be concerned of the identifier's particular value. Take is as an abstract identifier only.

Though I doubt it can be invisible to users. Do you have another identifier to use? Auto_inqrement identifiers are usually visible to users as well. For example your ID here is 98361, nobody is hiding it. It is very handy to use such numbers, being unique and unchanged, forever bound to particular table row, it will always identify the same matter (a user, or an article, etc).

Col. Shrapnel
+1  A: 

Also make sure you index the columns you will use to perform lookups, such as email etc... to avoid full table scans.

methodin
I'm sorry, I don't understand what you are saying...
JDelage
@JDelage, an index is like the index in the back of a reference book, you look it up to find something quickly. An index in a database is set to a column or a number of columns to speed up the look up of records. It's generally good practise to index columns that store IDs (like the user ID you're asking about).
Cyntech
I'd read up on indexes as it's one of the most important concepts in maintaining a healthy database. Imagine having 1 million rows and searching for an email. The database has to look at all 1 million rows. With an index it can quickly scan and select the row almost immediately.
methodin
Sorry, I understand (+ or -) what indexes are, what throws me off is that the PK is indexed already, right? By definition...? I suppose the post was for non-PK fields that will be used in queries. The index can be created after the fact, right? I don't need to come up with all possible scenarios from the get go?
JDelage
@JDelage - yes, if it's a PK, it's an index already. Yes, they can be created at any time.
Cyntech
Got it, thanks.
JDelage
Yeah just be weary that it may take some time if you are indexing a large table.
methodin
+1  A: 

An auto incrementing field is fine unless you need to do things like share this ID across multiple databases then you will probably need to create the id value yourself. Also beware of exporting and importing data. If you are not careful all the id values will get reassigned.

In general I avoid auto incrementing fields so I have more control over how the id values are generated. Which is not to say I care what the values are just that they are unique. These are internal values the end user should never see.

Kevin Gale
Are you saying that it would be better to delegate the increment to the PHP code (for ex) than the db itself?
JDelage
In general I'd like to have some way for the database to generate an ID on demand. Some databases have this kind of ability built in. I'm not that familiar with mysql. I've also done it with a stored procedure. I just know that most of the time I've ended up doing this way in the long run.
Kevin Gale
+4  A: 

Your design is correct. Your internal PK should be a meaningless number, not seen by users of the system and maintained automatically. It doesn't matter if it starts at 1 and it doesn't matter if it's sequential or not, or if you have "holes" in the sequence. (For cases in which you do expose the number to end users, it is sometimes important that the numbers be neither sequential nor fully-populated so that they are not guessable).

Users should identify themselves to the system with another, meaningful piece of the information (such as an email address). That piece of information should either be guaranteed unique (using a UNIQUE index) or else your front end must provide an interface for disambiguation.

Among the benefits of this design are:

  1. The meaningful identifier for the account can be changed by updating one value in one record of one table, rather than requiring updates all around the database.

  2. Your PK value, which will appear many, many times in the database, is a small and efficiently indexed integer while your user-facing identifier can be of any type you want including a longish text string.

  3. You can use a non-unique identifier with disambiguation if the application calls for it.

Larry Lustig
Great, thank you very much.
JDelage