views:

124

answers:

8

Hi, I have a pretty basic question on which is the preferred way of storing data in my database.

I have a table called "users" with each user getting a username and user_id. Now, I want to make a table called "comments" for users to comment on news.

Is it better to have a column in comments called "username" and storing the logged in user's name, or have a column called "user_id". If I use user_id I would have to make my sql statement have another select statement. "(SELECT username FROM users WHERE users.id = comments.user_id) as username". It seems like performance would be better just storing the username.

I thought I read to avoid duplicate data in a database though.

Which is better?

Thanks

+1  A: 

If the user_id is the primary key then you should use user_id instead of username, if you want to use username instead of user_id then why do you have a user_id in the first place?

AlbertEin
+9  A: 

Typically, you use ID fields to link tables together. The reason being (in your situation) that you might allow the person to change their username, but you don't want to try and update all the places that is at...

Therefore, put the user_id in your comments table and pull the username out on a join, as you've shown.

Chris Lively
this is exactly right
larson4
Yep, exactly right. Maybe one thing : I *always* use the name "id" for ALL my tables if it's the "main" id, and a foreign key begin with "id_" then the name of the table of the forgein key. Thus my queries look like :<<SELECT C.comment FROM users U JOIN comment C ON C.id_user=U.id WHERE U.name like "Olivier">>. What's the point ? Someone who doesn't know your database will very easily guess the syntax of the foreign keys, and (far more important) in the long run your database is easier to maintain.
Olivier Pons
+1  A: 

If there's the potential of creating a large enough database, store the user_id in the comments table. Less overhead. Also consider that usernames my be modified easier this way.

Sev
Less overhead, slower joins
Joe Philllips
Just to clarify, I think you meant that if they use usernames, then they'd encounter slower JOINs, right?
Sev
A: 

Storing the userid (integer) will mean faster JOINs later. Unless you plan on having people dig through the database by hand, there's really no reason to use the username

Joe Philllips
A: 

I'm pretty sure storing the user id in the comments table is sufficient. If you're returning rows from the comments table, just use the JOIN statement.

Cheers

Fitzy
A: 

Which is going to be a unique identifier? The user_id, I'd bet, or you can't have two "John Smith"s in your system.

And if volume is much of a concern, text matching the username field is going to be more expensive than linking to the users table in your query in the long term.

DaveE
+1  A: 

Data should be stored in (at least) third normalized form, so you should use the user_id as the primary key in the users table, and as a foreign key in the comments table and use this to get the details:

SELECT comments.*, users.username  
FROM comments, users
WHERE users.user_id = comments.user_id;

If you are getting the comments based on an article, you could do this like this:

SELECT comments.*, users.username  
FROM comments, users
WHERE users.user_id = comments.user_id  
AND comments.article_id = '$current_article_id';
phalacee
You don't need the articles table in that second query at all.
Sohnee
@Sohnee: Are you seriously going to go through all the answers I have ever posted an find something to pick at them over?
phalacee
@phalacee - how many of your answers have I responded on?
Sohnee
@Sohnee: Why are you asking me this, you're the one that's down-voted on an answer that are over a year old, posted inane comments on it and posted comments on a few other questions I've written answers to. I don't consider it a coincidence that this has all happened since I posted a comment on one of your responses and down-voted it - the down-vote I have since removed as you corrected the response.
phalacee
@phalacee - this is a lot of chit chat then, since you still haven't removed the articles table from your answer, which is really what my comment was about, since you are joining a table but not selecting any data or filtering based on it.
Sohnee
A: 

Numeric values are cheaper to join and index than an alphanumeric id. Use a number to uniquely identify a row. Another benefit is that the PK doesn't need to change if they need to change the user id. The last benefit is that this is the design of most modern web frameworks such as django and rails.

Joshua