views:

132

answers:

4

I have a table which stores comments, the comment can either come from another user, or another profile which are separate entities in this app.

My original thinking was that the table would have both user_id and profile_id fields, so if a user submits a comment, it gives the user_id leaves the profile_id blank

is this right, wrong, is there a better way?

+4  A: 

If the comments are general for several objects you could create a table for each object:

user_comments (user_id, comment_id)
profile_comments (profile_id, comment_id)

Then you do not have to have any empty columns in your comments table. It will also make it easy to add new comment-source-objects in the future without touching the comments table.

becquerel
+1 for the properly normalized technique
DancesWithBamboo
I like this solution but doing a simple filtering of the comments by commenter or commenter type would require a join.
cherouvim
Indeed - but a join is a small drawback, and it's always possible to create a View or Stored Procedure for this so you don't have to think about the joins it every time you need to get that data.
becquerel
Yes, I agree this is the cleanest solution. But maybe in some contexts (million of comments) it's best to denormalize.
cherouvim
+2  A: 

Another way to solve is to always denormalize (copy) the name of the commenter on the comment and also store a reference back to the commenter via a type and an id field. That way you have a unified comments table where on you can search, sort and trim quickly. The drawback is that there isn't any real FK relationship between a comment and it's owner.

cherouvim
Of the methods mentioned so far, I prefer this one. Yeah, there's no real foreign key relationship; that's always going to be a drawback here, but not many web applications worry about that! This is the most extensible way without creating many different tables storing similar information.
Stephen Orr
I dont understand. are you saying, store a copy of the commenters name but no FK?
Tim
@user270797: the comment data is on the table already. The commenter's data (name, email etc) should be copied (denormalized) on the comment table for easy access. This solution does make sense in some scenarios.
cherouvim
+2  A: 

In the past I have used a centralized comments table and had a field for the fk_table it is referencing.

eg:

comments(id,fk_id,fk_table,comment_text)

That way you can use UNION queries to concatenate the data from several sources.

SELECT c.comment_text FROM comment c JOIN user u ON u.id=c.fk_id WHERE c.fk_table="user"
UNION ALL
SELECT c.comment_text FROM comment c JOIN profile p ON p.id=c.fk_id WHERE c.fk_table="profile"

This ensures that you can expand the number of objects that have comments without creating redundant tables.

Josh Stuart
+2  A: 

Whatever is the best solution depends IMHO on more than just the table, but also how this is used elsewhere in the application.

Assuming that the comments are all associated with some other object, lets say you extract all the comments from that object. In your proposed design, extracting all the comments require selecting from just one table, which is efficient. But that is extracting the comments without extracting the information about the poster of each comment. Maybe you don't want to show it, or maybe they are already cached in memory.

But what if you had to retrieve information about the poster while retrieving the comments? Then you have to join with two different tables, and now the resulting record set is getting polluted with a lot of NULL values (for a profile comment, all the user fields will be NULL). The code that has to parse this result set also could get more complex.

Personally, I would probably start with the fully normalized version, and then denormalize when I start seeing performance problems

There is also a completely different possible solution to the problem, but this depends on whether or not it makes sense in the domain. What if there are other places in the application where a user and a poster can be used interchangeably? What if a User is just a special kind of a Profile? Then I think that the solution should be solved generally in the user/profile tables. For example (some abbreviated pseudo-sql):

create table AbstractProfile (ID primary key, type ) -- type can be 'user' or 'profile'
create table User(ProfileID primary key references AbstractProfile , ...)
create table Profile(ProfileID primary key references AbstractProfile , ...)

Then any place in your application, where a user or a profile can be used interchangeably, you can reference the LoginID.

Pete
+1 This is the solution I was starting to write up. See also *Polymorphic Associations* in http://www.slideshare.net/billkarwin/practical-object-oriented-models-in-sql
Bill Karwin