views:

109

answers:

9

For example, lets say I have an entity called user and an entity called profile_picture. A user may have none or one profile picture.

So I thought, I would just create a table called "user" with this fields:

user: user_id, profile_picture_id (I left all other attributes like name, email, etc. away, to simplify this)

Ok, so if an user would have no profile_picture, it's id would be NULL in my relational model. Now someone told me that I have to avoid setting anything to NULL, because NULL is "bad".

What do you think about this? Do I have to take off that profile_picture_id from the user table and create a link-table like user__profile_picture with user_id, profile_picture_id?

Which would be considered to be "better practice" in database design?

+3  A: 

This is a perfectly reasonable model. True, you can take the approach of creating a join table for a 1:1 relationship (or, somewhat better, you could put user_id in the profile_picture table), but unless you think that very few users will have profile pictures then that's likely a needless complication.

Readability is an important component in relational design. Do you consider the profile picture to be an attribute of the user, or the user to be an attribute of the profile picture? You start from what makes logical sense, then optimize away the intuitive design as you find it necessary through performance testing. Don't prematurely optimize.

Adam Robinson
Well, in the real world I would never start with the profile picture and ask which user it has ;)
openfrog
@frog: Exactly, which is why it (to me) makes more sense to leave the design as you have it, with the `profile_picture_id` as an attribute of the `user`. There is nothing wrong with `null` values as a rule.
Adam Robinson
Thanks. Going to send my Anti-Null-friend this link ;)
openfrog
To me, the profile_pictures table is dependent on the users table. So it makes more logical sense to have the user_id column in the profile_pictures table :)
Scott Anderson
Adam is right that there's really nothing "wrong" with this design, and it's simple to understand and implement. However, depending how your picture table is implemented, Scott may have a pretty good point too - does a picture have its own lifetime completely independent of a user, or is it dependent on the user?
GalacticCowboy
In this case, it's dependent on the user. Of course it has some benefits to put the user_id as FK into the profile_picture table. When displaying a gallery of user images, a link to the user profile can be generated with much better performance, probably. I try to find a "rule" for my framework to cover all cases of (0-1):1 relationships.
openfrog
A: 

I agree that NULL is bad. It is not relational-database-style.

Null is avoided by introducing an extra table named UserPictureIds. It would have two columns, UserId and PictureId. If there's none, it simply would not have the respective line, while user is still there in Users table.

Edit due to peer pressure

This answer focuses not on why NULL is bad - but, on how to avoid using NULLs in your database design.

For evaluating (NULL==NULL)==(NULL!=NULL), please refer to comments and google.

Pavel Radzivilovsky
In what possible way is null categorically "not relational database style"?
Adam Robinson
and if you left join against that table, you get a .... NULL!
RedFilter
Both left join and null are not a part of the relational model. They are pets causing lots of confusion, and can easily be avoided.
Pavel Radzivilovsky
Looks like it's time for someone to earn his Peer Pressure badge.
Adam Robinson
LEFT join is not part of relational model? Which universe are you from? Everyone know that it's the RIGHT join that doesn't belong in the relational model...
Charles Bretana
So you'd rather perform two joins with a reference table in between than have a simple nullable foreign key? This seems pretty silly in my book, sorry.
Scott Anderson
@charles: Universe without NULLs can perfectly live and work. In many ways, it's better than the popular one here.
Pavel Radzivilovsky
@Pavel: Life must be nice in academia.
Adam Robinson
@Scott: yes, definitely. Everything that starts with "hey, I might not know what this field is". If you don't know one of the fields, **then don't add them to your table**. Design the tables in such way that this won't happen. Hacks like NULL quickly get you deep into mysteries of three-state logic and respective bugs, the value of NULL!=NULL and respective bugs, and the differences between "unapplicable" to "unknown". You are welcome to do your databases your way though :)
Pavel Radzivilovsky
@Adam re "academia": I don't even have a degree in computer stuff (I am a physicist). However, I do debug code written by people who don't know what relational database design is.
Pavel Radzivilovsky
@Pavel it seems that you might be one of those people who doesn't "know what relational database design is." Ternary logic is perfectly acceptable if you use any modern RDBMS. It seems the consensus here is against you.
Scott Anderson
@Scott: I (sometimes) design databases that work. I believe you do the same. Both approaches are OK, but what's important is that neither of them is stupid, and I very much doubt is that this consensus is not created by those NULLifiers who do not understand the consequences of what they do. Are you aware of the fact that NULL is handled differently by different RDBMS? Besides, what's so bad in an extra Inner Join instead of ternary logic?
Pavel Radzivilovsky
The problem is people who make blanket statements like "NULL is BAD" without understanding the problem, the purpose of NULL, or whether NULL really makes sense in a particular context or not.
GalacticCowboy
@Pavel You're treating NULL like it's some sort of evil without really providing any evidence. Maybe 20 years ago it was a bad idea, but any modern, decent RDBMS handles NULL values just fine now. I'd like to see any modern RDBMS where performing two joins versus only one with a nullable field saves time, performance, or even makes more logical sense.
Scott Anderson
@Scott I believe I referenced enough NULL issues both in the comments above and in the answer (types of ternary logic, etc). I do not think ternary logic is evil. It <a href="http://bugpwr.blogspot.com/2009/10/nullables-in-c.html">depends if you are asking or answering</a>. Most people who use NULL in their designs just don't know what they are doing. They have only basic understanding of databases, along the lines "here I want a dunno, let's make it NULL" and then wonder why their complex queries don't work as they would expect. In terms of performance, also, JOIN should not be inferior.
Pavel Radzivilovsky
+3  A: 

"NULL is bad" is a rather poor excuse for a reason to do (or not do) something.

That said, you may want to model this as a dependent table, where the user_id is both the primary key and a foreign key to the existing table.

Something like this:

  Users                     UserPicture                   Picture
----------------          --------------------          -------------------
| User_Id (PK) |__________| User_Id (PK, FK) |__________| Picture_Id (PK) |
| ...          |          | Picture_Id (FK)  |          | ...             |
----------------          --------------------          -------------------

Or, if pictures are dependent objects (don't have a meaningful lifetime independent of users) merge the UserPicture and Picture tables, with User_Id as the PK and discard the Picture_Id.

Actually, looking at it again, this really doesn't gain you anything - you have to do a left join vs. having a null column, so the other scenario (put the User_Id in the Picture table) or just leave the Picture_Id right in the Users table both make just as much sense.

GalacticCowboy
how would that look like, in detail?
openfrog
Nice ASCII art. Well done!
openfrog
+2  A: 

Your user table should not have a nullable field called profile_picture_id. It would be better to have a user_id column in the profile_picture table. It should of course be a foreign key to the user table.

klausbyskov
Both are reasonable, but are you more likely to look for the *user's profile picture* or a *profile picture's user*?
Adam Robinson
+2  A: 

Since when is a nullable foreign key relationship "bad?" Honestly introducing another table here seems kind of silly since there's no possibility to have more than one profile picture. Your current schema is more than acceptable. The "null is bad" argument doesn't hold any water in my book.

If you're looking for a slightly better schema, then you could do something like drop the "profile_picture_id" column from the users table, and then make a "user_id" column in the pictures table with a foreign key relationship back to users. Then you could even enforce a UNIQUE constraint on the user_id foreign key column so that you can't have more than one instance of a user_id in that table.

EDIT: It's also worth noting that this alternate schema could be a little bit more future-proof should you decide to allow users to have more than one profile picture in the future. You can simply drop the UNIQUE constraint on the foreign key and you're done.

Scott Anderson
@Scott, the idea that having the join table is a "better schema" implies that having multiple pictures will be a requirement. It's not objectively better going on the information in the question.
Adam Robinson
It's simply moving the foreign key from one table to the other. The only reason I would call it "better" is because of the future proofing and the fact that the profile_pictures table is now explicitly dependent on the users table. I guess it's really just a matter of perspective.
Scott Anderson
+3  A: 

NULL isn't "bad". It means "I don't know." It's not wrong for you or your schema to admit it.

duffymo
+1 for saying "I don't know" being okay. That is a deep truth.
Anthony Potts
+1  A: 

It is true that having many columns with null values is not recommended. I would suggest you make the picture table a weak entity of user table and have an identifying relationship between the two. Picture table entries would depend on user id.

Lucas T
+1  A: 

Make the profile picture a nullable field on the user table and be done with it. Sometimes people normalize just for normalization sake. Null is perfectly fine, and in DB2, NULL is a first class citizen of values with NULL being included in indices.

Anthony Potts
A: 

@Pavel.

There is no point in doing what you try to do here. It's a sheer waste of time. Being right about data management (as you are) around here can only get you shit loads of downvotes.

Erwin Smout