tags:

views:

164

answers:

4

I have a MySQL table that records classified listings. We don't force users to join to post a listing, and therefore the listing will not always have a user_id associated with it.

I therefore need a method of recording the poster's email if they are not signed in.

Is it bad practice to create a column email that will sometimes be blank and sometimes be filled?

Or is there a better way to go about this that I don't realize?

+2  A: 

this is exactly what NULL is for. but you already knew that because your user_id column will also sometimes be NULL, right?

longneck
to be honest, im fairly new...i didn't know what null was for. since my user_id column is an int column, i figured if the poster wasnt signed in, that it would just be 0. is that poor practice as well? Thanks!
johnnietheblack
no, don't use 0. use NULL, like Pascal MARTIN said: 'Using NULL, which literally means "no value" is better than using some kind of "impossible value", like an empty string' (or 0)
longneck
+2  A: 

Is it bad practice to create a column email that will sometimes be blank and sometimes be filled?

It is not a bad practice, no : juste use a NULL column -- that's why they exist ;-)
See 12.1.17. CREATE TABLE Syntax : in the *column_definition* part of the create table query, you can specify NULL or NOT NULL.

BTW: Using NULL, which literally means "no value" is better than using some kind of "impossible value", like an empty string : NULL really means "no value", and make your point obvious -- while an empty string could mean an error in your code.

And I don't really see another "logical" way, actually...

Note, though, that you'll have to handle a NULL value for the email, in your application's code, of course ;-)

Pascal MARTIN
+1  A: 

I think the approach you have laid out is perfectly acceptable. As longneck points out, thats what NULL is for in SQL databases.

However, if you're truly concerned about it, you could save space (possibly a significant amount, depending on the column type and number of rows) if you use the *user_id* column for the userid and the email address, and then have another boolean column, say *is_email* to distinguish which type of value is stored in the *user_id* column. This may simplify things for you because it is likely that your application does not care, in many places, whether the data is actually a user_id or an email address.

JoshJordan
thats a good idea for sure, but i feel like it's a little round about in the logic? definitely an option though
johnnietheblack
I think its more straight forward than keeping separate columns, personally. For instance, if you have any sorting or searching features, you'll have to do extra work if you have the email and user_id separate, as well as checking both of them for NULL anytime you use them. With this approach, you can mark the column as never being NULL, and only check for a distinction between email and user_id when necessary.
JoshJordan
If user_id is a foreign key to another table (like users, perhaps) then combining the user_id and email columns loses the advantages one gets with a FK.
dnagirl
+1  A: 

I have a MySQL table that records classified listings. We don't force users to join to post a listing, and therefore the listing will not always have a user_id associated with it.

I therefore need a method of recording the poster's email if they are not signed in.

What is the business key of your user entity? Or, more directly: what is your user entity? Is every distinct email address a key for a User entity with some users having registered and their email set in some profile, and others not registered and giving an email address every time they post? Or do you have two distinct entities, RegisteredUser and UnknownPosterWithEmailAddress, with their attributes stored in separate places?

In the latter case, you would use a NULLable user_id and a NULLable email field, like you suggested, but then queries like "for a given post, find the email address the reply should be sent to" are going to be awkward, e.g. a list of all post with their respective reply addresses will look like this:

select post.id, 
   case when post.user_id is not null then user.email
        else post.email end as email
   from post
   left join user on user.id=post.user_id;

This can get real messy after a while.

I'd rather use the former approach: each row in User is a dsitinct poster, with an non-NULLable unique email address, and a surrogate key as foreign key in posts:

create table user(id integer primary key,
                  email text not null unique,
                  is_registered boolean default false);
create table post(id integer primary key,
                  user_id integer not null references user(id),
                  content text);

If a non-registered user enters an email address, you look it up in the user table, and retrieve the user.id, adding a new entry in user if necessary. As a result, you can answer questions like: for a given email address, how many posts has this user made in the past week? via the foreign key field, without having to compare strings in some NULLable attribute field.

When a user chooses to register, you can add the registration data either in user itself or in some separate table (again with user.id as a foreign key, some might argue that a boolean field is_registered is actually redundant then). Added benefits:

  • If he has posted before under the same email address, now all of his old posts become associated with his new registered identity automatically.
  • If the user changes his email address in his profile, all replies to older posts of his "see" the new updated email address.
wallenborn
This second idea is a GREAT idea...thanks for pointing that option out!
johnnietheblack