views:

653

answers:

6

I have a web-based application that notifies users of activity on the site via email. Users can choose which kinds of notifcations they want to receive. So far there are about 10 different options (each one is a true/false).

I'm currently storing this in one varchar field as a 0 or 1 separated by commas. For example: 1,0,0,0,1,1,1,1,0,0

This works but it's difficult to add new notification flags and keep track of which flag belongs to which notification. Is there an accepted standard for doing this? I was thinking of adding another table with a column for each notification type. Then I can add new columns if I need, but I'm not sure how efficient this is.

Thanks in advance!

A: 

I'd expect that letting the DB manage it by using Bool columns would be better. I seem to recall that some systems will pack bools to bits (null might mess that up). To avoid clutter, you might make it a separate table.

(I'm no DBA)

Edit: slaps head "I just suggested exactly what you are thing of" :b

BCS
+1  A: 

I would use 10 different bit or bool fields. But if your going to do it in one field, you can use a bitmap 0x1111111111 as a big integer or a text field without the comma. I've worked on different applications, using all those techniques. But I'd actually just go with the multiple fields. It will be a lot easier to do select statements on.

stephenbayer
A: 

It depends! It's down to space versus flexibility.

If you can guarantee that the number of options is 'always' going to be less than 32, then you could use an int to store them. But then you'll have to pack/unpack which makes for less readable code.

If you don't have space concerns, then create a separate table as you suggested. Given the right indexes, the join will be fast.

Mitch Wheat
I'm not a DB admin but I am wearing that hat for this project, what sort of index would I put on a table like this (10 boolean fields, 1 uniqueidentifier field, about 50,000 rows currently)?
Arthur Chaparyan
The uniqueidentifier for starters. After that, it would depend on if you need to look up by flag. If so... (uniqueidentifier,flagN) for each flag?
BCS
I don't know what dbms your using but 50k rows isn't particularly large. you would of course have an index on your id field and the rest depends on your implementation and needs
stephenbayer
there isn't that much benefit in indexing a boolean column. low cardinality is a bad candidate for indexes (and sometimes performance is worse)
Owen
Bitmapped indexes are optimized for cases where the column being indexed has a very limited set of values. Boolean/Bit columns are perfect candidates.
Hank Gay
+14  A: 

I would use two tables. One table would store the user data and the other the notifications that they subscribe to. The second table would look something like this:

create table notifications (
   user_id int,
   notification_type int
);

I'd make a FK relationship between user_id and the user's id in the users table with a cascade on delete. Use both the user_id and notification_type as the primary key. To check if a user wants a particular notification simply do a join between the two tables and select rows where the notification_type matches the one in question. If the result set is non-empty the user wants the notification.

Adding new notifications becomes trivial (as does deleting). Simply add (delete) a new type value and let users choose to accept it or not. If you wanted to keep the notification types in a table to manage via the application that would work, too, but it would be a little more complex.

tvanfosson
This seems the best solution by far IMO. Storing them all in one column is not very extensible. And storing them as bool fields in a table means you have to add a new field every time there is a new notification type.
Craig
The would work well, particularly if you want to be able to scan the notification types.
BCS
normalization is great for extensibility, but it does suffer a bit on performance.
stephenbayer
The FK relationship should induce an index on the user_id in the second table, making this an index join which should limit the impact on performance.
tvanfosson
Flexibility first. Optimize for performance, without sacrificing too much flexibility, later - and only when you know that you have to. Do it the way this answer suggests.
Justice
+1  A: 

Using MySQL?

Then, SET datatype is the answer.

"The MySQL SET datatype is stored as an integer value within the MySQL tables, and occupies from one to eight bytes, depending on the number of elements available." - http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html"

yogman
SET is limited to 64 flags.
epochwolf
A: 

If you do decided to use a bit field like @stephenbayer mentioned you can always use a view on the table to make it easier for developers to use. This then means that you still have the space savings of the bit field and the ease of use of separate columns per field and while avoiding having to parse the column.

As mentioned the separate table is an excellent option if you want your solution to be more extensible. The only downside is slightly increased complexity.

This is the trade off. If you want something that is really easy to implement and is fast consider the bit field. If you want something that is easier to extend and maintain at the cost of slightly more complexity then by all means go for the separate table. If the votes tell you anything you probably want to follow the separate table implementation.

smaclell