views:

224

answers:

9

I have a problem creating a database schema for the following scenario:

(I’m not creating a dating site but just using this as an example)

A user logs on to a dating site and is given a multiple selection for the hair colour they’d like their date to have:

This is easy enough to model with the three tables below:

Tables:

User {key}

HairColour {key}

UserHairColour {UserKey} {HairColourKey}

However, the user also has the option to select ‘any’ which means they don’t care about hair colour and all hair colour should be included in the selection.

How do I give the user the ‘any’ option?

I could obviously select all hair colours and shove them into ’UserHairColour’ but what if I need to add a new hair colour in the future?

A: 

Declare a temp table, fill it with the color values and query like this:

SELECT  *
FROM    UserHairColor
JOIN    User
ON      User.id = UserHairColor.UserID
WHERE   HairColorKey IN
        (
        SELECT  ColorKey
        FROM    @mytable
        )
UNION ALL
SELECT  *
FROM    UserHairColor
JOIN    User
ON      User.id = UserHairColor.UserID
        AND NOT EXISTS
        (
        SELECT  NULL
        FROM    @mytable
        )

This will select all users with requested hair colors, of all users at all if the table is empty.

Quassnoi
+9  A: 

Absence of any records for this particular user in the UserHairColour table will indicate they do not care about the hair colour.

Absence of a decision indicates they have no preference. Obviously, it cannot mean they want their date to have no hair color at all.

I do not see here a need for a separate value or any extra table design. What you have allows you to achieve your goal in a simple way.


EDIT: As reaction to a proposed solution with ANY extra value.

The idea of "ANY" will conceptually interfere with the other selections. We are talking about presenting the user with a multitude of choices, ANY being one of them, and allowing them to select many. So the user can technically select ANY along with the other options, making it unclear what takes precedence - ANY or specific options. I believe the approach with simply no records as an indicator of ANY is clearer - it can only be interpreted one way. No records - no preferred values. You obviously cannot interpret it in the other way - no preferred value - user does not want this value to be present - this will make for transparent hair color which makes no sense. You can say it can mean no hairs at all, but I would suggest to have a separate option or a separate question for that already.

Developer Art
I think this is looking like the most practical solution. The downside of this approach is that the application will have to make an extra trips database.E.gIF (User. PreferredHairColour.Count == 0) User. PreferredHairColour = “select * from HairColour where UserKey = User.Key ”(Not real code :-) )Thanks
Lee Smith
Your edit made me re-read the question more closely. I missed the possibility of "multiple selection" meaning multiple checkboxes rather than radio buttons. So, if the user can choose Blonde+Brunette then your reservations about Any are justified. But if the choice is exclusive then I think Any is a valid option.
APC
Actually, even with exclusive choice "Any" option will be redundant.
Developer Art
Rather belatedly I have decided I agree with you, and have deleted my answer in favour of the ANY value. I would continue to defend an ANY value if the HairColour was a column on the main User table but it is not appropriate with the UserHairColour table. Whether an intersection table is the appropriate design decision is a different question.
APC
+1  A: 

Given the example above, I would just add 'Any' or 'No Preference' as a selection and treat it as a specific hair color. This would work the best because if you did want to add more specific hair colors. Typically when I create new relational models I tend to add a -1 for the first key entry and keep the values for that row as my default go to ones. This would be better practice than just dummy'ing it out with a temp table or query in my opinion.

ajdams
A: 

If users can select any number of HairColours, I think, for consistency, it would be useful to do shove a record in UserHairColours for every colour. If users can select only one, one of which is 'any', then I favour New in town's solution.

Lex
A: 

Put (PersonID, HairColorPreference) in a table of its own. If someone has no preference, just don't record a row in that table.

Use views to put together people with preference with just that preference, and people with no preference with all hair colors.

BTW, what are you going to do with people whose preference is "anything but purple"?

Walter Mitty
+1  A: 

This should be simple to achieve. If the user chooses "Any", you simply handle it on the query:

select 
 * 
from
 User
left join 
 UserHairColour on UserHairColour.UserId=User.UserId
where
 (@hairpreference = 'Any' OR UserHairColour.HairColourId=@hairpreference)

If you can set the input var @hairpreference to null instead of 'Any', then it gets easier:

where
 (UserHairColour.HairColourId=COALESCE(@hairpreference, UserHairColour.HairColourId))
leandrosa81
A: 

As clearly you are not going to build a dating site you may make it clear are the other answers here fulfill your need or not. But my suggestion is to creat another table to tell if a user has selected any hair color of no hair color at all( sounds nonsense in your example but may have meaning in other situation). By having following tables in your database you may accomplish this.

  1. Users
  2. HairColor
  3. TypeOfColorSelection(1:Selected, 2:All, 3:Exclude, ...)
  4. UserColorSelectionProfile(UserID, TypeOfColorSelection)
  5. UserPreferredColor(UserID, HairColor)
Mehdi
I think 'New in Towns' answer solves the problem. It's a bit messy doing it this way, I'd have to use a transaction to update/insert to tables and use triggers to enforce data integrity.
Lee Smith
A: 

If you want the hair color option to be mandatory then the no choice (empty set) option doesn't work.

A: 

This reminds me of the classic UK TV ads for Whiskas cat food. The strapline was originally,

Eight out of ten owners say their cat prefers it

Later, it was changed to

Eight out of ten owners who expressed a preference said their cat prefers it

[The italics are mine.]

Clearly, the results are skewed when failing to show the difference between implicitly explicitly having no preference, otherwise why change a purrfectly good strapline for one that doesn't scan quite as well? QED ;)

My preference would be to use separate tables to model those who expressed a preference (along with the colour(s) they chose), those who expressed they had no preference and those who expressed no preference.

For a worked example, see How To Handle Missing Information Without Using NULL by Hugh Darwen.

onedaywhen