views:

284

answers:

7

I am not a database guy, but am trying to clean up another database. So my question is would normalizing the gender table be going too far?

User table:
userid int pk,
genderid char(1) fk
etc...

gender table:
genderid char(1) pk,
gender varchar(20)

Now at first it seemed silly to me, but then I considered it because i can then have a constant data source to populate from or bind from. I will be using WPF. If it was another framework I would probably avoid it, but what do you think?

+3  A: 

Just use a bit. 0 for male, 1 for female. Normalizing this to a lookup table only makes sense if you plan on adding "other. "

See also this post.

David Lively
-1: Forms that have precisely two options for gender belong in the last century. And in any case, what about the records where you don't have the information?
Vicky
@Vicky The degree to which you specify options for *gender* should probably depend on the business needs and sensitivity concerns within the application. Not all applications are going to have the same needs, though if you intend to only include the male/female binary option, perhaps *sex* would be a better word to use than *gender*.
Ben McCormack
@Ben: If you have a business need to collect the information at all, then surely you have a business need to collect it correctly.
Vicky
@Vicki the question was if it was worth normalizing a field that has exactly two possible values. In my opinion, No, it is not. Also, despite which century we're in, I'm having trouble envisioning a situation in which anyone may legally identify themselves as something other than male or female, and if the user doesn't answer the question, "null" always works just fine.
David Lively
And also, how does "collect it correctly" jive with "don't have the information?"
David Lively
Having Gender in the User table is already normalized assuming that each user has only one gender. Creating a new Gender table doesn't make it more normalized - and that's true no matter how many gender options there are.
dportas
@dportas absolutely correct; perhaps refactor would better describe the predicament? And I think you get points for best subtle-yet-deadly belaboration.
David Lively
A: 

Yes. I think that You can use enum in code and bind eventuatly to it.

null - unknow ; 0 - male ; 1 - female;

or you can use bool type to define this

null - unknow; true - male; false - female

Vash
Database design is supposed to make things easier to understand. `gender == true` is insane, and `gender == 1` isn't much better. Also I think half the world's population will be insulted by the idea that `female == false` (although obviously not half of the world's population of programmers)
APC
I never sad that this is good, but possible, everything depend of the business needs.
Vash
gender: true OR sex: yes please
yeffach nollid
@APC I think the world could understand defining female as "0" and male as "1", there's some pretty obvious symbology going on there :p In before comment flagged as offensive!
meagar
+4  A: 

I'm also not a database guy but I do it. It gives me the possibility to assure that only the genders are entered, that are valid (referencial integrity) and I can also use it to populate the selection control.

HCL
Makes it customisable for localisation also.
Binary Worrier
+1  A: 

Well, your company might have a requirement that, if possible, everything be normalized.

Also, depending on the business & data, you might need to include transgenders as well which would create 3+ genders (I don't know how many there are, haven't checked)

PowerUser
Even if there is one, you still would want to do it...Take the 5 mins now, rather than the 60 mins later...
NinjaCat
+4  A: 

Whether or not you choose to normalize your table structure to accomodate gender is going to depend on the requirements of your application and your business requirements.

I would normalize if:

  • You want to be able to manage the "description" of a gender in the database, and not in code.
    • This allows you to quickly change the description from Man/Woman to Male/Female, for example.
  • Your application currently must handle, or will possible handle in the future, localization requirements, i.e. being able to specify gender in different languages.
  • Your business requires that everything be normalized.

I would not normalize if:

  • You have a relatively simple application where you can easily manage the description of the gender in code rather than in the database.
  • You have tight programmatic control of the data going in and out of the gender field such that you can ensure consistency of the data in that field.
  • You only care about the gender field for information capture, meaning, you don't have a lot of programmatic need to update this field once it is set the first time.
Ben McCormack
Thank you. I didn't consider the localization issue, which I would like to plan for just in case.
AdamD
Also, other values allow multiple null meanings, such as "Not known", "Not disclosed", or other possibilities.
Adam Musch
@Adam I would make the field a `char(1)` and simply allow null.
meagar
@meagar - and what does NULL mean then - unknown, not disclosed, undefined, transhuman? It's an indeterminate value. Plus, inclusion of nulls generally makes logic less clear (gotta do trivalue, or use coalesce()) when in this case, it's pretty avoidable.
Adam Musch
+1  A: 

I can think of applications where I'd use different columns for sex and gender, have three values for sex (male/female/decline to state) and six for gender (male/female/transgendered male/transgendered female/asexual/decline to state). Granted, I live in San Francisco, where there's an level of public discussion of transgender issues that much of the rest of the world is behind the curve on.

The point is: without a compelling reason to think otherwise, I'd assume that any simplifying assumption I made about demographics was limited and parochial. The cost of breaking sex out to its own table is small now and expensive later. I wouldn't avoid the small cost on the basis of an assumption.

Robert Rossney
Apparently someone either doesn't know or doesn't want to hear about transgender issues.
Robert Rossney
A: 

I'll remark on another aspect: sorting. Normally, 'M' sorts after 'F'; in a project one time, a database table had a gender field with either of those two values. There was a desire to be able to sort results on the gender (census data) and a further preference to have 'M' appear before 'F'. My solution was to add a separate lookup table, assigning the Male value an ID of 0, and Female an ID of 1. So queries on the main table could easily be sorted on the new genderID field.

Grant Palin
What database query language doesn't let you write `ORDER BY Gender DESC`?
dan04
@dan04 Keep in mind that this was several years ago, so I'm not certain of the circumstances now. But I think the other requirement was to be able to filter on that column, as well as sort it, and numbers are easier to filter than characters. I think the sort+filter need led to the solution.
Grant Palin