views:

816

answers:

15

I'm designing a database schema, and I'm wondering what criteria I should use for deciding whether each column should be nullable or not.

Should I only mark as NOT NULL only those columns that absolutely must be filled out for a row to make any sense at all to my application?

Or should I mark all columns that I intend to never be null?

What are the performance implications of small vs large numbers of NOT NULL columns?

I assume lots of NOT NULL columns would slow down inserts a bit, but it might actually speed up selects, since the query execution plan generator has more information about the columns..

Can someone with more knowledge than me give me the low-down?

+13  A: 

There are no significant performance consequences. Don't even think about considering this as an issue. To do so is a huge early optimization antipattern.

"Should I only mark as NOT NULL only those columns that absolutely must be filled out for a row to make any sense at all to my application?"

Yes. It's as simple as that. You're a lot better off with a NULLable column without any NULL values in it, than with the need for NULLs and having to fake it. And anyway, any ambiguous cases are better filtered out in your Business Rules.


EDIT:

There's another argument for nullable fields that I think is ultimately the most compeilling, which is the Use Case argument. We've all been subject to data entry forms that require values for some fields; and we've all abandoned forms where we had no sensible values for required fields. Ultimately, the application, the form, and the database design are only defensible if they reflect the user requirements; and it's clear that there are many, many database columns for which users can present no value - sometimes at given points in the business process, sometimes never.

le dorfier
Do you have any technical resources to verify that?
Rex M
It would kinda be like trying to prove a negative. If there were any consequence, I would expect it to be widely documented.
le dorfier
The upvotes on his answer will serve as technical resources.
Portman
There are just as many upvotes for more hardline stances on nulls.
Steven Huwig
A comment on another answer says "no one would have a null salary", but its quite possible for an employee to be in the system with an -undetermined- salary, which is distinct from -no- salary. Another consideration is forward/backwards compatibility and the organic growth of...
Richard Levasseur
...growth of the database. As it grows, some fields may be deprecated or become unused.
Richard Levasseur
I submit that vote-counting ... is no reliable indicator of correctness. It just polls for conventional wisdom, which I find to be generally unreliable especially when it comes to databases.
le dorfier
@le dorfier: you wrote "Your data model must model the real world, and the real world is ambiguous." This does not entail using NULL values. Using NULL forces you to renounce the law of the excluded middle -- renouncing it causes ambiguity and does not solve it.
Steven Huwig
The correct way to deal with ambiguity is to model your data based on the propositions that may or may not be mde about that data. The proposition "my dog's birthdate is NULL" is nonsensical. Don't force yourself to claim nonsense. Make a new table with optional participation for this fact.
Steven Huwig
If you have no dog, then your dog's birthday is NULL. I understand your description of making additional tables instead, but that's just a dubious way to accomplish the same thing, and you end up with two tables. Find me one credible reference recommending you create "non-mandatory property tables."
le dorfier
How about a member of the SQL committee? http://www.dcs.warwick.ac.uk/~hugh/TTM/Missing-info-without-nulls.pdf . It's part 1a.
Steven Huwig
... and your citation requested on my answer is on pages 53-55 of Database in Depth, under "Why Nulls are Prohibited," now linked in my answer.
Steven Huwig
... and if you have no dog, your dog's birthday is mu, not null. "My dog's birthday is X" encompasses two propositions: the existential proposition that I have a dog, and the proposition that my dog's birthday is X. "My dog's birthday is NULL", OTOH, can only mean "I don't know my dog's birthday."
Steven Huwig
Or it can mean "I don't want to tell you my dog's birthday, and it shouldn't break your database if I won't."
le dorfier
Date's eternal war (God bless him, we need him badgering the vendors) is to enable the creation of pure relational databases and fully capable query languages. I've not seen many cases where a Date-designed refactoring has actually been implemented.
le dorfier
with today's technology most OLTP systems don't need nulls. the trouble is that designers are lazy and don't think in terms of the propositions being made about their universes of discourse.
Steven Huwig
Granted, database design is generally poor, as is query design. We can lament that. But there are two questions - how to get better tools, and how better to use the tools we have. I think we both care mostly about #2, since we (at least I) am but a feeble voice. Date addresses mainly #1.
le dorfier
you don't need NULLs to represent "no answer." you need vertical decomposition, per the Darwen presentation.
Steven Huwig
But since this is a question about existing tools and (specifically) SQL, then I think discussing doing away with NULLs when we don't have a good #1 alternative may not be very useful.
le dorfier
And if you want to start, then you *must* insist on complete decomposition. The worst case is a non-nullable dog birthday in the primary table, which is the inevitable consequence of a lot of the suggestions in this thread.
le dorfier
Using standard normalization techniques -- horizontal decomposition, vertical decomposition -- you can avoid the vast majority of NULLs. At some point I punt and include NULLs too, but not until that normalization is done.
Steven Huwig
Fair enough -- or even 'better,' my dog's birthday is 01-01-1900 but that's someone's "sentinel value."
Steven Huwig
@Steven – isn't NULL just a standard [sentinel value](http://en.wikipedia.org/wiki/Sentinel_value) itself?
Kenny Evitt
Probably in the physical implementation, but conceptually it's much more overloaded than that. For instance, you can test for equality with a sentinal value, but don't try "val = NULL".
le dorfier
Sentinel values need to be a member of the domain, and NULL is in no domain because it is not a value.
Steven Huwig
+2  A: 

I would tend to agree with dorfier.

Be serious in your application about being flexible when receiving database NULL values and treating them as empty values, and you give yourself a lot of flexibility to let NULL's get inserted for values you don't specify.

There's probably a lot of cases where you need some very serious data integrity (and/or the intense speed optimization of disallowing NULL fields) but I think that these concerns are tempered against the extra effort it takes to make sure every field has a default value and/or gets set to a sensible value.

danieltalsky
+10  A: 

I have found marking a column as NOT NULL is usually a good idea unless you have a useful meaning for NULL in the column. Otherwise you may unexpectedly find NULL in there later when you realise you don't want it, and changing is harder.

singpolyma
Cool! Another ex-seminarian programmer! I'll guess it was really more than just 'personal enrichment'.
FastAl
+1  A: 

If you can think long term, having NULLs in a column affects how you can design your queries. Whether you use CASE statements, COALESCE, or have to explicitly test for NULL values can make the decision for you.

From a performance standpoint, it's faster to not have to worry about NULLS. From a design standpoint, using NULL is an easy way to know that an item has never been filled in. Useful examples include "UpdatedDateTime" columns. NULL means an item has never been updated.

Personally I allow NULLs in most situations.

Some Canuck
Such "UpdateDateTime" columns can sometimes be made nullable (I'd have to see your db to say for certain), but most situations should NOT allow nullable. Otherwise, someday management will ask why there's data missing from the reports and tell you to "fix it".
Cybis
That's not true at all. How often do you dump a database table to a text document without formatting it? It's as simple as replacing the "NULL" value with "(not updated)" or something similar.
Some Canuck
@Cybus: That's crazy talk. It is clearly correct usage of a nullable field
Harry
+24  A: 

Honestly, I've always thought NOT NULL should be the default. NULL is the odd special case, and you should make a case for it whenever you use it. Plus it's much easier to change a column from NOT NULL to nullable than it is to go the other way.

kquinn
Agreed. Didn't someone recently post about using the number of nullable columns to judge app age/rott?
jms
The problem is when it gets to users, you end up coercing entries that they fake because they can't get past the forms. I'm sure you're familiar with those ... a good example of system-driven requirements overcoming user requirements.
le dorfier
Sure, there are uses for NULL -- did I ever say otherwise? But nullable columns really are the exception, not the rule, in just about every design. If NULL is the right solution, use it... just make sure it's the right solution first, and not just a lazy hack.
kquinn
Your experience is different from mine. You start from the position that fields should be required unless someone can defend why not. I find it friendlier to start from the position that they are optional unless someone can prove they are required - IMHO a more user-centered point of view.
le dorfier
I suppose I have a more backend-oriented mindset; dealing with user input has always been one of my least favorite things to do :) It's also one of the places where NULL tends to actually be appropriate.
kquinn
Actually, it's reasonably difficult to go from NOT NULLable to NULLable. You must change every piece of code that references the field to have a sensible interpretation of NULL, and check all JOIN, WHERE, etc. clauses to make sure they work properly with NULL. Not trivial.
Mark Brackett
@Mark Brackett - but that's just code changes... bad data is pretty much unrecoverably broken. Code is ephemeral and data is forever.
Steven Huwig
@Steven, the assertion is being made that NULL should be rare, and only used when no other option can be imagined. That's wrong. The choice to use NULLable should be made exclusively based on the facts of the problem domain, not whether the designer can find any possible alternative.
le dorfier
I believe Ingres provided (provides) NOT NULL as default; you had (have) to specify NULLS ALLOWED for a column to accept nulls.
Jonathan Leffler
If something needs to be null I break it out into a child table, and join back using a view to recover the unified view with nulls (if that was ever needed.)
magnifico
+9  A: 

Err on the side of NOT NULL. You will, at some point, have to decide what NULL "means" in your application - more than likely, it will be different things for different columns. Some of the common cases are "not specified", "unknown", "inapplicable", "hasn't happened yet", etc. You will know when you need one of those values, and then you can appropriately allow a NULLable column and code the logic around it.

Allowing random things to be NULL is, sooner or later, always a nightmare IME. Use NULL carefully and sparingly - and know what it means in your logic.

Edit: There seems to be an idea that I'm arguing for NO null columns, ever. That's ridiculous. NULL is useful, but only where it's expected.

Le Dorfier's DateOfDeath example is a good example. A NULL DateOfDeath would indicate "not happened yet". Now, I can write a view LivingPersons WHERE DateOfDeath IS NULL.

But, what does a NULL OrderDate mean? That the order wasn't placed yet? Even though there's a record in the Order table? How about a NULL address? Those are the thoughts that should go through your head before you let NULL be a value.

Back to DateOfDeath - a query of persons WHERE DateOfDeath > '1/1/1999' would not return the NULL records - even though we logically know they must die after 1999. Is that what you want? If not, then you better include OR DateOfDeath IS NULL in that query. If you allow all columns to be NULL, you have to think about that every single time you write a query. IME, that's too much of a mental tax for the 10% or so of columns that actually have legit meaning when they're NULL.

Mark Brackett
IME, you should deal with ambiguous cases in your BL. It's not a data integrity issue.
le dorfier
@le dorfier: The relational system is the business layer, unless your business model doesn't need things like propositions made about a universe of discourse (hint: it does.)
Steven Huwig
Disagree. There's not enough granularity to the expressiveness of the relational system, which really only wants to describe things at the table level without doing really nasty things with triggers. I suggest you read up on "Object Role Modeling" for copious examples of what I mean.
le dorfier
Ambiguity is a data integrity issue, imo. The next generation of developers maintaining your application will someday mistake your meaning of null, at which point the data becomes ambiguous in a way the BL cannot resolve. I see this often where I work - though not a nightmare, it's still a big pain.
Cybis
@le dorfier: I suggest you read Databases, Types, and the Relational Model to learn why you are wrong.
Steven Huwig
that's http://www.pearsonhighered.com/educator/academic/product/0,3110,0321399420,00.html
Steven Huwig
@Steven, I'm familiar with Chris Date's POV, and concur with it. I'm speaking in terms of the distinction that can legitimately be made between data integrity ambiguity (properly dealt with in the book) and business rules ambiguity (not specific to data elements). And you mix the two at your peril.
le dorfier
How is NULL (aka missing data) not a data integrity issue? How is it not meaningful to the data model? *Any* ambiguity in your data model is a failure. Your BL is for defining logic upon your data, not for cleaning up your misshapen data modeling mistakes.
Mark Brackett
Your data model must model the real world, and the real world is ambiguous. You won't have much of an application if it can only deal with what is known, with certainty, at the time a record is first added to a table.
le dorfier
There are no NULLs in the real world.
Steven Huwig
Whooaaa!! Pretty remarkable assertion. I don't have a "Date of Death" yet, I hope.
le dorfier
No, because you haven't died and hence shouldn't appear in any table that represented such a proposition. If you've got "Date of Death" in your Person table, that's a bad design. Anyone who appears in Dead_Person table will have a date of death. Therefore there are no nulls in the real world.
Steven Huwig
Steven, read Date's interview again. He talks about weaknesses in SQL, and specifically the way SQL deals with NULLs. Date's assertion (which I can generally agree with) is that SQL is a deficient query language. But it's the one we have. You aren't proposing any other.
le dorfier
How would you handle my date of death? A assume a new table something like "USER_DEATH"? What if you knew I had died, but didn't know when? Do you refuse to add the row? What if you didn't know whether I was dead or alive?
le dorfier
+2  A: 

Stick with NOT NULL on everything until someone squeaks with pain about it. Then remove it on one column at a time, as reluctantly as possible. Avoid nulls in your DB as much as you can, for as long as you can.

Jonathan Leffler
While I don't agree with you, I am at a loss as to why someone marked this as offensive.
HLGEM
Thanks - I'm puzzled too. I decided it isn't worth worrying about. If they're offended by this, they're going to be offended by lots of SO.
Jonathan Leffler
+9  A: 

I try to avoid using NULL's in the database as much as possible. This means that character fields are always not null. Same for numeric fields, especially anything representing money or similar (shares, units, etc).

I have 2 exceptions:

  1. Dates where the date might not be known (eg. DivorcedOn)
  2. Optional foriegn key relationships (MarriedToPersonId). Though on occasion I have used "blank" rows in the foreign key table and made the relatonship mandatory (eg. JobDescriptionCode)

I have also on occasion used explicit bit fields for "unknown"/"not set" (eg. JobDescriptionCode and IsEmployeed).

I have a few core reasons why:

  1. NULLs will always cause problems in numeric fields. Always. Always. Always. Doesn't matter how careful you are at somepoint select X + Y as Total is going to happen and it will return NULL.
  2. NULLs can easily cause problems in string fields, typically address fields (eg. select AddrLine1 + AddrLine2 from Addresses).
  3. Guarding against NULLs in the business logic tier is a tedious waste of effort... just don't let them in the DB and you can save 100's of lines of code.

My preferred defaults:

  • Strings -> "", aka an empty string
  • Numbers -> 0
  • Dates -> Today or NULL (see exception #1)
  • Bit -> false
I'll take null over empty string all day...but maybe that's just me. Do you not like isnull()?
dotjoe
@dotjoe, I agree completely. I also don't find 0 for numbers to be acceptable. 0 can have semantic meaning. I don't get the nullophobia.
BobbyShaftoe
I don't like unnecesary code. If a NULL string does not have a seperate meaning from an empty string then the NULL is unnecesary and so is any supporting code. For strings it has been my experience that NULL and empty string almost always have the same meaning. I like less code (kind of a DRY thing)
@Bobby... how can you have NULL salary (to pick an easy example)? It is not possible! A person can however have "no salary" which is an important distinct fact, important enough to merit being stored seperately.
checking for null takes one line of code...hardly a deal-breaker plus I don't like empty strings or padded strings cluttering up me tables.
dotjoe
Empty strings are much easier to cope with for optional values than NULL, which in relational algebra has unintuitive behaviour that you probably don't want for plain data.
bobince
eg. If I want to look for matches, I don't want to have to convert to “(a=b OR a IS NULL AND b IS NULL)”; looking for records which don't match a value I don't want “(a<>'x' OR a IS NULL)”... I especially don't want to have two versions of dynamic/parameterised statements for null/not null input.
bobince
Agreed. I try to avoid null as well unless there is no practical alternative. I strongly prefer empty strings over nulls, because if you allow nulls you'll someday encounter both! Furthermore, checking for null isn't "one line of code", it's "one EXTRA line of code" for practically every query.
Cybis
"0" is a bad alternative for a null number, however. Usually 0 is a valid value. And for the love of jesus, please don't recommend negative numbers to denote null.
Cybis
@bobince - well I've never had to query for matches on a optional string field and if I had to, I don't know why I would care about null values...I would be more annoyed by empty strings matching so it would have to be "(a=b and a <> '')"...
dotjoe
It's scary that this is a widely held view. This is how databases end up looking nothing like the problem domain - the attention is on the structure rather than the problem. If a value might not apply, or might not be known at some time during the record lifecycle, then it's NULLable by definition.
le dorfier
+5  A: 

You may find Chris Date's Database In Depth a useful resource for these kinds of questions. You can get a taste for his ideas in this interview, where he says among other things:

So yes, I do think SQL is pretty bad. But you explicitly ask what its major flaws are. Well, here are a few:

  • Duplicate rows
  • Nulls
  • Left-to-right column ordering
  • Unnamed columns and duplicate column names
  • Failure to support "=" properly
  • Pointers
  • High redundancy

In my own experience, nearly all "planned nulls" can be represented better with a child table that has a foreign key to a base table. Participating in the child table is optional, and that's where the null/not null distinction is actually made.

This maps well to the interpretation of a relation as a first-order logic proposition. It also is just common sense. When one does not know Bob's address, does one write in one's Rolodex:

Bob. ____

Or does one merely refrain from filling out an address card for Bob until one has an actual address for him?

Edit: Date's argument appears on pages 53-55 of Database In Depth, under the section heading "Why Nulls are Prohibited."

Steven Huwig
But then you are just adding another join where you could have just had a nullable column. I'd prefer the nullable column.
BobbyShaftoe
Much of the time there's a reasonable default - e.g., empty string for varchar columns. I would use that instead of a null or a child table.
Cybis
An empty string is not a reasonable default. Putting an empty string in a street address column makes the proposition, "Bob lives at ''", which is obviously not true. Your database should not represent untrue things.
Steven Huwig
@BobbyShaftoe: it's not a matter of preference, it is a matter of logical consistency.
Steven Huwig
It's a matter of having a database that accurately models the domain, in a generally acceptable manner.
le dorfier
No business domain is accurately modeled by a system with NULLs, since a system with NULLs allows a properly formed query to return incorrect answers. _Database in Depth_ has an example of this.
Steven Huwig
I challenge you to quote it and defend it then. You have a lot of remarks with no examples.
le dorfier
I will link to it and provide the page numbers. This is an answers forum, not a debate forum. Suffice to say that many of those who developed the relational model of data now offer varied and vigorous admonitions to never use NULL.
Steven Huwig
@Steven, note again in your quote that he ascribes the problem to deficiencies in the SQL language. But we don't currently have anything better. He's not asserting we're using SQL incorrectly wrt NULLs. FOr him it's a basic deficiency in SQL (as distinct from relational data stores).
le dorfier
The rational response to "SQL NULLS give incorrect query results" is "don't use SQL NULLs," not "don't use SQL." You have to read the book, and the rest of his work.
Steven Huwig
+4  A: 

I lean toward NOT NULL unless I see a reason otherwise -- like someone else said, like it or not, NULL is the weird special case.

One of my favorites in regards to NULL is:

SELECT F1 FROM T WHERE F2 <> 'OK'

...which (in DB2 at least) won't include any rows where f2 is null -- because in relational jargon, (NULL <> 'OK') IS NULL. But your intent was to return all not-OK rows. You need an extra OR predicate, or write F2 DISTINCT FROM 'OK' instead (which is special case coding in the first place).

IMO, NULL is just one of those programmer's tools, like pointer arithmetic or operator overloading, that requires as much art as science.

Joe Celko writes about this in SQL For Smarties -- the trap of using NULL in an application is that its meaning is, well, undefined. It could mean unknown, uninitialized, incomplete, not applicable -- or as in the dumb example above, does it mean OK or not-OK?

twblamer
+1 NULL's behaviour in relational algebra is unintuitive and a potential source of many bugs. You still need it for missing optional-foreign-key references, but for plain data null-like values such as the empty string are likely to be easier. (Except in Oracle where the empty string *is* null, argh)
bobince
It's only unintuitive if you find 3-valued logic unintuitive. And pretending a NULL value is some arbitrary other value is a potential source of just as many bugs.
le dorfier
+1  A: 

What are the performance implications of small vs large numbers of NOT NULL columns?

This may be stating the obvious, but, when a column is nullable, each record will require 1 extra bit of storage. So a BIT column will consume 100% more storage when it is nullable, while a UNIQUEIDENTIFIER will consume only 0.8% more storage when it is nullable.

In the pathological case, if your database has a single table consisting of a single BIT column, the decision to make that column nullable would reduce your database's performance in half. However, under the vast majority of real world scenarios, nullability will not have a measurable performance impact.

Portman
"However, under the vast majority of real world scenarios, nullability will not have a measurable performance impact." Glad you finally made sense in the end!
Harry
You are (probably) technically correct overall, but I love that you lead with a worst case scenario!
Kenny Evitt
+2  A: 

Personally I think you should mark the columns as Null or not null based on what kind of data they contain, if there is a genuine requirement for the data to always be there, and whether the data is always known at the time of input. Marking a column as not null when the users don't have the data will force then to make up the data which makes all your data useless (this how you end up with junk data such as an email field containing "[email protected]"). Failing to require something that must be there for the process to work(say the key field to show what customer made the order) is equally stupid. Null vice not null is a data integrity issue at the heart, do what makes the most sense towards keeping your data useable.

HLGEM
A very good plain and simple explanation using real-world logic. Which should always trump distorted data design and content based on incomplete understanding of the tool, or questionable (probably any) efficiency issues.
le dorfier
+4  A: 

Thanks for all the great answers, guys. You gave me a lot to think about, and helped me form my own opinion/strategy, which boils down to this:

Allow nulls if-and-only-if a null in that column would have a specific meaning to your application.

A couple of common meanings for null:

  • Anything that comes directly from the user
    • Here null means "user did not enter"
    • For these columns, it's better to allow nulls, or you'll just get [email protected] type input anyway.
  • Foreign keys for "0 or 1" relationships
    • null means "no related row"
    • So allow nulls for these columns
    • This one is controversial, but this is my opinion.

In general, if you cannot think of a useful meaning for null in a column, it should be NOT NULL. You can always change it to nullable later.

Example of the sort of thing I ended up with:

create table SalesOrderLine (
    Id int identity primary key,
    -- a line must have exactly one header:
    IdHeader int not null foreign key references SalesOrderHeader, 
    LineNumber int not null, -- a line must have a line number
    IdItem int not null, -- cannot have null item
    Quantity decimal not null, -- maybe could sell 0, but not null
    UnitPrice decimal not null, -- price can be 0, but not null
    -- a null delivery address means not for delivery:
    IdDeliveryAddress int foreign key references Address, 
    Comment varchar(100), -- null means user skipped it
    Cancelled bit not null default (0) -- true boolean, not three-state!
    Delivered datetime, -- null means not yet delivered
    Logged datetime not null default (GetDate()) -- must be filled out
)
Blorgbeard
+1 I wish I could upvote more than once.
vmarquez
The trouble with this solution is that the semantics of NULL in SQL are pretty clearly "unknown." So a row doesn't show up in either SELECT * WHERE IdDeliveryAddress = 1600 or SELECT * WHERE IdDeliveryAddress != 1600, but something not for delivery should show up in the latter query.
Steven Huwig
I think that's a pretty reasonable design, myself.
Mark Brackett
'Nullable Foreign keys for "0 or 1" relationships' - Why would you want orphaned records? I can't imagine a situation that calls for that behavour. I'm pretty sure 0 value for Foreign keys (assuming Primary Key is a Int)
Harry
@Steven, that is a good point. I would need to write "where IdDeliveryAddress not null and IdDeliveryAddress != 1600". I'll give that one some more thought.
Blorgbeard
@Harry, a value of 0 would point to the record with PK of 0. So that row would need to be maintained as a sentinal value meaning "no delivery".
Blorgbeard
This is really a discussion for the question I linked to above though.
Blorgbeard
@Blorgbeard: Codd thought that different NULL markers would be needed to represent different states. The trouble is that the SQL NULL handling works against you unless you use it to mean "unknown" and only "unknown."
Steven Huwig
For this particular design, consider a separate table SALES_ORDER_LINE_DELIVERY_ADDRESSES or somesuch, and make participation optional.
Steven Huwig
Using the `IdDeliveryAddress != 1600` example, I can understand why that wouldn't include NULLs – if it's NULL then it may (actually) be `1600` but we don't know! So, in a sense, SQL is forcing you to explicitly ask for either (1) addresses that are definitely not `1600`; or (2) addresses that are not `1600` AND addresses that are unknown or undefined.
Kenny Evitt
A: 

Any nullable column is a violation of third normal form.

But, that's not an answer.

Maybe this is: there are two types of columns in databases - ones that hold the structure of the data, and ones that hold the content of the data. Keys are structure, user-enterable fields are data. Other things - well - it's a judgment call.

Stuff that's structure, that is used in join clauses, is typically not null. Stuff that's data is typically nullable.

When you have a column that hold one of a list of choices or null (no choice made), it is usually a good idea to have a specific value for "no choice made" rather than a nullable column. These types of columns often participate in joins.

paulmurray
Why avoid having null (no choice made?)
Harry
because null means unknown, whereas no choice means no choice.
Steven Huwig
No it doesn't violate 3NF http://en.wikipedia.org/wiki/Third_normal_form
Meff
The reason for avoiding NULL is that in SQL, you have to use outer joins all over the place to handle it. Similarly, in java you need to specially handle the detection of the null condition.Having a special value for no choice allows you to - for instance - use polymorphic methods.
paulmurray
Dang - I had misunderstood 3nf. What's that one, then, where if a field is null then it is supposed to be farmed out to another table with a zero-or-one relationship? Effectively, a subclass?
paulmurray
A: 

Using 'Not Null' or 'Null' should be primarily driven by your particular persistance requirements.

Having a value being Nullable means there are two or three states (three states with Bit fields)

For instance; if I had a bit field which was called 'IsApproved' and the value is set at a later stage than insertion. Then there are three states:

  1. 'IsApproved' Not answered
  2. 'IsApproved' Is Approved
  3. 'IsApproved' Is Not Approved

So if a field can be legitimently considered Not Answered and there is no default value that is suitable. These fields should be considered for being nullable

Harry