views:

209

answers:

4

Hi there,

I have the task of re-factoring an SQLServer DB.... A lot of the tables and columns "ALLOW NULLS", Is this good practice...

I seem to remember the authour of CSLA.NET saying it was really bad practice to allow nulls in a DB...

If this is the case, what are my alternatives?

Remove all "ALLOW NULLS" from all columns.... and in numeric columns use a value of -1 for example??

I would really appreciate any input anyone has.

I am currently using a Model (from entity framework) from my DB and the db columns that "ALLOW NULLS" are null ... and some of the stored procedures require that i have a default value... i.e. BOOLEAN require FALSE as default ... but it is null..

Well i don't want to stray from my original question, ALLOW NULLS are a bad thing from what i can gather .... so how do i fix this ?

Any help really appreciated

+1  A: 

Null is sometimes a valid column value, but I would suggest that a better practice would be to instead add default values and make the column not null.

If you have existing data to consider you would have to do an update like:

update TableA set ColumnA = 'default value' where ColumnA is null 

... if you want to impose 'not null' on existing data.

If there is no reasonable default value then null can be a perfectly valid column value - but there is quite often a decent default available.

amelvin
I find that plenty of fields that aren't strings (who "cheat" with an empty string) can't most often be filled with a default value. Like birthdays, last editor's id, and so on. If it's an unknown value, if should be left as null - but most people fail to grasp that concept. =|
ANeves
I don't like using blank as an alternative to null - a blank value suggests that it was a choice. Using 'No description entered' instead of null in a description field would be a reasonable default and it would save casting nulls later. Like many things there is a happy medium between nulls scattered everywhere when better information could be easily defaulted to and a holy war against nulls where valid information is the first casualty.
amelvin
A: 

NULLs are a blight on the face of humanity that should be eradicated from existence. Unfortunately, however, current systems and languages (especially SQL) aren't up to the task of accommodating such a radical overhaul of current thinking. In fact, SQL is a major perpetrator of this travesty.

In the real world, therefore, the advice I would give is to avoid NULLs as much as you can, but accept the reality that sometimes there is no easy alternative.

Marcelo Cantos
-1: What?!? So, you're one of those people that use -1 for unknown ints, 1900-1-1 for unknown dates, and so on?
ANeves
Hmm, it certainly is hazardous taking an unpopular stance. No, I'm not one of those people. I am in favour of 5NF and 6NF normalisation of data, and the construction of DBMSs and associated tools that work well with this approach. Of course, since the folks who down-voted me probably don't have a clue what I'm talking about, I'm not expecting my rep back anytime soon. It is profoundly ironic that the highest-voted answer essentially agrees with my position that NULLs are bad, but generally unavoidable. OTOH, I don't agree with @Mike's implication that magic values are the only alternative.
Marcelo Cantos
+7  A: 

There is no question that NULLs should be avoided when possible, because of the problems they can introduce several problems with indexing and query syntax.

Due to the problems that NULLs can introduce, there has been a push to stop using them. However, like most such movements, it has gotten out of control, to the point that some people fanatically insist that NULLs should never be used in the database. I was in this camp for a lot of years, before I found it to be a little overzealous.

The answer is somewhere in between. NULLs should be avoided whenever possible, but there are indeed valid business reasons for storing them.

Often times you need to store an optional numeric value, and many people will tell you to just store a zero to indicate "no value", but this is an even worse antipattern of storing an magical value that really means something else. What if then you can't use zero for one field, because zero is considered a meaningful value as well, or because this value is being used as the multiple of a divisor, so you need to use some other magical value (-1?) for that field in just this case, and now you have a worse problem, because all of your optional numeric fields are now behaving differently. Oye.

Dates are an even more compelling candidate for nullable fields. The standard "no value" that people use in .NET is the default unassigned DateTime value, which is DateTime.MinValue, or more specifically January 1, 0001. However, you cannot write this magic value into SQL database because the default minimum value for a SQL Server DATETIME field is January 1, 1973. You now have to have something that checks that you are translating those values properly as they are written to and read from the database, and you have to have defensive coding all over the place that checks for whether your date fields are less than SqlDateTime.MinValue, intead of just checking whether they are equal to DateTime.MinValue. Double Oye.

My preference is to deal with the values as they truly are, and not to build a lot of artifical constructs to hide the true meaning and usage of the field. If a field may very well not have a value, make it nullable, and make it nullable in your application objects as well (if you language supports such a thing). Then, anytime you are using that field, you are required to consider what should be done in the case where it is NULL, but that is actually a good thing. Generally I an opposed to making developers waste braincycles on unnecessary code complexity, but that is because it steals focus away from the true business problem being solved; however, in this case, the lack of a value IS part of the true business problem and must be thought through. If you are just defaulting these value, then the developer writing a formula or algorithm will be less likely to think through those edge conditions where the values are missing, and may not even realize at the time that it is a possibility that those values are missing.

Mike Mooney
`new DateTime() != DateTime.MinValue` - aside from that, great answer, +1
Evgeny
A: 

Using a null value is a case where slight denormalisation can bring a big performance benefit. In the real world a NULL value is a perfectly valid way of saying that a value hasnt been recorded. This is especially relevant for boolean (BIT) values and optional numeric values.

The purists will have you create a new table with a left join to record this value. The realist will alter the existing table and add a null value.

There is definitely cases where nulls are relevant. The Nullable generic was added to .net framework for value types for exactly the same reason.

James Westgate