views:

96

answers:

2

The database development standards in our organization state the varchar fields should not allow null values. They should have a default value of an empty string (""). I know this makes querying and concatenation easier, but today, one of my coworkers questioned me about why that standard only existed for varchar types an not other datatypes (int, datetime, etc). I would like to know if others consider this to be a valid, defensible standard, or if varchar should be treated the same as fields of other data types?

I believe this standard is valid for the following reason:

I believe that an empty string and null values, though technically different, are conceptually the same. An empty, zero length string is a string that does not exist. It has no value. However, a numeric value of 0 is not the same as NULL.

For example, if a field called OutstandingBalance has a value of 0, it means there are $0.00 remaining. However, if the same field is NULL, that means the value is unknown. On the other hand, a field called CustomerName with a value of "" is basically the same as a value of NULL because both represent the non-existence of the name.

I read somewhere that an analogy for an empty string vs. NULL is that of a blank CD vs. no CD. However, I believe this to be a false analogy because a blank CD still phyically exists and still has physical data space that does not have any meaningful data written to it. Basically, I believe a blank CD is the equivalent of a string of blank spaces (" "), not an empty string. Therefore, I believe a string of blank spaces to be an actual value separate from NULL, but an empty string to be the absense of value conceptually equivalent to NULL.

Please let me know if my beliefs regarding variable length strings are valid, or please enlighten me if they are not. I have read several blogs / arguments regarding this subject, but still do not see a true conceptual difference between NULLs and empty strings.

+8  A: 

It pretty much boils down to this - in your application, for a specific string, is there a difference between having an empty string to having no string at all?

If there is no distinction, then the standard your are following is fine.

If you find that there is a difference, then the null has a distinct meaning and should be allowed.

In my experience, null is normally modelled to mean unknown.

Here is a more concrete example - middle names of people:

  • If you know the middle name, then the value is populated
  • If you know that the person has no middle name, then use an empty string ('')
  • If you don't know whether a person has a middle name, a null may be more appropriate

Again, if your application treats people with no middle name and those where this is unknown identically, then using an empty string for both makes sense (even if it does mean loosing some information).

Oded
+1: Business rules drive column optionality, not data type. Additionally, there's no space savings using NULL vs zero length strings: http://vampirebasic.blogspot.com/2009/01/sql-server-null-varchar-vs-empty.html
OMG Ponies
+1 for the middle name example... exactly what I was going to post :)
Tom H.
+1  A: 

No, null is a very distinctive value. For example - just one of over a dozen - null may mean "we don't have a value at all", while the empty string means "we have an answer and it is nothing." This would be useful, for example, as an answer to a question - never got an answer or the answer was nothing...

There's a GREAT whitepaper floating around out there entitled something like "the 18 meanings of NULL" - I don't recall just what the integer is! ANyway, that paper has been out there since at least the early 1990s and is truly fantastic if you can find it - I have not done a web search.

The real issue with nulls is that they can "errantly" alter what rows are returned. For example, if you say

select strcol1, datecol2, someint3 from fubar where ...

If strcol1 happens to be null, you will NOT get a value back for that row because there's an implied "where strcol1 is not null" - thus a whole row can be missing.

This is not true for all RDBMS systems, but has been true for some for quite a while so if you want your code to be able to be moved from one RDBMS to another, you should be VERY careful when handling nulls.

Another point: Oracle - or at least some versions of Oracle - will silently convert empty strings to null! This is truly outrageous, but somehow, they've had this in production "forever." Watch out! My solution is to use some other string to mean "the empty string", most often a single space character.

Richard T