views:

144

answers:

6
+3  Q: 

NCHAR(1) vs BIT

I'm working under refactoring of database (SQL Server 2008) scheme and gather arguments to change NCHAR(1) columns (which keep Y|N values) to BIT. Everybody understand this is necessary and don't know why does it take place but this change affects production database so weighty arguments are required. Table keeps address catalog (up to 1m of records).

First argument I found - each nchar fields take 2 bytes, each 8 bit fields - 1 byte (next 8 - additional 1 byte).

What's next? Maybe some indexes performance issues?

+8  A: 

I would hesitate to provide any arguments for such a change unless you had a good reason to make that change. i.e. you have to balance the cost of a change to what you would personally of done / prefer, vs the cost of actually implementing it and the benefits.

Have you checked whether the use of nchar(1) is hurting performance, or are you falling into the trap of premature-optimization? You are only talking about 1 million records here.

For the minor storage / IO cost you think you are incurring, consider the total man hours to change, retest and upgrade the system * hourly rate vs cost of just buying a faster disk. I suspect the disk will be far cheaper - as well as benefit every aspect of the system.

Andrew
+1 - Having to modify all the stored procs and such to account for the new logic would probably cost a LOT more than the extra space being used.
JNK
+1, it's not the end of the world and if it's not broken, don't fix it yet. The original developer might of had in mind terniary options, such as M for maybe.
Tom Gullen
Agreed. And, you cannot index a BIT field in SQL Server. Your composite indexes probably aren't doing you much good since there are only two values in your NCHAR field, but if they are then you're in for a world of hurt changing it up.
mattmc3
@mattmc3: Not true. We've been able to index bit columns since SQL 2000. http://msmvps.com/blogs/greglow/archive/2004/12/20/26432.aspx
Joe Stefanelli
Is it better to use a bit flag than a char field set to Y or N? Absolutely. Is it good to go back and change a ton of code to implement such an improvement? Almost certainly not. You'd have to find every place where this field is used, change every one to the new scheme, and then test the results. Odds are that there's some place in the code where a programmer is doing something "clever", like concatenating severals flags together and then comparing to "YYN" or, who knows, something truly bizaree like adding 1 to it and comparing to "Z". Making a change like this is very dangerous.
Jay
I think the main reason is not saving hdd space but performance
abatishchev
My point was that you could still buy a better 'performing' hard disk for a fraction of the price of the change you are implementing that has no discernable benefit. e.g. if your only on 7.2k disks, buy 10k, or 15k, of FC disks, or an iSCSI unit, or get space allocated from a SAN etc. The cost of the change + retest (you do of course test don't you?) + upgrade can be phenominally expensive
Andrew
+4  A: 

One common reason to find NCHAR(1) instead of bit is that Oracle did not support a bit type. If you had an Oracle or Oracle-trained developer, or a database that used to run on Oracle, you're gonna see this a lot. In Sql Server, there's really no need for this.

However, I've found that most places where I have a bit field (or NCHAR(1) in Oracle) what I really want is a datetime that indicates not as much the value of the flag but exactly when it became true. This isn't always that case, but when I think back about old code I've written I'd guess that 4 out of 5 times I used a bit field I should have used a datetime.

Joel Coehoorn
And doing datetime comparisons in your queries where could have been doing bit comparisons? I prefer to use both datetime and bit. datetime to know when it became true (or false) and the bit field itself for queries.
Jeroen
@Jeroen - Correctness first, performance second. But most of the time I'm just checking that it's not NULL anyway, and that's just about as fast as a bit check.
Joel Coehoorn
@Joel: Interesting point. Thanks
Jeroen
Hmm, interesting. I take it this means null is full and any date is true. But this assumes that you want to know the date it became true but you don't care about the date it became false. Frankly I think this is risky design: You're cramming two facts into one field: the current status, and the date that the current status was last changed. Like Jeroen, I'd create two fields if I need to know both facts.
Jay
@Jay - that's the other one time in four. A lot of time, a bit field is only ever intended to change once. For most of the rest, you want to know _every_ time the value changes, and that means not just another field but a whole new record.
Joel Coehoorn
@Joel: Well, sounds to me like a very specialized case: Flag starts out false, only changes to true once, never changes back, and you care about the date when it changed. If that's the overwhelming majority of the uses for booleans that you have, okay. Can't say my experience is anything like that.
Jay
+1  A: 

Is the field used extensively in queries Where fld = 'Y'?

If so i would consider doing a test to see whether or not changing it to bit impacts performance.

Changing it now just because it ought to be a bit field since you're storing boolean values on a table of 1m+ records doesn't sound like a good idea to me either and i'd go with @Andrew's answer.

Jeroen
+2  A: 

A bit field helps your logic by automatically enforcing what is currently an implicit business rule (i.e., this column can only contain 'Y' or 'N'). If you're enforcing that rule programmatically, you can save by eliminating that overhead. Indexing a bit column on its own has little value due to the low cardinality, but it could be useful as part of a composite index.

See also:

Joe Stefanelli
+2  A: 

Create the bit field, add a computed column that emulates the nchar(1) for now.

What not to use nchar:

  • Y vs y vs some unicode Y
  • Overhead of checking Y or N
  • Not natively "true" o "false" (eg won't map directly to .net boolean)
  • Y and N are English. Ja/Nein, Oui/Non etc

You shouldn't index this anyway so it comes down to efficient storage and use. bit is

  • smaller
  • datatype safe (eg no CHECK needed)
  • maps to client meaning directly
  • independent of region

Saying that, we use a smalldatetime "WhenInactive" field as a substitute for "IsActive" field. NULL = active.

gbn
+1  A: 

If you are using LINQ2SQL or Entity Framework a BIT column will translate into a bool, but NCHAR(1) will translate into a string.

Albin Sunnanbo