views:

2135

answers:

12

I'm working on an application developed by another mob and am confounded by the use of a char field instead of bit for all the boolean columns in the database. It uses "Y" for true and "N" for false (these have to be uppercase). The type name itself is then aliased with some obscure name like ybln.

This is very annoying to work with for a lot of reasons, not the least of which is that it just looks downright aesthetically unpleasing.

But maybe its me that's stupid - why would anyone do this? Is it a database compatibility issue or some design pattern that I am not aware of?

Can anyone enlighten me?

+18  A: 

I've seen this practice in older database schemas quite often. One advantage I've seen is that using CHAR(1) fields provides support for more than Y/N options, like "Yes", "No", "Maybe".

Other posters have mentioned that Oracle might have been used. The schema I referred to was in-fact deployed on Oracle and SQL Server. It limited the usage of data types to a common subset available on both platforms.

They did diverge in a few places between Oracle and SQL Server but for the most part they used a common schema between the databases to minimize the development work needed to support both DBs.

Jeremy Wiebe
Ahh, that good ol' boolean 'Maybe'!
Mitch Wheat
As an aside: watch out for right padded fields as well. I've found this quite common in DB's that use the "Y/N" idiom for booleans.
Mitch Wheat
Y/N/FileNotFound...
nickf
Also, in addition to simple Y/N fields, I've seen one schema that used the full words "Yes" and "No" and in that case the capitalization mattered. Now _that_ was a pain! :-(
Jeremy Wiebe
Ah, reminds me of a system I inherited. The 'Booleans' were stored in 32-bit integer columns with a foreign key to a table that defined Yes and No. Irritatingly, No was 1 and Yes was 2.
BlackWasp
FileNotFound ftw!!
Orion Edwards
@BlackWasp: better than 0 = yes, 1 = no. In that situation: if (myBooleanField) { .. } would execute if it was "no"
nickf
@nickf: Isn't it funny how we (software developers) can complicate even the simplest things? :-(
Jeremy Wiebe
@nickf - that's true, good point. It could have also been in bigints. I wonder, perhaps a UNIQUEIDENTIFIER with 00000000-0000-0000-0000-000000000000 for yes and 00000000-0000-0000-0000-000000000001 for no? Or maybe just whack a new GUID in every time you need a Y/N. It could go into a Y/N table! (Of course, you would then need another column in the Y/N table to explain which was in use for each GUID. Perhaps another GUID....
BlackWasp
+10  A: 

Welcome to brownfield. You've inherited an app designed by old-schoolers. It's not a design pattern (at least not a design pattern with something good going for it), it's a vestige of coders who cut their teeth on databases with limited data types. Short of refactoring the DB and lots of code, grit your teeth and gut your way through it (and watch your case)!

rp
Amen - welcome to inheritance tax :)
stephbu
just imagine if you did spend ages refactoring your code, and then some large customer comes along that uses Oracle and the salesman says "yup, we've had oracle support since day 1"... you'd just love refactoring it back. It works, leave it alone and spend your time more positively.
gbjbaanb
We actually use CHAR(1) fields for all Y/N values since it allows for complete cross-database compatibility without any thought.
Jess
+10  A: 

Other platforms (e.g. Oracle) do not have a bit SQL type. In which case, it's a choice between NUMBER(1) and a single character field. Maybe they started on a different platform or wanted cross platform compatibility.

WW
A: 

They probably were used to using Oracle and didn't properly read up on the available datatypes for SQL Server. I'm in exactly that situation myself (and the Y/N field is driving me nuts).

Stewart Johnson
+1  A: 

I don't like the Y/N char(1) field as a replacement to a bit column too, but there is one major down-side to a bit field in a table: You can't create an index for a bit column or include it in a compound index (at least not in SQL Server 2000).

Sure, you could discuss if you'll ever need such an index. See this request on a SQL Server forum.

splattne
There's not much point in indexing a bit column -- there are only two options, so the variance (I can't remember the technical term right now) isn't high enough to be efficient.Why you can't use them in compound indexes is beyond me, though.
Roger Lipscombe
You can use them in compound indexes in SQL 2005.
BlackWasp
You can index a bit field, since SQL 2000 I believe, and whether and index will be useful depends on the distribution of values... fer sure, if there half true and half false, then of course there's no point...
Charles Bretana
But if only 2-3% of records are true (like isSpecialCustomerFlag) then an index will speed up those retrieve Special Customer queries enormously...
Charles Bretana
A: 

The reasons are as follows (btw, they are not good reasons):

1) Y/N can quickly become "X" (for unknown), "L" (for likely), etc. - What I mean by this is that I have personally worked with programmers who were so used to not collecting requirements correctly that they just started with Y/N as sort of 'flags' with the superstition that it might need to expand (to which they should use an int as a status ID).

2) "Performance" - but as was mentioned above, SQL indexes are ruled out if they are not 'selective' enough... a field that only has 2 possible values will never use that index.

3) Lazyness. - Sometimes developers want to output directly to some visual display with the letter "Y" or "N" for human readableness, and they don't want to convert it themselves :)

There are all 3 bad reasons that I've heard/seen before.

Timothy Khouri
+1  A: 

I can't imagine any disadvantage in not being able to index a "BIT" column, as it would be unlikely to have enough different values to help the execution of a query at all.

I also imagine that in most cases the storage difference between BIT and CHAR(1) is negligible (is that CHAR a NCHAR? does it store a 16bit, 24bit or 32bit unicode char? Do we really care?)

MarkR
+1  A: 

They may have started development back with Microsoft SQl 6.5

Back then, adding a bit field to an existing table with data in place was a royal pain in the rear. Bit fields couldn't be null, so the only way to add one to an existing table was to create a temp table with all the existing fields of the target table plus the bit field, and then copy the data over, populating the bit field with a default value. Then you had to delete the original table and rename the temp table to the original name. Throw in some foriegn key relationships and you've got a long script to write.

Having said that, there were always 3rd party tools to help with the process. If the previous developer chose to use char fields in lieu of bit fields, the reason, in a nutshell, was probably laziness.

Aheho
A: 

I've seen worse ...

One O/R mapper I had occasion to work with used 'true' and 'false' as they could be cleanly cast into Java booleans.

Also, On a reporting database such as a data warehouse, the DB is the user interface (metadata based reporting tools notwithstanding). You might want to do this sort of thing as an aid to people developing reports. Also, an index with two values will still get used by index intersection operations on a star schema.

ConcernedOfTunbridgeWells
+1  A: 

This is terribly common in mainframe files, COBOL, etc.

If you only have one such column in a table, it's not that terrible in practice (no real bit-wasting); after all SQL Server will not let you say the natural WHERE BooleanColumn, you have to say WHERE BitColumn = 1 and IF @BitFlag = 1 instead of the far more natural IF @BooleanFlag. When you have multiple bit columns, SQL Server will pack them. The case of the Y/N should only be an issue if case-sensitive collation is used, and to stop invalid data, there is always the option of a constraint.

Having said all that, my personal preference is for bits and only allowing NULLs after careful consideration.

Apparently, bit columns aren't a good idea in MySQL.

Cade Roux
A: 

Sometimes such quirks are more associated with the application than the database. For example, handling booleans between PHP and MySQL is a bit hit-and-miss and makes for non-intuitive code. Using CHAR(1) fields and 'Y' and 'N' makes for much more maintainable code.

staticsan
A: 

I don't have any strong feelings either way. I can't see any great benefit to doing it one way over another. I know philosophically the bit fields are better for storage. My reality is that I have very few databases that contain a lot of logical fields in a single record. If I had a lot then I would definitely want bit fields. If you only have a few I don't think it matters. I currently work with Oracle and SQL server DB's and I started with Cullinet's IDMS database (1980) where we packed all kinds of data into records and worried about bits and bytes. While I do still worry about the size of data, I long ago stopped worrying about a few bits.

bruceatk