views:

148

answers:

4

For .NET fields and properties that by definition only contain a single character, is there any benefit to defining them as char instead of string? Or is that an incorrect use of the char data type?

I'm thinking of a field that can hold an M or F for sex, or a middle initial, or an indicator stored in the database as a Y or N. I typically define these as string, but I'm wondering whether I should be defining them as char instead.

+2  A: 

Strings are arrays of characters, so unless you believe you may need the properties/methods that come along with the array (such as length, and substring functions) I would say stick with a character, since you are certain it is only one character.

TheTXI
+11  A: 

Yes, char is the correct data type for this. The general rule of thumb is: always choose the narrowest (most restrictive) data type possible in the context. This means that any invalid (or out of range) data will not be accepted if it gets input by accident, thus your program/database will become less error prone.

To consider it from another angle: when wanting to use an integer value in code, do you create an integer array of size one? Of course not, because although it would do the job, it is quite unnecessary. i.e. Compare:

int[] value = new int[1] { 123 };
value[0] = 456;

to:

int value = 123;
value = 456;

The first is simple absurd, as I'm sure you see. Assuredly, this isn't so obvious in the context of databases (usage is about as simple if you choose a string data type), but I think the analogy helps explain the logic behind the choice.

In addition, when it comes to manipulating the values in code, you should find that having the field in the more appropiate data type (i.e. char) makes it slightly more straightforward to use appropiately in code.

In terms of performance, I wouldn't imagine that using string would give you any significant overhead. Ok, so it takes up marginally more memory, but that's probably not an issue. I do however think that the other reasons I have just proposed explain why you should choose char.

Noldorin
Good point about the invalid data which I did not include in my own answer.
TheTXI
Yeah, there are really two main reasons here: restrictiveness and protection against invalid data.
Noldorin
+4  A: 

A Char and a 1 character String are not going to differ much, performance-wise. The CLR will intern strings for you so that is something to consider (but again I cannot imagine the performance benefits would be significant).

Remember that a programming language is a useful tool as an astraction. Always create useful abstractions. In other words I would define your fields like this:

enum Gender { Male, Female }

class Foo
{
    Gender gender;
    Char middleInitial;
    Boolean indicator;
}

This class is semantically valuable now because the datatypes indicate their use. Always use the right tool for the job.

Andrew Hare
Agree with your enum example. It's certainly much more obvious than a single char.
Skurmedel
+1  A: 

The string class will have some overhead. If you have char, will you need to represent empty with 0x00 or <space> or some other unknown indicator? Will this be going to or coming from a database, and what convention are you going to use there?

I typically prefer enums/bools/native semantics in the application layer for this, translating from and to whatever database convention is required. After all Person.Gender == Female and if (!Person.IsDeceased) {} is a lot more readable and clear.

Cade Roux
Good question. In the specific class I'm working on right now, the data will be stored in a database. The table (not my design) uses space padding instead of nulls, so a single space character will map directly to the database. For other cases where I need to persist a null instead of a space, that's a good point for me to consider.
John M Gant
added note about my personal design preference, which is similar to Anrew Hare's point.
Cade Roux