views:

457

answers:

8

Possible Duplicates:
Is there any advantage of being a case-sensitive programming language?
Why are many languages case sensitive?

Something I have always wondered, is why are languages designed to be case sensitive?

My pea brain can't fathom any possible reason why it is helpful.

But I'm sure there is one out there. And before anyone says it, having a variable called dog and Dog differentiated by case sensitivity is really really bad practise, right?

Any comments appreciated, along with perhaps any history on the matter! I'm insensitive about case sensitivity generally, but sensitive about sensitivity around case sensitivity so let's keep all answers and comments civil!

+3  A: 

Imagine you have an object called dog, which has a method called Bark(). Also you have defined a class called Dog, which has a static method called Bark(). You write dog.Bark(). So what's it going to do? Call the object's method or the static method from the class? (in a language where :: doesn't exist)

Alexander
Why not call them different things so you don't get mixed up further down the line? Having a class called Dog and an object called dog is going to get confusing. This is what I don't understand.
Tom Gullen
Dog is too generic... An object from Dog such as labrador would make more sense.
So now the compiler knows what method to call, but the reader will still be confused. I think you should use something better to differentiate between the two than capitalization of 1 letter.
Miel
I have a few naming conventions. Private class members and local variables are always named with a lower case character. Global stuff like class names, public members, etc are upper-case first letter.This naming convention makes me naturally understand what I'm writing. It's very easy once you get the hang of it.
Alexander
@Tom: I prefer case sensitivity because together with a good set of naming conventions, you can keep things straight *and* your choice of identifier names isn't artificially limited by different strings comparing "equal". If I have a class called `Dog`, why should I have to resort to calling my instances things like `dogVar` or `dog_` or even (shudder) `m_dog`? @m00st: what if `Dog` is the root of an inheritance hierarchy, and you need to name a `Dog` pointer which might point to an instance of any subclass of dog? :)
shambulator
Alexander, that's probably the best/only good use for case sensitivity. Of course things may vary from language to language, but those are pretty good conventions to keep.
Wayne Werner
+16  A: 

It's not necessarily bad practice to have two members which are only differentiated by case, in languages which support it. For example, here's a fairly common bit of C#:

private readonly string name;
public string Name { get { return name; } }

Personally I'm quite happy with case sensitivity - particularly as it allows code like the above, where the member variable and property follow conventions anyway, avoiding confusion.

Note that case-sensitivity has a culture aspect too... not all cultures will deem the same characters to be equivalent...

Jon Skeet
Thanks for the reply, could you expand on cultural aspect? I'm not sure I fully understand.
Tom Gullen
The most often given example, is the Turkish variations of "I". English case-insensitivity is insensitive to this, because "i"=/="I" in this case.
maxwellb
+1 Jon, because I back my properties with lowercase samenames, as well. I personally dislike using "_" prefixes for private fields.
maxwellb
Also note that in this case, though different members are only differentiated by case, this code will be confusing to a reader.
Miel
@Miel: Which code would be confusing to the reader? Any competent C# developer should be aware of the conventions for the code they're reading - which in this case would be variables (always private) being camel-cased, and properties being Pascal-cased.
Jon Skeet
@Miel, how is it confusing? If you put yor private field declaration 1000 lines away maybe it would be difficult to read but here you've got everything together, there is no way to miss it.
rochal
@Jon and @rochal I wanted to say _not_ confusing. I meant to contrast this with the answer Alexander gave. Hmm, now I seem to be confusing the readers...
Miel
I have to say that is less ugly than naming the member `_name` or `m_name`. Still, I'm not a fan of bare setters and getters like this.
T.E.D.
@joe: Whereas I've never been burned by case-sensitivity, but have been burned by not being allowed to name things how I want to. It seems to me it's a matter of individual taste.
Jon Skeet
+1  A: 

Case-sensitive comparison is (from a naive point of view that ignores canonical equivalence) trivial (simply compare code points), but case-insensitive comparison is not well defined and extremely complex in all cases, and the rules are impossible to remember. Implementing it is possible, but will inadvertedly lead to unexpected and surprising behavior. BTW, some languages like Fortran and Basic have always been case-insensitive.

Philipp
I would think that case "sensitive" comparison would be more trivial. How does one "compare codepoints" to say that "a"~"A"? In one character encoding, one can compare case sensitively by doing simple binary comparison...
maxwellb
I confused case-sensivity with case-insensivity.
Philipp
Every searcher I have ever used can do case-insensitive searches. Many (eg: Emacs') do it by default. Yes, things get muddy when you are dealing with non-English languages, but as long as the rules for your compiler and your searcher are the same, there is no real problem there.
T.E.D.
You're making the mistake that any code files are written in Unicode. Nearly every source code file, even those inside Microsoft, are written (not intentionally) in the local code-page, which is almost certainly "Windows-1252".
Ian Boyd
+2  A: 

I'm sure originally it was a performance consideration. Converting a string to upper or lower case for caseless comparison isn't an expensive operation exactly, but it's not free either, and on old systems it may have added complexity that the systems of the day weren't ready to handle.

And now, of course, languages like to be compatible with each other (VB for example can't distinguish between C# classes or functions that differ only in case), people are used to naming things the same text but with different cases (See Jon Skeet's answer - I do that a lot), and the value of caseless languages wasn't really enough to outweigh these two.

Sean Edwards
I'm equally sure that originally it was nothing to do with performance that case was ignored in programming languages. Instead it was down to the 5-bit encoding schemes used on (many) 1950s teleprinters which simply didn't have both upper and lower case letters. YOU DONT NEED UPPERCASE OR PUNCTUATION IN A TELEGRAM STOP
High Performance Mark
I think it counts against the performance theory that case-insensitive languages like Fortran and Basic are among the oldest of all languages.
Philipp
Heh, Philipp wins. Didn't even think of that.
Sean Edwards
+1 This is probably the real reason case sensitivity exists in some languages. The people writing the C compiler in the 1970's didn't want to burn CPU time during compile converting each ASCII uppercase character to lowercase. Presto, C is case sensitive. C begat C++. C++ begat Java. Java begat C#. And here we have it today, nobody willing to revisit the issue
Ian Boyd
+8  A: 

One of the biggest reasons for case-sensitivity in programming languages is readability. Things that mean the same should also look the same.

I found the following interesting example by M. Sandin in a related discussion:

I used to believe case sensitivity was a mistake, until I did this in the case insensitive language PL/SQL (syntax now entierly forgotten):

function IsValidUserLogin(user:string, password :string):bool begin
   result = select * from USERS
            where USER_NAME=user and PASSWORD=password;
   return not is_empty(result);
end

This passed unnoticed for several months on a low-volume production system, and no harm came of it. But it is a nasty bug, sprung from case insensitivity, coding conventions, and the way humans read code. The lesson for me was that: Things that are the same should look the same.

Can you see the problem immediately? I couldn't...

0xA3
This is case-insensitive code, so `PASSWORD=password` is always true.
Blorgbeard
@Tom Gullen: It's explained [further down](http://lambda-the-ultimate.org/node/1114#comment-12098) in the linked thread.
0xA3
my guess? PASSWORD=password always evaluates to true? Hence any valid username will login... at least if I'm right :P
Wayne Werner
my guess is that the problem lies with "PASSWORD=password". With a case-insensitive language this is equivalent to "password=password" which probably doesn't do what the author expects.
Bryan Oakley
PASSWORD=password is presumably intended to be true if the value in the table's PASSWORD column is the same as the passed in parameter. Unfortunately, they both refer either to the parameter or the column. The example is, however, erroneous. `PASSWORD=password` only looks OK to us because we have been programming for twenty years or more in case sensitive languages and we are conditioned to believe that PASSWORD and password must be two different entities.
JeremyP
@JeremyP: Good point about our habits, but then again: Why not force different things to look different and same things to look the same? After all, this is what we are used to from our everyday life (where casing *may* totaly change the meaning... do you like reading? or do you like Reading?)
0xA3
@JeremyP: Maybe we wouldn't need indenting either if we were used to code without it. I think people would start making conventions even if we had case-insensitivity, for the same reasons we have indent conventions.
luiscubal
+8  A: 

I like case sensitivity in order to differentiate between class and instance.

Form form = new Form();

If you can't do that, you end up with variables called myForm or form1 or f, which are not as clean and descriptive as plain old form.

Case sensitivity also means that you don't have references to form, FORM and Form which all mean the same thing. I find it difficult to read such code. I find it much easier to scan code where all references to the same variable look exactly the same.

Blorgbeard
Having worked a lot in case-insensitive languages, I find that coming up with different names for classes and objects is a good skill to acquire.
T.E.D.
This problem doesn't appear in e.g. Visual Basic because type identifiers and object identifiers are clearly separated there (`Dim form As Form` etc., only types may appear after `As`). Again it's the C syntax which is flawed, not the idea in general.
Philipp
@T.E.D. It depends. Sometimes it's the most obvious and useful name - sometimes it isn't. It's nice to at least have the option.
Jon Skeet
"It's nice to at least have the option" would actually be a good summation of the entire C design philosophy. The problem comes when you are trying to maintain someone else's code. Then you find it would often have been much nicer if they *didn't* have the option.
T.E.D.
+2  A: 

The reason you can't understand why case-sensitivity is a good idea, is because it is not. It is just one of the weird quirks of C (like 0-based arrays) that now seem "normal" because so many languages copied what C did.

C uses case-sensitivity in indentifiers, but from a language design perspective that was a weird choice. Most languages that were designed from scratch (with no consideration given to being "like C" in any way) were made case-insensitive. This includes Fortran, Cobol, Lisp, and almost the entire Algol family of languages (Pascal, Modula-2, Oberon, Ada, etc.)

Scripting languages are a mixed bag. Many were made case-sensitive because the Unix filesystem was case-sensitive and they had to interact sensibly with it. C kind of grew up organically in the Unix environment, and probably picked up the case-sensitive philosophy from there.

T.E.D.
i computers were much less powerful in the 1970's than they are today; but i wonder if maybe it wasn't a case of premature optimization 40 years ago.
Ian Boyd
The entire C language could be looked at that way. For instance, pre and post increments and decrements were enshrined as operators because C's original (CISC) CPU had opcode variants for most operations to do that.
T.E.D.
+4  A: 

Something I have always wondered, is why are languages designed to be case sensitive?

Ultimately, it's because it is easier to correctly implement a case-sensitive comparison correctly; you just compare bytes/characters without any conversions. You can also do other things like hashing really easy.

Why is this an issue? Well, case-insensitivity is rather hard to add unless you're in a tiny domain of supported characters (notably, US-ASCII). Case conversion rules vary by locale (the Turkish rules are not the same as those in the rest of the world) and there's no guarantee that flipping a single bit will do the right thing, or that it is always the same bit and under the same preconditions. (IIRC, there's some really complex rules in some language for throwing away diacritics when converting vowels to upper case, and reintroducing them when converting to lower case. I forget exactly what the details are.)

If you're case sensitive, you just ignore all that; it's just simpler. (Mind you, you still ought to pay attention to UNICODE normalization forms, but that's another story and it applies whatever case rules you're using.)

Donal Fellows
Also be glad that nobody does automated title-casing for computer languages (and I'm actually not sure how that would work); the rules for *that* are much more complex!
Donal Fellows