ansaurus

Question

What is the purpose of case sensitivity in languages?

Answer 1

+3 A:

Imagine you have an object called dog, which has a method called Bark(). Also you have defined a class called Dog, which has a static method called Bark(). You write dog.Bark(). So what's it going to do? Call the object's method or the static method from the class? (in a language where :: doesn't exist)

Alexander 2010-07-01 12:43:03

Why not call them different things so you don't get mixed up further down the line? Having a class called Dog and an object called dog is going to get confusing. This is what I don't understand.

Tom Gullen 2010-07-01 12:45:01

Dog is too generic... An object from Dog such as labrador would make more sense.

2010-07-01 12:46:48

So now the compiler knows what method to call, but the reader will still be confused. I think you should use something better to differentiate between the two than capitalization of 1 letter.

Miel 2010-07-01 12:47:24

I have a few naming conventions. Private class members and local variables are always named with a lower case character. Global stuff like class names, public members, etc are upper-case first letter.This naming convention makes me naturally understand what I'm writing. It's very easy once you get the hang of it.

Alexander 2010-07-01 12:51:00

@Tom: I prefer case sensitivity because together with a good set of naming conventions, you can keep things straight *and* your choice of identifier names isn't artificially limited by different strings comparing "equal". If I have a class called `Dog`, why should I have to resort to calling my instances things like `dogVar` or `dog_` or even (shudder) `m_dog`? @m00st: what if `Dog` is the root of an inheritance hierarchy, and you need to name a `Dog` pointer which might point to an instance of any subclass of dog? :)

shambulator 2010-07-01 12:58:48

Alexander, that's probably the best/only good use for case sensitivity. Of course things may vary from language to language, but those are pretty good conventions to keep.

Wayne Werner 2010-07-01 12:59:01

Answer 2

+16 A:

It's not necessarily bad practice to have two members which are only differentiated by case, in languages which support it. For example, here's a fairly common bit of C#:

private readonly string name;
public string Name { get { return name; } }

Personally I'm quite happy with case sensitivity - particularly as it allows code like the above, where the member variable and property follow conventions anyway, avoiding confusion.

Note that case-sensitivity has a culture aspect too... not all cultures will deem the same characters to be equivalent...

Jon Skeet 2010-07-01 12:45:45

Thanks for the reply, could you expand on cultural aspect? I'm not sure I fully understand.

Tom Gullen 2010-07-01 12:50:52

The most often given example, is the Turkish variations of "I". English case-insensitivity is insensitive to this, because "i"=/="I" in this case.

maxwellb 2010-07-01 12:57:51

+1 Jon, because I back my properties with lowercase samenames, as well. I personally dislike using "_" prefixes for private fields.

maxwellb 2010-07-01 12:59:33

Also note that in this case, though different members are only differentiated by case, this code will be confusing to a reader.

Miel 2010-07-01 13:06:05

@Miel: Which code would be confusing to the reader? Any competent C# developer should be aware of the conventions for the code they're reading - which in this case would be variables (always private) being camel-cased, and properties being Pascal-cased.

Jon Skeet 2010-07-01 13:13:25

@Miel, how is it confusing? If you put yor private field declaration 1000 lines away maybe it would be difficult to read but here you've got everything together, there is no way to miss it.

rochal 2010-07-01 13:18:42

@Jon and @rochal I wanted to say _not_ confusing. I meant to contrast this with the answer Alexander gave. Hmm, now I seem to be confusing the readers...

Miel 2010-07-01 13:29:08

I have to say that is less ugly than naming the member `_name` or `m_name`. Still, I'm not a fan of bare setters and getters like this.

T.E.D. 2010-07-01 13:52:44

@joe: Whereas I've never been burned by case-sensitivity, but have been burned by not being allowed to name things how I want to. It seems to me it's a matter of individual taste.

Jon Skeet 2010-07-02 05:18:51

Answer 3

+1 A:

Case-sensitive comparison is (from a naive point of view that ignores canonical equivalence) trivial (simply compare code points), but case-insensitive comparison is not well defined and extremely complex in all cases, and the rules are impossible to remember. Implementing it is possible, but will inadvertedly lead to unexpected and surprising behavior. BTW, some languages like Fortran and Basic have always been case-insensitive.

Philipp 2010-07-01 12:46:51

I would think that case "sensitive" comparison would be more trivial. How does one "compare codepoints" to say that "a"~"A"? In one character encoding, one can compare case sensitively by doing simple binary comparison...

maxwellb 2010-07-01 13:01:29

I confused case-sensivity with case-insensivity.

Philipp 2010-07-01 13:10:23

Every searcher I have ever used can do case-insensitive searches. Many (eg: Emacs') do it by default. Yes, things get muddy when you are dealing with non-English languages, but as long as the rules for your compiler and your searcher are the same, there is no real problem there.

T.E.D. 2010-07-01 14:08:46

You're making the mistake that any code files are written in Unicode. Nearly every source code file, even those inside Microsoft, are written (not intentionally) in the local code-page, which is almost certainly "Windows-1252".

Ian Boyd 2010-07-02 13:18:12

Answer 4

+2 A:

I'm sure originally it was a performance consideration. Converting a string to upper or lower case for caseless comparison isn't an expensive operation exactly, but it's not free either, and on old systems it may have added complexity that the systems of the day weren't ready to handle.

And now, of course, languages like to be compatible with each other (VB for example can't distinguish between C# classes or functions that differ only in case), people are used to naming things the same text but with different cases (See Jon Skeet's answer - I do that a lot), and the value of caseless languages wasn't really enough to outweigh these two.

Sean Edwards 2010-07-01 12:51:17

I'm equally sure that originally it was nothing to do with performance that case was ignored in programming languages. Instead it was down to the 5-bit encoding schemes used on (many) 1950s teleprinters which simply didn't have both upper and lower case letters. YOU DONT NEED UPPERCASE OR PUNCTUATION IN A TELEGRAM STOP

High Performance Mark 2010-07-01 13:08:59

I think it counts against the performance theory that case-insensitive languages like Fortran and Basic are among the oldest of all languages.

Philipp 2010-07-01 13:12:58

Heh, Philipp wins. Didn't even think of that.

Sean Edwards 2010-07-01 13:22:57

+1 This is probably the real reason case sensitivity exists in some languages. The people writing the C compiler in the 1970's didn't want to burn CPU time during compile converting each ASCII uppercase character to lowercase. Presto, C is case sensitive. C begat C++. C++ begat Java. Java begat C#. And here we have it today, nobody willing to revisit the issue

Ian Boyd 2010-07-02 13:15:54

Answer 5

+8 A:

One of the biggest reasons for case-sensitivity in programming languages is readability. Things that mean the same should also look the same.

I found the following interesting example by M. Sandin in a related discussion:

I used to believe case sensitivity was a mistake, until I did this in the case insensitive language PL/SQL (syntax now entierly forgotten):
function IsValidUserLogin(user:string, password :string):bool begin
   result = select * from USERS
            where USER_NAME=user and PASSWORD=password;
   return not is_empty(result);
end
This passed unnoticed for several months on a low-volume production system, and no harm came of it. But it is a nasty bug, sprung from case insensitivity, coding conventions, and the way humans read code. The lesson for me was that: Things that are the same should look the same.

Can you see the problem immediately? I couldn't...

0xA3 2010-07-01 12:51:31

This is case-insensitive code, so `PASSWORD=password` is always true.

Blorgbeard 2010-07-01 12:57:10

@Tom Gullen: It's explained [further down](http://lambda-the-ultimate.org/node/1114#comment-12098) in the linked thread.

0xA3 2010-07-01 12:58:41

my guess? PASSWORD=password always evaluates to true? Hence any valid username will login... at least if I'm right :P

Wayne Werner 2010-07-01 13:01:11

my guess is that the problem lies with "PASSWORD=password". With a case-insensitive language this is equivalent to "password=password" which probably doesn't do what the author expects.

Bryan Oakley 2010-07-01 13:01:15

PASSWORD=password is presumably intended to be true if the value in the table's PASSWORD column is the same as the passed in parameter. Unfortunately, they both refer either to the parameter or the column. The example is, however, erroneous. `PASSWORD=password` only looks OK to us because we have been programming for twenty years or more in case sensitive languages and we are conditioned to believe that PASSWORD and password must be two different entities.

JeremyP 2010-07-01 13:08:47

@JeremyP: Good point about our habits, but then again: Why not force different things to look different and same things to look the same? After all, this is what we are used to from our everyday life (where casing *may* totaly change the meaning... do you like reading? or do you like Reading?)

0xA3 2010-07-01 13:13:12

@JeremyP: Maybe we wouldn't need indenting either if we were used to code without it. I think people would start making conventions even if we had case-insensitivity, for the same reasons we have indent conventions.

luiscubal 2010-07-01 15:15:55

Answer 6

+8 A:

I like case sensitivity in order to differentiate between class and instance.

Form form = new Form();

If you can't do that, you end up with variables called myForm or form1 or f, which are not as clean and descriptive as plain old form.

Case sensitivity also means that you don't have references to form, FORM and Form which all mean the same thing. I find it difficult to read such code. I find it much easier to scan code where all references to the same variable look exactly the same.

Blorgbeard 2010-07-01 12:53:13

Having worked a lot in case-insensitive languages, I find that coming up with different names for classes and objects is a good skill to acquire.

T.E.D. 2010-07-01 13:09:23

This problem doesn't appear in e.g. Visual Basic because type identifiers and object identifiers are clearly separated there (`Dim form As Form` etc., only types may appear after `As`). Again it's the C syntax which is flawed, not the idea in general.

Philipp 2010-07-01 13:14:47

@T.E.D. It depends. Sometimes it's the most obvious and useful name - sometimes it isn't. It's nice to at least have the option.

Jon Skeet 2010-07-01 13:27:12

"It's nice to at least have the option" would actually be a good summation of the entire C design philosophy. The problem comes when you are trying to maintain someone else's code. Then you find it would often have been much nicer if they *didn't* have the option.

T.E.D. 2010-07-02 13:06:29

Answer 7

+2 A:

The reason you can't understand why case-sensitivity is a good idea, is because it is not. It is just one of the weird quirks of C (like 0-based arrays) that now seem "normal" because so many languages copied what C did.

C uses case-sensitivity in indentifiers, but from a language design perspective that was a weird choice. Most languages that were designed from scratch (with no consideration given to being "like C" in any way) were made case-insensitive. This includes Fortran, Cobol, Lisp, and almost the entire Algol family of languages (Pascal, Modula-2, Oberon, Ada, etc.)

Scripting languages are a mixed bag. Many were made case-sensitive because the Unix filesystem was case-sensitive and they had to interact sensibly with it. C kind of grew up organically in the Unix environment, and probably picked up the case-sensitive philosophy from there.

T.E.D. 2010-07-01 13:10:51

i computers were much less powerful in the 1970's than they are today; but i wonder if maybe it wasn't a case of premature optimization 40 years ago.

Ian Boyd 2010-07-02 13:19:23

The entire C language could be looked at that way. For instance, pre and post increments and decrements were enshrined as operators because C's original (CISC) CPU had opcode variants for most operations to do that.

T.E.D. 2010-07-02 14:02:08

Answer 8

+4 A:

Something I have always wondered, is why are languages designed to be case sensitive?

Ultimately, it's because it is easier to correctly implement a case-sensitive comparison correctly; you just compare bytes/characters without any conversions. You can also do other things like hashing really easy.

Why is this an issue? Well, case-insensitivity is rather hard to add unless you're in a tiny domain of supported characters (notably, US-ASCII). Case conversion rules vary by locale (the Turkish rules are not the same as those in the rest of the world) and there's no guarantee that flipping a single bit will do the right thing, or that it is always the same bit and under the same preconditions. (IIRC, there's some really complex rules in some language for throwing away diacritics when converting vowels to upper case, and reintroducing them when converting to lower case. I forget exactly what the details are.)

If you're case sensitive, you just ignore all that; it's just simpler. (Mind you, you still ought to pay attention to UNICODE normalization forms, but that's another story and it applies whatever case rules you're using.)

Donal Fellows 2010-07-01 13:50:33

Also be glad that nobody does automated title-casing for computer languages (and I'm actually not sure how that would work); the rules for *that* are much more complex!

Donal Fellows 2010-07-01 20:42:37

ansaurus

tags:

views:

answers:

What is the purpose of case sensitivity in languages?

related questions