views:

354

answers:

9

This may seem like a dumb question, but still I don't know the answer.

Why do programming languages do not include spaces in the names ( for instance method names )?

I understand it is to facilitate ( allow ) the parsing, and at some point it would be impossible to parse anything if spaces were allowed.

Nowadays we are so use to it that the norm is not to see spaces.

For instance:

 object.saveData( data );
 object.save_data( data )
 object.SaveData( data );
 [object saveData:data];

etc.

Could be written as:

 object.save data( data )  // looks ugly, but that's the "nature" way.

If it is only for parsing, I guess the identifier could be between . and ( of course, procedural languages wouldn't be able to use it because there is no '.' but OO do..

I wonder if parsing is the only reason, and if it is, how important it is ( I assume that it will be and it will be impossible to do it otherwise, unless all the programming language designers just... forget the option )

EDIT

I'm ok with identifiers in general ( as the fortran example ) is bad idea. Narrowing to OO languages and specifically to methods, I don't see ( I don't mean there is not ) a reason why it should be that way. After all the . and the first ( may be used.

And forget the saveData method , consider this one:

key.ToString().StartsWith("TextBox")

as:

key.to string().starts with("textbox");

Should this be CW? Just edit it 6 times :P

+37  A: 

Be cause i twoul d makepa rsing suc hcode reallydif ficult.

Jay Riggs
+1 for making me laugh and giving the correct answer at the same time XD
Maximilian Mayerl
Ironically most people had no problem parsing this text.Bad example. :P
theycallmemorty
Yesparsingiseasierwithoutspaces. AndWeAre usedToIt, but_as_a_side_effect. :(
OscarRyz
This is the first time I upvoted an answer that is, IMO, wrong.
Nikolai Ruhe
+8  A: 

Yes, it's the parsing - both human and computer. It's easier to read and easier to parse if you can safely assume that whitespace doesn't matter. Otherwise, you can have potentially ambiguous statements, statements where it's not clear how things go together, statements that are hard to read, etc.

jprete
+12  A: 

I used an implementation of ALGOL (c. 1978) which—extremely annoyingly—required quoting of what is now known as reserved words, and allowed spaces in identifiers:

  "proc" filter = ("proc" ("int") "bool" p, "list" l) "list":
     "if" l "is" "nil" "then" "nil"
     "elif" p(hd(l)) "then" cons(hd(l), filter(p,tl(l)))
     "else" filter(p, tl(l))
     "fi";

Also, FORTRAN (the capitalized form means F77 or earlier), was more or less insensitive to spaces. So this could be written:

  799 S = FLO AT F (I A+I B+I C) / 2 . 0
      A  R E  A = SQ R T ( S *(S - F L O ATF(IA)) * (S - FLOATF(IB)) *
     +     (S - F LOA TF (I C)))

which was syntactically identical to

  799 S = FLOATF (IA + IB + IC) / 2.0
      AREA = SQRT( S * (S - FLOATF(IA)) * (S - FLOATF(IB)) *
     +     (S - FLOATF(IC)))

With that kind of history of abuse, why make parsing difficult for humans? Let alone complicate computer parsing.

wallyk
Not to mention: DO 10 I = 1.10 (an assignment) and DO 10 I = 1,10 (the start of a DO loop).
Jonathan Leffler
It's actually a requirement of Algol spec. It always allowed whitespace in identifiers (but at the same time ignored it, so `foo bar(x)` was the same as `foobar(x)`), and required that keywords and identifiers are in different "namespaces", so implementation had to quote either one or the other.
Pavel Minaev
+5  A: 

Before the interpreter or compiler can build a parse tree, it must perform lexical analysis, turning the stream of characters into a stream of tokens. Consider how you would want the following parsed:

a = 1.2423 / (4343.23 * 2332.2);

And how your rule above would work on it. Hard to know how to lexify it without understanding the meaning of the tokens. It would be really hard to build a parser that did lexification at the same time.

CBFraser
+1 An excellent point.
jprete
+1. Wouldn't my token be `save data` because it is between a `.` and a `(` What about the other way around? I know compilers have no problem with: `a=1.2423/(4343.23*2332.2);a=a+a;` ( no spaces at all )
OscarRyz
It doesn't have to be a single token. It could well be parsed as two different tokens, and then the parser looks at the context and figures out if those together should form a single name. Depending on the grammar of the language, it may even be unambiguous. So not really an implementation issue.
Pavel Minaev
+1  A: 

Using space as part of an identifier makes parsing really murky (is that a syntactic space or an identifier?), but the same sort "natural reading" behavior is achieved with keyword arguments. object.save(data: something, atomically: true)

Chuck
+1  A: 

There are a few languages which allow spaces in identifiers. The fact that nearly all languages constrain the set of characters in identifiers is because parsing is more easy and most programmers are accustomed to the compact no-whitespace style.

I don’t think there’s real reason.

Nikolai Ruhe
Can you give an example of such a language? I think it would be interesting to take a look at.
jprete
Algol-60 (and Simula, as a superset of of it) was one such. ABC (http://ftp.cwi.nl/abc/examples/advent/advent.ftp) allowed it in function names (e.g. `TRY TO TAKE object` or `INCLUDE object IN property FOR thing` - uppercased identifiers are all parts of function name, and lowercase ones are parameter names).
Pavel Minaev
Another example would be AppleScript which allows it for some identifiers.
Nikolai Ruhe
Inform 7 allows multiple-word names. It's bad; you can't tell if a phrase is just a variable name or if the individual English words actually take their English meanings.
Jason Orendorff
+3  A: 

Check out Stroustrup's classic Generalizing Overloading for C++2000.

Jonathan Leffler
+3  A: 

We were allowed to put spaces in filenames back in the 1960's, and computers still don't handle them very well (everything used to break, then most things, now it's just a few things - but they still break).

We simply can't wait another 50 years before our code will work again. :-)

(And what everyone else said, of course. In English, we use spaces and punctuation to separate the words. The same is true for computer languages, except that computer parsers define "words" in a slightly different sense)

Jason Williams
+3  A: 

Such a change would make for an ambiguous language in the best of cases. For example, in a C99-like language:

if not foo(int x) {
    ...
}

is that equivalent to:

  1. A function definition of foo that returns a value of type ifnot:

    ifnot foo(int x) {
        ...
    }
    
  2. A call to a function called notfoo with a variable named intx:

    if notfoo(intx) {
        ...
    }
    
  3. A negated call to a function called foo (with C99's not which means !):

    if not foo(intx) {
        ...
    }
    

This is just a small sample of the ambiguities you might run into.

Update: I just noticed that obviously, in a C99-like language, the condition of an if statement would be enclosed in parentheses. Extra punctuation can help with ambiguities if you choose to ignore whitespace, but your language will end up having lots of extra punctuation wherever you would normally have used whitespace.

Greg Hewgill
And you would also need a whole lot of lookahead to realize that it is not a call to a function called "if not foo".
erikkallen