views:

892

answers:

13

For example (in C):

int break = 1;
int for = 2;

Why will the compiler have any problems at all in deducing that break and for are variables here?


So, we need keywords because

  • we want the programs to be readable
  • we do not want to over-complicate the job of already complex compilers of today
  • but most importantly, a language is lot more powerful if some 'key'words are reserved for some special actions. Then, the language can think of being useful at a higher level rather than dying in trying to implement a for loop in an unambiguous way.
+4  A: 

several reasons:

  • The keywords may seem unambiguous in your samples. But that is not the only place you would use the variable 'break' or the variable 'for'.

  • writing the parser would be much harder and error prone for little gain.

  • using a keyword as a function or procedure name in a library may have undesired, possibly security relevant, side effects.

lexu
The example given is weak, since it does not contain a three element arglist, or for that matter, anything that can be considered any sort of arglist. The semicolons ruin it. Try rewriting your example in terms of `id` or `while`, since they take only a single parenthesized expression
TokenMacGuy
@TokenMacGuy : you are correct, I removed the example..
lexu
+17  A: 

Then what will the computer do when it comes across a statement like:

while(1) {
  ...
  if (condition)
    break;
}

Should it actually break? Or should it treat it as 1;?

The language would become ambiguous in certain cases, or you'd have to create a very smart parser that can infer subtle syntax, and that's just unnecessary extra work.

Kache4
while, true, if... so many keywords before break
Even worse if you've declared a function pointer (or a function) called `while`...
caf
@Mac: It's not clear (to me, at least) what you intend to convey with your observation that while and if occur before break in the example given. What do you mean by that?
Heath Hunnicutt
@Heath Hunnicutt: I think caf just presented an example of what 'he intended to convey'.
Kache4
@Mac, there is no such keyword as `true` in C.
Bertrand Marron
+2  A: 

The compiler would have problems if you write something like this:

while(*s++);
return(5);

Is that a loop or a call to a function named while? Did you want to return the value 5 from the current function, or did you want to call a function named return?

It often simplifies things if constructs with special meaning simply have special names that can be used to unambiguously refer to them.

sth
Many language that don't reserve keywords handle cases like these by treating these as context-dependent keywords (even C# does this). That is, where the word might be interpreted speically (e.g., as a "while statment") it is treated as such; where not, it isn't a problem . Your while statement above isn't a problem; it clearly can't be a while statement, as the is no body to execute. The return statement example you give is good; it would clearly be ambiguous without a context-keyword rule, which would make it a return statement.
Ira Baxter
@Ira: At least in C (as the question is tagged) this is a valid while-statement. That the body of the `while` is empty is no problem, it just won't execute anything there.
sth
@sth: Yes, you're right because of C's null statement syntax. You can still resolve it context-dependence without having real keywords: this is now clearly a "while" loop :-} You'd have to use (*( to force a function call.
Ira Baxter
+2  A: 

If we are speaking of C++ - it already has very complicated grammar. Allowing to use keywords as variable names, for example, will make it even more complicated.

Glorphindale
+24  A: 

It's not necessary -- Fortran didn't reserve any words, so things like:

if if .eq. then then if = else else then = if endif

are complete legal. This not only makes the language hard for the compiler to parse, but often almost impossible for a person to read or spot errors. for example, consider classic Fortran (say, up through Fortran 77 -- I haven't used it recently, but at least hope they've fixed a few things like this in more recent standards). A Fortran DO loop looks like this:

DO 10 I = 1,10

Without them being side-by-side, you can probably see how you'd miss how this was different:

DO 10 I = 1.10

Unfortunately, the latter isn't a DO loop at all -- it's a simple assignment of the value 1.10 to a variable named DO 10 I (yes, it also allows spaces in a name). Since Fortran also supports implicit (undeclared) variables, this is (or was) all perfectly legal, and some compilers would even accept it without a warning!

Jerry Coffin
http://stackoverflow.com/questions/1995113/strangest-language-feature/2002154#2002154 - I am sure you know this already though :-)
Alok
I didn't know it had been posted here, but it doesn't surprise me. The other thing to keep in mind is that ',' and '.' are right next to each other, so it wasn't even all that rare of a problem (back when I was writing Fortran, you ran into it about once every three months or so -- not as often as mis-counted Hollerith constants, but still often enough that it was something you checked if a loop misbehaved.
Jerry Coffin
Another good confusing example, say there's a three dimensional array if, and you are doing an arithmetic if: `if(if(1,2,3))1,2,3`
mpez0
and let's not forget Python (pre-3.0): `true = false`
BlueRaja - Danny Pflughoeft
+8  A: 

They don't. PL/1 famously has no keywords; every "keyword" (BEGIN, DO, ...) can also be used a variable name. But allowing this means you can write really obscure code: IF DO>BEGIN THEN PRINT:=CALL-GOTO; Reserving the "statement keywords" as the language isn't usually a loss if that set of names is modest (as it is in every langauge I've ever seen except PL/1 :-).

APL also famously has no keywords. But it has a set of some 200 amazing iconic symbols in which to write complicated operators. (the "domino" operator [don't ask!] is a square box with a calculator divide sign in the middle) In this case, the langauge designers simply used icons instead of keywords. The consequence is that APL has a reputation of being a "write only" language.

Bottom line: not a requirement, but it tends to make programs a lot more readable if the keywords are reserved identifiers from a small set known to the programmers. (Some langauges has insisted that "keywords" start with a special punctuation character like "." to allow all possible identifiers to be used, but this isn't worth the extra trouble to type or the clutter on the page; its pretty easy to stay away from "identifiers" that match keywords when the keyword set is small).

Ira Baxter
APL started as a mathematical syntax to describe algorithms. The lack of keywords or text operators actually becomes an advantage in some polylingual institutions, such as ESA or CERN. It is easy to get completely confused, though.
mpez0
Good APL programmers don't get confused at all regarding what the operators are or do. They do get completely confused as what an APL statement is trying to accomplish.
Ira Baxter
+4  A: 

As others said, this makes compiler parsing your source code easier. But I would like to say a bit more: it can also make your source code more readable; consider this example:

if (if > 0) then then = 10 end if

The second "if" and the second "then" are variables, while others are not. I think this kind of code is not readable. :)

Dylan Lin
Who cares about how hard the compiler has to work, except the compiler guy? He gets outvoted on this topic every time, rightly, IMHO; his job is to make everybody's job easier, not the other way around.. This issue is really about usability for coders.
Ira Baxter
You're right. :)
Dylan Lin
@Ira Baxter: Not just the compiler guy. If you want to write a program analysis tool or refactoring tool or something, you've got to parse the language. Moreover, if you make building parsers expensive, you've got less resources to work on other aspects of the compiler.
David Thornley
@Thornley: True, you don't want the "front end" to work a lot harder if you can avoid it. However, keywords or not don't change the average cost of parsing; for GLR parsers and virtually every modern language, the parsing cost is linear with a small constant; (I've implemented dozens of front ends http://www.semanticdesigns.com/Products/DMS/FrontEnds.html with a GLR parser and validate this empirically). The analysis step is usually much more costly as it requires a lot more inference, at least if does anything interesting.
Ira Baxter
A: 

I guess it look very weird if not impossible to write the parser. E.g

int break = 1;
while (true) {
   // code to change break
   if (!break) break;   // not very readable code.
}
fastcodejava
+6  A: 

Since it's tagged C, the original C language was such that by default any variable was defined as type int.

It means that foo; would declare a variable of type int.

Let's say you do break;. So how does the compiler know whether you want to declare a variable named break or use the keyword break?

Bertrand Marron
+2  A: 

Because we want to keep what little sanity points we've got:

void myfunction(bool) { .. };

funcp while = &myfunction;
while(true); 
Marcus Lindblom
you forgott to define true = false. On a related note "#define while myfunction" works in c
josefx
define is a different thing from keywords.
Marcus Lindblom
just thought that keywords don't help with sanity if you can redefine them. If the preprocessor instructions count as part of the language then only those are reserved, every other keyword can be redefined.
josefx
A: 

Depending on the language definition a compiler may or may not need keywords. When it does not know what to do it can try to apply precedence rules or just fail.
An example:

void return(int i){printf("%d",i);}
public int foo(int a)
{
  if(a > 2)return (a+1)*2;
  return a + 3;
}

What happens if a is greater than 2?

  • The language specification may require the compiler to fail
  • The language specification may require the compiler use the return function
  • The language specification may require the compiler to return

You can define a language which dosn't use keywords. You can even define a language which alowes you to replace all symbols (since they are only very short keywords themselfes).
The problem is not the compiler, if your specification is complete and error free it will work. The problem is PEBCAD, programs using this feature of the language will be hard to read as you have to keep track of the symbol definitions.

josefx
A: 

FWIW, Tcl doesn't have any reserved words. You can have variables and functions named "if", "break", etc. The interpretation of a token is totally dependent on the context. The same token can represent a command in one context, a variable in another, or a literal string in another.

Bryan Oakley
A: 

In many cases, it would be possible for the compiler to interprete keywords as normal identifiers, like in your example:

int break = 1;
int for = 2;

As a matter of fact, I just wrote a compiler for a simple assembly-like toy language which does this, but warns the user in such cases.

But sometimes the syntax is defined in a way that keywords and identifiers are ambiguous:

int break;

while(...)
{
    break; // <-- treat this as expression or statement?
}

And the most obvious reason is that editors will emphasize keywords so that the code is more readable for humans. Allowing keywords to be treated as identifiers would make code highlighting harder, and would also lead to bad readability of your code.

AndiDog