views:

282

answers:

5

In the Java Generic Book, while contrasting the difference between C++ Templates and Java Generic says:

In C++, a problem arises because >> without the space denotes the right-shift operator. Java fixes the problem by a trick in the grammar.)

What is this trick?

+3  A: 

This is actually being fixed in C++ in the next version. There really isn't much of a trick; if you encounter >> while in the process of parsing a generic or template where instead you expected >, then you already have enough information to generate an error message. And, if you have enough information to generate an error message, you also have enough information to interpret >> as two separate tokens: > followed by >.

Michael Aaron Safyan
This is incorrect in the case of C++, since `A<1<<b>>c>` is a valid generic type with a single ordinal parameter, `1<<b>>c`.
Marcelo Cantos
@Marcelo, but there is no error in that case, and so it is not necessary to expand >> or <<... so I don't see how what I have said is incorrect.
Michael Aaron Safyan
+4  A: 

It's a simple parser/lexer hack. The lexical analyser normally recognises the pair >> as a single token. However, when in the middle of parsing a generic type, the parser tells the lexer not to recognise >>.

Historically, C++ didn't do this for the sake of implementation simplicity, but it can (and will) be fixed using the same trick.

Marcelo Cantos
This isn't how it works for Java. The lexer always converts (eg) `>>` and `>>>` to their relevant tokens. The grammar has some hacks to accept the token and replace it with one with less '>' characters in it.
spong
+2  A: 

It's not really a trick, they just defined the grammar such that a right shift token is synonymous with with two right angle brackets (thus allowing that token to close a template). You can still create ambiguities that have to be resolved with parentheses, but unambiguous sequences are parsed without developer intervention. This is also done in C++0x.

Nick Bastin
+1  A: 

The Java Language Specification, Third Edition shows the full grammar, both shift operators are listed in the InfixOp production, there is no (obvious) trick. to determine which operation >, >> or >>> is intented, will be decided by the scanner using a lookahead technique.

stacker
yes, but section 3.2 actually required ">>" to be parsed into one token, no matter what. a little inconsistency there.
irreputable
+1  A: 

The OpenJDK javac parser, JavacParser, massages the lexer tokens GTGTGTEQ (>>>=), GTGTEQ, GTEQ, GTGTGT (>>>) and GTGT into the token with one less '>' character when parsing type arguments.

Here is a snippet of the magic from JavacParser#typeArguments():

    switch (S.token()) {
    case GTGTGTEQ:
        S.token(GTGTEQ);
        break;
    case GTGTEQ:
        S.token(GTEQ);
        break;
    case GTEQ:
        S.token(EQ);
        break;
    case GTGTGT:
        S.token(GTGT);
        break;
    case GTGT:
        S.token(GT);
        break;
    default:
        accept(GT);
        break;
    }

One can clearly see that it is indeed a trick, and it's in the grammar :)

spong