views:

251

answers:

4

Well, first I should probably ask if this is browser dependent.

I've read that if an invalid token is found, but the section of code is valid until that invalid token, a semicolon is inserted before the token if it is preceded by a line break.
However, the common example cited for bugs caused by semicolon insertion is:

return
  _a+b;

which doesn't seem to follow this rule, since _a would be a valid token. On the other hand, breaking up call chains works as expected:

$('#myButton')
  .click(function(){alert("Hello!")});

Does anyone have a more in-depth description of the rules?

A: 

The biggest "gotcha" that comes to mind for me is that you want to prefer this style:

if (...) {
   // do something
} 

over this style:

if (...)
{
   // do something
}

because if you don't in the wrong circumstances it will put a semi-colon at the end of the first line and then proceed to immediately execute the code block starting on the next line. This applies to function definitions, while loops, and anywhere else you have an opening brace.

Joel Coehoorn
+2  A: 

If the line is valid as is, then a semicolon will be inserted. If the line as is would be a syntax error, the engine continues reading. The exception is if, for and while and the like - the engine will go to the next line to look for un-tucked braces if necessary.

Example 1

return
  a + b;

This example has semicolon auto-inserted after return.

Example 2

alert(c + 
  d);

This example DOES NOT have semicolon auto-inserted as the line is invalid syntax alone.

Delan Azabani
+11  A: 
CMS
CMS - Thanks for this answer. In case 1, you have `;2; }; 3;`, while the same example in the Spec has `;2 ;} 3;`. Wondering if there's a reason, or just a typo. Also, the first sentence of the description of case 1 seems to imply that the `LineTerminator` is the offending token, when it would seem that the token *after* the `LineTerminator` is the offending one (the `2` in this case). As you state in the first bullet point *"The token is separated from the previous token by at least one LineTerminator."* Sorry to nitpick. Just a little confused. Thanks. :o)
patrick dw
@patrick: Yes, thanks, it's obviously a typo (sorry for the confusion), the semicolon after the Block statement shouldn't be there. I will write soon a more in-depth article including cases not mentioned in 7.9.1 where ASI should happen, e.g. regarding `MultilLineComments` `(function () { return /*\n*/1;})() === undefined`, and other lexical conventions where the rules described above can make ASI happen, but it shouldn't. :)
CMS
So, you have ECMScript standards for breakfast? =)
alcuadrado
Thanks CMS. Much appreciated.
patrick dw
A: 

Straight from the ECMA-262, Fifth Edition ECMAScript Specification:

7.9.1 Rules of Automatic Semicolon Insertion

There are three basic rules of semicolon insertion:

  1. When, as the program is parsed from left to right, a token (called the offending token) is encountered that is not allowed by any production of the grammar, then a semicolon is automatically inserted before the offending token if one or more of the following conditions is true:
    • The offending token is separated from the previous token by at least one LineTerminator.
    • The offending token is }.
  2. When, as the program is parsed from left to right, the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete ECMAScript Program, then a semicolon is automatically inserted at the end of the input stream.
  3. When, as the program is parsed from left to right, a token is encountered that is allowed by some production of the grammar, but the production is a restricted production and the token would be the first token for a terminal or nonterminal immediately following the annotation "[no LineTerminator here]" within the restricted production (and therefore such a token is called a restricted token), and the restricted token is separated from the previous token by at least one LineTerminator, then a semicolon is automatically inserted before the restricted token.

However, there is an additional overriding condition on the preceding rules: a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement (see 12.6.3).

Jörg W Mittag