views:

199

answers:

3

In JavaScript, why does an octal number string cast as a decimal number? I can cast a hex literal string using Number() or +, why not an octal?

For instance:

1000 === +"1000" // -> true
0xFF === +"0xFF" // -> true
0100 === +"0100" // -> false - +"0100" gives 100, not 64

I know I can parse with parseInt("0100" [, 8]), but I'd like to know why casting doesn't work like it does with hex and dec numbers.

Also, does anyone know why octal literals are dropped from ECMAScript 5th Edition in strict mode?

+4  A: 

Because you're not actually performing casting in the proper sense (JS doesn't have casting) - it's just type juggling.

When you have any literal in Javascript and enact a method on it, an object is created behind the scenes for you.

"foo".toUpperCase() for example, is replaced by the evaluation of code that would roughly look like this new String( "foo" ).toUpperCase();

Since strings can't be evaluated with a unary + operator, JS converts your string to a number - and it doesn't use parseInt() or parseFloat() internally - you guessed it - it uses Number().

So, the value you see is the what you'd see from the return of Number(), which doesn't appear to assume octals.

Peter Bailey
Thanks Peter, I did already assume `Number()` to be used when unary "casting" (http://stackoverflow.com/questions/61088/hidden-features-of-javascript/2243631#2243631), it just seems odd to me that `Number()` wouldn't accept any stringified numeric literal as defined by the grammar. It just seems like it would have made more sense to re-use the behind the scenes code already there for parsing numeric literals. Thanks for the info on the behind-the-scenes object creation, I've read this before and it slipped my mind, it's easy to forget these things when they're magically done for you :-)
Andy E
@Peter: It always bugs me when one of my answers is unaccepted because the SE system doesn't tell you which one it was, so I thought I'd be polite and let you know where your 15 points went. CMS wrote a good answer that explained the reasons in more detail, so accepting his answer seemed appropriate. Sorry, and thanks for your answer :-)
Andy E
@Andy no worries - I agree - he has the better answer. Cheers.
Peter Bailey
+4  A: 

I'm a bit late to the question but I think I can give a good answer.

The accepted answer doesn't tell you anything more that what you actually know, and mention in the question itself: Number(value) works as +value but not as parseInt(value).

The key is to know that there is a semantic difference between type conversion and parsing.

Why does an octal number string cast as a decimal number?

Because the Number constructor called as a Function (Number(value)) and the Unary + Operator (+value) behind the scenes use the ToNumber internal operation. The purpose of those constructs is type conversion.

When ToNumber is applied to the String Type a special grammar production is used, called the StringNumericLiteral.

This production can hold only Decimal literals and Hexadecimal Integer literals:

...

StrNumericLiteral :::
   StrDecimalLiteral
   HexIntegerLiteral

...

There are also semantic differences between this grammar and the grammar of "normal" NumericLiterals.

A StringNumericLiteral:

  • May be preceded and/or followed by white space and/or line terminators.
  • That is decimal may have any number of leading 0 digits. no octals!
  • That is decimal may be preceded by + or - to indicate its sign.
  • That is empty or contains only white space is converted to +0.

Now I will go with the parseInt and parseFloat functions.

The purpose of those functions obviously is parsing, which is semantically different to type conversion, for example:

parseInt("20px");     // 20
parseInt("10100", 2); // 20
parseFloat("3.5GB");  // 3.5
// etc..

Is worth mentioning that the algorithm of parseInt changed in the ECMAScript 5th Edition Specification, it no longer interprets a number's radix as octal just for having a leading zero:

parseInt("010"); // 10, ECMAScript 5 behavior
parseInt("010"); // 8,  ECMAScript 3 behavior

As you can see, that introduced an incompatibility in the behavior between ES3 and ES5 implementations, and as always is recommended to use the radix argument, to avoid any possible problems.

Now your second question:

Why octal literals are dropped from ECMAScript 5th Edition in strict mode?

Actually, this effort of getting rid of octal literals comes since 1999. The octal literal productions (OctalIntegerLiteral and OctalEscapeSequence) were removed from the grammar of NumericLiterals since the ECMAScript 3rd Edition specification, they might be included for backwards compatibility (also in ES5) with older versions of the standard.

In fact, they are included in all major implementations, but technically an ES3 or ES5 compliant implementation could choose to not include them, because they are described as non-normative.

That was the first step, now ECMAScript 5 Strict Mode disallows them completely.

But why?

Because they were considered to be an error prone feature, and in fact, in the past they caused unintentional or hard to catch bugs - just as the same problem of implicit octals of parseInt -.

Now, under strict mode an octal literal will cause a SyntaxError exception -currently only observable in Firefox 4.0 Betas-.

CMS
This is a great answer and more than what I was originally expecting. I guess I overlooked `StringNumericLiteral` in the spec, and I certainly didn't know that white-space was permitted. That's just one of those things, I always expected white-space to result in *NaN*.
Andy E
Thanks @Andy, yes, I really often see people surprised that e.g. `isNaN("\t\r\n ")` returns `false` ;)
CMS
+1  A: 

To elaborate on why octal support was removed in ES5, it's mostly because, to the novice or non-programmer, the syntax is unexpected. Imagine vertically aligning a series of numbers (perhaps being added), using leading zeroes to align them, for example -- if your numbers don't use 8 or 9, they'll end up being treated as octal. Suddenly your code's off in the weeds for no obvious reason! This is why octal support was removed. A different octal syntax might someday be added if it doesn't create such a misfortune (I think I remember seeing 0o755 as one strawman idea), but for now octal is out.

Regarding the incompatible parseInt change noted in past responses: no implementation has made this change, and I suspect no implementation will make it. ES5 is mostly grounded in reality. Its new features generally don't break existing code (except that new code of course must take care in using new features not to break existing code as part of that use) that doesn't try to use the new features. Its incompatibilities are mostly negligible as well, or they're irrelevant because real-world implementations blithely ignored the specification for compatibility reasons. But not all the incompatibilities are well-founded: some are more aspirational than harmonizing. The change to parseInt is an example of an aspirational change. It breaks existing code that expects octal syntax, without an explicit radix, to parse as octal.

For the span of a few days SpiderMonkey (Mozilla's JavaScript engine) implemented a halfway-change to make parseInt, when called from strict mode code, disregard octal, and to support octal when not called from strict mode code. This is closer to what ES5 wants, but it's a clear impediment to converting non-strict code to strict mode, it would likely be confusing to the user, and -- perhaps most interestingly for implementers -- it means you couldn't implement parseInt in JavaScript itself (because there's no way in the specification to examine the strictness of one's calling function), as might be desirable at some future time (to reduce attack surface, ease implementation, and so on). So we undid the dependency. (I wrote the patch to make parseInt caller-dependent, and I reviewed the patch to undo it, spawned by further discussion after my initial patch landed.) parseInt now conforms to ES3 again, and given the web as it is, and that ES5's semantics are probably not compatible with it, I doubt we'll change. Therefore I doubt others will change, either. (I'm also pretty sure they'd agree with our estimation of the degree of incompatibility of the web with ES5's aspirational forbidding of implicit octal syntax in parseInt, and probably with our other reasons too. Even if we were to change I'm not sure they would follow, and I suspect they'd be smart not to.)

Jeff Walden
@Jeff: +1, thanks for the extra insight. I agree, the grammar for octal literals is rather dangerous to the unaware, unlike hex literal grammar's `0x` prefix that makes it stand out from decimal literals.
Andy E