views:

361

answers:

3

This is more of a language design rather than a programming question.

The following is an excerpt from JLS 15.19 Shift Operators:

If the promoted type of the left-hand operand is int, only the five lowest-order bits of the right-hand operand are used as the shift distance.

If the promoted type of the left-hand operand is long, then only the six lowest-order bits of the right-hand operand are used as the shift distance.

This behavior is also specified in C#, and while I'm not sure if it's in the official spec for Javascript (if there's one), it's also true based on my own test at least.

The consequence is that the following is true:

(1 << 32) == 1

I understand that this specification is probably "inspired" by the fact that the underlying hardware only takes 5 bits for the count operand when shifting 32-bit values (and 6 bits for 64-bit), and I can understand such behavior specified at the JVM level for example, but why would high level languages such as C# and Java retain this rather low-level behavior? Shouldn't they provide a more abstract view beyond the hardware implementation and behave more intuitively? (Even better if they can take a negative count to mean to shift in the OTHER direction!)

+3  A: 

Because in most programming environments an integer is only 32 bits. So then 5 bits (which is enough to express 32 values) is already enough to shift the entire integer. A similar reasoning exists for a 64bit long: 6 bits is all you need to completely shift the entire value.

I can understand part of the confusion: if your right-hand operand is the result of a calculation that ends up with a value greater than 32, you might expect it to just shift all the bits rather than apply a mask.

Joel Coehoorn
I understand why effective shifting of a 32-bit value only needs 5 bits at most, and anything beyond that essentially wipes out the entire register -- and as a user of the language, sometimes that's exactly what I want! As I said, the question is not why the parameters are chosen as such, but rather why something so low-level is retained at the high-level language.
polygenelubricants
+5  A: 

Java and C# are not fully "high-level". They try real hard to be such that they can be compiled into efficient code, in order to shine in micro-benchmarks. This is why they have the "value types" such as int instead of having, as default integer type, true integers, which would be objects in their own right, and not limited to a fixed range.

Hence, they mimic what the hardware does. They trim it a bit, in that they mandate masking, whereas C only allows it. Still, Java and C# are "medium-level" languages.

Thomas Pornin
The determination of whether a language is high-level or not is very subjective, I understand. I would think, however, that most people would classify Java and C# as "high", at least for non-scripting languages.
polygenelubricants
Absolutely. But Java and C# still retain low-level characteristics, for efficiency's sake (or at least _perceived efficiency_). The 32-bit `int` type and the shift count masking are such characteristics. Other languages, such as Scheme, are "higher level" in that matter.
Thomas Pornin
http://therighttool.hammerprinciple.com/statements/this-is-a-high-level-language
starblue
+2  A: 

C# and Java define shifting as using only the low-order bits of the shift count as that's what both sparc and x86 shift instructions do. Java was originally implemented by Sun on sparc processors, and C# by Microsoft on x86.

In contrast, C/C++ leave as undefined the behavior of shift instructions if the shift count is not in the range 0..31 (for a 32 bit int), allowing any behavior. That's because when C was first implemented, different handware handled these differently. For example, on a VAX, shifting by a negative amount shifts the other direction. So with C, the compiler can just use the hardware shift instruction and do whatever it does.

Chris Dodd