views:

474

answers:

5

If I understand this correctly:

Current CPU developing companies like AMD and Intel have their own API codes (the assembly language) as what they see as the 2G language on top of the Machine code (1G language)

Would it be possible or desirable (performance or otherwise) to have a CPU that would perform IL handling at it's core instead of the current API calls?

+10  A: 

This is not a new idea - the same thing was predicted for Java, and Lisp machines were even actually implemented.

But experience with those shows that it's not really useful - by designing special-purpose CPUs, you can't benefit from the advances of "traditional" CPUs, and you very likely can't beat Intel at their own game. A quote from the Wikipedia article illustrates this nicely:

cheaper desktop PCs soon were able to run Lisp programs even faster than Lisp machines, without the use of special purpose hardware.

Translating from one kind of machine code to another on the fly is a well-understood problem and so common (modern CISC CPUs even do something like that internally because they're really RISC) that I believe we can assume it is being done efficiently enough that avoiding it does not yield significant benefits - not when it means you have to decouple yourself from the state of the art in traditional CPUs.

Michael Borgwardt
+1. Good point, if there were benefits to this approach we would have seen Java byte code processors make more of a splash.
AnthonyWJones
I think Chuck Moore actually made quite a bit of money with his FORTH chips.
paxdiablo
A: 

I wouldn't have thought so for two reasons:-

  1. If you had hardware processing IL that hardware would not be able to run a newer version of IL. With JIT you just need a new JIT then existing hardware can run the newer IL.
  2. IL is simple and is designed to be hardware agnostic. The IL would have to become much more complex to enable it to describe operations in the most effecient manner in order to get anywhere close to the performance of existing machine code. But that would mean the IL would be much harder to run on non IL specific hardware.
AnthonyWJones
+6  A: 

I would say no.

The actual machine language instructions that need to run on a computer are lower level than IL. IL, for example, doesn't really describe how methods calls should be made, how registers should be managed, how the stack should be accessed, or any other of the details that are needed at the machine code level.

Getting the machine to recognize IL directly would, therefore, simple move all the JIT compilation logic from software into hardware.

That would make the whole process very rigid and unchangeable.

By having the machine language based on the capabilities of the machine, and an intermediate language based on capturing programmer intent, you get a much better system. The folks defining the machine can concentrate on defining an efficient computer architecture, and the folks defining the IL system can focus on things like expressiveness and safety.

If both the tool vendors and the hardware vendors had to use the exact same representation for everything, then innovation in either the hardware space or the tool space would be hampered. So, I say they should be separate from one another.

Scott Wisniewski
+18  A: 

A similar technology does exist for Java - ARM do a range of CPUs that can do this, they call it their "Jazelle" technology.

However, the operations represented by .net IL opcodes are only well-defined in combination with the type information held on the stack, not on their own. This is a major difference from Java bytecode, and would make it much more difficult to create sensible hardware to execute IL.

Moreover, IL is intended for compilation to a final target. Most back ends that spit out IL do very little optimisation, aiming instead to preserve semantic content for verification and optimisation in the final compilation step. Even if the hardware problems could be overcome, the result will almost certainly still be slower than a decent optimising JIT.

So, to sum up: while it is not impossible, it would be disproportionately hard compared to other architectures, and would achieve little.

moonshadow
I think Your answer addresses my question perfectly..thanks.
Ric Tokyo
+14  A: 

You seem a bit confused about how CPU's work. Assembly is not a separate language from machine code. It is simply a different (textual) representation of it.

Assembly code is simply a sequential listing of instructions to be executed. And machine code is exactly the same thing. Every instruction supported by the CPU has a certain bit-pattern that cause it to be executed, and it also has a textual name you can use in assembly code.

If I write add $10, $9, $8 and run it through an assembler, I get the machine code for the add instruction, taking the values in registers 9 and 8, adding them and storing the result in register 10.

There is a 1 to 1 mapping between assembler and machine code.

There also are no "API calls". The CPU simply reads from address X, and matches the subsequent bits against all the instructions it understands. Once it finds an instruction that matches this bit pattern, it executes the instruction, and moves on to read the next one.

What you're asking is in a sense impossible or a contradiction. IL stands for Intermediate Language, that is, a kind of pseudocode that is emitted by the compiler, but has not yet been translated into machine code. But if the CPU could execute that directly, then it would no longer be intermediate, it would be machine code.

So the question becomes "is your IL code a better, more efficient representation of a program, than the machine code the CPU supports now?"

And the answer is most likely no. MSIL (I assume that's what you mean by IL, which is a much more general term) is designed to be portable, simple and consistent. Every .NET language compiles to MSIL, and every MSIL program must be able to be translated into machine code for any CPU anywhere. That means MSIL must be general and abstract and not make assumptions about the CPU. For this reason, as far as I know, it is a purely stack-based architecture. Instead of keeping data in registers, each instruction processes the data on the top of the stack. That's a nice clean and generic system, but it's not very efficient, and doesn't translate well to the rigid structure of a CPU. (In your wonderful little high-level world, you can pretend that the stack can grow freely. For the CPU to get fast access to it, it must be stored in some small, fast on-chip memory with finite size. So what happens if your program push too much data on the stack?)

Yes, you could make a CPU to execute MSIL directly, but what would you gain? You'd no longer need to JIT code before execution, so the first time you start a program, it would launch a bit faster. Apart from that, though? Once your MSIL program has been JIT'ed, it has been translated to machine code and runs as efficiently as if it had been written in machine code originally. MSIL bytecode no longer exists, just a series of instructions understood by the CPU.

In fact, you'd be back where you were before .NET. Non-managed languages are compiled straight to machine code, just like this would be in your suggestion. The only difference is that non-managed code targets machine code that is designed by CPU designers to be suitable for execution on a CPU, while in your case, it'd target machine code that's designed by software designers to be easy to translate to and from.

jalf
Very thorough, great post!
Bob Somers
In the old days there were CISC and RISC processors, now some of the CISC chips compile their instructions down to RISC internally. I think this is what the OP is refering to.
Martin Brown
Yeah, but once again, what would be the point? You'd lose the flexibility/portability that JIT'ed bytecode offers, and you'd require the CPU to do more work for every instruction, just to save a bit of work up front during JIT-compilation.
jalf