views:

731

answers:

11

I've been working in C and CPython for the past 3 - 5 years. Consider that my base of knowledge here.

If I were to use an assembly instruction such as MOV AL, 61h to a processor that supported it, what exactly is inside the processor that interprets this code and dispatches it as voltage signals? How would such a simple instruction likely be carried out?

Assembly even feels like a high level language when I try to think of the multitude of steps contained in MOV AL, 61h or even XOR EAX, EBX.

EDIT: I read a few comments asking why I put this as embedded when the x86-family is not common in embedded systems. Welcome to my own ignorance. Now I figure that if I'm ignorant about this, there are likely others ignorant of it as well.

It was difficult for me to pick a favorite answer considering the effort you all put into your answers, but I felt compelled to make a decision. No hurt feelings, fellas.

I often find that the more I learn about computers the less I realize I actually know. Thank you for opening my mind to microcode and transistor logic!

EDIT #2: Thanks to this thread, I have just comprehended why XOR EAX, EAX is faster than MOV EAX, 0h. :)

+7  A: 

This is a big question, and at most universities there's an entire semester-long class to answer it. So, rather than give you some terribly butchered summary in this little box, instead I'll direct you to the textbook that has the whole truth: Computer Organization and Design: The Hardware/Software Interface by Patterson and Hennessey.

Crashworks
+13  A: 

Edit: Here is a example of CPU (6502) that has been simulated using python/javascript AT THE TRANSISTOR LEVEL http://visual6502.org You can put your code in to see how it to do what it does.

Edit: Excellent 10 000m Level View : Soul of a New Machine - Tracy Kidder

I had great difficulty envisioning this until I did microcoding. Then it all made sense (abstractly). This is a complex topic but in a very very high level view.

Essentially think of it like this.

A cpu instruction is essentially a set of charges stored in electrical circuits that make up memory. There is circuity that cause those charges to be transferred to the inside of the CPU from the memory. Once inside the CPU the charges are set as input to the wiring of the CPU's circuitry. This is essentially a mathematical function that will cause more electrical output to occur, and the cycle continues.

Modern cpus are far far more complex but and include many layers of microcoding, but the principle remains the same. Memory is a set of charges. There is circuitry to move the charges and other circuitry to carry out function with will result in other charges (output) to fed to memory or other circuitry to carry out other functions.

To understand how the memory works you need to understand logic gates and how they are created from multiple transistors. This leads to the discovery that hardware and software are equivalent in in the sense that the essentially perform functions in the mathematical sense.

Preet Sangha
+12  A: 

This is a question that requires more than an answer on StackOverflow to explain.

To learn about this all the way from the most basic electronic components up to basic machine code, read The Art of Electronics, by Horowitz and Hill. To learn more about computer architecture, read Computer Organization and Design by Patterson and Hennessey. If you want to get into more advanced topics, read Computer Architecture: A Quantitative Approach, by Hennessey and Patterson.

By the way, The Art of Electronics also has a companion lab manual. If you have the time and resources available, I would highly recommend doing the labs; I actually took the classes taught by Tom Hayes, in which we built a variety of analog and digital circuits, culminating in building a computer from a 68k chip, some RAM, some PLDs, and some discrete components. You would enter machine code directly into RAM using a hexadecimal keypad; it was a blast, and a great way to get hands on experience at the very lowest levels of a computer.

Brian Campbell
The Art of Electronics rocks.
aaa
Too bad it hasn't been updated recently. It's getting somewhat dated. :-( Otherwise an excellent resource!
Brian Knoblauch
I would also recommend later chapters in SICP(http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-30.html#%_chap_5)
TokenMacGuy
@TokenMacGuy I would recommend anyone interested in programming read all of SICP, but I would say for this particular question, Horowitz and Hill is better for the low-level, hands on kind of experience, and Patterson and Hennessey better for describing real-world, relatively modern computer architecture. But yeah, I will always second a recommendation to read SICP.
Brian Campbell
What the chapters discussing register machines did for me was help me understand a bit better how logic gates are combined to form functional blocks and how those blocks are combined to execute instructions.
TokenMacGuy
@Brian Knoblauch I think they are working on next edition.
aaa
Woaaah, that class sounds *awesome*.
Paul Nathan
+5  A: 

A simpler introduction but still very good intro to a computer from the wire up
http://img.amazon.com/images/I/31GBgcA5PML._BO2,204,203,200_PIsitb-sticker-arrow-click,TopRight,35,-76_AA240_SH20_OU15_.jpg

Charles' Petzold's code

Martin Beckett
I <3ed this book; it's really quite enjoyable to read.
James McNellis
+1 This is the book OP needs to read.
claws
+1  A: 

The most basic element in a digital circuit should be the Logic Gate. Logic gates can be used to build logic circuits to perform boolean arithmetic, or decoders, or sequential circuits such as Flip-Flops. The Flip-Flop can be thought of a 1 bit memory. It is the basis of more complex sequential circuits, such as counters, or registers (arrays of bits).

A microprocessor is just a bunch of sequencers and registers."Instructions" to a microprocessor are no more than just patterns of bits that get sequentially pushed onto some of the registers, to trigger specific sequences to perform calculations on "Data". Data is represented as arrays of bits... and now we're on a higher level.

Simón
A: 

Well here's one terribly butchered summary :-)

A MOV AL, 61h is again a human readable form of code which is fed into the assembler.The assembler generates the equivalent hexcode which is basically a sequence of bytes understood by the processor and which is what you would store in the memory.In an embedded system environment, the linker scripts give you fine grained control as to where to place these bytes (separate areas for program/data etc) in memory.

The processor essentially contains a finite state machine (microcode) implemented using flip flops. The machine reads(fetch cycle) the hex code for 'MOV' from the memory, figures out(decode cycle) that it needs an operand,which in this case is 61h, again fetches it from memory, and executes it (i.e copies 61 into the accumulator register.'Read' 'fetch' , execute' etc all mean the bytes are shifted/added in and out of shift registers using digital circuits like adders,subtractors,multiplexers etc

itisravi
A: 

what exactly is inside the processor that interprets this code and dispatches it as voltage signals

I'd like to say 'hardware', but a truer answer is 'microcode'.

ChrisW
RISC and VLIW architectures are not microcoded and are prevelent in embedded systems.
Clifford
@Clifford `MOV AL, 61h` and `XOR EAX, EBX` are x86-family instructions.
ChrisW
@ChrisW: I took that merely as a generic example of a typical instruction; the question seemed more broad than that (perhaps too broad!); but fair point, both examples are x86 instructions. So I am left wondering why it was tagged "embedded", since the question is broader than that too (and x86 is not that common in embedded systems).
Clifford
+14  A: 

I recently started reading Charles Petzold book titled Code, which so far covers exactly the kinds of things I assume you are curious about. But I have not gotten all the way through so thumb through the book first before buying/borrowing.

This is my relatively short answer, not Petzolds...and hopefully in line with what you were curios about.

You have heard of the transistor I assume. The original way to use a transistor was for things like a transistor radio. it is an amplifier basically, take the tiny little radio signal floating in air and feed it into the input of the transistor which opens or closes the flow of current on a circuit next to it. And you wire that circuit with higher power, so you can take a very small signal, amplify it and feed it into a speaker for example and listen to the radio station (there is more to it isolating the frequency and keeping the transistor balanced, but you get the idea I hope).

Now that the transistor exists that lead to was a way to use a transistor as a switch, like a light switch. The radio is like a dimmer light switch you can turn it to anywhere from all the way on to all the way off. A non-dimmer light switch is either all on or all off, there is some magic place in the middle of the switch where it changes over. We use transistors the same way in digital electronics. Take the output of one transistor and feed it into another transistors input. The output of the first is certainly not a small signal like the radio wave, it forces the second transistor all the way on or all the way off. that leads to the concept of TTL or transistor-transistor logic. Basically you have one transistor that drives a high voltage or lets call it a 1, and on that sinks a zero voltage, lets call that a 0. And you arrange the inputs with other electronics so that you can create AND gates (if both inputs are a 1 then the output is a 1), OR gates (if either one or the other input is a 1 then the output is a one). Inverters, NAND, gates, NOR gates (an or with an inverter) etc. There used to be a TTL handbook and you could buy 8 or so pin chips that had one or two or four of some kind of gate (NAND, NOR, AND, etc) functions inside, two inputs and an output for each. Now we dont need those it is cheaper to create programmable logic or dedicated chips with many millions of transistors. But we still think in terms of AND, OR, and NOT gates for hardware design. (usually more like nand and nor).

I dont know what they teach now but the concept is the same, for memory a flip flop can be thought of as two of these TTL pairs (NANDS) tied together with the output of one going to the input of the other. Lets leave it at that. That is basically a single bit in what we call SRAM, or static ram. sram takes basically 4 transistors. Dram or dynamic ram the memory sticks you put in your computer yourself take one transistor per bit, so for starters you can see why dram is the thing you buy gigabytes worth of. Sram bits remember what you set them to so long as the power doesnt go out. Dram starts to forget what you told it as soon as you tell it, basically dram uses the transistor in yet a third different way, there is some capacitance (as in capacitor, wont get into that here) that is like a tiny rechargeable battery, as soon as you charge it and unplug the charger it starts to drain. Think of a row of glasses on a shelf with little holes in each glass, these are your dram bits, you want some of them to be ones so you have an assistant fill up the glasses you want to be a one. That assistant has to constantly fill up the pitcher and go down the row and keep the "one" bit glasses full enough with water, and let the "zero" bit glasses remain empty. So that at any time you want to see what your data is you can look over and read the ones and zeros by looking for water levels that are definitely above the middle being a one and levels definitely below the middle being a zero.. So even with the power on, if the assistant is not able to keep the glasses full enough to tell a one from a zero they will eventually all look like zeros and drain out. Its the trade off for more bits per chip. So short story here is that outside the processor we use dram for our bulk memory, and there is assistant logic that takes care of keeping the ones a one and zeros a zero. But inside the chip, the AX register and DS registers for example keep your data using flip flops or sram. And for every bit you know about like the bits in the AX register, there are likely hundreds or thousands or more that are used to get the bits into and out of that AX register.

You know that processors run at some clock speed, these days around 2 gigahertz or two billion clocks per second. Think of the clock, which is generated by a crystal, another topic, but the logic sees that clock as a voltage that goes high and zero high and zero at this clock rate 2ghz or whatever (gameboy advances are 17mhz, old ipods around 75mhz, original ibm pc 4.77mhz).

So transistors used as switches allow us to take voltage and turn it into the ones and zeros we are familiar with both as hardware engineers and software engineers, and go so far as to give us AND, OR, and NOT logic functions. And we have these magic crystals that allow us to get an accurate oscillation of voltage.

So we can now do things like say, if the clock is a one, and my state variable says I am in the fetch instruction state, then I need to switch some gates so that the address of the instruction I want, which is in the program counter, goes out on the memory bus, so that the memory logic can give me my instruction for MOV AL,61h. You can look this up in a x86 manual, and find that some of those opcode bits say this is a mov operation and the target is the lower 8 bits of the EAX register, and the source of the mov is an immediate value which means it is in the memory location after this instruction. So we need to save that instruction/opcode somewhere and fetch the next memory location on the next clock cycle. so now we have saved the mov al, immediate and we have the value 61h read from memory and we can switch some transistor logic so that bit 0 of that 61h is stored in the bit 0 flipflop of al and bit 1 to bit 1, etc.

How does all that happen you ask? Think about a python function performing some math formula. you start at the top of the program with some inputs to the formula that come in as variables, you have individual steps through the program that might add a constant here or call the square root function from a library, etc. And at the bottom you return the answer. Hardware logic is done the same way, and today programming languages are used one of which looks a lot like C. The main difference is your hardware functions might have hundreds or thousands of inputs and the output is a single bit. On every clock cycle, bit 0 of the AL register is being computed with a huge algorithm depending how far out you want to look. Think about that square root function you called for your math operation, that function itself is one of these some inputs produce an output, and it may call other functions maybe a multiply or divide. So you likely have a bit somewhere that you can think of as the last step before bit 0 of the AL register and its function is: if clock is one then AL[0] = AL_next[0]; else AL[0] = AL[0]; But there is a higher function that contains that next al bit computed from other inputs, and a higher function and a higher function and much of these are created by the compiler in the same way that your three lines of python can turn into hundreds or thousands of lines of assembler. A few lines of HDL can become hundreds or thousands or more transistors. hardware folks dont normally look at the lowest level formula for a particular bit to find out all the possible inputs and all the possible ANDs and ORs and NOTs that it takes to compute any more than you probably inspect the assembler generated by your programs. but you could if you wanted to.

A note on microcoding, most processors do not use microcoding. you get into it with the x86 for example because it was a fine instruction set for its day but on the surface struggles to keep up with modern times. other instruction sets do not need microcoding and use logic directly in the way I described above. You can think of microcoding as a different processor using a different instruction set/assembly language that is emulating the instruction set that you see on the surface. Not as complicated as when you try to emulate windows on a mac or linux on windows, etc. The microcoding layer is designed specifically for the job, you may think of there only being the four registers AX, BX, CX, DX, but there are many more inside. And naturally that one assembly program somehow can get executed on multiple execution paths in one core or multiple cores. Just like the processor in your alarm clock or washing machine, the microcode program is simple and small and debugged and burned into the hardware hopefully never needing a firmware update. At least ideally. but like your ipod or phone for example you sometimes do want a bug fix or whatever and there is a way to upgrade your processor (the bios or other software loads a patch on boot). Say you open the battery compartment to your TV remote control or calculator, you might see a hole where you can see some bare metal contacts in a row, maybe three or 5 or many. For some remotes and calculators if you really wanted to you could reprogram it, update the firmware. Normally not though, ideally that remote is perfect or perfect enough to outlive the TV set. Microcoding provides the ability to get the very complicated product (millions, hundreds of millions of transistors) on the market and fix the big and fixable bugs in the field down the road. Imagine a 200 million line python program your team wrote in say 18 months and having to deliver it or the company will fail to the competitions product. Same kind of thing except only a small portion of that code you can update in the field the rest has to remain carved in stone. for the alarm clock or toaster, if there is a bug or the thing needs help you just throw it out and get another.

If you dig through wikipedia or just google stuff you can look at the instruction sets and machine language for things like the 6502, z80, 8080, and other processors. There may be 8 registers and 250 instructions and you can get a feel from the number of transistors that that 250 assembly instructions is still a very high level language compared to the sequence of logic gates it takes to compute each bit in a flip flop per clock cycle. You are correct in that assumption. Except for the microcoded processors, this low level logic is not re-programmable in any way, you have to fix the hardware bugs with software (for hardware that is or going to be delivered and not scrapped).

Look up that Petzold book, he does an excellent job of explaining stuff, far superior to anything I could ever write.

dwelch
Nice answer. Though I wouldn't call it "relatively short" ;-).
sleske
@sleske It is relatively short; relative to the length which a discussion of this topic could take, such as my answer, which references three textbooks and a lab manual. Compared to that, this answer is short.
Brian Campbell
right that is what I was thinking, compared to a full blown EE degree, years of calculus so you can take senior physics to compute electron charges, superconductors, etc, the analog electrical engineering so that a transistor becomes a simple switch. It is a blip on the radar, compared to normal SO posts, it is relatively huge. sleske did put the little smiley face on there so you know it was a joke comment...
dwelch
+9  A: 

Explaining the whole system in any detail is impossible to do without entire books, but here is a very high level overview of a simplistic computer:

  • At the lowest level there is physics and materials.
  • Using physics and materials, you can derive the NAND logic gate.
  • Using the NAND gate, you can derive all the other basic logic gates (AND, OR, XOR, NOT, etc).
  • Using the basic logic gates, you can derive more complicated circuits such as the adder, the multiplexer, and so forth.
  • Also using the basic logic gates, you can derive stately digital circuit elements such as the flip flop, the clock, and so forth.
  • Using your more complicated stately circuits, you can derive higher-level pieces like counters, memory, registers, the arithmetic-logic-unit, etc.
  • Now you just have to glue your high level pieces together such that:
    • A value comes out of memory
    • The value is interpreted as an instruction by dispatching it to the appropriate place (eg. the ALU or memory) using multiplexers and etc. (Basic instruction types are read-from-memory-into-register, write-from-register-into-memory, perform-operation-on-registers, and jump-to-instruction-on-condition.)
    • The process repeats with the next instruction

To understand how an assembly instruction causes a voltage change, you simply need to understand how each of those levels is represented by the level below. For example, an ADD instruction will cause the value of two registers to propagate to the ALU, which has circuits that compute all of the logic operations. Then a multiplexer on the other side, being fed the ADD signal from the instruction, selects the desired result, which propagates back to one of the registers.

Strilanc
You typically don't build all the circuits up from just NAND; you use some combinations that don't entirely follow too (for efficiency). And the single most important part of any CPU is the one you omit: the latch, typically driven by a clock signal. (It's also the core of how a CPU register works.)
Donal Fellows
@Donal This is for a simplistic computer, not a practical computer. I had to cut a lot of information at the other levels as well. Also, I said flip flop instead of latch.
Strilanc
A: 

The rough draft of the book "Microprocessor Design" is currently online at Wikibooks.

I hope that someday it will include an excellent answer to that question. Meanwhile, perhaps you can still learn something from the current rough draft of an answer to that question, and help us make improvements or at least point out stuff we forgot to explain and areas where the explanation is confusing.

David Cary
+2  A: 

VERY briefly,

A machine code instruction is stored within the processor as a series of bits. If you look up MOV in the processor data sheet, you'll see that it has a hex value, like (for example) 0xA5, that is specific to the MOV instruction.. (There are different types of MOV instructions with different values, but let's ignore that for the moment).

0xA5 hex == 10100101 binary.

*(this is not a real opcode value for MOV on an X86 - I'm just picking a value for illustration purposes).

Inside of the processor, this is stored in a "register", which is really an array of flip-flops or latches, which store a voltage:

+5 0 +5 0 0 +5 0 +5

Each of these voltages feeds into the input of a gate or collection of gates.

At the next clock edge, those gates update their output based in the input voltages from the register.

The output of those gates feeds into another level of gates, or back to themselves. That level feeds into the next, which feeds into the next, and so on.

Eventually, a gate output way down the line will be connected back to another latch/flip-flop (internal memory), or one of the output pins on the processor.

Register->(clock)->Gate A->(clock)->Gate B->pin
                                          ->latch

(ignoring feedback for different gate types and higher-level structures)

These operations happen in parallel to a certain degree as defined by the core architecture. One of the reasons that "faster" processors -say, 2.0GHz vs 1.0GHz - perform better is that a faster clock speed (the GHz value) results in faster propagation from one collection of gates to the next.

It's important to understand that, at a very high level, all a processor does is change pin voltages. All of the glorious complexity that we see when using a device such as a PC is derived from the internal pattern of gates and the patterns in the external devices/peripherals attached to the processor, like other CPUs, RAM, etc. The magic of a processor is the patterns and sequences in which its pins change voltages, and the internal feedback that allows the state of the CPU at one moment to contribute to its state at the next. (In assembly, this state is represented by flags, the instruction pointer/counter, register values, etc.)

In a very real way, the bits of each opcode(machine code instruction) are physically tied to the internal structure of the processor (though this may be abstracted to a certain degree with an internal lookup table/instruction map where necessary).

Hope that helps. I've also got a nice EE education under my belt and a whole lot of embedded development experience, so these abstractions make sense to me, but may not be very useful to a neophyte.

David Lively