views:

367

answers:

1

There was this problem that has been asked about implementing a load byte into a single cycle datapath without having to change the data memory, and the solution was something below.

alt text

This is actually quite a realistic question; most memory systems are entirely word-based, and individual bytes are typically only dealt with inside the processor. When you see a “bus error” on many computers, this often means that the processor tried to access a memory address that was not properly word-aligned, and the memory system raised an exception. Anyway, because byte addresses might not be a multiple of 4, we cannot pass them to memory directly. However, we can still get at any byte, because every byte can be found within some word, and all word addresses are multiples of 4. So the first thing we do is to make sure we get the right word. If we take the high 30 bits of the address (i.e., ALUresult[31-2]) and combine them with two 0 bits at the low end (this is what the “left shift 2” unit is really doing), we have the byte address of the word that contains the desired byte. This is just the byte’s own address, rounded down to a multiple of 4. This change means that lw will now also round addresses down to multiples of 4, but that’s OK since non-aligned addresses wouldn’t work for lw anyway with this memory unit. OK, now we get the data word back from memory. How do we get the byte we want out of it? Well, note that the byte’s byte-offset within the word is just given by the low-order 2 bits of the byte’s address. So, we simply use those 2 bits to select the appropriate byte out of the word using a mux. Note the use of big-endian byte numbering, as is appropriate for MIPS. Next, we have to zero-extend the byte to 32 bits (i.e., just combine it with 24 zeros at its high end), because the problem specifies to do so. Actually, this was a slight mistake in the question: in reality, the lbu instruction zero-extends the byte, but lb sign-extends it. Oh, well. Finally, we have to extend the MemtoReg-controlled mux to accept one new input: the zero-extended byte for the lb case. The MemtoReg control signal must be widened to 2 bits. The original 0 and 1 cases change to 00 and 01, respectively, and we add a new case 10 which is only used in the case of lb.

I don't quite actually understand on how this works even after reading the explanation, especially about left shift the ALU result by 2 would give the byte address... how is this possible?? so if I would like to load a half word then I would do one left shift and I would get the address of the half word?? what would be a better way to do load byte, load half word by modifying the data memory? (the question above puts constraints that we can't modify the data memory)

A: 

The original author simply seems to be adding a byte multiplexer to the 32-bit data being read from the memory. This memory allows a full 32-bit naturally aligned load (lw instruction) and the additional byte multiplexer and zero extension allows for load byte instructions as well (lbu instruction).

The left shift of the ALU result yields a word address, NOT a byte address, and accounts for the implicit right shift by two in the signal routing. The end result is simply the lower two bits of the ALU result being masked (zeroed) before being sent to the memory. The two LSBs of the ALU value are fed down-stream of the memory to the byte multiplexer, allowing the word memory to read arbitrary bytes.

There is no direct support in the logic shown for loading half-words (16-bits), just bytes and full 32-bit words. You could, however, easily modify the byte addressing logic to support words instead of bytes (or even both) using a similar approach.

Charles Steinkuehler
so the point is that.1. will this work without having to shift left by 2?2. if I want to do a half word then I need to have a 2x1 mux and use the least significant bit as the control signal?
EquinoX
1) There is really no left shift by two. There is a right shift by two (caused by dropping the two LSBs: ALUResult[31-2]) and the left shift by two simply puts the bits back in the same place (with zero in the two LSBs). It would be much more obvious to simply mask the lower two bits.2) To do an aligned half word read, you would use ALUresult[1] to control a 16-bit 2 to 1 multiplexer. You cannot support arbitrary single cycle half-word reads with this memory, because one of the unaligned cases (address LSB = 11) requires bytes from two different 32-bit words (ie: two memory accesses).
Charles Steinkuehler