views:

461

answers:

2

I'm getting an Undefined Instruction error while running an embedded system, no coprocessor, no MMU, Atmel 9263. The embedded system has memory in the range 0x20000000 - 0x23FFFFFF. I've had two cases so far:

  1. SP 0x0030B840, LR 2000AE78 - the LR points at valid code, so I'm not sure what causes the exception, although the SP is bogus. What other addresses, registers, memory locations should I look at?

  2. SP 0x20D384A8, LR 0x1FFCA59C - SP is ok, LR is bogus. Is there some kind of post mortem that I can do to find out how the LR got crushed? Looks like it rolled backwards off the end of the address space, but I can't figure out how.

Right now I am just replacing large chunks of code with simulations and running the tests agin to try and isolate the issue - the problem is sometimes it takes 4 hours to show the problem.

Any hints out there would be appreciated, thanks!

The chip is the AT91SAM9263, and we are using the IAR EWARM toolchain. I'm pretty sure it is straight ARM, but I will check.

EDIT

Another example of the Undef Instruct - this time SP/LR look fine. LR = 0x2000b0c4, and when I disassemble near there:

2000b0bc e5922000 LDR R2, [R2, #+0]
2000b0c0 e12fff32 BLX R2
2000b0c4 e1b00004 MOVS R0, R4

since LR is the instruction following the Undef Exception - how is BLX identified as Undefined? Note that CPSR is 0x00000013, so this is all ARM mode. However, R2 is 0x226d2a08 which is in the heap area, and I think is incorrect - the disassmbly there is ANDEQ R0,R0,R12, the instruction is 0x0000000C, and the other instructions there look like data to me. So I think the bad R2 is the problem, I'm just trying to understand why the Undef at the BLX?

thanks!

+1  A: 

Check the T bit in the CPSR. If you are inadvertently changing from ARM mode to Thumb mode (or vice versa), undefined instructions will occur.

As far as the SP or LR getting corrupted, it could be that you execute a few instructions in the wrong mode that corrupt them before hitting the undefined instruction.

EDIT

Responding to the new error case in the edit of the question:

LR contains the return address from the BLX R2, so it makes sense that it points to one instruction after the BLX.

If R2 was pointing to the heap when the BLX R2 was executed, you'll jump into the heap and start executing the data as if they were instructions. This will cause an undefined instruction exception in short order...

If you want to see the exact instruction that was undefined, look at the R14_und register (defined while you're in the undefined instruction handler) - it contains the address of the next instruction after the Undefined one.

The root cause is the bad value in R2. Assuming this is C code, my guess is a bad pointer dereference, but I'd need to see the source to know for sure.

Brent
CPSR = 0x00000013 which means ARM mode, right?
Jeff
Yep, bit 5 is the T bit, and 0 means ARM.
Brent
+1  A: 

Is this an undefined instruction or a data abort because you are reading from an unaligned address?

edit:

On an undefined exception CPSR[4:0] should be 0b11011 or 0x1B not 0x13, 0x13 is a reset according to the arm arm.

dwelch