views:

745

answers:

6

I have a watchdog in my microcontroller that if it is not kicked, will reset the processor. My applications runs fine for a while but will eventually reset because the watchdog did not get kicked. If I step through the program it works fine.

What are some ways to debug this?

EDIT: Conclusion: The way I found my bug was the watchdog breadcrumbs.

I am using a PIC that has a high and low ISR vector. The High vector was suppose to handle the LED matrix and the Low vector was to handle the timer tick. But I put both ISR handlers in the high vector. So when I disabled the LED matrix ISR and the timer tick ISR needed service, the processor would be stuck in the low ISR to handle the timer tick, but the timer tick handler was not there.

The breadcrumbs limited my search down to the function that handled the led matrix and specifically disabling the LED matrix interrupt.

A: 

Question every assumption you make, twice:

  • Make sure the watchdog is kicked (I don't know the logging facilities on the processor).
  • Make sure the watchdog, when kicked, doesn't reset the processor.

And wonder what differences there are between 'stepping through' and running alone; timing constraints will surely matter.

xtofl
+3  A: 

Add an uninitialized global variable that is set to different values throughout the code. Specifically, set it before and after major function calls.

Put a breakpoint at the beginning of main.

When the processor resets the global variable will still have the last value it was set to. Keep adding these "bread crumbs" to narrow down to the problem function.

Robert
+1  A: 

Many software watchdogs are automatically disabled when you attach a debugger (to prevent it from restarting while the debugger has the application halted).

That said, here are some basics:

Is this a multithreaded applications? Are you using a RT scheduler? If so, is your watchdog task starved?

Make sure your watchdog task can't be stuck on anything (pending semaphore, waiting for a message, etc). Sometimes, functions can block in ways you might not expect; for example, I have a Linux platform I'm working on right now where I can get printf to block quite easily.

If it's single threaded, a profiler may help you identify timing issues.

If this is a new system, make sure the watchdog works correctly; test simple code that just hits the WD and then sleeps in an infinite loop.

Mikeage
+1 for the first paragraph in particular.
Steve Melnikoff
+1  A: 

I use state-based programming and a trick I've always wanted to employ was to reserve one output port for the current state in binary. Then hook up a logic analyzer and see the timings of the state changes. You could do something similar here: Do what Robert said and create a global variable and change its value at key points - preferably with a function that immediately sets the value of the port to the current state (ie changeState(nextState); ) Change the state when you enter the function that kicks the dog, then change it back to the previous state before you leave the function. You should be able to see from what functions it DOESN'T get kicked and then you can work on those.

Good luck, it sounds like a timing problem and those are tough to solve.

Stephen Friederichs
A: 

Usually the watchdog task/thread runs at a low priority. So if the watchdog isn't getting kicked, this should be because the processor is busy doing something else - probably something that it shouldn't be doing.

It would be really useful to dump out the execution context (local stack, scheduling state etc.) for each task/thread just before the processor resets. With a bit of luck and work, you'll be able to determine what is preventing the watchdog task from kicking the timer.

billmcc
A: 

I'd use an extra output pin, set high then low at appropriate points in the code to limit the scope of where I'm looking. Then I'd trace it on a digital scope or logic analyzer. This is equivalent to the breadcrumbs method mentioned by another poster, but you'll be able to time correlate to the reset pulse much better.

Michael Kohne