views:

1180

answers:

7

I have encountered a problem in a C program running on an AVR microcontroller (ATMega328P). I believe it is due to a stack/heap collision but I'd like to be able to confirm this.

Is there any way I can visualise SRAM usage by the stack and the heap?

Note: the program is compiled with avr-gcc and uses avr-libc.

Update: The actual problem I am having is that the malloc implementation is failing (returning NULL). All mallocing happens on startup and all freeing happens at the end of the application (which in practice is never since the main part of the application is in an infinite loop). So I'm sure fragmentation is not the issue.

+1  A: 

The usual approach would be to fill the memory with a known pattern and then to check which areas are overwritten.

starblue
+2  A: 

If you're using both stack and heap, then it can be a little more tricky. I'll explain what I've done when no heap is used. As a general rule, all the companies I've worked for (in the domain of embedded C software) have avoided using heap for small embedded projects—to avoid the uncertainty of heap memory availability. We use statically declared variables instead.

One method is to fill most of the stack area with a known pattern (e.g. 0x55) at start-up. This is usually done by a small bit of code early in the software execution, either right at the start of main(), or perhaps even before main() begins, in the start-up code. Take care not to overwrite the small amount of stack in use at that point of course. Then, after running the software for a while, inspect the contents of stack space and see where the 0x55 is still intact. How you "inspect" depends on your target hardware. Assuming you have a debugger connected, then you can simply stop the micro running and read the memory.

If you have a debugger that can do a memory-access breakpoint (a bit more fancy than the usual execution breakpoint), then you can set a breakpoint in a particular stack location—such as the farthest limit of your stack space. That can be extremely useful, because it also shows you exactly what bit of code is running when it reaches that extent of stack usage. But it requires your debugger to support the memory-access breakpoint feature and it's often not found in the "low-end" debuggers.

If you're also using heap, then it can be a bit more complicated because it may be impossible to predict where stack and heap will collide.

Craig McQueen
A: 

If you can edit the code for your heap, you could pad it with a couple of extra bytes (tricky on such low resources) on each block of memory. These bytes could contain a known pattern different from the stack. This might give you a clue if it collides with the stack by seeing it appear inside the stack or vice versa.

DoxaLogos
The pattern could be checked in the free function, but it will still be hard to find out when the error occured.Also note that sometimes extra stack space is reserved for local variables that might not be used (depends on compiler/code). In that case the patters could be left unaltered while the heap is still corrupted.
Ron
+1  A: 

Assuming you're using just one stack (so not an RTOS or anything) and that the stack is at the end of the memory, growing down, while the heap is starting after the BSS/DATA region, growing up. I've seen implementations of malloc that actually check the stackpointer and fail on a collision. You could try to do that.

If you're not able to adapt the malloc code, you could choose to put your stack at the start of the memory (using the linker file). In general it's always a good idea to know/define the maximum size of the stack. If you put it at the start, you'll get an error on reading beyond the beginning of the RAM. The Heap will be at the end and can probably not grow beyond the end if it's a decent implemantation (will return NULL instead). Good thing is you know have 2 separate error cases for 2 separate issues.

To find out the maximum stack size, you could fill your memory with a pattern, run the application and see how far it went, see also reply from Craig.

Ron
The malloc implementation is failing (returning NULL). The problem I have is that I'm not sure it is a collision which is causing it...
Matthew Murdoch
A collision would typically result in very weird things like returning to the wrong function or data changing or suddenly executing outside the RAM/FLASH and having an addressing error. If your application looks "normal" it's likely not a collision.To debug this, set a breakpoint where the NULL is returned (or if you can debug the malloc function itself, setting the breakpoint there is even better). At that point, check the stackpointer and see if the collision appeared.Also, is your Heap defined as "whatever memory is not used" or can you set the Heap size?
Ron
I don't have the luxury of a debugger unfortunately...
Matthew Murdoch
Can you print things to e.g. the serial line? If so, you could still print the info to the serial line when it occurs. To make it easier, you could write a wrapper around the malloc function and call that from your code instead of malloc. In the wrapper, check for the NULL return value and print the stackpointer.If you don't have access to print function, how are you debugging at the moment then?
Ron
@Ron - I can print to the serial line. How would you suggest I print the stack pointer?
Matthew Murdoch
One ugly hack I used before is to make a local variable array. For example make "char buffer[8]" in the scope of a "printStack" function. This buffer is allocated on the stack. Then use a for-loop to print this buffer but instead of just reading 8 bytes, you can read 256 bytes or just how many you want. Reading past the boundary of "buffer" will result in reading up the stack. To know the address of the stackpointer, print the address of buffer. It's not 100% correct but it'll give you an idea. I know, it's ugly, but it works.
Ron
+2  A: 

Don't use the heap / dynamic allocation on embedded targets. Especially with a processor with such limited resources. Rather redesign your application because the problem will reoccur as your program grows.

Gerhard
+1  A: 

On Unix like operating systems a library function named sbrk() with a parameter of 0 allows you to access the topmost address of dynamically allocated heap memory. The return value is a void * pointer and could be compared with the address of an arbitrary stack allocated variable.

Using the result of this comparison should be used with care. Depending on the CPU and system architecture, the stack may be growing down from a arbitrary high address while the allocated heap will move up from low-bound memory.

Sometimes the operating system has other concepts for memory management (i.e. OS/9) which places heap and stack in different memory segments in free memory. On these operating systems - especially for embedded systems - you need to define the maximum memory requirements of your applications in advance to enable the system to allocate memory segments of matching sizes.

Ralf Edmund
The ATMega328P doesn't run Linux/BSD (it only has 2K of RAM and I'm not running an OS on it at all) and yes, the stack grows down and the heap grows up.
Matthew Murdoch
+3  A: 

You say malloc is failing and returning NULL:

The obvious cause which you should look at first is that your heap is "full" - i.e, the memory you've asked to malloc cannot be allocated, because it's not available.

There are two scenarios to bear in mind:

a: You have a 16 K heap, you've already malloced 10 K and you try and malloc a further 10K. Your heap is simply too small.

b: More commonly, you have a 16 k Heap, you've been doing a bunch of malloc/free/realloc calls and your heap is less than 50% 'full': You call malloc for 1K and it FAILS - what's up? Answer - the heap free space is fragmented - there isn't a contigous 1K of free memory that can be returned. C Heap managers can not compact the heap when this happens, so you're generally in a bad way. There are techniques to avoid fragmentation, but it's difficult to know if this is really the problem. You'd need to add logging shims to malloc and free so that you can get an idea of what dynamic memory operations are being performed.

EDIT:

You say all mallocs happen at startup, so fragmentation isn't the issue.

In which case, it should be easy to replace the dynamic allocation with static.

old code example:

char *buffer;

void init()
{
  buffer = malloc(BUFFSIZE);
}

new code:

char buffer[BUFFSIZE];

Once you've done this everywhere, your LINKER should warn you if everything cannot fit into the memory available. Don't forget to reduce the heap size - but beware that some runtime io system functions may still use the heap, so you may not be able to remove it entirely.

Roddy
+1 It's most likely a). All mallocing happens on startup and all freeing happens at the end of the application (which in practice is never since the main part of the application is in an infinite loop). So I'm sure fragmentation is not the issue.
Matthew Murdoch
This is a very important point! Please, stop dropping these little hints in comments and just add as much detail as you can to the question itself!
Artelius
@Artelius - Have added this to the question. Thanks.
Matthew Murdoch