views:

1306

answers:

15

Hello Stack overflow,

What would be a good way to detect a C++ memory leak in an embedded environment? I tried overloading the new operator to log every data allocation, but I must have done something wrong, that approach isn't working. Has anyone else run into a similar situation?

This is the code for the new and delete operator overloading.

EDIT:

Full disclosure: I am looking for a memory leak in my program and I am using this code that someone else wrote to overload the new and delete operator. Part of my problem is the fact that I don't fully understand what it does. I know that the goal is to log the address of the caller and previous caller, the size of the allocation, a 1 if we are allocating, a 2 if we are deallocation. plus the name of the thread that is running.

Thanks for all the suggestions, I am going to try a different approach that someone here at work suggested. If it works, I will post it here.

Thanks again to all you first-rate programmers for taking the time to answer.

StackOverflow rocks!

Conclusion

Thanks for all the answers. Unfortunately, I had to move on to a different more pressing issue. This leak only occurred under a highly unlikely scenario. I feel crappy about just dropping it, I may go back to it if I have more time. I chose the answer I am most likely to use.

#include <stdlib.h>
#include "stdio.h"
#include "nucleus.h"
#include "plus/inc/dm_defs.h"
#include "plus/inc/pm_defs.h"

#include "posix\inc\posix.h"

extern void* TCD_Current_Thread;
extern "C" void rd_write_text(char * text);
extern PM_PCB * PMD_Created_Pools_List;

typedef struct {
    void* addr;
    uint16_t size;
    uint16_t flags;
} MemLogEntryNarrow_t;

typedef struct {
    void* addr;
    uint16_t size;
    uint16_t flags;
    void* caller;
    void* prev_caller;
    void* taskid;
    uint32_t timestamp;
} MemLogEntryWide_t;

//size lookup table
unsigned char MEM_bitLookupTable[] = {
 0,1,1,2,1,2,2,3,1,2,2,3,1,3,3,4
};

//#pragma CODE_SECTION ("section_ramset1_0")
void *::operator new(unsigned int size)
{
   asm(" STR R14, [R13, #0xC]");  //save stack address temp[0]
   asm(" STR R13, [R13, #0x10]");  //save pc return address temp[1]

   if ( loggingEnabled )
   {
      uint32_t savedInterruptState;
      uint32_t currentIndex;

      // protect the thread unsafe section.
      savedInterruptState = NU_Local_Control_Interrupts(NU_DISABLE_INTERRUPTS);

      // Note that this code is FRAGILE.  It peeks backwards on the stack to find the return
      // address of the caller.  The location of the return address on the stack can be easily changed
      // as a result of other changes in this function (i.e. adding local variables, etc).
      // The offsets may need to be adjusted if this function is touched.
      volatile unsigned int temp[2];

      unsigned int *addr = (unsigned int *)temp[0] - 1;
      unsigned int count = 1 + (0x20/4);   //current stack space ***

      //Scan for previous store
      while ((*addr & 0xFFFF0000) != 0xE92D0000)
      {
         if ((*addr & 0xFFFFF000) == 0xE24DD000)
         {
            //add offset in words
            count += ((*addr & 0xFFF) >> 2);
         }
         addr--;
      }

      count += MEM_bitLookupTable[*addr & 0xF];
      count += MEM_bitLookupTable[(*addr >>4) & 0xF];
      count += MEM_bitLookupTable[(*addr >> 8) & 0xF];
      count += MEM_bitLookupTable[(*addr >> 12) & 0xF];

      addr = (unsigned int *)temp[1] + count;
      // FRAGILE CODE ENDS HERE

      currentIndex = currentMemLogWriteIndex;
      currentMemLogWriteIndex++;

      if ( memLogNarrow )
      {
         if (currentMemLogWriteIndex >= MEMLOG_SIZE/2 )
         {
            loggingEnabled = false;
            rd_write_text( "Allocation Logging is complete and DISABLED!\r\n\r\n");
         }
         // advance the read index if necessary.
         if ( currentMemLogReadIndex == currentMemLogWriteIndex )
         {
            currentMemLogReadIndex++;
            if ( currentMemLogReadIndex == MEMLOG_SIZE/2 )
            {
               currentMemLogReadIndex = 0;
            }
         }

         NU_Local_Control_Interrupts(savedInterruptState);

         //Standard operator 
         //(For Partition Analysis we have to consider that if we alloc size of 0 always as size of 1 then are partitions must be optimized for this)
         if (size == 0) size = 1;

         ((MemLogEntryNarrow_t*)memLog)[currentIndex].size = size;
         ((MemLogEntryNarrow_t*)memLog)[currentIndex].flags = 1;    //allocated

         //Standard operator
         void * ptr;
         ptr = malloc(size);

         ((MemLogEntryNarrow_t*)memLog)[currentIndex].addr = ptr;

         return ptr;
      }
      else
      {
         if (currentMemLogWriteIndex >= MEMLOG_SIZE/6 )
         {
            loggingEnabled = false;
            rd_write_text( "Allocation Logging is complete and DISABLED!\r\n\r\n");
         }
         // advance the read index if necessary.
         if ( currentMemLogReadIndex == currentMemLogWriteIndex )
         {
            currentMemLogReadIndex++;
            if ( currentMemLogReadIndex == MEMLOG_SIZE/6 )
            {
               currentMemLogReadIndex = 0;
            }
         }

         ((MemLogEntryWide_t*)memLog)[currentIndex].caller = (void *)(temp[0] - 4);
         ((MemLogEntryWide_t*)memLog)[currentIndex].prev_caller = (void *)*addr;
         NU_Local_Control_Interrupts(savedInterruptState);
         ((MemLogEntryWide_t*)memLog)[currentIndex].taskid = (void *)TCD_Current_Thread;
         ((MemLogEntryWide_t*)memLog)[currentIndex].size = size;
         ((MemLogEntryWide_t*)memLog)[currentIndex].flags = 1;    //allocated
         ((MemLogEntryWide_t*)memLog)[currentIndex].timestamp = *(volatile uint32_t *)0xfffbc410;   // for arm9

         //Standard operator
         if (size == 0) size = 1;

         void * ptr;
         ptr = malloc(size);

         ((MemLogEntryWide_t*)memLog)[currentIndex].addr = ptr;

         return ptr;
      }
   }
   else
   {
       //Standard operator
       if (size == 0) size = 1;

       void * ptr;
       ptr = malloc(size);

       return ptr;
   }
}
//#pragma CODE_SECTION ("section_ramset1_0")
void ::operator delete(void *ptr)
{
   uint32_t savedInterruptState;
   uint32_t currentIndex;

   asm(" STR R14, [R13, #0xC]");  //save stack address temp[0]
   asm(" STR R13, [R13, #0x10]");  //save pc return address temp[1]

   if ( loggingEnabled )
   {
      savedInterruptState = NU_Local_Control_Interrupts(NU_DISABLE_INTERRUPTS);

      // Note that this code is FRAGILE.  It peeks backwards on the stack to find the return
      // address of the caller.  The location of the return address on the stack can be easily changed
      // as a result of other changes in this function (i.e. adding local variables, etc).
      // The offsets may need to be adjusted if this function is touched.
      volatile unsigned int temp[2];

      unsigned int *addr = (unsigned int *)temp[0] - 1;
      unsigned int count = 1 + (0x20/4);   //current stack space ***

      //Scan for previous store
      while ((*addr & 0xFFFF0000) != 0xE92D0000)
      {
         if ((*addr & 0xFFFFF000) == 0xE24DD000)
         {
            //add offset in words
            count += ((*addr & 0xFFF) >> 2);
         }
         addr--;
      }

      count += MEM_bitLookupTable[*addr & 0xF];
      count += MEM_bitLookupTable[(*addr >>4) & 0xF];
      count += MEM_bitLookupTable[(*addr >> 8) & 0xF];
      count += MEM_bitLookupTable[(*addr >> 12) & 0xF];

      addr = (unsigned int *)temp[1] + count;
      // FRAGILE CODE ENDS HERE

      currentIndex = currentMemLogWriteIndex;
      currentMemLogWriteIndex++;

      if ( memLogNarrow )
      {
         if ( currentMemLogWriteIndex >= MEMLOG_SIZE/2 )
         {
            loggingEnabled = false;
            rd_write_text( "Allocation Logging is complete and DISABLED!\r\n\r\n");
         }
         // advance the read index if necessary.
         if ( currentMemLogReadIndex == currentMemLogWriteIndex )
         {
            currentMemLogReadIndex++;
            if ( currentMemLogReadIndex == MEMLOG_SIZE/2 )
            {
               currentMemLogReadIndex = 0;
            }
         }

         NU_Local_Control_Interrupts(savedInterruptState);

         // finish logging the fields.  these are thread safe so they dont need to be inside the protected section.
         ((MemLogEntryNarrow_t*)memLog)[currentIndex].addr = ptr;
         ((MemLogEntryNarrow_t*)memLog)[currentIndex].size = 0;
         ((MemLogEntryNarrow_t*)memLog)[currentIndex].flags = 2;    //unallocated
      }
      else
      {
         ((MemLogEntryWide_t*)memLog)[currentIndex].caller = (void *)(temp[0] - 4);
         ((MemLogEntryWide_t*)memLog)[currentIndex].prev_caller = (void *)*addr;

         if ( currentMemLogWriteIndex >= MEMLOG_SIZE/6 )
         {
            loggingEnabled = false;
            rd_write_text( "Allocation Logging is complete and DISABLED!\r\n\r\n");
         }
         // advance the read index if necessary.
         if ( currentMemLogReadIndex == currentMemLogWriteIndex )
         {
            currentMemLogReadIndex++;
            if ( currentMemLogReadIndex == MEMLOG_SIZE/6 )
            {
               currentMemLogReadIndex = 0;
            }
         }
         NU_Local_Control_Interrupts(savedInterruptState);

         // finish logging the fields.  these are thread safe so they dont need to be inside the protected section.
         ((MemLogEntryWide_t*)memLog)[currentIndex].addr = ptr;
         ((MemLogEntryWide_t*)memLog)[currentIndex].size = 0;
         ((MemLogEntryWide_t*)memLog)[currentIndex].flags = 2;    //unallocated
         ((MemLogEntryWide_t*)memLog)[currentIndex].taskid = (void *)TCD_Current_Thread;
         ((MemLogEntryWide_t*)memLog)[currentIndex].timestamp = *(volatile uint32_t *)0xfffbc410;   // for arm9
      }

      //Standard operator
      if (ptr != NULL) {
         free(ptr);
      }
   }
   else
   {
      //Standard operator
      if (ptr != NULL) {
        free(ptr);
      }
   }
}
+1  A: 

overloading new and delete should work if you pay close attention.

Maybe you can show us what isn't working about that approach?

John Weldon
This approach assumes that all allocations are done with operators new and delete. It won't catch e.g. memory allocated with malloc.
laalto
+1  A: 

I'm not an embedded environment expert, so the only advise I can give is to test as much code as you can on your development machine using your favorite free or proprietary tools. Tools for a particular embedded platform may also exist and you can use them for final testing. But most powerful tools are for desktops.

On desktop environment I like the job DevPartner Studio does. This is for Windows and proprietary. There're free tools available for Linux but I don't have much expirience with them. As an example there's EFence


Oleg Zhylin
Valgrind is great!
stepancheg
+3  A: 

One way is to insert file name and line number strings (via pointer) of the module allocating memory into the allocated block of data. The file and line number is handled by using the C++ standard "__FILE__" and "__LINE__" macros. When the memory is de-allocated, that information is removed.

One of our systems has this feature and we call it a "memory hog report". So anytime from our CLI we can print out all the allocated memory along with a big list of information of who has allocated memory. This list is sorted by which code module has the most memory allocated. Many times we'll monitor memory usage this way over time, and eventually the memory hog (leak) will bubble up to the top of the list.

DoxaLogos
I don't think I fully understand. Do you modify every object to have a string pointer to the allocation location? If so, how do you instrument third party objects which you can't modify? If not, what am I missing?
Alex B
I don't know how your memory management is designed, so I'll talk about ours. We don't modify the objects at all. It's all handled in the call to new and the memory allocation code. We have our home grown block manager for handling the heap. Inside the block we reserved bytes for filename (ptr) and line number (ptr). We modify the new function to be able to pass either by MACRO (debug build) to pass in this information into the memory blocks. We also have a function on another system that looks up the stack to find the caller as well.
DoxaLogos
Actually, on second thought, it may be easier to look up the call stack to determine who called the new and store that in the memory block. You can use the address to determine the code location.
DoxaLogos
A: 

You can use a third party tool to do this.

You can detect leaks within your own class structures by adding memory counters in your New and Delete calls to increment/decrement the memory counters, and print out a report at your application close. However, this won't detect memory leaks for memory allocated outside your class system - a third party tool can do this though.

Larry Watanabe
A: 

Can you describe what is "not working" with your log methods?

Do you not get the expected logs? or, are they showing everything is fine but you still have leaks?

How have you confirmed that this is definitely a leak and not some other type of corruption?

One way to check your overloading is correct: Instantiate a counter object per class, increment this in the new and decrement it in the delete of the class. If you have a growing count, you have a leak. Now, you would expect your log lines to be coinciding with the increment and decrement points.

nik
+1  A: 

If you override constructor and destructor of your classes, you can print to the screen or a log file. Doing this will give you an idea of when things are being created, what is being created, as well as the same information for deletion.

For easy browsing, you can add a temporary global variable, "INSTANCE_ID", and print this (and increment) on every constructor/destructor call. Then you can browse by ID, and it should make goings a little easier.

Mike Trpcic
A: 

Not specifically for embedded development, but we used to use BoundsChecker for that.

fretje
Our experience with BoundsChecker has been nothing but good. Whenever we're stuck finding a leak, we turn to BoundsChecker, and more often than not, we find the leak within a couple of hours to about a day or two. The only downside is all of the false positives that you have to wade through to get to the real problem.
RobH
+4  A: 

http://www.linuxjournal.com/article/6059

Actually from my experience it always better to create memory pools for embedded systems and use custom allocator/de-allocator. We can easily identify the leaks. For example, we had a simple custom memory manager for vxworks, where we store the task id, timestamp in the allocated mem block.

Warrior
A: 

Use smart pointers and never think about it again, there's loads of official types around but are pretty easy to roll your own too:

class SmartCoMem
{
public:
SmartCoMem() : m_size( 0 ), m_ptr64( 0 ) {
}

~SmartCoMem() {
    if( m_size )
        CoTaskMemFree((LPVOID)m_ptr64);
}

void copyin( LPCTSTR  in, const unsigned short size )
{
    LPVOID ptr;
    ptr = CoTaskMemRealloc( (LPVOID)m_ptr64, size );
    if( ptr == NULL )
        throw std::exception( "SmartCoMem: CoTaskMemAlloc Failed" );
    else
    {
        m_size = size;
        m_ptr64 = (__int64)ptr;
        memcpy( (LPVOID)m_ptr64, in, size );
    }
}

std::string copyout( ) {
    std::string out( (LPCSTR)m_ptr64, m_size );
    return out;
}

__int64* ptr() {
    return &m_ptr64;
}

unsigned short size() {
    return m_size;
}

unsigned short* sizePtr() {
    return &m_size;
}

bool loaded() {
    return m_size > 0;
}

private:
    //don't allow copying as this is a wrapper around raw memory
    SmartCoMem (const SmartCoMem &);
    SmartCoMem & operator = (const SmartCoMem &);

__int64 m_ptr64;
unsigned short m_size;

};

There's no encapsulation in this example due to the API I was working with but still better than working with completely raw pointers.

Patrick
+1  A: 

The way we did it with our C 3D toolkit was to create custom new/malloc and delete macros that logged each allocation and deallocation to a file. We had to ensure that all the code called our macros of course. The writing to the log file was controlled by a run time flag and only happened under debug so we didn't have to recompile.

Once the run was complete a post-processor ran over the file matching allocations to deallocations and reported any unmatched allocations.

It had a performance hit, but we only needed to do it once it a while.

ChrisF
++ for simple and effective.
Mike Dunlavey
+6  A: 

There are several forms of operator new:

 void *operator new (size_t);
 void *operator new [] (size_t);
 void *operator new (size_t, void *);
 void *operator new [] (size_t, void *);
 void *operator new (size_t, /* parameters of your choosing! */);
 void *operator new [] (size_t, /* parameters of your choosing! */);

All the above can exist at both global and class scope. For each operator new, there is an equivalent operator delete. You need to make sure you are adding logging to all versions of the operator, if that is the way you want to do it.

Ideally, you would want the system to behave the same regardless of whether the memory logging is present or not. For example, the MS VC run time library allocates more memory in debug than in release because it prefixes the memory allocation with a bigger information block and adds guard blocks to the start and end of the allocation. The best solution is to keep all the memory logging information is a separate chunk or memory and use a map to track the memory. This can also be used to verify that the memory passed to delete is valid.

new
  allocate memory
  add entry to logging table

delete
  check address exists in logging table
  free memory

However, you're writing embedded software where, usually, memory is a limited resource. It is usually preferable on these systems to avoid dynamic memory allocation for several reasons:

  1. You know how much memory there is so you know in advance how many objects you can allocate. Allocation should never return null as that is usually terminal with no easy way of getting back to a healthy system.
  2. Allocating and freeing memory leads to fragmentation. The number of objects you can allocate will decrease over time. You could write a memory compactor to move allocated objects around to free up bigger chunks of memory but that will affect performance. As in point 1, once you get a null, things get tricky.

So, when doing embedded work, you usually know up front how much memory can be allocated to various objects and, knowing this, you can write more efficent memory managers for each object type that can take appropriate action when memory runs out - discarding old items, crashing, etc.

Skizz

EDIT

If you want to know what called the memory allocation, the best thing to do is use a macro (I know, macros are generally bad):

#define NEW new (__FILE__, __LINE__, __FUNCTION__)

and define an operator new:

void *operator new (size_t size, char *file, int line, char *function)
{
   // log the allocation somewhere, no need to strcpy file or function, just save the 
   // pointer values
   return malloc (size);
}

and use it like this:

SomeObject *obj = NEW SomeObject (parameters);

You compiler might not have the __FUNCTION__ preprocessor definition so you can safely omit it.

Skizz
+8  A: 

If you're running Linux, I suggest trying Valgrind.

Michael
+1 - valgrind is the business. One of it's nicest features is that the output (at least in emacs) looks like standard compiler output. And of course it does all that other stuff like "massif" to show you how you're using memory. You may not have leaks, but if you have a memory hungry application then massif lets you know where you're using the most memory, potentially showing you other kinds of problems - like not freeing memory immediately when you're finished with it.
Richard Corden
+1  A: 

Is there a real need to roll your own memory leak detection?

Assuming you can't use dynamic memory checkers, like the open-source valgrind tool on Linux, static analysis tools like the commercial products Coverity Prevent and Klocwork Insight may be of use. I've used all three, and have had very good results with all of them.

Void
The static tools have significant limitations over the dynamic ones for finding leaks. Among them is properly handling heap allocated resources - once your resource gets assigned to a global or an instance field it is very difficult to reason about what will happen. None of the commercial tools have a good solution for that - they either report them all as flaws, giving a high FP rate, or ignore them all, cutting out a huge proportion of the potential bugs to be found
Michael Donohue
I wasn't suggesting using static analysis tools alone. They are certainly useful complements to dynamic tools, particularly if they can be incorporated into automated or periodic manual builds.
Void
A: 

For testing like this, try compiling your embedded code natively for Linux (or whatever OS you use), and use the a well-established tool like Valgrind to test for memory leaks. It can be a challenge to do this, but you just need to replace any code that directly accesses hardware with some code that simulates something suitable for your testing.

I've found that using SWIG to convert my embedded code into a linux-native library and running the code from a Python script is really effective. You can use all of the tools that are available for non-embedded projects, and test all of your code except the hardware drivers.

Neil
+1  A: 

Lots of good answers.

I would just point out that if the program is one that, like a small command-line utility, runs for a short period of time and then releases all its memory back to the OS, memory leaks probably do no harm.

Mike Dunlavey