Try using valgrind; I haven't tried valgrind on OS X, and I don't understand your problem, but "try valgrind" is the first thing I think of when you say "clobbered".
gdb often acts weird with multithreaded programs. Another solution (if you can afford it) would be to put printf()
s all over the place to try and catch the moment where your value gets clobbered. Not very elegant, but sometimes effective.
static
variables and multi-threading generally do not mix.
Without seeing your code (you should include your threaded code), my guess is that you have two threads concurrently writing to addr
variable. It doesn't work.
You either need to:
- create separate instances of
addr
for each thread; or - provide some sort of synchronisation around
addr
to stop two threads changing the value at the same time.
One thing you could try would be to create a separate thread whose only purpose is to watch the value of addr
, and to break when it changes. For example:
static int * volatile addr; // volatile here is important, and must be after the *
void *addr_thread_proc(void *arg)
{
while(1)
{
int *old_value = addr;
while(addr == old_value) /* spin */;
__asm__("int3"); // break the debugger, or raise SIGTRAP if no debugger
}
}
...
pthread_t spin_thread;
pthread_create(&spin_thread, NULL, &addr_thread_proc, NULL);
Then, whenever the value of addr
changes, the int3
instruction will run, which will break the debugger, stopping all threads.
- You could put an array of uint's between some_values and addr and determine if you are overruning some_values or if the corruption affects more addresses then you first thought. I would initialize padding to DEADBEEF or some other obvious pattern that is easy to distinguish and unlikely to occur in the program. If a value in the padding changes then cast it to float and see if the number makes sense as a float.
static float some_values[SIZE]; static unsigned int padding[1024]; static int * addr;
Run the program multiple times. In each run disable a different thread and see when the problems goes away.
Set the programs process affinity to a single core and then try the watchpoint. You may have better luck if you don't have two threads simultaneously modifying the value. NOTE: This solution does not preclude that from happening. It may make it easier to catch in a debugger.
I have not done any debugging on OSX, but I have seen the same behavior in GDB on Linux: program crashes, yet GDB can read and write the memory which program just tried to read/write unsuccessfully.
This doesn't necessarily mean GDB is confused; rather the kernel allowed GDB to read/write memory via ptrace() which the inferior process is not allowed to read or write. IOW, it was a (recently fixed) kernel bug.
Still, it sounds like GDB watchpoints aren't working for you for whatever reason.
One technique you could use is to mmap
space for some_values
rather than statically allocating space for them, arrange for the array to end on a page boundary, and arrange for the next page to be non-accessible (via mprotect
).
If any code tries to access past the end of some_values
, it will get an exception (effectively you are setting a non-writable "watch point" just past some_values
).