tags:

views:

166

answers:

4

I have a target platform reporting when memory is read from or written to as well as when locks(think mutex for example) are taken/freed. It reports the program counter, data address and read/write flag. I am writing a program to use this information on a separate host machine where the reports are received so it does not interfere with the target. The target already reports this data so I am not changing the target code at all.

Are there any references or already available algorithms that do this kind of detection? For example, some way of detecting race conditions when multiple threads try to write to a global variable without protecting it first.

I am currently brewing my own but I convince myself there is definitely some code out there that does this already. Or at least some proven algorithm of how to go about it.

Note This is not to detect memory leaks.

Note Implementation language is C++

I am trying to make the detection code I write platform agnostic so I am using STL and just Standard C++ with libraries like boost, poco, loki.

Any leads will help

thanks.

+1  A: 

There is no standard way, since the C/C++ standards do not deal with OS specific concepts like memory protection. Have a look at Breakpad, the crash reporting library used by Mozilla on various platforms like OS X, Win32 or Linux.

Remus Rusanu
thanks. I can not add any code to the running code on target.
MeThinks
+4  A: 

It is probably too late to talk you out of this, but this does not work. Threading races are caused by subtle timing issues between threads. You can never diagnose timing related problems with logging. Heisenbergian, just logging alters the timing of a thread. Especially the kind you are contemplating. Infamously, there's plenty of software that shipped with logging kept turned on because it would nosedive with it turned off.

Flushing out threading bugs is hard. The kind of tool that works is one that intentionally injects random delays in code. Microsoft CHESS is an example, works on native code too.

Hans Passant
Thanks. Definitely helpful and insightful response. The code running on target is instrumented already to report the information I mentioned in the question. So I am just trying to process events already reported by the target. There is nothing to turn off/on in this case. I will take your advise in experimenting with whatever solution I adopt. Additionally, it is not a timing issue I am investigating. More of acces control.
MeThinks
A: 

Check out this article by Andrei Alexandrescu: http://www.drdobbs.com/184403766;jsessionid=LKUUBKFR00O0VQE1GHRSKH4ATMY32JVN

It advocates using the volatile keyword on your data that is accessed by more than one thread. If you cast away that volatility with your locking mechanism, you will know via compiler error where you need to lock that data.

I have used this method and found it extremely helpful.

Hope that helps.

Thanks and will keep this article as reference. I am not trying to fix the code running on target. Just trying to detect if there are any access violations at runtime. The advice in this article would certainly be put to good use once a problem has been detected. If I was to right the embedded target code myself, this article and related references would certainly come in handy.
MeThinks
+2  A: 

To address only part of your question, race conditions are extremely nasty precisely because there is no good way to test for them. By definition they're unpredictable sequences of events that are quite difficult to diagnose. Detection code depends on the fact that the race condition is actually happening, and in that case it's likely that you'll see errant behavior anyway. Any test code you add may make them more or less likely to appear, or possibly even change the timing such that they never appear at all.

Instead of trying to detect race conditions, what about attempting program design that helps make you more resilient to having them in the first place?

For example if your global variable were simply encapsulated in an object that knows all the proper protection that needs to happen on access, then it's impossible for threads to concurrently write to it, because such a interface doesn't exist. Programmatically preventing race conditions is going to be easier than trying to detect them algorithmically (chances are you'll still catch some during unit/subsystem testing).

Mark B
+1 for the encapsulation recommendation. At a minimum, writing simple "getter" and "setter" wrapper functions (that coordinate access using a mutex or similar) around a global variable works wonders. Limit the scope of the "raw" variable such that only the interface code which you *know* uses proper access control can touch it directly.
bta
Yes. Definitely sound advice and should be taken seriously. However, the code is already out there on target and report data accesses. So I am just trying to work with the reported data accesses.
MeThinks