views:

1958

answers:

7

My application crashes after running for around 18 hours. I am not able to debug the point in the code where it actually crashes. I checked the call stack- it does not provide any information as such. The last few calls in the call stack are greyed out-meaning I cannot see the code of that part-they all belong to MFC libraries.

However, I get this 'MicroSoft Visual Studio' pop-up when it crashes which says :

Unhandled exception at 0x7c809e8a in NIMCAsst.exe: 0xC0000005: Access violation reading location 0x154c6000.

Could the above information be useful to understand where it is crashing.Is there any software that could tell me a particular memory address is held by which variable in the code.

+3  A: 

If you can't catch the exception sometimes you just have to go through your code line by line, very unpleasant but I'd put money on it being your code not in MFC (always is with my bugs). Check how you're using memory and what you're passing into the MFC functions extra carefully.

Patrick
+2  A: 

Probably the crash is caused by a buffer overflow or other type of memory corruption. This has overwritten some part of the stack holding the return address which has made the debugger unable to reconstruct the stack trace correctly. Or, that the code that caused the crash, you do not have correct sybols for (if the stack trace shows a module name, this would be the case).

My first guess would be to examine the code calling the code that crashed for possible issues that might have caused it. Do you get any other exceptions or error conditions before the crash? Maybe you are ignoring an error return? Did you try using the Debug Heap? What about adplus? Application verifier to turn on heap checks?

Other possibilities include to run a tool like pclint over the code to check for obvious issues of memory use. Are you using threads? Maybe there is a race condition. The list could go on forever really.

1800 INFORMATION
The error may be coming-but when the crash will occur is not consistent-it crashes any time after running for 6 hours.So, I cannot put a watch on the errors returned for such a long time.
Rakesh Agarwal
Yeah,its a multithreaded app.
Rakesh Agarwal
+1  A: 

The above information only tells you which memory was accessed illegally.

You can use exception handling to narrow down the place where the problem occurs, but then you need at least an idea in which corner to seek.

You say that you're seeing the call stack, that suggests you're using a debugger. The source code of MFC is available (but perhaps not with all vc++ editions), so in principle one can trace through it. Which VC++ version are you using?

The fact that the bug takes so long to occur suggests that it is memory corruption. Some other function writes to a location that it doesn't own. This works a long time, but finally the function alters a pointer that MCF needs, and after a while MFC accesses the pointer and you are notified.

Sometimes, the 'location' can be recognized as data, in which case you have a hint. F.e. if the error said:

Access violation reading location 0x31323334

you'd recognize this as a part of an ASCII string "1234", and this might lead you to the culprit.

Pim
I am using Visual Studio 2005 and I have attached the application with the solution to debug.
Rakesh Agarwal
+1  A: 

As Patrick says, it's almost definitely your code giving MFC invalid values. One guess would be you're passing in an incorrect length so the library is reading too far. But there are really a multitude of possible causes.

Matthew Flaschen
Not only can user code give some improperly structred data, but it can for example call a function invoking a callback on completion and give it a userdata pointer to a valid object but have the object deleted at the moment the callback arrives. Possibilities are endless here.
sharptooth
+1  A: 

Is the crash clearly reproducible?

If yes, Use Logfiles! You should use a logfile and add a number statements that just log the source file/line number passed. Start with a few statements at the entrypoint (main event handler) and the most common execution paths. After the crash inspect the last entry in the logfile. Then add new entries down the path/paths that must have been passed etc. Usually after a few iterations of this work you will find the point of failure. In case of your long wait time the log file might become huge and each iteration will take another 18 hours. You may need to add some technique of rotating log files etc. But with this technique i was able to find some comparable bugs.

Some more questions:

Is your app multithreaded?

Does it use any arrays not managed by stl or comparable containers (does it use C-Strings, C/C++-Arrays etc)?

RED SOFT ADAIR
Yes, the application is multithreaded. There are no arrays which is not managed by stl. The logging is a good idea.
Rakesh Agarwal
Also when appending to a log file I would recommend opening, writing, flushing and closing the log file each time so that when the crash occurs the logfile is properly closed with the last printed text in it.
KPexEA
A: 

Try attaching a debugger to the process and have the debugger break on access violations.

If this isnt possible then we use a tool called "User mode process dumper" to create a memory dump of the process at the point where the access violation happened. You can find this for download here:

http://www.microsoft.com/downloads/details.aspx?FamilyID=E089CA41-6A87-40C8-BF69-28AC08570B7E&displaylang=en

How it works: You configure rules on a per-process (or optionally system-wide) basis, and have the tool create either a minidump or a full dump at the point where it detects any one of a list of exceptions - one of them being an access violation. After the dump has been made the application continues as normal (and so if the access violation is unhandled, you will then see this dialog).

Note that ALL access violations in your process are captured - even those that are then later handled, also a full dump can create a while to create depending on the amount of memory the application is using (10-20 seconds for a process consuming 100-200 MB of private memory). For this reason it's probably not a good idea to enable it system-wide.

You should then be able to analyse the dump using tools like WinDbg (http://www.microsoft.com/whdc/devtools/debugging/default.mspx) to figure out what happened - in most cases you will find that you only need a minidump, not a full dump (however if your application doesnt use much memory then there arent really many drawbacks of having a full dump other than the size of the dump and the time it takes to create the dump).

Finally, be warned that debugging access violations using WinDbg can be a fairly involed and complex process - if you can get a stack trace another way then you might want to try that first.

Kragen
A: 

Hello Rakesh,

Did you manage to fix the bug?

I have exactly the same problem with a service application I've develloped. I'm very interested in this topic. I've coded using the Structured Exception Handler a function to show the exception address and the virtual address of the inaccessible data at the exception location : EvalException(EXCEPTION_POINTERS* pep){...}

Maybe we can help each other...

Best Regards,

Adelmo.

I could not solve the problem and moved on :(. Sorry couldn't help u.
Rakesh Agarwal