views:

1077

answers:

13

My software recently got deployed to a customer who said that the application was crashing immediately after it started. After some initial debugging, the customer provided me remote access to one of the computers which was unable to run the application. I found that the crash wasn't specific to my application. Any application which depended on the .NET framework crashed immediately.

Conveniently, Visual Studio 2008 was installed so I created a quick hello world application on it and clicked Debug. The application worked fine. But, then when I tried to execute the generated binaries in the /bin/Debug/HelloWorld.exe directory outside of visual studio it crashed.

List of things i've tried (UPDATED):

  • I checked that "Everyone" has Read&Execute permissions for c:\Windows.
  • To test that the problem was with the .NET Framework (and not my application), I attempted to download Paint .NET on to the computers. The setup frontend crashed in the same manner.
  • Performed a repair of the .NET framework as outlined in http://support.microsoft.com/kb/908077 (Boy was this fun and time consuming). No luck.
  • Installed .NET 3.5 SP1 (before it just had .NET 3.5) Note: my application targets 2.0 so I did this more as a long shot... but i learned in the process that .NET 3.5 SP1 also updates the underlying frameworks.
  • Ran Aaron Stebner's .NET Setup Verification Tool. This tool indicated that .NET was successfully installed. (I forget if i checked all the versions but at least 2.0 worked).
  • Tested some mini hello world applications which were targeted for .NET 2.0 and .NET 3.5 and both crashed in the same way.
  • Tried launching .NET apps via windbg cmd line. Doing this did allow me invoke my simple hello world applications. So, simple .NET hello world works when invoked by windbg or by launching via debug in visual studio... but doesn't if i try to execute it standalone.

I created a dump file using WinDbg. It wasn't all that revealing to me.

FAULTING_IP:  mscorwks!PEImage::GetEntryPointToken+21 79f4ff9d f6401010        test    byte ptr [eax+10h],10h

EXCEPTION_RECORD:  0012f710 -- (.exr 0x12f710) ExceptionAddress: 79f4ff9d (mscorwks!PEImage::GetEntryPointToken+0x00000021) ExceptionCode: c0000005 (Access violation)   ExceptionFlags: 00000000 NumberParameters: 2    Parameter[0]: 00000000    Parameter[1]: 00000010 Attempt to read from address 00000010

FAULTING_THREAD:  00000b44
PROCESS_NAME:  MyProcess.exe
ERROR_CODE: (NTSTATUS) 0x80000003 - {EXCEPTION}  Breakpoint  A breakpoint has been reached.

EXCEPTION_CODE: (HRESULT) 0x80000003 (2147483651) - One or more arguments are invalid    
DETOURED_IMAGE: 1    
NTGLOBALFLAG:  0    
APPLICATION_VERIFIER_FLAGS:  0    
MANAGED_STACK: !dumpstack -EE OS Thread Id: 0xb44 (0) Current frame:  ChildEBP RetAddr  Caller,Callee

EXCEPTION_OBJECT: !pe cb10b4 Exception object: 00cb10b4 Exception type: System.ExecutionEngineException Message: <none> InnerException: <none> StackTrace (generated): <none> StackTraceString: <none> HResult: 80131506    
MANAGED_OBJECT_NAME:  System.ExecutionEngineException    
CONTEXT:  0012f72c -- (.cxr 0x12f72c) eax=00000000 ebx=00000000 ecx=00000000 edx=0000000e esi=001a1490 edi=00000001 eip=79f4ff9d esp=0012f9f8 ebp=0012fa1c iopl=0         nv up ei pl zr na pe nc cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010246 mscorwks!PEImage::GetEntryPointToken+0x21: 79f4ff9d f6401010        test    byte ptr [eax+10h],10h     ds:0023:00000010=?? Resetting default scope    
READ_ADDRESS:  00000010     
FOLLOWUP_IP:  mscorwks!PEImage::GetEntryPointToken+21 79f4ff9d f6401010        test    byte ptr [eax+10h],10h    
BUGCHECK_STR:  APPLICATION_FAULT_NULL_CLASS_PTR_DEREFERENCE_SHUTDOWN    
PRIMARY_PROBLEM_CLASS:  NULL_CLASS_PTR_DEREFERENCE_SHUTDOWN
    DEFAULT_BUCKET_ID:  NULL_CLASS_PTR_DEREFERENCE_SHUTDOWN    
LAST_CONTROL_TRANSFER:  from 79ef02b5 to 79f4ff9d    
STACK_TEXT:   79f4ff9d mscorwks!PEImage::GetEntryPointToken+0x21 79ef02b5 mscorwks!PEFile::GetEntryPointToken+0xa0 79eefeaf mscorwks!SystemDomain::ExecuteMainMethod+0xd4 79fb9793 mscorwks!ExecuteEXE+0x59 79fb96df mscorwks!_CorExeMain+0x15c 7900b1b3 mscoree!_CorExeMain+0x2c 7c817077 kernel32!BaseProcessStart+0x23    

SYMBOL_STACK_INDEX:  0    
SYMBOL_NAME:  mscorwks!PEImage::GetEntryPointToken+21    
FOLLOWUP_NAME:  MachineOwner    
MODULE_NAME: mscorwks    
IMAGE_NAME:  mscorwks.dll    
DEBUG_FLR_IMAGE_TIMESTAMP:  471ef729    
STACK_COMMAND:  .cxr 0012F72C ; kb ; dds 12f9f8 ; kb    
FAILURE_BUCKET_ID:  NULL_CLASS_PTR_DEREFERENCE_SHUTDOWN_80000003_mscorwks.dll!PEImage::GetEntryPointToken    
BUCKET_ID:  APPLICATION_FAULT_NULL_CLASS_PTR_DEREFERENCE_SHUTDOWN_DETOURED_mscorwks!PEImage::GetEntryPointToken+21    
WATSON_STAGEONE_URL:  http://watson.microsoft.com/StageOne/MyProcess_exe/2_4_4_39/4a8a192c/unknown/0_0_0_0/bbbbbbb4/80000003/00000000.htm?Retriage=1

Followup: MachineOwner

Edit 1:The event log details for this error say it's a .NET Runtime version 2.0.50727.3053 - Fatal Execution Engine Error (7A097706)(80131506).

DotNetFatalExecutionErrorScreenshot

Edit 2 (10-7-09): This issue is still active.

Edit 3 (3-29-10): This update is to let everyone know that I never did successfully solve the problem. The customer who's machine this was on lost interest in solving it and just reimaged the machine :(. Thanks for all the contributions though.

A: 

Sounds like you got yourself a "fun" one there. I've no clue, but here's my karma-whoring stab-in-the-dark suggestions anyway.

1) In addition to the regular user permissions assigned by Windows, there is also a separate set of security settings specifically for the .NET framework. If you have the .NET SDK installed, look for the "Microsoft .NET Framework Configuration" tool in your Control Panel (may be under Administrative Tools). See if any of the settings are different there from your dev machine.

2) I'm guessing your customer is under the thumb of an IT regime at his workplace. See if you or your customer can get your hands on a fresh Windows install where the program works and then, working with his IT group, apply one of their requirements (security settings, antivirus programs, etc.) at a time until your program stops working. A good old last working state bug-hunt. Of course, this assumes cooperation from his IT dept, so I hope your program is important to an executive somewhere.

sskuce
@sskuce I tried the Configuration Tool... Under "Runtime Security Policy" I clicked "Reset All Policy Levels" but that didn't help unfortunately. Also, I'd suspect that .NET would give a better error message if it was a security setting. As for your second suggestion, that's a last resort... it's strange b/c it works on 80% of the customer's computers and they are all managed by the same desktop management utility... so theoretically the environments should be identical to where the application does work properly.
blak3r
You hadn't mentioned that it works on most computers - I assumed it didn't work on any computers at your customer's. In that case, have you checked the patch levels of the OS and .NET on the failing machines?
sskuce
I was told they were all computers were identical. I just checked and it has 3.5 but not sp1... I'm installing that now. My application was designed for 2.0 so not sure if that matters.
blak3r
3.5 sp1 didn't help and it took a really long time to install.
blak3r
A: 

There is no silver bullet fix and I do not think it is a permission issue.

Here is what I would try

  1. If it is a 64 bit machine try and switch to 32 bit mode I have seen this with 32 bit dlls trying to run in 64 bit.
  2. Create a new website on the server and run aspnet_regiis on it.
  3. Uninstall and reinstall the 2.0 and 3.5 frameworks. besure and run aspnet_regiis when complete
Gary
@Gary Not sure if IIS is even installed. My application is a desktop application. I'm failing to see why aspnet_regiis would get me any closer to knowing what the problem is.
blak3r
It's a good point to check if it's 64 or 32-bit, though. Depending on your compiler options and build machine, you may be targeting a particular system.
TrueWill
A: 

Couple of more suggestions to try:

  • Just to make sure it isn't the vshost.exe issue, I would try running MyProcess.exe under cdb/windbg and see the behavior.
  • The issue looks like an Read AV and if the app works properly under debugger, I would try to repair my .Net installation, in case there might be a possible corruption in the way OS handsover execution of a .Net assembly to mscorwks.dll.
Arun Mahapatra
@Codito I am able to get further when I launch through windbg. I can launch a very simple hello world app that way. So, i'm attempting to repair.
blak3r
+1  A: 

We had a very similar issue in a large scale deployment in all cases running the repair on the framework fixed the issue I would give it a try.

rerun
Which framework did you repair... they have 1, 2, 3.5 etc. Is it the framework version you targeted your application that generally fixed the problem or the latest.
blak3r
Our customer attempted to repair by following the steps outlined in this kb article: http://support.microsoft.com/kb/908077. No luck.
blak3r
A: 

Launch it using WinDbg. Using Son of Strike you should be able to see exactly why it is crashing. There may be a low level assembly load error. I have run down similar problems using WinDbg in the past.

Steve
@Steve I have tried windbg. That's how I created that dump info that is listed in the question :P. The .NET execution engine is fatally crashing. It's not specific to my application. I wasn't really able to get anything out of that information. Any .NET application crashes for example I tried to download Paint .NET on the customer's computer and it also failed.
blak3r
I realized that after I posted. Can you crack open the stack frame where the exception is happening. It almost looks like it doesn't think there is a managed entrypoint which would make no sense. Can you replicate this on any other machine? Do native apps that call into the CLR work? Have tried explicitly targeting x86?Steve
Steve
A: 

Make sure the user is not launching the application from a network share. By default, .Net throws a security exception when attempting to launch an application from an untrusted source.

You can change this behaviour using the Microsoft .Net Framework Configuration utility in Administrative Tools.

X-Cubed
@X-Cubed Thanks but that isn't the issue. All .NET dependent apps have this issue. I tried Paint.NET and a few other ones i wrote.
blak3r
These are the default machine-wide security settings that the .Net framework has, so my point applies to all .Net applications, not yours specifically.
X-Cubed
@X-Cubed I understand. Thanks.
blak3r
A: 

I had a similar problem a few months ago (I do not remember the error code though). After trying many things, the following solved the problem (as far as I can remember):

Removing all temporary files in the .net temporary folder (and also checking the permission of that folder)

Samuel
@Samuel I investigated temporary directories for .NET because i had never heard of such a thing and found that there are temporary files for ASP.NET sites... The directory is of the form: "%windir%\Microsoft.NET\Framework\ <VERSION> \Temporary ASP.NET Files" But, as far as Desktop applications I don't know of such a thing. Was your issue with an ASP.NET site or a desktop application?
blak3r
It was asp.net application. Sorry missed the context
Samuel
A: 

Since it doesn't happen on every client machine, could it be RAM? Can you reboot the bad machine and run the memory diagnostic tool?

Could DEP have been switched on for this machine at some point in the past? You wouldn't get a .Net exception for this kind of security because your app simply doesn't get to run so there's little chance to throw an error.

Rocjoe
@Rocjoe Bad Ram... No for lots of reasons. Not sure about DEP. I'll ask.
blak3r
+6  A: 

Based on your windbg output it looks like someone has injected a DLL into the process at process-launch, and that the injection isn't designed for whatever version of mscorwks that has been loaded. If this is a casual workstation (e.g. secretary) I would have it confiscated for MIS/IT to inspect for malware. If it is a machine sitting in a server room I would look toward the customer to perform a relocation to another machine.

I don't suspect this would happen to any other customer, and in 8 years .NET development the only thing that can (expectedly) cause the behavior you're describing is an attempt to run a .NET Application on a system with an older version of the framework installed (e.g. lack of support, results in a standard debug/cancel dialog on most versions of Windows) and that is NOT what this problem is. This is also not related to Processor Architecture, Framework Version nor SP level, it is not related to any commercial AV software, nor any commercial network-security software.

It's clearly not something in your code, and I don't see that it is something you can fix for your client. I know of no tool nor series of steps you can use to resolve this issue short of having the customer re-image the target machine. Before they do so, again, have it ghosted by MIS/IT for potential malware (specifically, malware that wouldn't be distributed through the general public.)

For related reading: http://research.microsoft.com/apps/pubs/default.aspx?id=68568

Good luck.

Shaun Wilson
Interesting theory... what exactly did you see in the windbg output that makes you think this? My customer is a university and the software is being installed on most of their computers across campus using Alteris. So, It seems unlikely they distributed malware to a set of computers.
blak3r
Altiris doesn't prevent the installation (or development) of malware. I'm just making a best-guess since the null-ptr issue is thrown during entry of the process, the only thing I know that does this are bad method rewrites for a ctor (or similar). Again, a best guess. Since this is meant to be a controlled environment you may audit this machine versus another ghosted machine to determine what binary difference there is, depends on how necessary it is to resolve the problem, I suppose.
Shaun Wilson
I'm guessing that the reason running .NET executables via WinDbg/VS.NET works is because those apps directly launch the .NET runtime - whereas if you're running a .NET app via Explorer, Explorer has to look inside the EXE to determine whether it's a Win32 or .NET executable, and take the appropriate action (see the kernel32!BaseProcessStart entry in your stack trace).
Ian Kemp
A: 

May be this link could help you to resolve your problem Fixing The Error “.NET Runtime version 2.0.50727.3053 – Fatal Execution Engine Error” [.NET].

What operating system (Windows XP, Vista, etc.) does your customers pc has installed?

Have you tried to uninstall the .NET Framework (not repair) completely and then reinstalled it?

Jehof
@Jehof The suggestion provided by that link was to use System Restore to an earlier point which isn't really an option. I know the issue is occuring on WinXP SP3 not sure if it happens on vista. I'm not sure what they did exactly they followed the suggestions in http://support.microsoft.com/kb/908077. They are leaning towards reghosting b/c it takes so long so i doubt they'd try any more .NET framework repair.
blak3r
A: 

Had the same issue and it was fixed by using the Publish wizard. That's how I found out the target machine did not have the Visual Basic Powerpack 3.0 package installed. After installing that, it works like a charm.

SwissJay
@SwissJay I no longer have access to the box which this problem occurred on. Thanks for your suggestion. If this helps someone else let me know and i'll mark it as the correct answer. It would probably help if you elaborated on the "publish wizard". I'm not familiar with that so I'm assuming others will not be either.
blak3r
A: 

I recently had an issue that was manifesting in a very similar way. It turned out that certain third-party DLLs were not part of my deployment (I was just copying things from the bin directory). I created a setup application that picked up all the DLLs and once those got deployed correctly, it stopped crashing. It's strange that it was a hard fail rather than an exception.

This might not apply to you since you say it applies to any .NET application. Maybe you are working in an old project file with some leftover references?

strongopinions
@strongopinions Hmm... I do actually copy dlls to my bin directory. So, that that is interesting. But, It certainly doesn't explain why other .NET apps don't work nor why this issue hasn't happened on other computers.
blak3r
A: 

My father-in-law experienced an issue where a .NET app was crashing when launched on his personal laptop. The app was complaining about the installed version of the .NET framework. I was going to repair the installation, but first I ran Windows update. It has to be the dumbest fix I ever made, but all I did was install some optional updates. I can't remember which specific updates they were, but they were all things I would have installed on my own computer. The app worked perfectly after that.

Aaron
@Aaron Thanks for the suggestion, but these machines I believe had all windows updates and since reinstalling .net didn't fix the problem I suspect this wouldn't have fixed my issue.
blak3r