views:

182

answers:

2

I have a very complicated program that is failing, and I've simplified it to this test set with a batch file and C program.

My C program uses ExitProcess to pass back the errorlevel to a batch file. Sometimes on Windows 7 (Microsoft Windows [Version 6.1.7600]), the errorlevel is not being interpreted correctly.

I think this should run forever. On Windows XP it appears to run forever. On two different dual-core Windows 7 machines (one 64-bit one 32-bit) it fails within a couple minutes.

I can't imagine that I'm doing something wrong, but in case there is something funny about ExitProcess on Windows 7, I thought I'd ask. Is there anything here I've done illegally?

Batch file test.bat for cmd.exe:

@ECHO OFF
SET I=0
:pass
SET /A I=I+1
Title %I%
start/wait level250
if errorlevel 251 goto fail
if errorlevel 250 goto pass
:fail

Program level250.c:

#include "windows.h"

static volatile int Terminate = 0;

static unsigned __stdcall TestThread(void * unused)
    {
    Terminate = 1;
    return 0;
    }

int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpszCmdLine, int nCmdShow)
    {
    CreateThread(NULL, 0, TestThread, NULL, 0, NULL);

    while (Terminate == 0) Sleep(1);
    ExitProcess(250);
    }

My compiler version and invocation are:

Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 12.00.8804 for 80x86

Copyright (C) Microsoft Corp 1984-1998. All rights reserved.

cl /MT level250.c

Other information: I have also tried running under JPSoft's TCC and get the same behavior as using CMD. I am using a straight .c program, not .cpp. I do not see any failures in a single threaded version. I put the sources and binaries on http://jcook.info/win7fail and the zip file MD5 is 579F4FB15FC7C1EA454E30FDEF97C16B and CRC32 is C27CB73D.

EDIT After suggestions, I have further changed the test case and still see the failures. In our actual application, there are hundreds of threads. Some threads exit with various significant return codes, some run forever, and some are hung in operating system calls or dlls and are difficult (if not impossible) to kill.

#include "windows.h"

static unsigned __stdcall TestThread(void * unused)
    {
    return 0;
    }

int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpszCmdLine, int nCmdShow)
    {
    CreateThread(NULL, 0, TestThread, NULL, 0, NULL);
    return(250);
    }
+1  A: 

Print what the return code actually is. There's no guarantee that if something goes wrong, you will get the 251 and 250 that you expect, such as through segmentation fault, or other errors you're not aware of. Furthermore I can't see where you're returning 251 in your code. Be wary of high exit codes, I believe 255 is safe portable limit, on some systems it may be less than 64, or <= 127. (This might be irrelevant seeing as you're obviously using Windows but it's worth noting.)

Also try invoking a debugger, or loading a core dump upon the process' unexpected death.

Matt Joiner
In a batch file "if errorlevel 251" performs the action if the errorlevel is 251 or more. Therefore following that with "if errorlevel 250" is the way to test for exactly 250. We have all sorts of handling for segfault and such, but I have reduced a multi-thousand line problem to this simple case, hoping to learn why this simple case fails.
piCookie
+1  A: 

It appears to be returning the result of the thread in the times that it fails. I changed the return value of the thread to 37 and added echo %errorlevel% to the end of the batch file. When it stopped on my PC, it printed 37. So it seems that there is some kind of synchronization problem. To fix it, I changed the code in main to the following:

HANDLE h = CreateThread(NULL, 0, TestThread, NULL, 0, NULL);
while (Terminate == 0) Sleep(1);
WaitForSingleObject( h, INFINITE );
ExitProcess(250);

The documentation for ExitProcess clearly says that the exit code is "for the process and all threads". So it would seem that there is a bug, however, relying on ExitProcess to kill all the threads doesn't seem like the best plan. So waiting for them to finish is probably a reasonable course of action.

I built the program and reproduced the problem with VC6 (the version you used I believe), VS2005, and VS2008. I ran it, out of curiosity, on a 2-core win7 laptop and a 4-core win7 desktop machine. It did not reproduce on an older single-core hyperthreaded XP machine, but that isn't to say that it would not eventually fail on it; maybe it needed to just run longer there.

Edit It would be a bit of kludge, but perhaps a workaround would be to store the exit code in a global variable in the application and return that value from all threads. Then in the situation where this problem/bug occurs, the exit code of the application would still be the desired value.

Mark Wilkins
Thank you very much for your efforts! Unfortunately we can not wait for threads to finish in our actual program (some are hung in the OS, etc.) but I will explore more about things to do before calling ExitProcess. Our program has run for years on Windows NT 4.0 and on without ever having an errorlevel failure like this; I suspect your XP machine will never fail.
piCookie
Alas, your edit kludge is also not workable in our live application because some threads that terminate pass back information in their return value. Thanks for your continuing ideas!
piCookie
@piCookie: The real world conspires against the easy solutions :) I think you are correct that it is specifically a Win7 problem. Or at least I am unable to prove you wrong. I let it run over half a million iterations on a two-core XP machine without failure. It is an interesting problem (more for me than you I'm sure since it is not my direct problem).
Mark Wilkins