views:

517

answers:

9

My C# application sends me a stack trace when it throws an unhandled exception, and I'm looking at one now that I don't understand.

It looks as though this can't possibly be my fault, but usually when I think that I'm subsequently proved wrong. 8-) Here's the stack trace:

mscorlib caused an exception (ArgumentOutOfRangeException): startIndex cannot be larger than length of string.
Parameter name: startIndex
   System.String::InternalSubStringWithChecks(Int32 startIndex, Int32 length, Boolean fAlwaysCopy) + 6c
   System.String::Substring(Int32 startIndex) + 0
   System.IO.Directory::InternalGetFileDirectoryNames(String path, String userPathOriginal, String searchPattern, Boolean includeFiles, Boolean includeDirs, SearchOption searchOption) + 149
   System.IO.Directory::GetFiles(String path, String searchPattern, SearchOption searchOption) + 1c
   System.IO.Directory::GetFiles(String path) + 0
   EntrianSourceSearch.Index::zz18ez() + 19b
   EntrianSourceSearch.Index::zz18dz() + a

So my code (the obfuscated function names at the end) calls System.IO.Directory.GetFiles(path) which crashes with a string indexing problem.

Sadly I don't know the value of path that was passed in, but regardless of that, surely it shouldn't be possible for System.IO.Directory::GetFiles to crash like that? Try as I might I can't come up with any argument to GetFiles that reproduces the crash.

Am I really looking at a bug in the .NET runtime, or is there something that could legitimately cause this exception? (I could understand things going wrong if the directory was being changed at the time I called GetFiles, but I wouldn't expect a string indexing exception in that case.)

Edit: Thanks to everyone for their thoughts! The most likely theory so far is that there's a pathname with dodgy non-BMP Unicode characters in it, but I still can't make it break. Looking at the code in GetFiles with Reflector, I think the only way it can break is for GetDirectoryName() to return a path that's longer than its input, even when its input is already fully normalised. Bizarre. I've tried making pathnames with non-BMP characters in (I've never had a directory called {MUSICAL SYMBOL G CLEF} before 8-) but I still can't make it break.

What I've done is add additional logging around the failing code (and made sure my logging works with non-BMP characters!). If it happens again, I'll have a lot more information.

A: 

Perhaps it has something to do with the obfuscator. And the obfucator screws things up. Try running the code without the obfuscator. And post your results.

edit: Are you able to reproduce the crash?

Henri
...he doesn't know the input that caused the exception.
JoshJordan
@Henri: The same obfuscated code is running nicely for everyone else - there's just this one customer with the crash. I don't think the obfuscator's relevant. All it's doing is passing a string to GetFiles - how could obfuscation screw that up? 8-) And no, I'm not able to reproduce the crash.
RichieHindle
Obfuscation can do some weird stuff to .NET code. If the obfuscator does things like fake code injection and the like your program logic is no longer exactly the same as it was before the obfuscation. I have a feeling that it is doing something to manipulate that string input. You might see if you can exclude that function from obfuscation and see if that corrects the issue.
Navaar
@Navaar: This is my own obfuscator, and trust me, it's not that clever. 8-)
RichieHindle
Indeed, i agree with Navaar. Although an obfuscator should not change any logic in the program it some can do some strange stuff.Anyways, Im afraid you cant have a definite answer unless you do some logging in the app and wait until the exception to happen again.Another note is that i cant seem to find what version of .net you're using. I checked mscorlib 2.0 3.5 and 4.0 and none of them have a callstack as you've posted. (i.e. cant find GetFiles(string) -> GetFiles(string, string, SearchOption) )
Henri
@Henri: It's likely that the JIT (or NGen) has inlined some of the intermediate methods so that they're not appearing in the stack trace.
LukeH
Can your customer reproduce the exception? Or did this only happen once?
Sprklnh2o
@Sprklnh2o: The crash happens in background thread that indexes files as they change, so it's not easy to tell the customer how to go about trying to reproduce it. I will go down that route, but I don't want to bug the customer until I have as much information as I can possibly get.
RichieHindle
+1  A: 

Just a guess... are any of the file names passed as arguments longer than 256 characters? The .Net framework standard System.IO functions cannot handle a file name that is longer than that.

Matt Hamsmith
Sorry - that would generate a PathTooLongException, not your error.
Matt Hamsmith
@Matt: That's something I hadn't thought of, but I think you're right, it would give the obvious exception, not a weird one. I've spent a long time trying to break `GetFiles` by giving it broken arguments, and it invariably throws a sensible exception back at me.
RichieHindle
I checked the official documentation (as I'm sure you have), and ArgumentOutOfRangeException isn't even listed as one of the exceptions that this function can throw. The only thing even remotely close that I can think of would be a wildcard character or maybe a relative path that needs expansion - any data that needs to be expanded to make an absolute path, like Lucas points to.
Matt Hamsmith
+2  A: 

You can try looking into the code for System.IO.Path.GetFiles() with .NET Reflector. From a quick look it apparently only calls String.Substring() to split something from the end of the path and adds it back near the end of the method. It checks Path.DirectorySeparatorChar (the backslash, '\') and Path.AltDirectorySeparatorChar (the slash, '/') to determine the index and length of the substring.

My guess would be that invalid or unicode file or folder names are confusing the method.

Lucas
@Lucas: It looks like the only way it can break is for `GetDirectoryName()` to return a path that's longer than its input, even when its input is already fully normalised. Bizarre. And I can't come up with a pathname that makes that happen.
RichieHindle
+1  A: 

Wow.. I don't think that's ever happened to me.

You're saying that it's only this one customer that this happens to?

  1. Might want to start logging the path parameters, and set up the program to send the logs to you for analysis, I feel that the problem is in the format of the argument.
  2. If this obfuscated code created from your own obfuscator, why don't you try test it on your machine 'un-obfuscated' with some of the parameters collected and see the result?
  3. Isn't there anything in the Path namespace, like Path.Exist() or Path.IsValid() to give the parameter a check.. maybe there's funny '/' or '\' and other characters, so when the internal API parses each component, there's some sort of corruption in determining each portion of the path string because of funny characters? Just an observation, since the Substring is failing.

Hope that helps and good luck! Please let us know what the solution you've found is, as will definitely be an interesting one.

lb
@Leo: 1. More logging, yes, already done that for the next release. 8-) 2. Will do. 3. `GetFiles` already protects itself against all that stuff - see my comment to Matt's answer http://stackoverflow.com/questions/1324806/can-you-explain-this-bizarre-crash-in-the-net-runtime/1324846#1324846
RichieHindle
+1  A: 

Perhaps you could provide some details about the customer having the issue. Things like: 1. OS name and version 2. OS Language 3. .Net version you are targeting, vs .Net version the customer is running.

There could be unicode characters in the directory path that are causing the string length to be off by one or more.

Another note: the exception text suggests that your program was written in managed C++. You aren't mixing in any unmanaged string manipulation are you?

I might suggest that if you can, modify your diagnostics to capture the actual path variable that causes the error. A possible plausible explaination: http://support.microsoft.com/kb/943804/

Joe Caffeine
@Joe: XP-32 SP3, English, 2.0, 2.0.50727.1433 (which Earwicker comments is 3.5 without SP1). It's all C# (the C++-like syntax in the stack trace is because they're formatted by my own code, and I'm more of a C++ guy than a C# guy. 8-)
RichieHindle
Technically, 2.0.50727.1433 is 2.0 SP1. You can get 2.0 SP1 by updating the 2.0 installation or by installing 3.5 (w/o SP1), which installs 2.0 SP1 and 3.0 SP1 as prerequisites.
Lucas
+1  A: 

First and only question should have been, "Have your run ChkDsk?"

AMissico
A: 

Not sure this is related, but I'm using GetFiles in Visual C++, was getting it crashing when listing contents of C:, turned out I had a folder with messed up permissions from a previous install. I reclaimed the folder to my current user and it fixed the crash.

Gavin
@Gavin: Do you know what the exception was?
RichieHindle
A: 

From the souce and your comments, I suspect a UNC path is causing problems, with a possible security permission or share permission issue. For instance, if the user turned off creation of 8.3 file names, you will definitely have UNC path issues because it causes the network provider to fail in retrieving proper file names in Windows 2000 and Windows XP. (I forget which service packs this bug was fix.)

Following is the source code of importance.

    String tempStr = Path.InternalCombine(fullPath, searchPattern);

    // If path ends in a trailing slash (\), append a * or we'll
    // get a "Cannot find the file specified" exception
    char lastChar = tempStr[tempStr.Length-1];
    if (lastChar == Path.DirectorySeparatorChar || lastChar == Path.AltDirectorySeparatorChar || lastChar == Path.VolumeSeparatorChar) 
        tempStr = tempStr + '*';

    fullPath = Path.GetDirectoryName(tempStr); 
    BCLDebug.Assert((fullPath != null),"fullpath can't be null!");

    String searchCriteria;
    bool trailingSlash = false;
    bool trailingSlashUserPath = false;

    lastChar = fullPath[fullPath.Length-1];
    trailingSlash = (lastChar == Path.DirectorySeparatorChar) || (lastChar == Path.AltDirectorySeparatorChar); 

    if (trailingSlash) {
        // Can happen if the path is C:\temp, in which case GetDirectoryName would return C:\ 
        searchCriteria = tempStr.Substring(fullPath.Length);
    }
    else
        searchCriteria = tempStr.Substring(fullPath.Length + 1);
AMissico
A: 

Is it a possibility to quickly code up a console app and run it in debug mode. Basically loop through the entire file directory using the GetFiles method. Maybe something will hit and you should be able to quickly locatye the offending file?

It only ever happened on a customer's machine, so the customer would have to have done it. To my knowledge it hasn't happened again, so perhaps AMissico was right with his ChkDsk suggestion.
RichieHindle