tags:

views:

133

answers:

4

I have an application that crunches a bunch of text files. Currently, I have code like this (snipped-together excerpt):

FileInfo info = new FileInfo(...)
if (info.Length > 0) {
    string content = getFileContents(...);
        // uses a StreamReader
        // returns reader.ReadToEnd();
    Debug.Assert(!string.IsNullOrEmpty(contents)); // FAIL
}

private string getFileContents(string filename)
    {
        TextReader reader = null;
        string text = "";

        try
        {
            reader = new StreamReader(filename);
            text = reader.ReadToEnd();
        }
        catch (IOException e)
        {
            // File is concurrently accessed. Come back later.
            text = "";
        }
        finally
        {
            if (reader != null)
            {
                reader.Close();
            }
        }

        return text;
    }

Why am I getting a failed assert? The FileInfo.Length attribute was already used to validate that the file is non-empty.

Edit: This appears to be a bug -- I'm catching IO exceptions and returning empty-string. But, because of the discussion around fileInfo.Length(), here's something interesting: fileInfo.Length returns 2 for an empty, only-BOM-marker text file (created in Notepad).

+4  A: 

You might have a file which is empty apart from a byte-order mark. I think TextReader.ReadToEnd() would remove the byte-order mark, giving you an empty string.

Alternatively, the file could have been truncated between checking the length and reading it.

For diagnostic purposes, I suggest you log the file length when you get an empty string.

Jon Skeet
Will try logging.
ashes999
You're correct on that italicised thinking. Unless you explicitly open it as a BOMless encoding like US-ASCII, it'll eat the BOM (as indeed, it should).
Jon Hanna
A file with just BOM returns a non-zero length, but ReadToEnd() gives empty contents; see updated question.
ashes999
A: 

If I remember well, a file ends with end of file, which won't be included when you call ReadToEnd.

Therefore, the file size is not 0, but it's content size is.

Maupertuis
Nope, EOF is only handled by the console, but not by the `StreamReader`. See my quick test: http://pastebin.com/mhjr8yrV
Lucero
Yes, but, but we are speaking about FileInfo.Length, which is the complete File length, not the stream reader length. I am quite sure that they are not equal (which this question seems to demonstrate)
Maupertuis
@Lucero I don't really understand your pastebin example.
ashes999
@ashes999, the point is that the EOF character 26 is read into the string normally, it does not stop nor break there, so that a possible EOF character is not the cause for the problem at hand.
Lucero
I agree with you Lucero, the root cause here is thinking that FileInfo.Length == FileStream.Length == string.Length .
Maupertuis
A: 

What's in the getFileContents method?

It may be repositioning the stream's pointer to the end of the stream before ReadToEnd() is called.

Toby
+1  A: 

See that catch (IOException) block you have? That's what returns an empty string and triggers the assert even when the file is not empty.

Ben Voigt
Bingo. Complete oversight on my end.
ashes999