tags:

views:

658

answers:

2

Is there a way to keep StreamReader from doing any buffering?

I'm trying to handle output from a Process that may be either binary or text. The output will look like an HTTP Response, e.g.

Content-type: application/whatever
Another-header: value

text or binary data here

What I want to do is to parse the headers using a StreamReader, and then either read from its BaseStream or the StreamReader to handle the rest of the content. Here's basically what I started with:

private static readonly Regex HttpHeader = new Regex("([^:]+): *(.*)");
private void HandleOutput(StreamReader reader)
{
  var headers = new NameValueCollection();
  string line;
  while((line = reader.ReadLine()) != null)
  {
    Match header = HttpHeader.Match(line);
    if(header.Success)
    {
      headers.Add(header.Groups[1].Value, header.Groups[2].Value);
    }
    else
    {
      break;
    }
  }
  DoStuff(reader.ReadToEnd());
}

This seems to trash binary data. So I changed the last line to something like this:

if(headers["Content-type"] != "text/html")
{
  // reader.BaseStream.Position is not at the same place that reader
  // makes it looks like it is.
  // i.e. reader.Read() != reader.BaseStream.Read()
  DoBinaryStuff(reader.BaseStream);
}
else
{
  DoTextStuff(reader.ReadToEnd());
}

... but StreamReader buffers its input, so reader.BaseStream is in the wrong position. Is there a way to unbuffer StreamReader? Or can I tell StreamReader to reset the stream back to where StreamReader is?

A: 

Well, you can use Stream.Seek to set the position of the stream. It sounds to me like the problem you're having here is that StreamReader is reading characters rather than bytes (which, depending on the encoding, may be different than 1 byte per character). From the MSDN Library:

StreamReader is designed for character input in a particular encoding, whereas the Stream class is designed for byte input and output.

When you call reader.ReadToEnd(), it reads the data in as a character string based on whatever encoding it's using. You might have better luck using the Stream.Read method. Read in your string data with StreamReader and then pull out the binary data into a byte[] when you've read in the header that notifies you of incoming binary data.

Stuart Childs
A: 

This answer is late and possibly no longer relevant to you but it may come in handy for someone else who stumbles across this problem.

My problem involved PPM files, which have a similar format of:

  • ASCII text in the beginning
  • Binary bytes for the rest of the file

The problem I ran into was that the StreamReader class is incapable of reading stuff one byte at a time without buffering stuff. This caused unexpected results in some cases, since the Read() method reads a single character, not a single byte.

My solution was to write a wrapper around a stream that would read bytes one at a time. The wrapper has 2 important methods, ReadLine() and Read().

These 2 methods allow me to read the ASCII lines of a stream, unbuffered, and then read a single byte at a time for the rest of the stream. You may need to make some adjustments to suit your needs.

class UnbufferedStreamReader: TextReader
{
    Stream s;

    public UnbufferedStreamReader(string path)
    {
        s = new FileStream(path, FileMode.Open);
    }

    public UnbufferedStreamReader(Stream stream)
    {
        s = stream;
    }

    // This method assumes lines end with a line feed.
    // You may need to modify this method if your stream
    // follows the Windows convention of \r\n or some other 
    // convention that isn't just \n
    public override string ReadLine()
    {
        List<byte> bytes = new List<byte>();
        int current;
        while ((current = Read()) != -1 && current != (int)'\n')
        {
            byte b = (byte)current;
            bytes.Add(b);
        }
        return Encoding.ASCII.GetString(bytes.ToArray());
    }

    // Read works differently than the `Read()` method of a 
    // TextReader. It reads the next BYTE rather than the next character
    public override int Read()
    {
        return s.ReadByte();
    }

    public override void Close()
    {
        s.Close();
    }
    protected override void Dispose(bool disposing)
    {
        s.Dispose();
    }

    public override int Peek()
    {
        throw new NotImplementedException();
    }

    public override int Read(char[] buffer, int index, int count)
    {
        throw new NotImplementedException();
    }

    public override int ReadBlock(char[] buffer, int index, int count)
    {
        throw new NotImplementedException();
    }       

    public override string ReadToEnd()
    {
        throw new NotImplementedException();
    }
}
Dan Herbert