ansaurus

Question

Answer 1

+2 A:

Read the response until you reach a double CRLF. What you now have is the Response headers. Parse the headers to read the Content-Length header which will be the count of bytes left in the response.

Here is a regular expression that can catch the Content-Length header.

David's Updated Regex

Content-Length: (?<1>\d+)\r\n

Content-Length

Note

If the server does not properly set this header I would not use it.

ChaosPandion 2010-02-03 17:35:30

+1 for clarity and the regex. Thanks.

David Lively 2010-02-03 18:30:15

+1 for picking up on the `content-length` issue at the same time and putting in an example. Seems so often there's a deeper issue behind the question.

Aaronaught 2010-02-03 18:34:30

See also: http://en.wikipedia.org/wiki/Chunked_transfer_encodingCheck for this if there is no content length header.

Foole 2010-02-04 02:30:43

Answer 2

A:

I may be wrong, but it looks like your call to Write is writing (under the hood) to the stream ns (via StreamWriter). Later, you're reading from the same stream (ns). I don't quite understand why are you doing this?

Anyway, you may need to use Seek on the stream, to move to the location where you want to start reading. I'd guess that it seeks to the end after writing. But as I said, I'm not really sure if this is a useful answer!

Tomas Petricek 2010-02-03 17:36:55

That's how `NetworkStream` works. Attempting to seek on one will always throw a `NotSupportedException`.

Aaronaught 2010-02-03 17:39:48

Tomas, NetworkStream is bound to a buffered IP channel. Writing sends data to the server, Reading attempts to read from a receive buffer. .Seek() doesn't make sense in that context.

David Lively 2010-02-03 17:55:03

Thanks for the clarification! Glad you got a better answer!

Tomas Petricek 2010-02-03 18:56:57

Answer 3

+5 A:

Contrary to what the documentation for NetworkStream.Read implies, the stream obtained from a TcpClient does not simply return 0 for the number of bytes read when there is no data available - it blocks.

If you look at the documentation for TcpClient, you will see this line:

The TcpClient class provides simple methods for connecting, sending, and receiving stream data over a network in synchronous blocking mode.

Now my guess is that if your Read call is blocking, it's because the server has decided not to send any data back. This is probably because the initial request is not getting through properly.

My first suggestion would be to eliminate the StreamWriter as a possible cause (i.e. buffering/encoding nuances), and write directly to the stream using the NetworkStream.Write method. If that works, make sure that you're using the correct parameters for the StreamWriter.

My second suggestion would be not to depend on the result of a Read call to break the loop. The NetworkStream class has a DataAvailable property that is designed for this. The correct way to write a receive loop is:

NetworkStream netStream = client.GetStream();
int read = 0;
byte[] buffer = new byte[1024];
StringBuilder response = new StringBuilder();
do
{
    read = netStream.Read(buffer, 0, buffer.Length);
    response.Append(Encoding.ASCII.GetString(buffer, 0, read));
}
while (netStream.DataAvailable);

Aaronaught 2010-02-03 17:37:56

Again, the request works fine when going through the Fiddler proxy. I can see the entire response coming through and being appended to my StringBuilder (response). It just appears that the connection isn't being closed when the server is done sending the response, or my code isn't detecting it. Argh.

David Lively 2010-02-03 17:53:08

@David: See my update, I added an example of how to write the loop using `DataAvailable` instead of simply blocking on every read. If this fails as well, it means that you are not getting any response from the server when you don't go through Fiddler.

Aaronaught 2010-02-03 18:03:20

@Aaronaught Even when going direct (not through Fiddler), I *AM* receiving the *ENTIRE* response that I expect (I can see this in the debugger). I just don't get any sort of indication that the transaction is complete once that happens. Also, doesn't DataAvailable just indicate that there is data in the receive buffer? If that's the case, a false value for DataAvailable may not necessarily indicate that the transaction is complete, just that no data is yet available, or that the server is taking its time before sending the next chunk.

David Lively 2010-02-03 18:08:00

@David: Please, humour me and try it. Yes, `DataAvailable` means that there is data in the receive buffer, but `Read` is a **blocking call**. Your code is probably working by accident because Fiddler closes the socket prematurely (I've had this issue with Fiddler before) - a real server is *not* obligated to close the socket right away and in fact should not always do this - sometimes the connection needs to remain open. The way your code is written, it will *always* loop forever unless it is interrupted, and you can't control that factor.

Aaronaught 2010-02-03 18:11:31

@Aaronaught I agree that DataAvailable will work under light load. My concern is that, when the target server is getting hammered, that DataAvailable will temporarily be false while awaiting a yet-to-be transmitted chunk. If Fiddler is closing the connection prematurely, how do browsers or other applications handle this situation? High server latency could easily make DataAvailable fail.

David Lively 2010-02-03 18:14:31

@David: That is exactly the reason why the HTTP protocol has a `content-length` header. All a browser has to do is read enough data to grab that header, then it knows exactly how much more data it needs to read. Chunked works differently but that's way beyond the scope of this question. So if you're trying to use a `TcpClient` with HTTP (why not use a `WebRequest` instead?), then the only way to be sure is to check the content-length. If you have no idea how much data is coming back, you either need to rely on `DataAvailable` or wait for some predetermined timeout.

Aaronaught 2010-02-03 18:17:40

@Aaronaught - Bingo

ChaosPandion 2010-02-03 18:20:23

@Aaronaught This code needs to be protocol-agnostic. The fact that this particular layer uses HTTP is irrelevant. Also, WebRequest does some freaky stuff with cookies (concatenating multiple set-cookie headers, which effectively breaks when any cookie value has a comma). Sorry for the frustration. If I were sending, say, an image or a ZIP file, how would I know when to stop reading data?!

David Lively 2010-02-03 18:20:51

@David, There is **no such thing** as a "protocol-agnostic way" of reading the exact amount of data that is and ever will be available from a simple stream of bytes, unless the stream has a known length (which a `NetworkStream` does not). This logic is always part of the underlying protocol. HTTP uses `content-length` to get around this limitation, and the newer HTTP 1.1 can use chunked encoding, where each chunk has a flag that indicates whether or not there are more chunks. It's one or the other. Welcome to the wonderful world of network programming. ;)

Aaronaught 2010-02-03 18:24:22

@David - Many *mini-protocols* I have designed send the size of the data in the first 4-8 bytes. Since you are dealing with IIS this is not an option.

ChaosPandion 2010-02-03 18:27:45

@Aaronaught That's making more sense now. I suppose when requesting any item via HTTP the content-length header will be present, and hopefully correct. Answer accepted. Thanks.

David Lively 2010-02-03 18:29:19

@ChaosPandion I've done the same with binary protocols, but mostly so I could tell if the socket was prematurely closed. I hadn't actually written myself into a situation where the expected response length wasn't immediately available until now.

David Lively 2010-02-03 18:35:53

Answer 4

A:

Two Suggestions...

Have you tried using the DataAvailable property of NetworkStream? It should return true if there is data to be read from the stream.


    while (ns.DataAvailable)
    {
     //Do stuff here
    }

Another option would be to change the ReadTimeOut to a low value so you don't end up blocking for a long time. It can be done like this:


    ns.ReadTimeOut=100;

thorkia 2010-02-03 18:01:26

I'm concerned that when the target IIS server is under heavy load, this could cause me to prematurely close the socket. I think that DataAvailable indicates that there is data in the receive buffer; if it is false, the server may still be rendering data to be sent. Setting a low Timeout could cause the same issue.

David Lively 2010-02-03 18:05:25

Answer 5

+1 A:

Not sure if this is helpful or not but with HTTP 1.1 the underlying connection to the server might not be closed so maybe the stream doesn't get closed either? The idea being that you can reuse the connection to send a new request. I think you have to use the content-length. Alternatively use the WebClient or WebRequest classes instead.

Timbo 2010-02-03 19:08:29

Adding the "Connection: close" header fixed this, and it appears to be working for just about everything. Good call.

David Lively 2010-02-04 00:37:23

ansaurus

tags:

views:

answers:

C# NetworkStream.Read oddity

related questions