I am attempting to implement an HTTP tunnel using similar techniques to those employed by web browsers to simulate a full-duplex connection, in Java using the Netty framework. I wish to implement this in such a way that it will work in the presence of real world HTTP proxies. I am attempting to do this without using a servlet container, to avoid unnecessary overhead in terms of library dependencies, and because the servlet API does not fit the usage patterns of a full duplex http tunnel.
I am aware of some restrictions that HTTP proxies impose that "break" some potential uses of the HTTP protocol:
- HTTP Pipelining may not be honoured beyond the connection between the client and the proxy. i.e. The proxy may send a single request and wait for the response before sending the next request, even if the client has dispatched multiple pipelined requests to the proxy.
- Chunked encoding may not be honoured beyond the connection between the proxy in a similar fashion: the server may send a response back in chunks, but the proxy may wait for the end chunk before dispatching the full, dechunked response to the client.
- HTTP CONNECT is often only allowed for SSL/TLS ports, typically only port 443, so this cannot be used as a sneaky way to get an unfettered TCP connection to the outside world.
However there is one additional possibility that I am not sure about: do real world HTTP proxies also share a persistent connection to a server between multiple clients? For instance:
- Client A sends requests A1, A2, and A3 to server X
- Client B sends requests B1 and B2 to server X
- Client C sends requests C1, C2 and C3 to server X
Would the proxy then potentially open a single connection to server X and send messages in the order:
A1, A2, B1, C1, B2, A3, C2, C3
or a similar order that preserves the ordering from each individual client, but potentially interleaved? Or even worse, could the proxy open multiple connections to the server and scatter messages from each client between the connections, i.e.
Connection 1: A1, C1, C2, C3
Connection 2: B1, B2, A2, A3
If so, my approach requires more thought as I potentially need to demultiplex these messages into different queues for each tunnel, and cannot simply rely on identifying a connection as being used for a particular client.
Does anyone know of any good resources that describe the quirks of commonly used HTTP proxies and stateful inspecting firewalls?