views:

101

answers:

1

I have a CGI script which takes about 1 minute to run. Right now Apache only returns results to the browser once the process has finished.

How can I make it show the output like it was run on a terminal?

Here is a example which demonstrates the problem.

I want to see the numbers 1 to 5 appear as they are printed.

+1  A: 

There are several factors at play here. To eliminate a few issues, Apache and bash are not buffering any of the output. You can verify with this script:

#!/bin/sh

cat <<END
Content-Type: text/plain

END

for i in {1..10}
do
    echo $i
    sleep 1
done

Stick this somewhere that Apache is configured to execute CGI scripts, and test with netcat:

$ nc localhost 80
GET /cgi-bin/chunkit.cgi HTTP/1.1
Host: localhost

HTTP/1.1 200 OK
Date: Tue, 24 Aug 2010 23:26:24 GMT
Server: Apache/2.2.14 (Unix) mod_ssl/2.2.14 OpenSSL/0.9.7l DAV/2
Transfer-Encoding: chunked
Content-Type: text/plain

2
1

2
2

2
3

2
4

2
5

2
6

2
7

2
8

2
9

3
10

0

When I do this, I see in netcat each number appearing once per second, as intended.

Note that my version of Apache, at least, applies the chunked transfer encoding automatically, presumably because I didn't include a Content-Length; if you return the Transfer-Encoding: chunked header yourself, then you need to encode the output of your script in the chunked transfer encoding. That's pretty easy, even in a shell script:

chunk () {
    printf '%x\r\n' "${#1}"  # Length of the chunk in hex, CRLF
    printf '%s\r\n' "$1"     # Chunk itself, CRLF
}

chunk $'1\n' # This is a Bash-ism, since it's pretty hard to get a newline
chunk $'2\n' # character portably.

However, serve this to a browser, and you'll get varying results depending on the browser. On my system, Mac OS X 10.5.8, I see different behaviors between my browsers. In Safari, Chrome, and Firefox 4 beta, I don't start seeing output until I've sent somewhere around 1000 characters (I would guess 1024 including the headers, or something like that, but I haven't narrowed it down to the exact behavior). In Firefox 3.6, it starts displaying immediately.

I would guess that this delay is due to content type sniffing, or character encoding sniffing, which are in the process of being standardized. I have tried to see if I could get around the delay by specifying proper content types and character encodings, but without luck. You may have to send some padding data (which would be pretty easy to do invisibly if you use HTML instead of plain text), to get beyond that initial buffer.

Once you start streaming HTML instead of plain text, the structure of your HTML matters too. Some content can be displayed progressively, while some cannot. For instance, streaming down <div>s into the body, with no styling, works fine, and can display progressively as it arrives. If you try to open a <pre> tag, and just stream content into that, Webkit based browsers will wait until they see the close tag to try to lay that out, while Firefox is happy to display it progressively. I don't know all of the corner cases; you'll have to experiment to see what works for you.

Anyhow, I hope this helps you get started. Let me know if you have any more questions!

Brian Campbell
I can only get `curl -vv http://test.dabase.com/foo.cgi` working. Firefox 3.6.8, w3m, Chrome are not showing anything. :(Can you tell me how you run netcat in one step? I'm having issues getting it running with my Apache2 VHOST setup.I can only get that curl running from the host itself. When I'm remote, it stops working. :( I am guess a proxy is messing it up?
hendry
@hendry To run netcat, you invoke it as `nc localhost 80`, and then you type in the HTTP request and headers manually (`GET /cgi-bin/...` up through the blank line). At that point, you should see the response coming back from the server. It's a pretty low-tech way of debugging HTTP, but it can be helpful to see exactly what's going on. If things work locally but not remotely, then I would expect that a proxy or some aspect of your Apache config is messing you up. I'm not sure how to help there, without knowing more about your setup.
Brian Campbell
I thought you could echo and pipe to nc to debug quicker.
hendry
@hendry Sure, you could instead do: `(echo 'GET /cgi-bin/chunkit.cgi HTTP/1.1'; echo 'Host: localhost'; echo) | nc localhost 80` or `printf 'GET /cgi-bin/chunkit.cgi HTTP/1.1\r\nHost: localhost\r\n\r\n' | nc localhost 80`, or just stick the request in a file and do `nc localhost 80 < request.txt`. I usually just type the headers in manually because I'm doing a one-off test, but you might try one of those if you want something quicker for repeated testing.
Brian Campbell
Can you get it working with my remote service, `printf 'GET /foo.cgi HTTP/1.1\r\nHost: test.dabase.com\r\n\r\n' | nc test.dabase.com 80`? I can't :(
hendry
@hendry It works just fine for me. I think you might be right, that there's an issue with a firewall somewhere in between.
Brian Campbell
I am giving up on Apache. I can get **chunking** working nicely in EVERY browser with [nhttpd](http://www.nazgul.ch/dev_nostromo.html) and the test page <http://hetty.webconverger.org:8080/test/>
hendry
@hendry I notice that the webconverger server is sending a few extra headers. You may want to try adding them to your CGI on Apache; in particular, the `Cache-Control: no-cache` might be helping you bypass whatever proxy you might be behind. You might want to experiment with the `Connection: close` header as well, in case that makes a difference. Both of your examples are working equally well for me, so I think that there's a strong chance that a proxy is affecting you.
Brian Campbell
I have a reduced example running now at <http://3547488.naovi.com/>
hendry
@hendry I'm not quite sure what trouble you are still having. That example works fine for me if I use curl or netcat to fetch the page and watch it as it arrives; the chunked transfer encoding is working, and the content is streamed out slowly. If I try to open it in most browsers, it doesn't display until it has all downloaded, most likely due to the reason I mentioned, that many browsers wait to sniff the first 1k or so of data before they decide to render it as HTML. You need to include some kind of padding data at the front, and then you should see it display progressively.
Brian Campbell
It does seem to work for me in FF3.6 without padding using nhttpd. I'm not convinced that it is correct to introduce "chunked transfer encoding" in this topic, since I just generally want to "progressively loading" to work more reliably. I do want to attempt to plot this as a bug in browsers, since I really don't like this padding workaround as a matter of taste.
hendry