tags:

views:

381

answers:

8

I have read quite a bit of material on Internet, where different authors suggest using output buffering. The funny thing is that most authors argument for its use because it allows to set HTTP response headers while also generating response output. Frankly, if for no other reason, I think that responsible web applications SHOULD NOT mix outputting headers and content, and web developers instead should look for the errors where headers are in fact attempted to be sent after output has been generated - it most likely indicates errors in script flow control logic. This is my first argument against PHP's ob_ output buffering API. Even if for that little convenience it gives - "mixing" headers with output - that is not a good enough reason to use it, unless one needs to hack up scripts fast, which is not my current task at all.

Are there other, more important advantages to output buffering?

Also, I think most people meeting or dealing with the output buffering API do not think about the fact that even without the explicit output buffering enabled, PHP in combination with the web-server it is plugged into, STILL does some internal buffering. It is easy to check - do an echo of some short string, sleep for say 10 seconds, and do another echo. Request your script with a browser and watch as the blank page pauses for 10 seconds, and then both lines are shown at the same time. Before some say that it is a rendering artefact, not traffic, tracing the actual traffic between the client and the server shows that the server has generated the 'Content-Length' header with an appropriate value for the entire output - suggesting that the output was not sent progressively with each 'echo' call, but accumulated in some buffer and then sent on script termination. This is one of my gripes with explicit output buffering - why do we need two different output buffer implementations on top of one another? May it be because the internal (inaccessible) PHP/Web-server output buffering is subject to conditions a PHP developer cannot control, and is thus not really usable?

In any case, I for one, start to think one should avoid explicit output buffering (the series of ob_ functions) and rely on the implicit one, assisting it with the good flush function, when necessary. Maybe if there was some guarantee that a web server one uses actually sends output to the client on each echo/print call, then it would be useful to set up explicit buffering - after all one does not want to send response to the client with some 100 byte chunks. But the alternative with two buffers seems like a somewhat useless layer of abstraction.

Opinions are very welcome...

A: 

We used to use it back in the day for pages with enormously long tables filled with data from a database. You'd flush the buffer every x rows so the user knew the page was actually working. Then someone heard about usability and pages like that got paging and search.

Tom
Well, that is implicit output buffering, right? You can do just fine without ob_ family of functions, no matter the length of data, and regardless usability and paging.
amn
+4  A: 

i use output buffering for one reason ... it allows me to send a "location" header after i've begun processing the request.

Don Dickinson
What would "processing the request" mean here? Sure, you can process, as long as you don't send any data before headers. I am still not convinced on a good reason to send location header after sending data. Maybe there exists a corner case?
amn
processing the request might mean sending some sql data back to the client or perhaps setting a cookie. later in the call, an sql call fails. you may not want the user to see the first stuff ...the cookie or the other sql data. in that case, canceling the buffer and sending a location to a generic error page or whatever might be necessary. in theory it would be great to not send data, but you never know when an error will occur and there may be very good reasons to NOT send the buffer to the client.
Don Dickinson
A: 

It's useful if you're trying to display a progress bar during a page that takes some time to process. Since PHP code isn't multi-threaded, you can't do this if the processing is hung up doing 1 function.

Langdon
Ehm, and I have tested this, you can do exactly that without output buffering API?! Echo your progress message at the beginning of a lengthy operation, call 'flush', and start your lengthy processing. The server will switch to 'chunked' transfer encoding, your progress message will already be at the client end, while your script is still executing.
amn
+1  A: 

Output buffering is critical on IIS, which does no internal buffering of its own. With output buffering turned off, PHP scripts appear to run a lot slower than they do on Apache. Turn it on and they run many times faster.

Jon Benedicto
Can you provide any proof of this beyond the fact that it "seems" slower?
Langdon
@Langdon, to be fair, perceived performance definitely is real performance, from a user's perspective.
eyelidlessness
I could, but I don't want to restart my IIS server right now :-) Try running phpBB3 in IIS with and without output buffering, and use Firebug to note the time it takes to receive the HTML.
Jon Benedicto
I have no extensive experience with IIS. What you imply is that the server sends chunks of output data with each 'echo' etc, without 'Content-Length'? In that case indeed it is better to have SOME output buffering than none at all, especially if sending short strings, I guess. However, I was argumenting against the type of cases where both output buffering schemes work in parallel, which seems to be a waste.
amn
+2  A: 

If you want to output a report to the screen but also send it through email, output buffering lets you not have to repeat the processing to output your report twice.

Daniel Vandersluis
That's one good reason. Thanks. Missed it, did it in one script a while ago, albeit not with emails, but post-processing data before it actualle was sent to client, like fixing URLs in a HTML text.
amn
It's something that we do a lot of at work because we've got a lot of report applications where the user wants the output for posterity. Saves on duplication, obviously. :)
Daniel Vandersluis
+2  A: 

Ok, here is the real reason : the output is not started until everything is done. Imagine an app which open an SQL connection and don't close it before starting the output. What happen is your script get a connection, start outputting, wait for the client to get all it needs then, at the end, close the connection. Woot, a 2s connection where a 0.3s one would be enough.

Now, if you buffer, your script connect, put everything in a buffer, disconnect automatically at the end, then start sending your generated content to the client.

Arkh
Eh!? Something is seriously wrong with your page if it takes 2 seconds to just load the structure. Also having open connection to DB has almost no impact. What matters is how often you make those connections and what you do with them. -1
Maiku Mori
Again, a good reason, thanks. On the other hand, you can choose - a shorter database connection timespan or more progressive user response feedback. I would say, and this especiall applies to queries with big results, progressive user feedback wins over. Unless you hit connection limit, of course, which IS a HARD limit.
amn
Thanks for the votedown, but yeah, some pages can be quite heavy and necessitate a "long" time for the browser to download it (lot of data with embedded javascript things for example, a 2MB page is not exceptionnal).Time during which you have at least one useless open database connexion (if not multiple, different file handlers which prevent other scripts or apps to edit these files etc.).
Arkh
Amn, sure progressive user feedback is good. But if your script is a webservice asked for a big report, you've got to do it.Just to put things into perspective. To load this page, firebug is saying my firefox is wasting 515ms in the download phase (not the request send and waiting time).
Arkh
You have a valid point, I marked your answer as useful. I would think however, that when one wants to fetch query results and free the connection fast, a better solution is to offload the results in internal PHP variables (arrays, lists, etc), which may or may not have anything to do with output. After that you close the connection and start to send the formatted data to the client, buffered implicitly. It is your argument, just a bit modified.
amn
I agree it helps mainly against "bad" coding. But you never know in what kind of hands your code will end its life. So you better prepare it for some abuse.
Arkh
You can push JS to another file instead of embedding so it gets cached. 2MB is a lot of data for a human to perceive (+- 1000 pages of text). Most of the time such data can be filtered and split into smaller chunks since people are usually looking for something specific. If it cannot be done or the size is even bigger then I'd generate static document on the server and serve those instead of building the page from scratch for each request. Anyhow, my point is that it's most likely a design flaw.
Maiku Mori
+1  A: 

I use output buffering in order to avoid generating HTML by string concatenation, when I need to know the result of a render operation to create some output before I use the rendering.

Don
+1  A: 

The most obvious use cases are:

  1. An output filter (eg ob_gzhandler or any number of filters you could devise on your own); I have done this with APIs that only support output (rather than return values) where I wanted to do subsequent parsing with a library like phpQuery.
  2. Maintenance (rather than rewriting) of code written with all the problems you discuss; this includes things like sending headers after output began (credit Don Dickinson) or suppression of certain output that has already been generated.
  3. Staggered output (credit here to Tom and Langdon); note that your tests may have failed because it conflicts with PHP/Apache's default internal buffer, but it is possible to do, it simply requires a certain amount to be flushed before PHP will send anything—PHP will still keep the connection open though.
eyelidlessness