views:

49

answers:

3

We have an application that takes snapshots of certain web pages. It's quite tightly integrated into the code, so we're not ready to incorporate another library.

But we don't have a way of being able to calculate the web page height, so we end up taking snaps of 8000px height. Which is now proving troublesome when inserted into PDFs.

Is there a way to find the height of the webpage in PHP?

A: 

Hmm, try this: http://www.php.net/manual/en/function.get-browser.php

JavaScript is a much better option, though.

JorgeV44
This has nothing to do with webpage height
gAMBOOKa
did you bother reading the link you provided?
acmatos
Nope..Wasn't aware that was required.
JorgeV44
+1  A: 

By definition, no. You can reliably tell the height of a web page only after it has been rendered, because the rendering engine decides how it is going to interpret the markup provided.

PHP does not have a HTML rendering engine, so it's impossible to tell a page height using PHP.

You need to utilize your snapshot application for this. Only the renderer built into that app can give you reliable info about how tall the web page is going to be in the end result.

Don't forget the page height can vary even between different versions of the same browser (most prominently, Internet Explorer) depending on how margin padding etc. are interpreted.

If the images that the snapshot app produces have too much space to the bottom, consider using ImageMagick and its -trim option that can remove excess space from images.

Pekka
We use xvfb + Firefox http://www.semicomplete.com/blog/geekery/xvfb-firefox.html
gAMBOOKa
@gamBO There may be something possible with Javascript/Greasemonkey or something, tapping into the Firefox instance, but I'd simply clip the image before inserting it into the PDF.
Pekka
Hmmm... is there a way to auto crop image whitespaces? We get over 200 snaps a day, so manual cropping is out of the question. I remember my old scanner could detect non-white areas, should be possible with a simple scripy.
gAMBOOKa
@gaMbooka as I said, ImageMagick can do this; It should also be fairly easy to do in PHP with GD. Just start at the bottom and run through each line. As long as every pixel in a line is white, it's excess space.
Pekka
I'll look into ImageMagick's trim. But GD won't work because we're saving as JPEG's which are lossy, so white ain't exactly white.
gAMBOOKa
@gamBoo mmm, good point. ImageMagick may get problems with that, too. You'd have to save as PNG first, clip, then make a JPG out of it for clipping to work I think.
Pekka
A: 

Rather than taking a giant screenshot, could you not simply print the page to PDF or another file format?

Jordan Sissel
Can you save as image? PDF's won't do.
gAMBOOKa
You can convert PDF to one of a number of formats; pdf2svg, for example. From SVG, you can go to PNG.It may not be a straight "firefox to png" in one simple tool you can download, but you can code up the conversion (print to pdf, pdf to svg, svg to whatever) and just invoke that.
Jordan Sissel