views:

49

answers:

1

I'd like to write a function (ideally in PHP) where I can input a url and return a string corresponding to the hypertext from that webpage which would render the largest in a browser (any standard browser is fine).

Getting the webpage and tokenizing things with DOM is pretty straightforward, but what's the best way to calculate ultimate size of the rendered text tokens - how do you account for CSS that includes px, em, % etc. for different font sizes.

Anyone done something like this before I go and re-invent the wheel?

Thanks in advance.

+1  A: 

I don't think PHP can measure rendered HTML elements, because those elements aren't rendered on the server, but on the client side, and PHP is server-side.

I've done measurements on rendered HTML elements using jQuery and the outerHeight() and outerWidth functions.

See the source at http://www.ccsnetwork.eu > lib.js > correctHeightMainAndSidebar()

Niels Bom
This is part of a mini-spider, so client side JS is not ideal. What I was planning on doing was grabbing the css referenced or included in the webpage, and pair each tag with its attributes, then compute the height. That's a fair bit of work, though, because you basically have to recreate cascading style rule logic (which even the major browser releases can't seem to get right most of the time.) I am hoping there is some open source renderer I can commandeer instead.
TMG
The problem is ofcourse that there is no standard that everyone adheres exactly to. However, because you only want to measure font-sizes I'd try Webkit (http://webkit.org/) which is used by Safari and Chrome amongst others. Webkit is written in C++, which you can run serverside, as you need to.I severely doubt if there's a we-page rendering engine written in PHP.
Niels Bom