views:

257

answers:

2

I'm looking for a library in Ruby or Python that would take some HTML and CSS as the input and return data that contains the positions and sizes of the elements. If it helps, I don't need the info for all the elements but just the major divs of the page.

+3  A: 

Scriptor, I think what you likely are looking for might be something in JavaScript more then Ruby or Python. I mean - the positions and sizes are essentially going to be determined by the rendering engine (the browser). You might consider using something like jQuery to loop through all of your desired objects - outputting the name of the object (like the DIV's ID) and the height and width of that item. So, for what it's worth I'd look at jQuery if I was in your position and the height() and width() methods. You never know - there may already be a jQuery plugin.

Tim K.
I'm actually using jQuery for the other half of my project. The reason I'd like to do this on the server side is so that it could be automated. I'd just run the script and it would keep parsing pages. On the client the user must initiate the parsing. Thanks though, I'm still considering this method.
Scriptor
dimensions plugin:http://brandonaaron.net/docs/dimensions/FF 3 has:https://developer.mozilla.org/en/DOM/element.getBoundingClientRect
Gene T
The dimensions plugin is built into jQuery now, you no longer need it as a separate plugin.
Tim K.
A: 

Both Ruby and Python have a Regex library. Why not search for things like /width=\"(\d+)px\"/ and /height:(\d+)px/. Use $1 to find the value in the group. I'm not a regex expert and I'm doing this from memory, so refer to any of the tutorials on the net for the correct syntax and variable usage, but that's where to start. Good luck, bsperlinus

The problem with that is that locations and sizes are not always explicitly stated in the code, either HTML or CSS. That's why I'm looking for a library, so that it would take care of figuring out all of this for me. To implement myself, I'd have to learn more about layout than I want ot know.
Scriptor