views:

369

answers:

1

This is related to (http://stackoverflow.com/questions/1547614/how-to-get-html-element-coordinates-using-c)

The answer given here is quite good. However, opening up a full instance of Internet Explorer seems like overkill if you're trying to process a good deal of information from within your own spider.

Has anyone found/encountered/thought of a way to to do this using the underlying interface structures of IE (Like IHTMLDocument2 and it's brethren).

Thanks in advance.

+1  A: 

Most browsers don't agree on per-pixel rendering in every situation. CSS Compliance, rounded corners, transparency support and padding bugs (I'm looking at you, IE6) are just a few.

The only way to reliably do this is to open a browser like the question you linked and figure it out that way. Even then, know that your results may not be the same as another browser will see it.

Site note: Different DPI & zoom settings for accessibility also will affect this, there are tons of variables in what people see. Mac font rendering also differs, so those users will usually see things slightly different as well.

Nick Craver
Hi Nick, thanks - Re: the above. I'm wondering do you think something like Crowbar (http://simile.mit.edu/wiki/Crowbar) would perform better then opening the page up in IE? I'm actually popping it open in IE8 now and accessing the MSHTML object tree to get at the positioning data. This works well enough for the sort of analytics I'm doing, but it's *dog* slow. I haven't given Crowbar a play yet (will tonight) but do you think the div positions will be rendered (albeit with deviation from what's rendered in IE - I only need estimates). Thanks!
Dr.HappyPants