tags:

views:

311

answers:

4

I am parsing pages using Simple DOM parser. It is neat, but I would like to get the applied css style for each element. Not only the inline styles, but every style it applies to that element, being it inline, in-page or external.

Is there a class that does that? If not, how would you do it? I do not really care about overriding styles, cascade, or browser specific styles. Having all the directly applied styles would suffice.

+1  A: 

That's a pretty tall order. Consider this simple example:

<style>
   p .foo {
     color: yellow;
   }
   span > *[href] {
     color: red;
   }
   img + .foo {
     color: green;
   }
   span #bar {
     color: blue;
   }
   .baz #bar {
     color: black;
   }
</style>      
<p class="baz">Lorem ipsum <span>dolor sit 
  <img src="x.png"><a id="bar" class="foo" href="#top">amet</a>,</span>
  consectetur adipiscing elit.
</p>

What color is the link? Each of the 5 styles applies directly to the link element. Even when you only consider CSS2.1, you still have 3 styles to process.

As Gumbo says, without a full CSS parser and interpreter, this cannot be solved. I haven't seen one written in PHP yet, although it should be theoretically possible to write one.

(There are classes for CSS parsing, yes - see the answers to this question, but those would only tell you "for this file, you have these CSS declarations". The interpreter is the hardest part, and I'm not aware of a PHP one)

Your best bet would be rendering the page in some webpage rendering engine (e.g. Gecko or Webkit) and querying the CSS properties. That, unfortunately, is far beyond the scope of a simple PHP class.

Piskvor
+3  A: 

As Martin says, in doing this you're almost writing a browser in PHP - it's a big ask! As with any big project, the key is to break it down into more manageable steps (although some of these aren't exactly straightforward).

You'll need to:

  • work out which (if any) external css files are linked to
  • (to echo Gumbo): find (or develop) a way of reading and interpreting the external css, in-page css and inline css
  • work out what styles apply to each element (including styles applied by .class, #id and element type), and the parents of each element - including which css rules override which other css rules, etc.

I wouldn't say it's impossible, as things like MPDF do almost the same thing (and may provide a good starting point) but I don't think there's a neat quick-fix.

Waggers
A: 

I'm looking for the same sort of thing. Any luck in finding anything? Thanks.

Joe
A: 

You might wanna check out the CSS part of QUAIL accessibility library - we needed that feature too and have been basically building a psuedo-browser that is based on DOMDocument. Because of some of the weird things with Xpaths in DOMDocument we had to hack an additional attribute to every node on the page that acts as a pointer to a central array of computed styles, but we're about 70% of the way there in terms of passing the W3C tests.

Kevin Miller