views:

497

answers:

4

Guys, I need to develop a tool which would meet following requirements:

  1. Input: XHTML document with CSS rules within head section.
  2. Output: XHTML document with CSS rules computed in tag attributes

The best way to illustrate the behavior I want is as follows.

Example input:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;
<html>
<head>
  <style type="text/css" media="screen">
    .a { color: red; }
        p { font-size: 12px; }
  </style>
</head>
<body>
    <p class="a">Lorem Ipsum</p>
    <div class="a">
         <p>Oh hai</p>
    </div>
</body>
</html>

Example output:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;
<html>
<body>
    <p style="color: red; font-size: 12px;">Lorem Ipsum</p>
    <div style="color: red;">
         <p style="font-size: 12px;">Oh hai</p>
    </div>
</body>
</html>

What tools/libraries will fit best for such task? I'm not sure if BeautifulSoup and cssutils is capable of doing this.

Python is not a requirement. Any recommendations will be highly appreciated.

A: 

Depends on how complicated your CSS is going to be. If it's a simple matter of elements ("p {}", "a {}"), IDs/Classes (#test {}), then probably easiest to use regular expressions. You'd have to have one to find all the style definitions and then parse them, then use more regular expressions to find instances of tags that match.

For, for example, if you found you had a style for A tags, you could use a regular expression like:

<a\b[^>]*>(.*?)</a>

To get them, then you'd have to do a replace to add the style. Of course you'd want the regex to accept the tag as a parameter (the A tag in this case).

If you got into child selection or anything more than just root elements and ID/classes this could get messy fast.

Consider just defining the styles inline to begin with?

Parrots
+3  A: 

Try premailer

code.dunae.ca/premailer.web

More info: campaignmonitor.com

garrow
Perfect, thank you :)
ohnoes
+1  A: 

While I do not know any specific tool to do this, here is the basic approach I would take:

Load as xml document
Extract the css classes and styles from document
For each pair of css class and style
  Construct xpath query from css class
  For each matching node
    Set the style attribute for that class
Remove style node from document Convert document to string

Ross Goddard
+1  A: 

There is a premailer python package on Pypi

Grégoire Cachet