views:

85

answers:

2

Hi,

I am working with some HTML for my site, I am basically moving my site from PHP to Rails.

I have literally thousands of pages and some parts of the site have different CSS files from others.

I can grab the tags fine but I added some conditions for different stylesheets to be loaded if its IE6/IE7/IE8 etc.

I am trying to figure out how to parse this:

<!--[if lt IE 7]>
<link type="text/css" rel="stylesheet" media="all" href="/templates/default/css/ie6.css" />
<![endif]-->
<!--[if IE 7]>
<link type="text/css" rel="stylesheet" media="all" href="/templates/default/css/ie7.css" />
<![endif]-->        
<!--[if IE 8]>
<link type="text/css" rel="stylesheet" media="all" href="/templates/default/css/ie8.css" />
<![endif]-->

And end up with:

/templates/default/css/ie6.css
/templates/default/css/ie7.css
/templates/default/css/ie8.css

I am using nokogiri to pull any other information I need, but can not get these URLs due to them being wrapped in comments, could anyone lend a hand? I was thinking about using regex but I am still not very good with regex, learning however :)

Thanks

Eef

+2  A: 

The Nokogiri documentation implies that it creates nodes in the document to represent comments (http://nokogiri.org/Nokogiri/XML/Comment.html), if you can get the content of these nodes then you can use a simple RegExp to extract the URL.

Something like:

/<link[^>]*?href="([^"]+)"/
fd
A: 

Pretty sure

/href="(.+).css"/i 

would work fine

Kieran Allen
This does work find, but I was wanting to grab the CSS files from in between the comments and only those ones.
Eef