tags:

views:

34

answers:

2

I want to use regular expression to grab the url from a html css link tag from the HTML source code

eg

<link rel="apple-touch-icon" href="http://sample.com/icons/apple-touch-icon.png"&gt;

any help??

+4  A: 

Standard response: Do not use regexes to read XML and especially not HTML. Use a HTML parser.

Realistic response: if the input you're working with is consistent and predictable, then this may work for you:

/<link\s+rel="([^"]*)"/

If the input varies virtually at all from the example in your question, this will break. The following are examples of valid HTML which will not work with the regex:

<link rel=apple-touch-icon>
<link rel='apple-touch-icon'>
<link href='...' rel='apple'>
<link
   rel="apple">
<link rel = "apple">

etc, etc... You could try extending the regex to cater for some of these, but you'd be entering a world of hurt.

edit: sorry, I just noticed that you wanted the href. The above still applies, just replace 'rel' with 'href' as necessary.

nickf
+1 For the disclaimer
Jamie Wong
A: 

see this blog by Jeff Atwood

harschware