views:

316

answers:

2

I'm trying to run a regular expression on some href tags using javascript.
Unfortunatly reverse possitive lookup doesn't work in javascript :-(

I want to prefix all the relative tags with their full path.

I am using this regex-replace patern ([hs]r[ce]f?)[\s]?=[\s\"\']?(?!http|\/)(.*?)[\s\"\']

<Sorry, as I'm unable to post my HTML sample test code here because it has URL's I've had to provide this LINK to my regex & sample code>

The regex matches the coloured sections shown on lines 1-4 href's, but also 6 & 7.

I do not wish to select any of line 6 & 7.

Any help would be most appreciated as this is starting to drive me up the wall.

A: 

Honestly? I'd ditch the regex-only approach

updateRelative( 'a', 'href' );
updateRelative( 'img', 'src' );

function updateRelative( tag, attributeName )
{
   var collection = document.getElementsByTagName( tag )
     , attribute;
   for ( var i = 0, l = collection.length; i < l; i++ )
   {       
     attribute = collection[i].getAttribute( attributeName );
     if ( attribute && !/^\/|http/.test( attribute ) )
     {
       // How you actually implement this line is up to you
       collection[i][attributeName] = 'http://example.com/' + attribute;
     }
   }
}
Peter Bailey
Peter, I appreciate your help and will investigate how I can implement this into my script. I can see that it would be more flexible - certainly easier to develop than the terse (although powerful) functions of RegEx patterns.
Philofax
A: 

This is what is happening:

[\s\"\']? will initially match the first double-quote. Since the negative look-ahead assertion fails at that point and the match is optional, it will back up a character and try again. The assertion passes because the next character is the double-quote again and the rest of the pattern matches.

If you are running the script in a browser you should skip the regular expression and just use the DOM objects.

If you can't use DOM (if you are running outside a browser) you could use something like this regular expression that correctly matches all your examples.

/(href|src)\s*=\s*("(?!http|\/)[^"]*?"|'(?!http|\/)[^"]*?'|(?!http|\/)[^\s"']+)/
mcrumley
Thanks for your help here. I've been banging my head on the wall with this one.
Philofax