tags:

views:

121

answers:

3

I have a bunch of urls in static html files which need to be changed.

They look like this now:

<img src="/foldera/folderb/folderc/images/imgxyz.jpg" />

They need to look like this:

<img src="imgxyz.jpg" />

So, I just wrote a php script that opens each and does a preg_replace().

My regex (with the double escaped backslashes, yes):


$regex = '/<img src="\\/foldera\\/folderb\\/folderc\\/images\\/([^"]*)" \\/>/'

$replacement = '<img src="$0" />' ;
So I am only capturing anything after /images until the closing quote. But what I get is something like:
<img src="<img src="/foldera/folderb/folderc/images/imgxyz.jpg" />" />

It seems the capture group is overzealous and ... or something is not matching with the /foldera/folderb part.

What is going on here?

+5  A: 

Use $1 for the replacement. $0 matches the whole pattern. You want the first group.

$replacement = '<img src="$1" />' ;


An even better way would be to use basename as part of your replacement:

$regex = '/(<img src=")([^"]*)"( \\/>)/e';

$replacement = "stripslashes('\$1').basename(stripslashes('\$2')).stripslashes('\$3')";
Andrew Moore
oh, b'doy. Thanks :)
bobobobo
would you people calm down with the basename function? it isnt' that good, really.
bobobobo
**@bobobobo:** `basename()` is fantastic! It can give me amazing back massages when I need them!
Andrew Moore
A: 
  • Change the index to 1, as index 0 refers to the whole matched string, or

  • Use the "basename" function, or

  • Use the following:

    $regex = '//'

In which case you'd have to change the index to 2.

Daniel
+1  A: 

Just as a sidenote, now that the question has been answered : if you have slashes '/' in the regex, using slashes as delimiter forces you to escape the ones inside the regex, like the example you proposed :

$regex = '/<img src="\\/foldera\\/folderb\\/folderc\\/images\\/([^"]*)" \\/>/'

It really makes tkings harder to understand/modify/maintain :-(

You can use another character as delimiter, as long as it's the same at the beginning and the end of the regex. For example, in that kind of situation, people often use '#', pretty much like this :

$regex = '#<img src="/foldera/folderb/folderc/images/([^"]*)" />#'

Easier to read, no ?

(Of course, if you have '#' inside the regex, you'll have to escape them, as it's the delimiter)

Pascal MARTIN
neat! I was just thinking "this isn't perl..."
bobobobo
Perl would be more fun :-p
Pascal MARTIN