tags:

views:

175

answers:

5

I'm trying to change strings like this:

<a href='../Example/case23.html'><img src='Blablabla.jpg'

To this:

<a href='../Example/case23.html'><img src='<?php imgname('case23'); ?>'

And I've got this monster of a regular expression:

find . -type f | xargs perl -pi -e \
  's/<a href=\'(.\.\.\/Example\/)(case\d\d)(.\.html\'><img src=\')*\'/\1\2\3<\?php imgname\(\'\2\'); \?>\'/'

But it isn't working. In fact, I think it's a problem with Bash, which could probably be pointed out rather quickly.

r: line 4: syntax error near unexpected token `('
r: line 4: `  's/<a href=\'(.\.\.\/Example\/)(case\d\d)(.\.html\'><img src=\')*\'/\1\2\3<\?php imgname\(\'\2\'); \?>\'/''

But if you want to help me with the regular expression that'd be cool, too!

+1  A: 

Bash single-quotes do not permit any escapes.

Try this at a bash prompt and you'll see what I mean:

FOO='\'foo'

will cause it to prompt you looking for the fourth single-quote. If you satisfy it, you'll find FOO's value is

\foo

You'll need to use double-quotes around your expression. Although in truth, your HTML should be using double-quotes in the first place.

Uncle Mikey
Also, as others have said, regular expressions are not the best way to parse HTML. But if your case really is limited to a pattern as relatively simple as this, you can probably get away with it.
Uncle Mikey
Downvote: This is wrong, single quotes *do* permit escapes (see my answer for proof), and therefore double quotes are *not* needed. Double quotes are very inconvenient anyway, many Perl variables such as `$_` or `$1` are going to be interpreted as shell variables. To keep one's sanity, always `-e''`, never `-e""`.
daxim
@daxim: That's wrong, they don't. See my comment to your answer.
Dennis Williamson
A: 

I wouldn't use a one-liner. Put your Perl code in a script, which makes it much easier to get the regex right without wondering about escaping quotes and such.

I'd use a script like this:

#!/usr/bin/perl -pi

use strict;
use warnings;

s{
    ( <a \b [^>]* \b href=['"] [^'"]*/case(\d+)\.html ['"] [^>]* > \s*
      <img \b [^>]* \b src=['"] ) [^'"<] [^'"]*
}{$1<?php imgname('case$2'); ?>}gix;

and then do something like:

find . -type f | xargs fiximgs

– Michael

mscha
+2  A: 

Teaching you how to fish:

s/…/…/

Use a separator other than / for the s operator because / already occurs in the expression.

s{…}{…}

Cut down on backslash quoting, prefer [.] over \. because we'll shellquote later. Let's keep backslashes only for the necessary or important parts, namely here the digits character class.

s{<a href='[.][.]/Example/case(\d\d)[.]html'>…

Capture only the variable part. No need to reassemble the string later if the most part is static.

s{<a href='[.][.]/Example/case(\d\d)[.]html'><img src='[^']*'}{<a href='../Example/case$1.html'><img src='<?php imgname('case$1'); ?>'}

Use $1 instead of \1 to denote backreferences. [^']* means everything until the next '.

To serve now as the argument for the Perl -e option, this program needs to be shellquoted. Employ the following helper program, you can also use an alias or shell function instead:

> cat `which shellquote`
#!/usr/bin/env perl
use String::ShellQuote qw(shell_quote); undef $/; print shell_quote <>

Run it and paste the program body, terminate input with Ctrl+d, you receive:

's{<a href='\''[.][.]/Example/case(\d\d)[.]html'\''><img src='\''[^'\'']*'\''}{<a href='\''../Example/case$1.html'\''><img src='\''<?php imgname('\''case$1'\''); ?>'\''}'

Put this together with shell pipeline.

find . -type f | xargs perl -pi -e 's{<a href='\''[.][.]/Example/case(\d\d)[.]html'\''><img src='\''[^'\'']*'\''}{<a href='\''../Example/case$1.html'\''><img src='\''<?php imgname('\''case$1'\''); ?>'\''}'
daxim
That's not single quotes permitting escapes. That's "open-quote, close-quote, escape-quote (unquoted/outside of quotes), open-quote, ..."
Dennis Williamson
+1 for taking the time to explain in so much detail how to solve this problem step-by-step. Great answer!
kander
+1  A: 

Single quotes within single quotes in Bash:

set -xv
echo ''"'"''
echo $'\''
karl
A: 

if you install the package mysql, it comes with a command called replace.

With the replace command you can:

while read line 
do
 X=`echo $line| replace "<a href='../Example/"  ""|replace ".html'><" " "|awk '{print $1}'`
 echo "<a href='../Example/$X.html'><img src='<?php imgname('$X'); ?>'">NewFile   
done < myfile

same can be done with sed. sed s/'my string'/'replace string'/g.. replace is just easier to work with special characters.

Adam Outler