tags:

views:

34

answers:

2

i have some strings with this pattern in some files:

  • domain.com/page-10
  • domain.com/page-15 ....

and i want to replace them with something like

  • domain.com/apple-10.html
  • domain.com/apple-15.html

i have found that i can use sed command to replace them at a time but because after the numbers should something be added i guess i have to use regular expression to do it. but i don't know how.

+2  A: 
sed -i 's/page-\([0-9]*\)/apple-\1.html/' <filename>

The ([0-9]*) captures a group of digits; the \1 in the replacement string references that capture and adds it as part of the replacement string.

You may want to use something like -i.backup if you need to keep a copy of the file without the replacements, or just omit the -i and instead use the I/O redirection method instead.

Amber
You didn't specify how to use \d+
ghostdog74
thanks but i run your given command and got this error:invalid reference \1 on `s' command's RHS
hd
that's because you should escape the braces...see my 2nd answer
ghostdog74
+2  A: 
sed -i.bak -r 's/page-([0-9]+)/apple-\1.html/' file

sed  's/page-\([0-9][0-9]*\)/apple-\1.html/' file > t && mv t file

Besides sed, you can also use gawk's gensub()

awk '{b=gensub(/page-([0-9]+)/,"apple-\\1.html","g",$0) ;print b  }' file
ghostdog74
You didn't see the entirety of what the OP wants.
Amber
Uses a GNU Sed extension - can be written in standard sed replacing the '+' with '[0-9]*'.
Jonathan Leffler
@jonathan, you sure? `*` is zero or more. `+` is one or more. `[0-9] [0-9]*` should be more appropriate.
ghostdog74
Yes, so when the '+' is replaced as I said, then you end up with '[0-9][0-9]*' which is one or more digits - and, as you said, that is appropriate (and I agree).
Jonathan Leffler