views:

36

answers:

1

I want to substitute all 4 instances of the number 300 from the code below (which my website users will be pasting whenever they create a new blog post), with 470.

<div>
  <object width="300" height="300">
    <embed src="link-removed" width="300" height="300"></embed>
  </object>
  <p>
    <a href="another-link">link</a>
  </p>
</div>

The width and height of the code being pasted might not always be 300 by 300.

So I figure I probably need a regular expression that subs any numeric value that follows the strings "width=" and "height=", whilst remembering to account for the quotations marks that surround the number. Can anyone tell me if that's the best way, and if so, what would be the best regex?

In case it matters, the code being pasted is stored as "text" in the db rather than as a string, as it's quite lengthy (i've removed a few hundred chars from what you see pasted here)...

+4  A: 

You can find (width|height)="\d+" and replace it with $1="470". This captures either width or height into group 1, and in replacement strings you refer back to this captured string as $1.

The pattern can become more complex depending on the requirement. If you want to be liberal with whitespaces, you can allow \s* around the =; to prevent matching, say, tablewidth="300", you can precede the pattern with \b, etc.

See also


On capturing groups

The (...) construct is what is called a "capturing group".

Given this test string:

i have 35 dogs, 16 cats and 10 elephants

Then (\d+) (cats|dogs) yields 2 match results (see on rubular.com)

  • Result 1: 35 dogs
    • Group 1 captures 35
    • Group 2 captures dogs
  • Result 2: 16 cats
    • Group 1 captures 16
    • Group 2 captures cats

References


In Ruby

In replacement strings, Ruby uses \ instead of $ as sigil for backreferences to capturing groups.

ruby-doc.org -- String#gsub: If a string is used as the replacement, special variables from the match (such as $& and $1) cannot be substituted into it, as substitution into the string occurs before the pattern match starts. However, the sequences \1, \2, and so on may be used to interpolate successive groups in the match.

Thus, the solution you're looking for is something like this:

text = 'blah blah width="300" and height="299" more blah'
puts text.gsub(/(width|height)="\d+"/, '\1="470"')

This prints (as seen on ideone.com):

blah blah width="470" and height="470" more blah
polygenelubricants
i think that's the most helpful answer anyone has ever given me on the internet - thanks a lot polygenelubricants - i'm not so scared of regex's now
stephenmurdoch