tags:

views:

353

answers:

7

I have this Perl snippet from a script that I am translating into Python. I have no idea what the "s!" operator is doing; some sort of regex substitution. Unfortunately searching Google or Stackoverflow for operators like that doesn't yield many helpful results.

 $var =~ s!<foo>.+?</foo>!!;
 $var =~ s!;!/!g;

What is each line doing? I'd like to know in case I run into this operator again.

And, what would equivalent statements in Python be?

+15  A: 

s!foo!bar! is the same as the more common s/foo/bar/, except that foo and bar can contain unescaped slashes without causing problems. What it does is, it replaces the first occurence of the regex foo with bar. The version with g replaces all occurences.

sepp2k
Perl borrows from a lot of languages. It borrowed this from sed.
runrig
+2  A: 

s is the substitution operator. Normally this uses '/' for the delimiter:

s/foo/bar/

, but this is not required: a number of other characters can be used as delimiters instead. In this case, '!' has been used as the delimiter, presumably to avoid the need to escape the '/' characters in the actual text to be substituted.

In your specific case, the first line removes text matching '.+?'; i.e. it removes 'foo' tags with or without content.

The second line replaces all ';' characters with '/' characters, globally (all occurences).

The python equivalent code uses the re module:

f=re.sub(searchregx,replacement_str,line)
ire_and_curses
"...it removes 'foo' tags with or without content." Not quite -- it removes 'foo' tags enclosing *at least one* character. +1, however, for actually showing some pythonic code.
pilcrow
@pilcrow: Hmm, thanks for the clarification. The '?' here seems superfluous then. I'd assumed '.+?' would work like '(.+)?'. But it doesn't.
ire_and_curses
'.+?' means "one or more, but as few as possible while still getting a match". As opposed to '.+' which will match as much as possible.
sepp2k
+13  A: 

It's doing exactly the same as $var =~ s///. i.e. performing a search and replace within the $var variable.

In Perl you can define the delimiting character following the s. Why ? So, for example, if you're matching '/', you can specify another delimiting character ('!' in this case) and not have to escape or backtick the character you're matching. Otherwise you'd end up with (say)

s/;/\//g;

which is a little more confusing.

Perlre has more info on this.

Brian Agnew
A: 

And the python equivalent is to use the re module.

liori
+3  A: 

s is the substitution operator. Usually it is in the form of s/foo/bar/, but you can replace // separator characters some other characters like !. Using other separator charaters may make working with things like paths a lot easier since you don't need to escape path separators.

See manual page for further info.

You can find similar functionality for python in re-module.

Juha Syrjälä
+10  A: 

Perl lets you choose the delimiter for many of its constructs. This makes it easier to see what is going on in expressions like

$str =~ s{/foo/bar/baz/}{/quux/};

As you can see though, not all delimiters have the same effects. Bracketing characters (<>, [], {}, and ()) use different characters for the beginning and ending. And ?, when used as a delimiter to a regex, causes the regexes to match only once between calls to the reset() operator.

You may find it helpful to read perldoc perlop (in particular the sections on m/PATTERN/msixpogc, ?PATTERN?, and s/PATTERN/REPLACEMENT/msixpogce).

Chas. Owens
+2  A: 

s! is syntactic sugar for the 'proper' s/// operator. Basically, you can substitute whatever delimiter you want instead of the '/'s.

As to what each line is doing, the first line is matching occurances of the regex <foo>.+?</foo> and replacing the whole lot with nothing. The second is matching the regex ; and replacing it with /.

s/// is the substitute operator. It takes a regular expression and a substitution string.

s/regex/replace string/;

It supports most (all?) of the normal regular expression switches, which are used in the normal way (by appending them to the end of the operator).

Matthew Scharley