tags:

views:

315

answers:

5

I came across a function today that made me stop and think. I can't think of a good reason to do it:

sub replace_string {
        my $string  = shift;
        my $regex   = shift;
        my $replace = shift;

        $string =~ s/$regex/$replace/gi;

        return $string;
}

The only possible value I can see to this is that it gives you the ability to control the default options used with a substitution, but I don't consider that useful. My first reaction upon seeing this function get called is "what does this do?". Once I learn what it does, I am going to assume it does that from that point on. Which means if it changes, it will break any of my code that needs it to do that. This means the function will likely never change, or changing it will break lots of code.

Right now I want to track down the original programmer and beat some sense into him or her. Is this a valid desire, or am I missing some value this function brings to the table?

+4  A: 

My first reaction upon seeing that is a new Perl programmer didn't want to remember the syntax for a regular expression and created a function he or she could easily remember, without learning the syntax.

Narthring
Some lazy PHP programmer, perhaps. :)
Ether
Laziness, impatience, hubris.
Snake Plissken
+5  A: 
print replace_string("some/path", "/", ":");

Yes, you get some magic in not having to replace / with a different delimiter or escape / in the regex.

BillThor
That is a good point, if only that is what the one place in the code was doing; but you have averted the beating that poor programmer would have otherwise received (I didn't want to track him or her down anyway).
Chas. Owens
+5  A: 

If it's just a verbose replacement for s/// then I'd guess that it was written by someone who came to Perl from a language where using regular expressions required extra syntax and who is/was more comfortable coding that way. If that's the case I'd classify it as Perl baby-talk: silly and awkward to seasoned coders but not bad -- not bad enough to warrant a beating, anyway. ;)

If I squint really hard I can almost see cases where such a function might be useful: applying a bunch of patterns to a bunch of strings, allowing user input for the terms, supplying a CODE reference for a callback...

Michael Carman
They deserve a beating for not running quotemeta on the match string though...
Eric Strom
@Eric: that's a good example of how wrapper functions can be dangerous - they lull the user into thinking that the "edge" cases have been considered, when they actually haven't been.
Ether
A naive `quotemeta` isn't much better. It should test to see if `$regex` is a `qr//`-compiled pattern first and apply `quotemeta` only if it isn't. To Ether's point: While there may be a false sense of security there's also a single place to fix the problem.
Michael Carman
@Eric Storm `quotemeta` would actually break the code. It expects the second parameter to be a valid regex not a string. The proper way to call it would be something like `$s = replace_string $s, qr/\s+/, "+";`
Chas. Owens
@Chas Owens => even more reason to give the programmer a beating, as Ether says, the simple wrapper hides the implementation, which in this case is broken in multiple ways and doesn't even bother to validate its args at all.
Eric Strom
+1  A: 

The only reason I can see other than the ones mentioned already ( new programmer does not want to remember regex syntax ) is that it is possible they may be using some IDE that does not have any syntax highlighting for regex, but it does exist for functions they've written. Not the best of reasons, but plausible.

stocherilac
+10  A: 

The problems with that function include:

  • Opaque: replace_string doesn't tell you that you're doing a case-insensitive, global replace without escaping.
  • Non-idiomatic: $string =~ s{$this}{$that}gi is something you can learn what it means once, and its not like its some weird corner feature. replace_string everyone has to learn the details of, and its going to be different for everyone who writes it.
  • Inflexible: Want a non-global search-and-replace? Sorry. You can put in some modifiers by passing in a qr// but that's far more advanced knowledge than the s/// its hiding.
  • Insecure: A user might think that the function takes a string, not a regex. If they put in unchecked user input they are opening up a potential security hole.
  • Slower: Just to add the final insult.

The advantages are:

  • Literate: The function name explains what it does without having to examine the details of the regular expression (but it gives an incomplete explanation).
  • Defaults: The g and i defaults are always there (but that's non-obvious from the name).
  • Simpler Syntax: Don't have to worry about the delimiters (not that s{}{} is difficult).
  • Protection From Global Side Effects: Regex matches set a salad of global variables ($1, $+, etc...) but they're automatically locally scoped to the function. They won't interfere if you're making use of them for another regex.

A little overzealous with the encapsulation.

Schwern
nice summary. expand on the slower?
George
@George => the addition of a function call, its associated stack manipulation, shifting arguments off the stack... all these add constant time to the substitution. Being that substitutions are usually performed many times, the overhead can add up.
Eric Strom
@George And linear time wrt the string length due to the copying of the string into `$string` and the `return $string`.
Chas. Owens
How can one avoid the "Hidden Side Effects"?
Snake Plissken
@Snake The hidden side effects are mostly just the salad of global variables which all regular expressions set. But regex globals are locally scoped by default, my mistake. Putting the regex in a function actually spares you from them.
Schwern