tags:

views:

97

answers:

4

I have a function, call it f, that takes a string and returns a string.

I have a file with lines that look like this:

stuff:morestuff:stuff*:otherstuff:otherstuff*\n

Colons only appear as delimiters and * only appears at the end of each word. I want to loop over the file and replace all instances of stuff* with f(stuff). The previous line would go to

stuff:morestuff:f(stuff):otherstuff:f(otherstuff)\n

I can do this in a few lines, but there must be a way to do it in one.

Edit To be clear, by f(stuff), I mean f called on "stuff", not the string "f(stuff)".

A: 
$a=~s/(^|:)([^*:]*)\*(?=(:|$))/\1f\(\2\)/g;

-EDIT-

If f() is a function I don't see any particular reason for doing it in one line. split - process - join

def f(x):
    return x.upper()

a='stuff*:morestuff:stuff*:otherstuff:otherstuff*\n';

print ':'.join([f(x[:-1]) if x[-1]=='*' else x for x in a.strip().split(':')])

Sounds just as simple as the task. I love python ;)

Antony Hatchkins
The question is tagged as Perl so I'm not sure an answer in Python is all that helpful.
Dave Webb
+5  A: 

I'd do it this way:

#! /usr/bin/perl

use warnings;
use strict;

sub f { uc reverse $_[0] }

while (<DATA>) {
  chomp;

  my $out = join ":" =>
            map s/(stuff)\*$/$1/ ? f($_) : $_,
            split /:/;

  print $out, "\n";
}

__DATA__
stuff:morestuff:stuff*:otherstuff:otherstuff*
otherstuff
stuff
stuff*
stuff*:otherstuff*

Output:

stuff:morestuff:FFUTS:otherstuff:FFUTSREHTO
otherstuff
stuff
FFUTS
FFUTS:FFUTSREHTO

But if you have allinoneregexitis, go with

while (<DATA>) {
  chomp;

  s/  (?:^|(?<=:))     # BOL or just after colon
      ([^:]*stuff)\*   # ending with 'stuff*'
      (?=:|$)          # at EOL or just before colon
  / f $1 /gex;

  print $_, "\n";
}

This works because of the /e switch:

A /e will cause the replacement portion to be treated as a full-fledged Perl expression and evaluated right then and there.

Greg Bacon
+9  A: 

If you use the e option for s// then the right hand expression is evaluated as code. So this is as simple as:

$line =~ s/([^:]+)\*/f($1)/ge;

Breaking down the match:

  • ( starts marking part of the pattern
  • [^:] means anything but a :
  • + means one or more of these, i.e. one or more characters that's not a colon
  • ) ends marking part of the pattern as $1
  • \* means literally a *

This pattern is relying on the fact that * only appears at the end of each word. If it could appear in the middle of a field you'd need to tweak the pattern a little.

Or, putting the pattern in a whole script:

sub f {
    my $word = shift;
    $word =~ tr/a-z/A-Z/;
    return $word;
}

while (<>) {
    s/([^:]+)\*/f($1)/ge;
    print;
}
Dave Webb
+2  A: 
 $string =~ s{ (^|:) ([^:*]+) \* }{$1 . f($2)}gxe;

Should be enough.

depesz