tags:

views:

121

answers:

6

Can regular expressions be used to perform arithmetic? Such as find all numbers in a file and multiply them by a scalar value.

+1  A: 

Regular expressions themselves can't - they're all about text - so sed can't directly. It's easy enough to do something like that in a full scripting language like python or perl, though.

Jefromi
Sed could do that. Due to the existence of conditional jumps in sed scripts, sed is turing complete in theory. In practice, there are limits on the length and number of lines in a script. Somewhere on the internet, there is a sed script for a full fledged calculator, including trig and exp functions. But that kind of task is not the one for which I would consider using sed.
Christian Semrau
+9  A: 

You can achieve this using re.sub() with a callback:

import re

def repl(matchobj):
  i = int(matchobj.group(0))
  return str(i * 2)

print re.sub(r'\d+', repl, '1 a20 300c')

Output:

2 a40 600c

From the docs:

re.sub(pattern, repl, string[, count])

If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.

Ayman Hourieh
I was unaware of the 'iterator quality' of `re.sub`. Thanks for posting.
Arrieta
Thanks, your solution is very elegant. This is great feedback!
cdated
+2  A: 

I prepared a small script which uses re.finditer to find all the integers (you can change the regexp so that it can deal with floats or scientific notation) and then use map to return a list of scaled numbers.

import re

def scale(fact):
    """This function returns a lambda which will scale a number by a                           
    factor 'fact'"""
    return lambda val: fact * val

def find_and_scale(file, fact):
    """This function will find all the numbers (integers) in a file and                        
    return a list of all such numbers scaled by a factor 'fact'"""
    num = re.compile('(\d+)')
    scaling = scale(fact)
    f = open(file, 'r').read()
    numbers = [int(m.group(1)) for m in num.finditer(f)]
    return map(scaling, numbers)

if __name__ == "__main__":
    import sys
    if len(sys.argv) != 3:
        print "usage: %s file factor" % sys.argv[0]
        sys.exit(-1)
    numbers = find_and_scale(sys.argv[1], int(sys.argv[2]))
    for number in numbers:
        print "%d " % number

If you have a file whose numbers you want to scale by a factor fact, you call the script from the command line as python script.py file fact and it will print to STDOUT all the scaled numbers. Of course, you can do something more useful if you wanted...

Arrieta
Thanks, this fits perfectly with the scenario I had in mind.
cdated
+3  A: 

In perl you can do this with the /e modifier. This causes the substitution part of the expression be evaluated. Assuming $line contains a line of the file

 my $scalar= 4;
 $line =~ s/([\d]+)/$1*$scalar/ge;

Applying this to every line will do the job for you. For example applying this to a $line containing "foo2 bar25 baz", transforms it to "foo8 bar100 baz"

Jasmeet
Thanks for reminding me that there's always a one line solution in perl.
cdated
A: 

Ayman Hourieh's answer can be reduced to be a little bit simpler, and imo more readable:

>>> import re
>>> repl = lambda m: str(int(m.group(0)) * 2)
>>> print re.sub(r'\d+', repl, '1 a20 300c')
2 a40 600c
pyrony
`lambda` provides a way to create an anonymous function. If you end up giving it a name anyway it's usually clearer to use `def`.
gnibbler
That is a matter of opinion, as using several lines for a simple function like this is not inherently "clearer".
pyrony
+1  A: 

To those of you who doubt that sed can do arithmetic I offer this counter-example. This one is even wilder.

High Performance Mark
Those are neat tricks, but they show that you have to be pretty creative with sed to for an arithmetic result.
cdated