views:

157

answers:

4

Hey. I have a problem I'm trying to solve in Python and I can't think of a clever implementation. The input is a string of letters. Some of them represent variables, others represent operators, and I want to iterate over a large amount of values for the variables (different configurations).

I think an equivalent question would be that you get an equation with many variables stored as a string and you want to see the solution when you substitute x,y,z etc. for many different values, or satisfying a boolean formula.

I can't think of any clever data structure that would implement this - just 'setting' the formula every time, substiuting variables, takes more time than actually evaluating it.

I know it's a bit abstract - but anyone has any ideas or has experience with somtehing similar?

Someone usggested I want to implement an evaluator. It's true but beside the point. For the sake of the question supposed the eval(string) function is already implemented. What is an efficient data structure that will hold the values and allow to change them every time cleanly and efficiently? Surely not a string! It has to be a modifiable list. And if it has a few instances of the variable x, it should be able to accsess them all fast and change their value before evaluation, etc, etc.

A: 

What you want is obviously an expression evaluator.

Other you use a built in facility in the language typically called "eval" (I don't know if Python offers this) or you build an expression parser/evaluator.

To get some of ideas of how to build such an expression evaluator, you can look at the following SO golf exercise: http://stackoverflow.com/questions/1384811/code-golf-mathematical-expression-evaluator-full-pemdas

To do what you want, you'd have to translate the basic ideas (they're all pretty much the same) to Pythonese, and add a facility to parse variable names and look up their values. That should be a very straightforward extension.

Ira Baxter
A: 

Parse that string into a tree (or equivalently interpretable data structure), just once, and then repeatedly use a function to "interpret the tree" for each variable-assignment-set of interest. (You could even generate Python bytecode as the "interpretable data structure" so you can just use eval as the "interpret the tree" -- that makes for slow generation, but that's needed only once, and fast interpretation).

As you say that's a bit abstract so let's give a concrete, if over-simplistic, example. Say for example that the variables are the letters x, y, z, t, and the operators are a for addition and s for subtraction -- strings of adjacent letters implicitly mean high-priority multiplication, as in common mathematical convention; no parentheses, and strict left-to-right execution (i.e. no operator precedence, beyond multiplication). Every character except these 6 must be ignored. Here, then, is a very ad-hoc parser and Python bytecode generator:

class BadString(Exception): pass

def makeBytecode(thestring):
  theoperator = dict(a='+', s='-')
  python = []
  state = 'start'
  for i, letter in enumerate(thestring):
    if letter in 'xyzt':
      if state == 'start':
        python.append(letter)
        state = 'variable'
      elif state == 'variable':
        python.append('*')
        python.append(letter)
    elif letter in 'as':
      if state == 'start':
        raise BadString(
            'Unexpected operator %r at column %d' % (letter, i))
      python.append(theoperator[letter])
      state = 'start'

  if state != 'variable':
    raise BadString(
      'Unexpected operator %r at end of string' % letter)
  python = ''.join(python)
  # sanity check
  # print 'Python for %r is %r' % (thestring, python)
  return compile(python, thestring, 'eval')

Now, you can simply call eval with the result of this as the first argument and the dict associating values to x, y, z and t as the second argument. For example (having imported the above module as par and uncommented the sanity check):

>>> c=par.makeBytecode('xyax')
Python for 'xyax' is 'x*y+x'
>>> for x in range(4):
...   for y in range(5):
...     print 'x=%s, y=%s: result=%s' % (x,y,eval(c,dict(x=x,y=y)))
... 
x=0, y=0: result=0
x=0, y=1: result=0
x=0, y=2: result=0
x=0, y=3: result=0
x=0, y=4: result=0
x=1, y=0: result=1
x=1, y=1: result=2
x=1, y=2: result=3
x=1, y=3: result=4
x=1, y=4: result=5
x=2, y=0: result=2
x=2, y=1: result=4
x=2, y=2: result=6
x=2, y=3: result=8
x=2, y=4: result=10
x=3, y=0: result=3
x=3, y=1: result=6
x=3, y=2: result=9
x=3, y=3: result=12
x=3, y=4: result=15
>>>

For more sophisticated, yet still simple!, parsing of the string & building of the rapidly interpretable data structure, see for example pyparsing.

Alex Martelli
+1  A: 

as Ira said you need an expression evaluator like Lex/Yacc to do this job, you have plenty available here :

http://wiki.python.org/moin/LanguageParsing

I have been using pyparsing and works greats for that kind of things

Chmouel Boudjnah
+1 for PyParsing.
cjrh
A: 

I see that the eval idea has already been mentioned, and this won't help if you have operators to substitute, but I've used the following in the past:

def evaluate_expression(expr, context):
    try:
        return eval(expr, {'__builtins__': None}, context)
    except (TypeError, ZeroDivisionError, SyntaxError):
        # TypeError is when combining items in the wrong way (ie, None + 4)
        # ZeroDivisionError is when the denominator happens to be zero (5/0)
        # SyntaxError is when the expression is invalid
        return None

You could then do something like:

values = {
    'a': 1.0,
    'b': 5.0,
    'c': 1.0,
}
evaluate_expression('a + b - (1/c)', **values)

which would evaluate to 1.0 + 5.0 - 1/1.0 == 5.0

Again, this won't let you substitute in operators, that is, you can't let 'd' evaluate to +, but it gives you a safe way of using the eval function in Python (you can't run "while True" for example).

See this example for more information on using eval safely.

jgeewax