views:

322

answers:

3

I've got some dynamically-generated boolean logic expressions, like:

  • (A or B) and (C or D)
  • A or (A and B)
  • A
  • empty - evaluates to True

The placeholders get replaced with booleans. Should I,

  1. Convert this information to a Python expression like True or (True or False) and eval it?
  2. Create a binary tree where a node is either a bool or Conjunction/Disjunction object and recursively evaluate it?
  3. Convert it into nested S-expressions and use a Lisp parser?
  4. Something else?

Suggestions welcome.

+5  A: 

If you set up dicts with the locals and globals you care about then you should be able to safely pass them along with the expression into eval().

Ignacio Vazquez-Abrams
There is no need to use `eval` here; you only need to evaluate a very simple language, not Python. (Also, limiting what you pass to `eval` for locals/globals doesn't make it secure if you end up wanting to pass much at all, and certainly doesn't prevent impossibly-big calculations.)
Mike Graham
+1: Just create a dictionary and use eval.
S.Lott
+5  A: 

It shouldn't be difficult at all to write a evaluator that can handle this, for example using pyparsing. You only have a few operations to handle (and, or, and grouping?), so you should be able to parse and evaluate it yourself.

You shouldn't need to explicitly form the binary tree to evaluate the expression.

Mike Graham
This pyparsing example (http://pyparsing.wikispaces.com/file/view/simpleBool.py) should be almost a drop-in solution.
Paul McGuire
I'm going to accept this answer since it's not as broadly terrifying as `eval()` and it's the most easily extendable.
a paid nerd
+2  A: 

Here's a small (possibly, 74 lines including whitespace) module I built in about an hour and a half (plus almost an hour to refactoring):

str_to_token = {'True':True,
                'False':False,
                'and':lambda left, right: left and right,
                'or':lambda left, right: left or right,
                '(':'(',
                ')':')'}

empty_res = True


def create_token_lst(s, str_to_token=str_to_token):
    """create token list:
    'True or False' -> [True, lambda..., False]"""
    s = s.replace('(', ' ( ')
    s = s.replace(')', ' ) ')

    return [str_to_token[it] for it in s.split()]


def find(lst, what, start=0):
    return [i for i,it in enumerate(lst) if it == what and i >= start]


def parens(token_lst):
    """returns:
        (bool)parens_exist, left_paren_pos, right_paren_pos
    """
    left_lst = find(token_lst, '(')

    if not left_lst:
        return False, -1, -1

    left = left_lst[-1]

    #can not occur earlier, hence there are args and op.
    right = find(token_lst, ')', left + 4)[0]

    return True, left, right


def bool_eval(token_lst):
    """token_lst has length 3 and format: [left_arg, operator, right_arg]
    operator(left_arg, right_arg) is returned"""
    return token_lst[1](token_lst[0], token_lst[2])


def formatted_bool_eval(token_lst, empty_res=empty_res):
    """eval a formatted (i.e. of the form 'ToFa(ToF)') string"""
    if not token_lst:
        return empty_res

    if len(token_lst) == 1:
        return token_lst[0]

    has_parens, l_paren, r_paren = parens(token_lst)

    if not has_parens:
        return bool_eval(token_lst)

    token_lst[l_paren:r_paren + 1] = [bool_eval(token_lst[l_paren+1:r_paren])]

    return formatted_bool_eval(token_lst, bool_eval)


def nested_bool_eval(s):
    """The actual 'eval' routine,
    if 's' is empty, 'True' is returned,
    otherwise 's' is evaluated according to parentheses nesting.
    The format assumed:
        [1] 'LEFT OPERATOR RIGHT',
        where LEFT and RIGHT are either:
                True or False or '(' [1] ')' (subexpression in parentheses)
    """
    return formatted_bool_eval(create_token_lst(s))

The simple tests give:

>>> print nested_bool_eval('')
True
>>> print nested_bool_eval('False')
False
>>> print nested_bool_eval('True or False')
True
>>> print nested_bool_eval('True and False')
False
>>> print nested_bool_eval('(True or False) and (True or False)')
True
>>> print nested_bool_eval('(True or False) and (True and False)')
False
>>> print nested_bool_eval('(True or False) or (True and False)')
True
>>> print nested_bool_eval('(True and False) or (True and False)')
False
>>> print nested_bool_eval('(True and False) or (True and (True or False))')
True

[Partially off-topic possibly]

Note, the you can easily configure the tokens (both operands and operators) you use with the poor-mans dependency-injection means provided (token_to_char=token_to_char and friends) to have multiple different evaluators at the same time (just resetting the "injected-by-default" globals will leave you with a single behavior).

For example:

def fuzzy_bool_eval(s):
    """as normal, but:
    - an argument 'Maybe' may be :)) present
    - algebra is:
    [one of 'True', 'False', 'Maybe'] [one of 'or', 'and'] 'Maybe' -> 'Maybe'
    """
    Maybe = 'Maybe' # just an object with nice __str__

    def or_op(left, right):
        return (Maybe if Maybe in [left, right] else (left or right))

    def and_op(left, right):
        args = [left, right]

        if Maybe in args:
            if True in args:
                return Maybe # Maybe and True -> Maybe
            else:
                return False # Maybe and False -> False

        return left and right

    str_to_token = {'True':True,
                    'False':False,
                    'Maybe':Maybe,
                    'and':and_op,
                    'or':or_op,
                    '(':'(',
                    ')':')'}

    token_lst = create_token_lst(s, str_to_token=str_to_token)

    return formatted_bool_eval(token_lst)

gives:

>>> print fuzzy_bool_eval('')
True
>>> print fuzzy_bool_eval('Maybe')
Maybe
>>> print fuzzy_bool_eval('True or False')
True
>>> print fuzzy_bool_eval('True or Maybe')
Maybe
>>> print fuzzy_bool_eval('False or (False and Maybe)')
False
mlvljr
`nested_bool_eval` will fail if you don't actually perform any operation, i.e., `nested_bool_eval("True")` (or False).
Mike Graham
@Mike Graham Ah, forgot that case, fixing, thanks :)
mlvljr
This is disturbingly impressive. (applause)
a paid nerd
@a paid nerd Always welcome! :))
mlvljr