views:

86

answers:

4

I'm generating automatic C++ code from python, in particular I need to select some events for a list of events. I declare some selections:

selectionA = Selection(name="selectionA", formula="A>10")
selectionB = Selection(name="selectionB", formula="cobject->f()>50")
selectionC = selectionA * selectionB # * means AND

this generate the C++ code:

for(...) { // cicle on events
  event = GetEvent(i);

  bool selectionA = A>10;
  bool selectionB = cobject->f()>50;
  bool selectionC = (A>10) and (cobject->f()>50);

  if (selectionA) { histo_selectionA->Fill(event); }
  if (selectionB) { histo_selectionB->Fill(event); }
  if (selectionC) { histo_selectionC->Fill(event); }
}

This is not very smart, because the smartest code will be:

bool selectionC = selectionA and selectionB

This problem seems to be simple, but it is not, because I have 100+ base selections (as selectionA or selectionB) and 300+ derived selections, and of course a derived selection can be derived from derived selection. Obvious derived selections are not derived from base selections using a regular pattern.

I understand that it is diffult to answer, but can someone give me some hints? For example: is it really necessary to write smart code? I mean, compilers are not able to optimize this code?

+2  A: 

It's unlikely that a compiler could optimize this code. Partly because cobject->f() might have side effects the compiler can't see.

You could help in a minor way by declaring your bools as const.

Otherwise, it looks like you're already overloading operators to compose selections. So it shouldn't be too hard to make a composed selection use the names of the selections its composed from instead of the expressions. This does some optimization for you and will allow the compiler to optimize further if possible, especially if you declare your selection bools as const.

You will also have to be careful to emit the code to initialize the bool flags in the same order the selection objects are created in Python. This will make sure a bool is always declared and initialized before its used later. You can do this by having a list in the Python Selection class and have the __init__ method add the new Selection to that list. Of course, if you create Selection objects that you then throw away that might be a problem. But if you keep them all, it works.

Omnifarious
+1  A: 

Compilers might be able to optimize this code, but if you have hundreds of complicated expressions that depend on each other I would doubt that it would work that well.

But a more basic question is: Do you really need optimization? Computers are fast, and if you don't run that code very often it might very well not matter if cobject->f()>50 is run once or ten times.

On the other hand, if cobject->f() has side effects (like, for example, it prints something) the compiler will never optimize away the repeated calls and you will have to make sure that it is only called in your generated code as often as you want it to print something.

The best solution would be if your Selection class could just output name instead of formula when used as part of a derived definition. How hard or easy that is depends on your generating code.

sth
A: 

As an amendment to my comment, I don't even think you'll need a tracking variable as I suggested. Why not try this?

import string

class Selection:
    selections = []
    letters = string.letters[26:] + string.letters[:26]

    def __init__(self, name, formula):
        self.name = name
        self.formula = formula
        Selection.selections.append(self)

    def __mul__(self, selection):
        name = 'selection' + letters[len(selections)]
        return Selection(name, self.name + ' and ' + selection.name)

    @classmethod
    def generate(c):
        code = []
        for selection in c.selections:
            code.append('bool ')
            code.append(selection.name)
            code.append(' = ')
            code.append(selection.formula)
            code.append(';\n')

        code.append('\n')

        for selection in c.selections:
            code.append('if (')
            code.append(selection.name)
            code.append(') { histo_')
            code.append(selection.name)
            code.append('->Fill(event); }\n')

        return ''.join(code)

This only works of course, assuming you have only 52 selection objects, but that limitation only exists because this class only generates names of the form selection[A-Za-z]

blwy10
A: 

So, if I understand your example correctly, you are creating a set of Selection objects and then using them to generate the code?

First of all, why not just write the code in C++? As it is, you're embedding C++ expressions in Python as string variables, and using overloading mathematical operators to construct boolean expressions (the fact that you felt the need to comment that * means AND is an indication that this is a poor choice)? That's just plain ugly!

That being said, you have all the information you need -- in the Python code, selectionA knows its name is "selectionA" and selectionB knows its name is "selectionB". The only thing is, you don't provide enough context to know what type of object selectionC is. I'm assuming it's something like AndExpression, and holds references selectionA and selectionB (maybe as param1 and param2?). Just have it output "(" + self.param1.name + " && " + self.param2.name + ")".

Nathan Davis
I'm generating C++ code with python because the code is very boring, simple, repetitive, but very long (10000+ lines). Python code generate the C++ code, compile it, execute some istances of the program with different parameters in parellel, and use the output of them.
wiso
In python you can't overload and and or operator because they're lazy so you need to use something else, it is not so ugly
wiso
Yes, but it is not so simple. The point is that you need to know that you must declare `selectionA` and `selectionB` before `selectionC`. My example is trivial, real life is not.
wiso
Umm, maybe there is a better way to write the C++ code? Really, if you find yourself writing 10000+ lines of code that is "boring, simple, repetitive, but very long", then you are doing something wrong.
Nathan Davis
Also, I understand that you can't override `and` and `or` in Python. But the point is, if you need to explain what an override means, then it's probably not an appropriate override -- use a method instead.
Nathan Davis