views:

513

answers:

6

I'm generating C++ code, and it seems like it's going to get very messy, even my simple generating classes already have tons of special cases. Here is the code as it stands now: http://github.com/alex/alex-s-language/tree/local%2Fcpp-generation/alexs_lang/cpp .

+3  A: 

I wrote Cog partly to generate C++ code from an XML data schema. It lets you use Python code embedded in C++ source files to generate C++ source.

Ned Batchelder
+5  A: 

One technique I've used for code generation is to not worry at all about formatting in the code generator. Then, as a next step after generating the code, run it through indent to format it reasonably so you can read (and more importantly, debug) it.

Greg Hewgill
+1 not a bad idea.
ConcernedOfTunbridgeWells
Hmm... indent only works for C though. And when generating Python code, you really must worry about formatting
Eli Bendersky
eliben: That's true, you have to do something a bit more sophisticated for Python. I wonder if there are any modules that help to do that.
Greg Hewgill
Code generation is almost never needed in Python, since you can use the dynamic language and introspection/reflection techniques to achieve the same ends.
Ned Batchelder
It is quite easy to generate pretty code. Simply create a class with push()/write()/pop() methods and make sure that the write() method takes into account the current push/pop count. I use code generators every day and the generated C++ and C# code looks exactly like my own code.
MB
@MB: That's one approach but unnecessarily hard to do in XSLT (yes, I've used XSLT for C++ code generation).
Greg Hewgill
In XSLT you pass a parameter to the template you call, and append to it each time you recurse into a block.
Pete Kirkham
+2  A: 

See Tooling to Build Test Cases.

It's not clear what your problem is.

If you question is "how do I handle all the special cases in my generating classes?" then here's some advice. If your question is something else, then update your question.

  1. Use a template generator. Mako, for example, will make your life simpler.

    Write an example of your result. Replace parts with ${thing} placeholders. Since you started with something that worked, turning it into a template is easy.

  2. When generating code in another language, you need to have all of the class definitions in other other language designed for flexible assembly. You want to generate as little fresh, new code as possible. You want to tweak and customize a bit, but you don't want to generate a lot of stuff from scratch.

  3. Special cases are best handled with ordinary polymorphism. Separate subclasses of a common superclass can implement the various exceptions and special cases. Really complex situations are handled well by the Strategy design pattern.

    In essence, you have Python classes that represent the real-world objects. Those classes have attributes that can be fit into a C++ template to generate the C++ version of those objects.

S.Lott
A: 

I have a code generation system and one of the best choices I have taken with it is to put much of the resultant program in non generated code, e.g. a library/runtime. Using templates works well also. Complex template systems may be hard to work with by hand, but your not working with them by hand so leverage that.

BCS
A: 

It would actually be just recursing straight down, except I need to pull all function declarations out and put them elsewhere, and the fact that for all function calls I need to build a vector of all of the arguments, and then pass that to the function, since C++ doesn't have a syntax for vectors.

Alex Gaynor
It does have a syntax for variadic arguments, if you need that. But do you want to generate code which pushes three argument to a vector and passes that to a four parameter function?
Pete Kirkham
+1  A: 

I agree with S.Lott, that you should write out an example of what you want to generate.

Solving a problem with code generation should be less complicated than without.

This is because your total program has to deal with a lot of input information, and if a subset of that information changes very seldom, like once a week, the code generator only has to condition on that subset. The generated code conditions on the remaining input that changes more frequently. It's a divide-and-conquer strategy. Another name for it is "partial evaluation".

Generated code should also run a lot faster because it's less general.

In your specific case, there's no harm in doing the code generation in 2 (or more) passes. Like on pass 1 you generate declarations. On pass 2 you generate process code. Alternatively you could generate two output streams, and concatenate them at the end.

Hope that helps. Sorry if I'm just saying what's obvious.

Mike Dunlavey