views:

1018

answers:

34

What are some examples of code generators you have used? I think it's a cool idea, but I have trouble thinking of things they can do besides make a class based on an object's attributes/database schema (as described in The Pragmatic Programmer). What language did you write them in and what language did they output?

Edit: Thanks for the responses so far. What I am really looking for is examples of code generators made from scratch for some certain purpose. I mentioned it in the title, but didn't make it very clear in my question. How did you go about making a code generator on your own and what specificly did it achieve?

+2  A: 

erm, technically speaking any compiler. I've used gcc to generate machine-specific instructions from C and C++.

Peter
Yes, that's true, but not exactly the kind of answer I'm looking for :)
rosscj2533
well, ok, it's a little pedantic - but actually, I think it's important to realise that 'code generation' isn't actually anything particularly special - it's used extremely frequently by developers - we just don't necessarily recognise it as such.
Peter
Yeah, you are right and it's a good point to bring up. What I was more looking for was code generators that were made from scratch. I'll update question to reflect that.
rosscj2533
+2  A: 

Perl module Jeeves is nice. (The language that the templates are written in is a hybrid of Perl and a very simple Jeeves-specific command set that does looping, conditionals, foreach, etc.)

I use mine for database purposes as in TPP to write DB CRUD functions plus a few other convenience functions. The output code is in VB.NET.

Also I wrote a generator to do structured data parsing from a spec. The generated code did byte-swapping, type casting, and scaling, and stored the results in a struct tailored for the message type. Also generated were structure-specific formatted output functions (printf variety). That code was C++.

In each case I ran Jeeves as a prebuild step in my projects.

John at CashCommons
+4  A: 

The best use I've found recently is to use C# to output C++ hooks that communicate with the same C# library that creates the C++.

jestro
My head spins...
CesarGon
+5  A: 

I have used flex and bison to produce C code for a compiler. You can see examples for flex and bison in their manuals.

I have worked with a code generator written in C++ to produce a large number of C++ data structures for passing messages between a client and server. The input to this generator was a table giving a list of the data structures and the types of messages that applied to them. The generator assigned an unique id to each message and generated all the header files that could be used for the messages.

garethm
What does the text fed into flex and bison look like?
rosscj2533
+2  A: 

I wrote some code to generate "dumb" web interfaces (ASPX, ASPX.CS), service layer (interface, basic contract implementation), business and data access layers, based on class diagrams created on Enterprise Architect.

Also wrote some VB6/ASP3 generator based on a COBOL parser to format a IMS/CICS generated string into a XML file to be consumed on intranet websites.

Rubens Farias
+6  A: 

Assuming that compilers don't count... I used f2c once, when I had a nasty FORTRAN program and didn't really know FORTRAN that well, so I translated it into C... I took one look at the C that it generated and immediately went back to the FORTRAN...

I've also used LaTeX2HTML and various other LaTeX translators. Those tended to not be nearly as bad.

Brian Postow
LOL :D +1 for going back to fortran
Stefano Borini
Right. Generated code isn't necessarily readable code.
John at CashCommons
I helped develop a COBOL compiler many years ago, which generated C code, which ran on over a dozen different Unix machines. And yes, the generated code was not that pretty, but it performed as well as COBOL compilers that generated native code. The key is to generate enough useful comments to help debug your code generator.
Loadmaster
+8  A: 

They're extremely helpful in White Box testing. You can write a simple Perl (or whatever) script that spits out literally thousands of test cases for some piece of code. For example, suppose you want the answer to this question:

Does your source-code debugger properly handle every type of variable declaration that GCC allows?

Testing that by hand would really be a drag. Re-testing it after every change in the debugger is likely to lead to insanity.


Another extremely common use is for embedding multimedia data (images or sounds, for example) into program code. You write a program that reads a media file, parses it, converts it to another format, then generates a declaration that looks like this:

unsigned char logo[320*200] = { 123, 34, 119, ... };

I bet there's not a single embedded system with a screen that doesn't have something like this in the build process for the bootloader, BIOS, or equivalent.

Mark Bessey
I think you just created a converter to xpm...
Stefano Borini
+2  A: 

I've used the Web Service Software Factory, which generates code for .NET web service clients and servers based on a description of the service.

John Saunders
+2  A: 

I've used JAXB for genertion of Java classes from an XML schemaa. The code thus generated was used for creation of XML documents compliant with the schema.

sateesh
+2  A: 

I learned flex and bison so that I could parse some IDL files. It started out as a simple tool (at first just a bash script) to create basic C++ classes from the IDL definitions. This is reinventing the wheel -- there are commercial and free IDL compilers available -- but we had very specific requirements about what needed to be generated.

Once you have written a tool like this, and have a sizable collection of data structures defined in IDL, you find all sorts of other uses for it. Eventually it could count SLOC, generate XML definitions of the data structures, generate conversion routines between the C++ classes it created and the C++ classes output by the real IDL compiler, generate serialization code, etc.

At first it was just flex/bison and C used to generate C++. Working in C became cumbersome, particularly for one feature I was adding, so I rewrote large parts of it in C++.

I think what made it worthwhile was the scale -- probably close to a thousand structure definitions. Coding everything by hand would have been tedious and error-prone. If you only have say 40 or so it may be easier to do it by hand.

Dan
+1  A: 

I used Powershell to translate a PDF of MS-BINXML's functional specification into a C# parser class.

Richard Berg
+1  A: 

I used a tool called CAPE (Computer Aided Protocol Engineering) which allowed me to define a protocol using SDL (Q.931 was the main protocol I implemented) and generated a skeleton state machine that implemented the protocol. By skeleton I mean that all the action routines had to be written by me - things like "send RELEASE message", "start timer t304", etc.

State machine mechanism 9/10, ultimate usefulness 4/10 (code was obfuscated, small changes in SDL would result in large changes in code generated).

Tony van der Peet
+2  A: 

I once created a code generator that can output different languages from a visual representation. It works great for configuring systems.

http://www.memention.com/designer/

I created the first version many years ago for my final thesis for my M.Sc in CS studies.

I rewrote the engine a few year ago because I liked the project and we have one customer using it. I'd like to take it a few steps further...if I only had the time

epatel
+1  A: 
  • SQL generators, some implemented in POSIX shell environment (mostly awk), some with Python.
  • sh generators, implemented with sh strings and variables: var="some string using variables"; rv=$($var);
just somebody
+1  A: 

I developed a library to provide XML DOM interface to fortran (77 and 95). The library is actually a wrapper of libgdome2, which is in C, and my library (F77xml, you can find it somewhere) just makes interfacing from Fortran to C in a proper way.

Fact is, writing the DOM interface is boring. I wrote a python program reading a xml file where I specify the basic information about each routine. The python program just generates .c files to compile and link. Works pretty well, but I never actually get to use the library, so I lost interest in it.

Stefano Borini
+1  A: 

I too have used f2c. We used it to convert FORTRAN to ansi-c so that we could port it to different platforms with no changes to the original code. It allowed us to create a generic FORTRAN program on the PC and port it to Sun, SGI, Pr1me, AIX, and a few others.

Dave
+1  A: 

HTML-Canvas to PostScript in JavaScript

I was also experimenting with a JavaScript module system. This version still has some problems, so the demo page might need a few reloads.

Justin Love
+1  A: 

Ruby Equations to Graphviz dot, described more a journal post.

This one uses GraphvizR, (itself almost an example) so I'm not sure it counts as from scratch.

Justin Love
+1  A: 

MetWare has code generation and starts with a SKOS document (SKOS uses OWL uses RDF). The code that is autogenerated includes SQL table creation instructions as well as Java Beans and Java Server Pages/Faces (JSP/JSF) components. This code generators are written from sketch to provide me the flexibility to do various kinds of transformations, like using the XML Schema data types from the RDF into various field types (Java, SQL) and field content validators (JSF).

Egon Willighagen
+1  A: 

SQL routines for CRUD. Mechanically generated Stored Procedures that handle GET (for an Amend or Delete form), SAVE (allowing Insert new, Update existing, or Upsert; included validation based on the constraints set in the DB - to return a user-friendly error, rather than DB error) and DELETE (with validation that no FKeys will be violated etc.)

Includes other in-house issues - such as tables which have a Record Owner and thus SProcs need a parameter for the "Group" the editor belongs to, and so on.

The generated Stored Procedures are sometimes hand edited, but as the majority of the code remains the same it is fairly trivial to regenerate a copy of the generated code (e.g. if a new column is added to the table) and DIFF with the actual code.

Kristen
+1  A: 

I used PERL to automatically process several million lines worth of BDE based Delphi programs changing all of the database components from the BDE version (e.g. TTable, TQuery) to the DOA "equivalent" (TOracleDatase, TOracleQuery) also changing the source to use the different way of accessing fields, executing SQL etc. It was a huge project in terms of scope but the actual conversion was a remarkably tiny part of the project.

mcottle
+1  A: 

Do pre-historic times count? I wrote a code converter very similar to yacc that I used to make a Pascal compiler. I do not remember why yacc was not sufficient. Something to do with LR(1) I think it was. That was in the mid 80's. (And no, I did not choose Pascal.)

Jive Dadson
+1  A: 

I use [commore][1] right now. It's like "Corba for dummy". He can generate C/C++ or Java Code from a IDL. Usefull for distributed system.

[1]: http://sourceforge.net/projects/commore/ commore

Antoine Claval
+1  A: 

I work on a software that has to communicate with an embedded device. The PC software is written in C#, the embedded software is written in a proprietary semi-OOP language. To send structured data between these two components, I have written a program that parses the embedded program, extracts all structure definitions and generates equivalent C# classes for these structures. It also creates for each of these classes

  • code to serialize the data into a stream, so it can be easily read on the embedded side (and vice versa)
  • code to check if each of the members is in the valid range (the valid range for each member is "declared" in a comment in the original embedded code)
  • deep-clone and equality-testing methods
  • a method to find the "differences" between two instances of the same data structure (i.e. the set of members that don't have the same values in both instances, recursively for nested structures)

Now I can just fire up a data type converter whenever the embedded firmware changes, and I can immediately use the new data types without writing any new code. I'm quite happy with it.

nikie
+1  A: 

Data access layer classes (C#) from a schema (in XML format). We also generated database crfeation scripts from that same schema. We did write this from scratch, but if we were doing it now we'd probably consider a T4 script.

stusmith
+1  A: 

On a C programming course where one was supposed to write a Brainfuck interpreter, I submitted a Brainfuck to C++ compiler that then used execvp to run GCC (compiling the generated code) and finally ran the program. I couldn't use system() because the sandbox environment didn't permit that, but apparently they didn't prevent using execvp and running GCC there...

Got full points :)

Tronic
+1  A: 

There's a really cool article in the book "Beautiful Code" in Chapter 8 titled "On-the-Fly Code Generation for Image Processing" that described how early Microsoft Windows versions used code generation to do all sorts of bitmap processing. Since these functions would do a certain patterned operation on every pixel in a region, it gave huge performance benefits to generate the function being applied to each pixel at runtime and go from there.

You'd probably enjoy the article, and the book as a whole is worthwhile. I'd paste it here or something but that'd probably get me in trouble...

Adrian Petrescu
+2  A: 

Ah, code generation. I joke with people that I am a computer programmer because I am fundamentally a lazy person - I want the machine to do my work FOR me. Granted, to be more accurate, I want the machine to do EVERYONE's work...

I've written the SQL DDL-to-Objects program a few times. I have it write the T-SQL Stored Procedure for CRUD operations, the VB.NET code for the data access layer, the VB.NET code for the business layer, including a generic list of the object in question with a few handy methods pre-written. Then it's just a simple case of copy/paste into Visual Studio and, presto, custom objects derived from a table design.

But this isn't the first time I've done this.

Going back to 1984 I was working for a company that wrote "municipal software" - tax assessment software, utility billing software, that sort of thing. The Tax Assessment syste was being written from scratch but we had something very rare in the industry - a LONG design phase that assessors helped produce. We had it down to knowing what our indexed files would be (think of them as old-school SQL tables), the relations between them, etc.

We were going to have a LOT of maintenance screens to support all the 'parameter' files. Something like 17 small files that were each needed their own screen.

The system was VAX/VMS based, written in DEC-Cobol, using FMS (Forms Management System) for a front-end screen handler. That was the "official word". I had to write the 17 maintenance programs and I didn't want to. So I wrote the first one and "genericized" my code so that I could clone it easily for the other 16.

FMS had a feature that would produce a Cobol data division from your form definition. It hit me that with all the 1-to-1 screen-to-file field relationships, I could do something neat.

So, if you followed a few naming conventions for the user-editable fields, you had all the information you needed.

I wrote a COBOL "template" that required a single "Search/Replace All" and a "Save As" to properly customize.

Then I wrote a DEC-BASIC program that read the output from FMS and produced a series of INCLUDE files in COBOL. The INCLUDE files covered everything - the FD, data definitions, the COBOL part of the CRUD code, loading the screen from the record buffer and vice-versa, error checking - you name it. I just had to make sure the 'hooks' I put in the program template matched the 'latches' that I put in the INCLUDE files.

You literally went from the design of the screen to a working program in less than two minutes.

When I told my boss that I had a BASIC program writing COBOL code (after all, BASIC was better for manipulating strings) he looked at me like I had 3 heads. The next day when I told him that I had written 10 of the maintenance programs in one day (the schedule said we had 3 days for each program) he didn't believe me until I demoed them all. I was able to knock over a month off the development schedule.

I don't like repetetive work!

David
Wow, nice work!
rosscj2533
+1  A: 

The SPIRAL project takes DSP algorithms (or any Linear Transformation) like the Fast Fourier Transform and Walsh-Hadamard Transform, specified in the SPL algorithm and converts them to intermediate code. It can then do very linear transform-specific optimizations and spit out highly optimized and portable C code.

Here's how you would specify an FFT of size 8:

; common definition
(define F4
  (compose (tensor (F 2) (I 2) (T 4 2)
           (tensor (I 2) (F 2))) (L 4 2)))

; formula-1
(compose (tensor (F 2) (I 4)) (T 8 4)
         (tensor (I 2) F4) (L 8 2))

; formula-2
(compose (tensor F4 (I 2)) (T 8 2)
         (tensor (I 4) (F 2)) (L 8 4))

The two formulas following the definition of an FFT of size 4 are the breakdown rules. Depending on the cache architectures and other properties of the system one might be more efficient than the other. SPIRAL uses Dynamic Programming to decide which is better to use.

We're working on building a small working model of this in CS650: Automatic Program Generation and Optimization to learn the basic principles. I'm using flex and bison to do this. The lexing and parsing were the easy parts, its the semantics of the language that are difficult.

Lolindrath
+2  A: 

Saltarelle, which creates HTML-producing classes from templates. If you, like people did in the IT boom, count HTML as code, it's even a code generator that generates a code generator.

erikkallen
+1  A: 

Not sure if this qualifies:

http://francisshanahan.com/index.php/2010/a-simple-gwt-generator-example/

Francis Shanahan
Nice example, something like that is what I imagined when I first started thinking about what one would look like. I'm glad you pointed out some of the pain in using it. I'd say it definitely qualifies :)
rosscj2533
+1  A: 

My team has created code generators for testing purposes. We have a piece of hardware that we want to test, and the driver only has a C language API. We created a simple interpreter that will take test descriptions that we write in a human-readable plain text file (using a "scripting language" that we defined) and will generate C code that will perform the tests and dump the test results to a log file. We use this in our automated testing lab. We tell our overnight hardware test systems which tests need to be run and the test system will generate C code to perform the tests, compile the code into a test utility, run the utility, and report back in the morning with the test results.

We built the code generator after we noticed that we were constantly doing the same thing, night after night. We wanted to automate our testing, and we found that the individual cases we were writing test code for followed the same basic code flow. In many cases, we were copy/pasting from an application "skeleton" and simply fleshing out a few details. From that point, it wasn't a big leap to code generation using app skeletons as templates.

Another example is a project I worked on where we were taking data from overnight tests and storing the data in a database. Users could access a web interface to view the data, compare tests from several nights, etc. We presented the data as line graphs using SVG images. An SVG image is written like an XML file, so we put together some Ruby scripts that would use SVG templates and database data to generate images on the fly. Like the previous example, we created the code generator after we identified a repetitive activity that looked like it could be automated, then broke down that activity into parts that don't change (implemented as code templates) and parts that always change (the code that has to be generated).

bta
Cool work, sounds like it was worth it to automate a repetitive process :)
rosscj2533
+2  A: 

I have written a code generator for java as eclipse plugin. It is available at : http://fast-code.sourceforge.net/. Let me how you think of it. It is useful for spring based applications.

fastcodejava
+2  A: 

I created my own here it's just the start but it shows the idea:

1°) online central repository in a simple searchable and readable blog which could be shared and duplicated for refinement by others in the future 2°) executable with a free online client rebol (in the future will be integrated with eclipse and visual studio) 3°) can do more that just create the basic class, can also generate and even execute testing in the future

example with simple video instruction

http://askcodegeneration.com/java/class/

http://askcodegeneration.com/csharp/class/

Rebol Tutorial