views:

224

answers:

5

I would like to create a domain specific language as an augmented-C++ language. I will need mostly two types of contructs:

  • Top-level constructs for specialized types or declarations
  • In-code constructs, i.e. to add primitives to make functions calls or idiom easier

The language will be used for scientific computing purposes, and will ultimately be translated into plain C++. C++ has been chosen as it seems to offer a good compromise between: ease of use, efficiency and availability of a wide range of libraries.

A previous attempt using flex and bison failed due to the complexity of the C++ syntax. The existing parser can still fail on some constructs. So we want to start over, but on better bases.

Do you know about similar projects? And if you attempted to do so, what tools would you use? What would be the main pitfalls? Would you have recommendations in term of syntax?

+1  A: 

If you really want to extend C++, you'll need a full C++ parser plus name and type resolution. As you've found out, this is pretty hard. Your best solution is to get an existing one and modify it.

The DMS Software Reengineering Toolkit is an infrastructure for implementing langauge processors. It is designed to support the construction of tools that parse languages, carry out transformations, and spit out the same language (with enhanced code) or a different language/dialect.

DMS has a full C++ Front End, that parses C++, builds abstract syntax trees and symbol tables (e.g., all that name and type resolution stuff).

The DMS/C++ front end is provided with DMS in source form, so that it can be customized to achieve the kind of effect you want. You'd define your DSL as an extension of the C++ front end, and then write transformations that convert your special constructs into "vanilla" C++ constructs, and then spit out compilable result.

DMS/C++ have been used for a wide variety of transformation tasks, including ones that involved extending C++ as you've described, and including tasks that carry out massive reorganizations of large C++ applications. (See the Publications at that website).

Ira Baxter
“If you really want to extend C++, you'll need a full C++ parser plus name and type resolution” – no, not necessarily. It will be enough to parse out the *augmented* constructs. Of course, this requires not choking on the rest of the code but that doesn’t entail having a full parser.
Konrad Rudolph
It is possible in fact to find trivial enough extensions so in fact you can do this with just Perl and regex hacking. In practice, its pretty hard to find interesting extensions to C++ (such as the OP's notion "specialized types and declarations") that will allow you get away with just parsing, especially if writing such a declaration can impact on other parts of the code. Even then you'll find parsing to be a daunting task; if you try to parse *without* name and type resolution, you have to keep all the ambiguous parses around.
Ira Baxter
That looks very good, but unfortunately, this works only for Windows, which is a system I don't use.
PierreBdR
Best of luck. These beasts are pretty rare. You might check out EDG, but, the parser is implemented by hand which will make extending it a lot harder than just adding extra grammer rules, and you'll have to build all that transformation support infrastructure yourself. You'll likely find that to be of the same scale as the C++ parsing engine
Ira Baxter
A: 

To solve you first bullet, maybe you can use C++0x new features "initializer lists", and "user defined litterals" avoiding the need for a new parser. They may help for the second bullet, too.

Didier Trosset
A: 

The way to extend C++ is not to try to extend the language, which will be extremely difficult and probably break as new base compiler releases implement new features, but to write class libraries to support your problem domain. This has been what C++ programming has been all about since the language's inception.

anon
I know, and this is what I started with. But the result still looks quite scary for non-C++ programmers. This is why I want now to add a thin layer above C++ to make things more readable (like avoiding complex templates, add a few idiomatic constructions that occurs a lot in the code, or replace some functions call by keywords so as to provide a nicer syntax).
PierreBdR
This is not exactly true. Some of the best C++-based projects contains augmented C++ DSLs. To name a few: MOC preprocessor in Qt, tablegen DSL in LLVM.
SK-logic
+1  A: 

There are many (clever) attempts to have domain specific languages within the C++ language.

It's usually called DSEL for Domain Specific Embedded Language. For example, you could look up the Boost.Spirit syntax, or Boost.rdb (in the boost vault).

Those are fully compliant C++ libraries which make use of C++ syntax.

If you want to hide some complexity, you might add in a few macros.

I would be happy to provide some examples if you gave us something to work with :)

Matthieu M.
+2  A: 

You can try extending an open source Elsa C++ parser (it is now a part of a Mozilla's Pork project):

https://wiki.mozilla.org/Pork

SK-logic