views:

283

answers:

8

Hi All,

My partner and I are working on a prettyprinter for C++. The tool parses C++ and prints the resulting AST, so we have quite a bit of flexibility. We've implemented a few options for the user to control the output and now we're looking for opinions about the most important options. If you could take a look at our current (below) and then tell us what you like/dislike, what else should be there, etc. that would be great.

Thanks, Joe

Below are some of the current options (sorry for the length):


1. Control Blocks


1.1 IndentString


Define the white space string that’s used for each indent.

Example:

• IndentString “ ”

void f ()
{
  int a;
}

• IndentString “\t”

void f ()
{
    int m;
}


1.2 OpenBraceLocation


Three options are: “EndOfLine”, “NextLine”, or “NextLineAsWellAsCloseParen”

Start the open braces on the same or next line as the keyword that it’s associated with. Last option moves the close paren prior to the open brace if it exists to the next line as well.

Applies to if, while, for, switch and do-while statements.

If not present the “EndOfLine” option is used.

Example:

• OpenBraceLocation EndOfLine

if(val){
    val++;
}

• OpenBraceLocation NextLine

if(val)
{
    val++;
}

• OpenBraceLocation NextLineAsWellAsCloseParen

if(val
){
    val++;
}


1.3 NoBracesAroundSingleStatementBlock


Braces are removed from statement blocks that have only one statement. This option applies to do-while, for, if, and while blocks.

Example:

• NoBracesAroundSingleStatementBlock is present

if(a)
    func();

• NoBracesAroundSingleStatementBlock is not present

if(a)
{
    func();
}


2. Classes


2.1 virtualQualifier


The options are: “Everywhere” or “Minimalist”. When “Everywhere” is used the keyword “virtual” appears in all derived classes in front of the function declared to be virtual in the base class. With “Minimalist” it only appears in the base class.

Example :

• virtualQualifier Everwhere

class Base
{
    virtual void f(int a);
}

class Derived : public Base
{
    virtual void f( int a);

class MostDerived : public Derived
{
    virtual void f( int a);

• virtualQualifier Minimalist

class Base
{
    virtual void f(int a);
}

class Derived : public Base
{
    void f( int a);

class MostDerived : public Derived
{
    void f( int a);


2.2 SortClassMembers


The level options are “Access”, “Data/Functions” or “Functions/Data”, and “Alpha”. If no level-option is provided or the SortClassMember is not present the order of the members is unchanged.

Example:

• SortClassMembers Data/Functions Access Alpha

class Compiler 
{
private:
    string inputFileName;
public:
    Compiler( string const & inputFileName_);
    genOutput( string const & outputFileName_);
private:
    analyze();
    emitCode( string const & );
    parse();
    tokenize( string const & inputFileName_);
}

• SortClassMembers Access Functions/Data Alpha

class C
{
public:
    Compiler( string const & inputFileName_);
    genOutput( string const & outputFileName_);
private:
    analyze();
    emitCode( string const & );
    parse();
    tokenize( string const & inputFileName_);
private:
    string inputFileName;
}

• SortClassMembers Access Alpha

class C
{
public:
    Compiler( string const & inputFileName_);
    genOutput( string const & outputFileName_);
private:
    analyze();
    emitCode( string const & );
    string inputFileName;
    parse();
    tokenize( string const & inputFileName_);
}


3. Files


3.1 MaxLineWidth


Define the maximum line width. PrettyC++ will intelligently wrap longer lines if possible.

Example:

• MaxLineWidth 80

int x = 123456789;

• MaxLineWidth 10

int x =
123456789;


3.2 constLocation


The options are “Before” or “After”. The Before option places the const keyword before the type specifier. The After option places the const keyword after the type specifier.

Example :

• constLocation Before

const int x;

• constLocation After

int const x;


4. Names


4.1 AllNamesStartCase


Options are “LowerCase” or “UpperCase”.

Example:

• AllNamesStartCase LowerCase

int variable = 123456789;

• AllNamesStartCase UpperCase

int Variable = 123456789;


4.2 AllNamesDelimitWords


Options are “CaseDelimited” or “UnderscoreDelimited”. Words are identified by as either starting with a capital letter or following an underscore.

Example:

• AllNamesDelimitWords CaseDelimited

int myVariable = 123456789;

• AllNamesDelimitWords UnderscoreDelimited

int my_variable = 123456789;


4.3 FunctionNamesStartCase


Options are “LowerCase” or “UpperCase”.

Example:

• FunctionNamesStartCase LowerCase

void function() { return; }

• FunctionNamesStartCase UpperCase

void Function() { return; }


4.4 FunctionNamesDelimitWords


Options are “CaseDelimited” or “UnderscoreDelimited”. Words are identified by as either starting with a capital letter or following an underscore.

Example:

• FunctionNamesDelimitWords CaseDelimited

void myFunction() { return; }

• FunctionNamesDelimitWords UnderscoreDelimited

void my_function() { return; }
+1  A: 

I would suggest taking a look at eclipses code formatter. It may be for Java, but it still has a lot of options that would apply to C++. Otherwise, what you currently have looks good to me, but maybe a bit sparse... I would elaborate a bit more, but I don't have the time right now, and I'm sure someone else will answer as well...

Also, Eclipse does have support for C++, but I've never looked at it before, but I'd assume it would have a code formatter as well.

DeadHead
Thanks for the pointer.
Joe
+1  A: 

GNU Indent is an auto-indent program designed for C or C++. You can take a look at the options or the source if you'd like.

LiraNuna
Good point. Thanks.
Joe
+1  A: 

Take a look at Vera++

Piotr Dobrogost
I hadn't heard of this tools. Thanks for the pointer.
Joe
A: 

As far as I know the general wisdom is pick a style and go with it. It doesn't really matter too much what style you pick as long as you pick one. See python for a good example of the language enforcing a particular style and, while it irks some people, it works pretty well.

I guess the one's that people tend to care about are

  • indenting, spaces vs tabs and number of spaces (or number of spaces the tabs represent)
  • Where the brace goes
  • spaces around operators for(int i=0... vs for(int i = 0...
  • spaces after if before the (

Things I wouldn't be as interested in:

  • Sorting the members of a class
  • Enforcing variable naming schemes automagically (they should be in place but there are too many corner cases that cause trouble)
  • Line length limits are a sore issue for me. Depending on what area of our code I will strictly use 80. But really, modern IDEs can deal with longer lines without too much trouble. But then printing becomes a pain. No matter what this is very hard to automate well. Again too many corner cases that cause trouble and would need to be tweaked.
Matt Price
Thanks for the feedback! Appreciate it.
Joe
+1  A: 

Well, I like it very much. Is it different to astyle though? Astyle works pretty well for this kind of thing.

Thanks! My understanding of astyle is that it changes tabs and spaces to make source files consistent, since different tools interpret tabs sometimes to be equivalent to 2, 4, or 6. Since we parse the code and make a complete Abstract Syntax Tree (AST) we can do more than just modify the white space. We can do many refactorings too. Thanks again for the pointer to astyle. I need to spend some time with it.
Joe
A: 

One thing I miss in code formatters is something that gives me more hints about the structure of the code. I often find myself indenting consecutive lines of almost-equal statements to highlight their correspondances and differences, much like this:

auto_ptr<Base::Int> x1 = get<DataModelI::Base::Int>( context, c_Input1 );
auto_ptr<Base::Real> x2 = get<DataModelI::Base::Real>( context, c_Input2 );
auto_ptr<Composite::Array> x3 = get<Composite::Array>( context, c_Input3 );
auto_ptr<Base::Real> x4 = get<DataModelI::Base::Real>( context, c_Input4 );

into that

auto_ptr<Base::Int       > x1 = get<DataModelI::Base::Int >( context, c_Input1 );
auto_ptr<Base::Real      > x2 = get<DataModelI::Base::Real>( context, c_Input2 );
auto_ptr<Composite::Array> x3 = get<Composite::Array      >( context, c_Input3 );
auto_ptr<Base::Real      > x4 = get<DataModelI::Base::Real>( context, c_Input4 );

This is totally unmaintainable: when another, longer, line is added, I lose time indenting all previous lines. And yes, I did read Code Complete and partially agree with their statement about this :)

If you can add this (quite human, and subjective) aesthetic heuristic to a viewer, I would be pleased to hear from it.

I want my code as clear as possible. In my dream code viewer, I can toggle viewing using declarations - maybe I don't even need them: I can toggle full-namespace viewing, can hide local variable declarations to just focus on the control flow, ...

xtofl
Ohhhh the dream code viewer .... I've had many a dreams. :-) Thanks for the suggestion. Let me think about how to implement this and get back to you.
Joe
Can't be _that_ hard, really :)
xtofl
A: 

How about a style formatter. e.g. format according to: gnu, bsd, k&r etc

e.g. http://en.wikipedia.org/wiki/Indent_style

Matt H
astyle and GNU indent are actually capable of that
tr9sh
A: 

Thanks to all for the feedback.

Joe