views:

998

answers:

6

Compilers like all software, would also be prone to bugs, logical errors.

How does one validate the output generated by the compiler. Typically, my question is(are)

  • How to validate that the machine code generated is correct?

  • How to ensure that the machine code generated is according to the language specification.

  • Does it make sense to just pick an open source project (in C if one is also writing a compiler in C) to just compile it through the "compiler". In that case also, how do judge that the compiler is behaving as expected.

  • Are there any formal test cases (literature) provided by the language standards committee that a "language complying" compiler has to satisfy?

  • What are the sure "give aways" that the problem in a program compiled by a compiler is a compiler bug and not a program bug.

    - Any examples where mainstream compilers get confused and compile the code wrong?

Links to any literature would be appreciated.

+4  A: 

There are several compiler test suites out there. We've had some luck using the Plum Hall test suite for a C compiler. It consists of a large set of C code specifically written to test against the language standard. It verifies that the compiler can handle the language syntax and semantics.

Trent
Does not answer everything but the link is excellent. Thanks.
Aditya Sehgal
+2  A: 

The general practice is to create a large set of small programs that each demonstrate one aspects of the compiler. These will include both program that compile and ones that shouldn't. General the ASM coming out the back end is not checked but rather the program is run and it's output checked. As for how to make sure there are no bugs in the test cases: make them small, as in 5-10 lines each.

These test suites can be very large as in hundreds to thousands of tests (for example: an out of date test suite for the D programming language) and generally include one or more test cases for every bug ever reported.

BCS
but if I make small source files and test them individually, how does it ensures that they will all work when part of the same program. Like for example, I pick up an open source project and let *my* compiler loose on it.
Aditya Sehgal
It doesn't. But it is far more likely your compiler will fail one of the small programs. There is no possible way of proving that a compiler can correctly compile any given source - this would be equivalent to solving the halting Problem.
anon
Generally you get a mixing of test cases that are generated from the language spec (mostly really small) and cases from bugs (as small as the reproduction case can be cut down to). As for knowing things will work in all cases, I'm not sure you can (for most languages) even prove that the spec is consistent let alone that it is implemented correctly.
BCS
isnt it odd that something like this is not standardized or atleast an effort is made to standardize it.
Aditya Sehgal
+1  A: 

There was an earlier question related to this for C, but it comes down to a carefully written compiler test suite.

As to when compilers get the code wrong, I've hit that often enough in my professional career, thanks. It's happened less and less over time, but I found a bug in MS C++ compilers targeting CLI this week.

plinth
+1  A: 

The Eiffel compiler is open source and has an extensive library of test cases and internal design contracts.

http://dev.eiffel.com

+2  A: 

Good test suites for real languages are expensive to create and maintain. There's a reason that the Plum Hall test suite, which is industry standard for ANSI C, is so bloody expensive.

George Necula's translation validation is a brilliant idea but also quite expensive to implement.

The one thing that's cheap and easy is this: maintain a suite of regression tests, and every time you fix a bug in your compiler, put a suitable test into your regression suites. With compilers, it's unbelievable how easy it is to keep reintroducing the same bug over and over. Disciplined additions to your regression suite will prevent that, and they don't cost much.

Norman Ramsey
A: 

For the idea to compile a big open source project:

You could take a project that itself has a test suite. Then you compile the project and its test suite and see if the tests pass. To validate these results you compile project and test suite with an other compiler, and run the tests again.

Eike