views:

113

answers:

6

Hi,

Looking at several posts, I get a feel that many of the questions arise because compilers/implemenetation do not emit a very meaningful message many times (but not always). This is especially true in the case of templates where error messages could be at the least very daunting. A case in point could be the discussion topic

Therefore, I would like to understand a few things:

a) Why is it that compilers are sometimes unable to give more meaningful/helpful error messages? Is the reason purely practical or technical or is there something else. (I don't have a compiler background)

b) Why can't they give a reference to the most relevant conforming C++ Standard Verse/section, so that developer community can learn C++ better?

EDIT:

Refer the thread here for another example.

EDIT:

Refer the thread here for another example.

+4  A: 

A common mistake that people make when trying to design something completely foolproof is to underestimate the ingenuity of complete fools. --- Douglas Adams

I'll try to explain some rationale behind diagnostics (as the standard calls them):

a) Why is it that compilers are sometimes unable to give more meaningful/helpful error messages?

Compilers are bound to obey the standard. The standard defines more or less everything that the compiler needs to diagnose (e.g. syntax errors) because these are invariants, stuff that the vendor needs to document (called implementation defined as the vendor has some leeway as to how to document), stuff they call unspecified (the vendor can get away without documenting) and then undefined behavior (if the standard can't define it, what error message can the compiler possibly spit out?).

b) Why can't they give a reference to the most relevant conforming C++ Standard Verse/section, so that developer community can learn C++ better?

  • Not everyone has a copy of the standard.

  • Instead, what the compiler tries to do is group errors by categories and then fixes a human-understandable error message that is generic enough to handle all sorts of errors in that category while still being meaningful.

  • Also, not all compilers are standards compliant. Sad, but true.

  • Some compilers implement more than one standard. Do you really expect them to quote C&V of 3 standards texts for a simple "missing ;" error?

  • Finally, the standard is terse and less human readable than the committee would like to think (okay, this is a tongue-in-cheek remark but reflects the state of affairs pretty accurately!)

And read the quote at the top once more ;)

PS: As far as template error messages are concerned, I have to offer the following:

  • For immediate relief, use STLFilt
  • Pray that Concepts make their way into the next standard
dirkgently
+1 for STLFiltI meant diagnosable conditions, conditions which should be diagnosed by the implementation. The point about throwing up Reference to the Standard could be controlled by some sort of a switch/flag for the more interested users
Chubsdad
Most of the time, when examining code, the compiler has to pull in more than one section of the standard. The sheer amount of jugglery that you'd have to do with text is reason enough not to even think about it. Other reasons, well, I have mentioned them. The moot point is to get things done as easily as it can be. The compiler authors don't want you to ruminate on how accurately they can quote the standard but rather how accurately they have implemented the wordings! YMMV.
dirkgently
+1  A: 

Compiler authors aren't chosen for their English abilities, and don't choose their work for the writing opportunities.

That said, I think error messages have consistently improved over the last decade. With GCC, the problem is usually sifting through too much information. The discussion you linked was about a "no matching function" message. That's a common error which is usually followed by a torrent of candidate functions.

Being referred to the standard's rules on overload resolution would be possibly even counterproductive in this case. To resolve the issue, I'll find the candidate I want and compare it to the call site. 99% of the time, I want a simple no-frills match, and 99% of the sophisticated resolution machinery won't apply. Having to review the resolution rules in the standard often indicates you're getting into deep doo-doo.

I think only a minority of programmers are really inclined or fully able to navigate and interpret the ISO standard, anyway.

On the bright side, there are always avenues to contact the authors of any actively-maintained compiler. If you have any kind of suggestion for improved wording, send it in!

Potatoswatter
+1  A: 

There are some compilers that are better than others. The compiler from comeau I've heard gives significantly nicer errors. You can try it out at http://www.comeaucomputing.com/tryitout/

obelix
+1  A: 

IMHO, often times what matters is not the text of the message, but the ability to relate it to the source. The C++ compiler in VS2005 seems to show error messages indicating the file where the error occurred, but not the file it was included from. That can be a real pain when e.g. a mistake in one header file causes compilation errors in the next one. It can also be difficult to ascertain what's going on with preprocessor macros.

supercat
+1  A: 

A factor not mentioned in the other answers I've read: C++ compilers have a very complicated job as is, and don't further complicate it by classifying the code they're compiling into "expected" stuff and "unexpected". For example, we as programmers understand that std::string is a particular instantiation of std::basic_string with various character types, traits, allocators - whatever. So, when there's an error we just want to know it involves a string and not see all that other stuff. But, say we're asked to debug an error message a client encountered when using our library. We may need to see exactly how a template has been instantiated in order to see where the problem is, and simply seeing some typedef that's inside their code - that we may not even have access to - would make the error messages useless. So, programmers at different levels in the software stack want to see different things, and most compilers don't want to buy into guessing about this or allowing customisations, they just spit everything out and trust the programmer will quickly learn to focus in on the stuff at the level they need to. Most of the time, programmers quickly learn to do that, but sometimes it's harder than others.

Another factor is that sometimes there may be many small variations on the erroneous code that would all be valid, so it's impractical for the compiler to know what the programmer intended and display a message about that delta. Programmers however are often unaware of the other ways the code might almost have made sense, and just think the compiler is dumb for not seeing it from their perspective.

Cheers, Tony

Tony
+3  A: 

The fundamental problem is that compiler diagnostics deal with things you haven't written.

In order to give you a meaningful error message, the compiler has to guess what you meant, and then tell you how your code differs from that.

If you're missing a semicolon, the compiler obviously can't see that semicolon anywhere. Of course, one of the things it can do is to guess "maybe the user is missing a semicolon. That's a common mistake, after all". But where should that semicolon have been? Because you made an error, the code can't be parsed into a syntax tree, so there's no clear indicator that "this node is missing from the tree". And there might be more than one place where a semicolon could be inserted so that the surrounding code would parse correctly. And moreover, how much code are you going to try to parse/recompile once you've found what might be the error? The compiler could insert the semicolon, but then at the very least it has to restart parsing of that block of code. But maybe it introduced errors further down in the code. So maybe the entire program should be recompiled, just to make sure the fix the compiler came up with was actually the right one. But that's hardly an option either. It takes too long.

Say you have some code like this:

struct foo {
 ...
}

void bar();

what is the error here? Looking at it, you and I would say "you're missing the semicolon after the class definition". But how can the compiler tell? void could be a typo. Perhaps you actually intended to write the name of an instance of type foo. then the real error would be that it is followed by what now looks like a function call.

So the compiler has to guess. "This looks like it could have been a class definition, and what comes after it looks like it the name of a type. If that is true, the user is missing a semicolon to separate them".

And guessing isn't a very precise science. And matters are further complicated because every time the compiler tries to be clever and makes a guess, it's only going to add confusion if the guess is wrong.

So sometimes, it might be better to output a short, terse message saying only what we're sure of (say, that a class definition cannot be followed by a type name). That's not as helpful as saying "you're missing a semicolon after the class definition", but it's less harmful if the compiler guesses wrong.

If it tells you you're missing a semicolon, and the error was actually something else, it's just misleading you. So maybe a terse and less helpful error message is better in the worst case, even if it isn't as nice in the best case.

Writing good compiler errors isn't easy, especially not in a messy language like C++. But when that is said, some compilers (including MSVC and GCC) could be a lot better. I believe that better compiler diagnostics are one of the primary goals of Clang.

jalf
I'll have to disagree about the semi-colon or curly brace issue. Crimson Editor (not a compiler - mind you) can tell you which curly brace goes where and if one is missing simply by placing your cursor. I imagine any good compiler could do the same thing.
0A0D