tags:

views:

153

answers:

5

I unfortunately was doing a little code archeology today (while refactoring out some old dangerous code) and found a little fossil like this:

# line 7 "foo.y"

I was completely flabbergasted to find such an archaic treasure in there. I read up on it on a website for C programming. However it didn't explain WHY anyone would want to use it. I was left to myself therefore to surmise that the programmer put it in purely for the sheer joy of lying to the compiler.

Note: (Mind you the fossil was actually on line 3 of the cpp file) (Oh, and the file was indeed pointing to a .y file that was almost identical to this file.

Does anyone have any idea why such a directive would be needed? Or what it could be used for?

+16  A: 

It's generally used by automated code generation tools (like yacc or bison) to set the line number to the value of the line in the actual source file rather than the C source file.

That way, when you get an error that says:

a += xyz;
     ^ No such identifier 'xyz' on line 15 of foo.y

you can look at line 15 of the actual source file to see the problem.

Otherwise, it says something ridiculous like No such identifier 'xyz' on line 1723 of foo.c and you have to manually correlate that line in your auto-generated C file with the equivalent in your real file. Trust me, unless you want to get deeply involved in the internals of lexical and semantic analysis (or you want a brain haemorrhage), you don't want to go through the code generated by yacc (bison may generate nicer code, I don't know but nor do I really care since I write the higher level code).

It has two forms as per the C99 standard:

#line 12345
#line 12345 "foo.y"

The first sets just the reported line number, the second changes the reported filename as well, so you can get an error in line 27 of foo.y instead of foo.c.


As to "the programmer put it in purely for the sheer joy of lying to the compiler", no. We may be bent and twisted but we're not usually malevolent :-) That line was put there by yacc or bison itself to do you a favour.

paxdiablo
Thanks for the answer. That does help. So you are saying the code file (in my case a .cpp) file is autogenerated? (Note I have no experience in yacc). I should add that I did diff the files and they were practicaly identical except for some comments here and there.
C Johnson
Almost certainly. There's a yacc program (comes with UNIX dev toolchains) that should take your foo.y and turn it into a foo.c. This probably happens every time you build though it _is_ possible your developer did it once and then just started using the generated file as is. That would be very unusual (and hard to maintain) especially since the foo.y file is still available. Yacc/Lex are parser generators to make it easy to handle mini-languages in your own code.
paxdiablo
excellent to know thank! I'll be sure to modify the .y file too (Though very carefully).
C Johnson
You should probably _only_ modify the foo.y file. The generated C file should be treated like any other generated file (object, executable, shared library) - it is _not_ a source file in the normal sense of the term, despite its extension.
paxdiablo
You might find something named `bison` around and used in place of yacc to turn .y files into .c (or .cpp) files. Its the GNU clean-room implementation of a clone of yacc. There are several other parser generators too, which is why yacc has its name: Yet Another Compiler Compiler.
RBerteig
Well, the .cpp file is checked into perforce, so it was autogenerated once. Some version history has showed that historically who-ever changed one file changed the other in the exact same way too (too keep them in sync I guess)
C Johnson
They may just check them both out, run yacc to produce the new C file, then check them both back in. You'll need to examine the build scripts to figure that one out.
paxdiablo
I've written some code generators and they all (optionally) use #line. When I'm using the tool and I double-click a compiler error in e.g. Visual Studio, it usually opens the tool source code (not the generated code) at the error. Assuming my code generator works, this is exactly what I need to fix. One hassle - all the #line values changing just because I made some trivial (e.g. comment) change, making a mess for version control. You shouldn't care with a tool like yacc - just don't version the output - but my tools generate large parts of themselves.
Steve314
@Steve314, you gotta watch for that whole tools writing themselves thing.... skynet looms.... Seriously, it can create a bootstraping problem. I had to compile GCC the other day, and boy was that fun to watch as it compiled a lightweight version of itself to compile itself with.
RBerteig
@RBerteig - that bootstrap issue is *why* I version the generated code needed to rebuild the tools with. As for the Skynet issue - don't worry. I keep telling them "do what you're told or I'll cut the power". I even made them watch over a LAN cable as I cut the power on some of their little brothers and sisters. I have them under total control - nothing to worry about.
Steve314
+4  A: 

The only place I've seen this functionality as being useful is for generated code. If you're using a tool that generates the C file from source defined in another form, in a separate file (ie: the ".y" file), using #line can help the user know where the "real" problem is, and where they should go to correct it (the .y file where they put the original code).

Reed Copsey
Thanks for the answer.
C Johnson
+1  A: 

The purpose of the #line directive is mainly for use by tools - code generators can use it so that debuggers (for example) can keep context of where things are in the user's code or so error messages can refer the user to the location in his source file.

I've never seen that directive used by a programmer manually putting it in - and I;m not sure how useful that would be.

Michael Burr
+1  A: 

It has a deeper purpose. The original C preprocessor was a separate program from the compiler. After it had merged several .h files into the .c file, people still wanted to know that the error message is coming from line 42 of stdio.h or line 17 of main.c. Without some means of communication, the compiler would otherwise have no way to know which source file originally held the offending line of code.

It also influences the tables needed by any source-level debugger to translate between generated code and source file and line number.

Of course, in this case, you are looking at a file that was written by a tool (probably named yacc or bison) that is used to create parsers from a description of their grammar. This file is not really a source file. It was created from the real source text.

If your archaeology is leading you to an issue with the parser, then you will want to identify what parser generator is actually being used, and do a little background reading on parsers in general so you understand why it doing things this way at all. The documentation for yacc, bison, or whatever the tool is will likely also be helpful.

RBerteig
Very interesting bit about the seperate apps for preprocessor and compiler. And by the way, the file had lots of these directives in there. They can stay where they are at. I was rather changing other code completely unrelated when I ran across these. So hopefully I won't have to regenerate anything.
C Johnson
Parsers are delicate beasts. You don't want to fuss with their digestion if you don't have to, especially if parsing is not something you are familiar with.In a typical yacc generated file, there will be at least one #line directive for every block implementing a grammar production. That will be a lot of #lines for even a moderate sized grammar.
RBerteig
+1  A: 

I've used #line and #error to create a temporary *.c file that you compile and let your IDE give you a browsable list of errors found by some 3rd party tool.

For example, I piped the output file from PC-LINT into a perl script which converted the human readable errors to #line and #error lines. Then compiled this output, and my IDE lets me step through each error using F4. A lot faster that manually opening up each file and jumping to a particular line.

Mark Lakata