How does a macro-enabled language keep track of the source code for debugging?

views:

165

answers:

+7 Q:

How does a macro-enabled language keep track of the source code for debugging?

This is a more theoretical question about macros (I think). I know macros take source code and produce object code without evaluating it, enabling programmers to create more versatile syntactic structures. If I had to classify these two macro systems, I'd say there was the "C style" macro and the "Lisp style" macro.

It seems that debugging macros can be a bit tricky because at runtime, the code that is actually running differs from the source.

How does the debugger keep track of the execution of the program in terms of the preprocessed source code? Is there a special "debug mode" that must be set to capture extra data about the macro?

In C, I can understand that you'd set a compile time switch for debugging, but how would an interpreted language, such as some forms of Lisp, do it?

Apologize for not trying this out, but the lisp toolchain requires more time than I have to spend to figure out.

I don't know about lisp macros (which I suspect are probably quite different than C macros) or debugging, but many - probably most - C/C++ debuggers do not handle source-level debugging of C preprocessor macros particularly well.

Generally, C/C++ debuggers they don't 'step' into the macro definition. If a macro expands into multiple statements, then the debugger will usually just stay on the same source line (where the macro is invoked) for each debugger 'step' operation.

This can make debugging macros a little more painful than they might otherwise be - yet another reason to avoid them in C/C++. If a macro is misbehaving in a truly mysterious way, I'll drop into assembly mode to debug it or expand the macro (either manually or using the compiler's switch). It's pretty rare that you have to go to that extreme; if you're writing macros that are that complicated, you're probably taking the wrong approach.

Michael Burr 2010-07-09 17:26:38

+3 A:

I don't think there's a fundamental difference in "C style" and "Lisp style" macros in how they're compiled. Both transform the source before the compiler-proper sees it. The big difference is that C's macros use the C preprocessor (a weaker secondary language that's mostly for simple string substitution), while Lisp's macros are written in Lisp itself (and hence can do anything at all).

(As an aside: I haven't seen a non-compiled Lisp in a while ... certainly not since the turn of the century. But if anything, being interpreted would seem to make the macro debugging problem easier, not harder, since you have more information around.)

I agree with Michael: I haven't seen a debugger for C that handles macros at all. Code that uses macros gets transformed before anything happens. The "debug" mode for compiling C code generally just means it stores functions, types, variables, filenames, and such -- I don't think any of them store information about macros.

For debugging programs that use macros, Lisp is pretty much the same as C here: your debugger sees the compiled code, not the macro application. Typically macros are kept simple, and debugged independently before use, to avoid the need for this, just like C.
For debugging the macros themselves, before you go and use it somewhere, Lisp does have features that make this easier than in C, e.g., the repl and macroexpand-1 (though in C there is obviously a way to macroexpand an entire file, fully, at once). You can see the before-and-after of a macroexpansion, right in your editor, when you write it.

I can't remember any time I ran across a situation where debugging into a macro definition itself would have been useful. Either it's a bug in the macro definition, in which case macroexpand-1 isolates the problem immediately, or it's a bug below that, in which case the normal debugging facilities work fine and I don't care that a macroexpansion occurred between two frames of my call stack.

Ken 2010-07-09 18:02:35

+1 A:

Usually in C source-level debugging has line granularity ("next" command) or instruction-level granularity ("step into"). Macro processors insert special directives into processed source that allow compiler to map compiled sequences of CPU instructions to source code lines.

In Lisp there exists no convention between macros and compiler to track source code to compiled code mapping, so it is not always possible to do single-stepping in source code.

Obvious option is to do single stepping in macroexpanded code. Compiler already sees final, expanded, version of code and can track source code to machine code mapping.

Other option is to use the fact that lisp expressions during manipulation have identity. If the macro is simple and just does destructuring and pasting code into template then some expressions of expanded code will be identical (with respect to EQ comparison) to expressions that were read from source code. In this case compiler can map some expressions from expanded code to source code.

dmitry_vk 2010-07-09 18:18:46

+1 A:

You should really look into the kind of support that Racket has for debugging code with macros. This support has two aspects, as Ken mentions. On one hand there is the issue of debugging macros: in Common Lisp the best way to do that is to just expand macro forms manually. With CPP the situation is similar but more primitive -- you'd run the code through only the CPP expansion and inspect the result. However, both of these are insufficient for more involved macros, and this was the motivation for having a macro debugger in Racket -- it shows you the syntax expansion steps one by one, with additional gui-based indications for things like bound identifiers etc.

On the side of using macros, Racket has always been more advanced than other Scheme and Lisp implementations. The idea is that each expression (as a syntactic object) is the code plus additional data that contains its source location. This way when a form is a macro, the expanded code that has parts coming from the macro will have the correct source location -- from the definition of the macro rather than from its use (where the forms are not really present). Some Scheme and Lisp implementations will implement a limited for of this using the identity of subforms, as dmitry-vk mentioned.

Eli Barzilay 2010-07-09 18:18:50

But mapping compiled code back to macro does not solve the following problem: when macro generates code based on declarative input (e.g., macro that generates parser from context-free grammar) it would be very useful during debugging to be able to find out what part of input is being "active" (e.g., what rule of grammar is being matched if that's possible). That would require macro writer to explicitly say which parts of generated code map correspond to which parts of macro input. Do Racket macros have such ability? Otherwise, it's just equivalent to debugging (partially) expanded code.

dmitry_vk 2010-07-10 03:11:03

"Otherwise, it's just equivalent to debugging (partially) expanded code."Not I think that I'm wrong in this sentence. Please ignore it.

dmitry_vk 2010-07-10 03:37:29

dmity-vk: Right -- the `syntax` special form in Racket is basically taking care of combining pieces of code from the macro user with pieces of code from the macro itself, and making sure that the source location on the resulting forms is correct in all forms.

Eli Barzilay 2010-07-12 16:58:48

+2 A:

In LispWorks developers can use the Stepper tool.

LispWorks provides a stepper, where one can step through the full macro expansion process.

Rainer Joswig 2010-07-09 21:26:34

The simple answer is that it is complicated ;-) There are several different things that contribute to being able to debug a program, and even more for tracking macros.

In C and C++, the preprocessor is used to expand macros and includes into actual source code. The originating filenames and line numbers are tracked in this expanded source file using #line directives.

http://msdn.microsoft.com/en-us/library/b5w2czay(VS.80).aspx

When a C or C++ program is compiled with debugging enabled, the assembler generates additional information in the object file that tracks source lines, symbol names, type descriptors, etc.

http://sources.redhat.com/gdb/onlinedocs/stabs.html

The operating system has features that make it possible for a debugger to attach to a process and control the process execution; pausing, single stepping, etc.

When a debugger is attached to the program, it translates the process stack and program counter back into symbolic form by looking up the meaning of program addresses in the debugging information.

Dynamic languages typically execute in a virtual machine, whether it is an interpreter or a bytecode VM. It is the VM that provides hooks to allow a debugger to control program flow and inspect program state.

donaldh 2010-07-09 21:47:34

ansaurus

tags:

views:

answers:

How does a macro-enabled language keep track of the source code for debugging?

related questions