views:

2573

answers:

4

I keep wondering how does a debugger work? Particulary the one that can be 'attached' to already running executable. I understand that compiler translates code to machine language, but then how does debugger 'know' what it is being attached to?

+19  A: 

The details of how a debugger works will depend on what your are debugging, and what the OS is. For native debugging on windows you can find some details on MSDN: Win32 Debugging API

The user tells the debugger which process to attach to, either by name or by process id. If it is a name then the debugger will look up the process id, and initiate the debug session via a system call, under Windows this would be DebugActiveProcess.

Once attached the debugger will enter an event loop much like for any UI, but instead of events coming from the windowing systems, the OS will generate events based on what happens in the process being debugged -- for example an exception occurring. See WaitForDebugEvent.

The debugger is able to read and write the target process' virtual memory, and even adjust its register values through API's provided by the OS. See the list of debugging functions for windows.

The debugger is able to use information from symbol files to translate from addresses to variable names and locations in the source code. The symbol file information is a separate set of APIs and isn't a core part of the OS as such. On windows this is through the Debug Interface Access SDK.

If you are debugging a managed environment (.NET, Java, etc) the process will typically look similar, but the details are different, as the virtual machine environment provides the debug API rather than the underlying OS.

Rob Walker
+2  A: 

My understanding is that when you compile an app or dll, whatever it compiles to contains symbols representing the functions and the variables. When you have a Debug build, these symbols are far more detailed than when its a Release build, thus allowing the debugger to give you more information. When you attach the debugger to a process, it looks at which functions are currently being accessed and resolves all the available debugging symbols from here ( since it knows what the internals of the compiled file looks like, it can acertain what might be in the memory, with contents of ints, floats, strings, etc ). Like the first poster said, this information and how these symbols work greatly depends on the environment and the language.

DavidG
+5  A: 

If you're on a Windows OS, a great resource for this would be "Debugging Applications for Microsoft .NET and Microsoft Windows" by John Robbins:

(or even the older edition: "Debugging Applications")

The book has has a chapter on how a debugger works that includes code for a couple of simple (but working) debuggers.

Since I'm not familiar with details of Unix/Linux debugging, this stuff may not apply at all to other OS's. But I'd guess that as an introduction to a very complex subject the concepts - if not the details and APIs - should 'port' to most any OS.

Michael Burr
+9  A: 

In Linux, debugging a process begins with the ptrace(2) system call. This article has a great tutorial on how to use ptrace to implement some simple debugging constructs.

Adam Rosenfield
@Adam Rosenfield: Does the `(2)` tell us something more (or less) than "ptrace is a system call"?
Lazer
@eSKay, no not really. The `(2)` is the manual section number. See http://en.wikipedia.org/wiki/Man_page#Manual_sections for a description of the manual sections.
Adam Rosenfield
@Adam Rosenfield: thanks!
Lazer