Monday, May 07, 2012

How a debugger works

Debuggers are part of a computer that seems like magic, I've often wondered how they work, so I set out to find out. I found out about the Windows user-mode debugger. Kernel-mode debugging is similar, but a bit more complicated. I imagine it is similar in different OSs, but I don't really know.

Your standard debugger runs in a separate process from the debuggee, it must have some special privileges over the debuggee process. When the debugged process is running, the debugger does nothing, and execution runs as usual.

Debugging on Windows basically works via windows exceptions, that is the structured exception handler (seh) mechanism. As an aside I never realised C++ exceptions were compiled to seh OS calls, I always assumed they just used jumps, no wonder they are expensive. Anyway, when the debugger attaches to the debugged process, it is registered with the OS (Windows) as the first-chance exception handler for the debuggee process. Any exceptions thrown by the debuggee are caught by the debugger, which can do with them as it pleases. Usually it will print a notification and pass the exception on to the debuggee. The debuggee can then run its usual exception handling procedure.

If the debuggee doesn't catch the exception, which would usually cause a crash (or at least revert to OS handling), then the debugger gets another go with the exception, called second-chance exception handling. Usually, the debugger will stop execution and let the user debug the 'crash'.

Another, longer, aside: this gets complicated if the debuggee has some kind of crash reporter for dealing with unhandled exceptions. In normal execution, any unhandled exception is handled by the unhandled exception filter, which can be set with a system call. But when debugging, this is always turned off, so the debuggee never gets its last chance exception handler called, instead the debugger catches it, as described above.

There are a special set of exceptions which are used only by the debugger. These are caught by the debugger process in the first-chance exception handler and not passed on to the debuggee process. Depending on the exception, the debugger takes some action.

If the debugger wants to break, then a new thread is created in the debuggee process which just executes the debug break CPU instruction (int 3, apparently). The OS handles this and throws one of those special exceptions (STATUS_BREAKPOINT), which is caught by the debugger. The debugger can then use its special debugging privileges to pause all the threads in the debuggee process, and voila: a break has happened and the debugger is in control.

Breakpoints work similarly, when you breakpoint a line of code, the debugger (presumably with some help from the compiler) finds the corresponding instruction and overwrites it with that 'int 3' instruction. When it gets executed (where the original instruction should be), the same thing happens as above. When the debugger is done, it overwrites the 'int 3' with the original instruction and control returns back to the debuggee. The debugger must single step through the instruction (see next) and control will come back to the debugger so that it can re-replace the instruction with 'int 3' again, then single stepping is stopped and control is properly returned to the debuggee.

To single step, a CPU flag is set which causes the 'int 1' interrupt to occur after every instruction, control follows the path of 'int 3', above. The debugger can return control to the debuggee to execute all instructions that correspond with a single C++ line (or whatever language and concept of 'single step' is being debugged). At the end of a step, execution is paused as for a break. When the debugger is done single stepping, the CPU flag is unset and control transferred back to the debuggee.

No comments: