The Mangy Yak

Friday, November 03, 2006

On implementing symbolic debuggers for languages that get compiled to some closed-source VM bytecodes

That's what I want to do. How do you go on and implement a symbolic (aka source-level) debugger for a language that has an open source compiler and which gets compiled to some Virtual Machine bytecodes? If you're using the Java Virtual Machine, that's a piece of cake. You just have to modify the compiler so that it generates the correct mapping of source code and bytecode in a format that is meaningful to the JPDA. And bang! All of a sudden, you have support for breakpoints, stepping, and remote debugging - just like that, out of the blue. It might not be as easy if you have some weird language that doesn't map very well into the control flow of the bytecodes, but it's a feasible approach for many languages.

Now, the approach works because the class file format is open, because the debugger protocol is open, and because the format of the debugging information is open - you can leverage these in your favor. Suppose, however, that the class file format were open but the debugger protocol were not, and that you didn't have something like the Java Debug Interface to alleviate your pain. Also, suppose that the VM were proprietary and you knew close to nothing about it. Well, that's where I stand. When you have something like that, your only option is adding support for debugging at the bytecode level. Now the question: How do you implement a symbolic debugger at the bytecode level - no VM help at all?

The problem with bytecodes is that they are, from the VM point of view, application-level code. This means you're pretty much restricted to what applications are allowed to do, which might just not be enough when you're trying to build a debugger (except if you're using Smalltalk, LISP, or some other environment with non-mainstream reflective capabilities). Still, I have to build such a debugger - a source-level debugger for a domain-specific language which is based on a closed-source virtual machine with less-than-adequate reflective capabilities.

Sounds like fun. :-) I have a couple of ideas in mind, and all of them seem to impact performance badly. I'll have to try them out one by one, and I promise I'll blog about them (not that anyone cares). :-P