Friday, April 20, 2007

Workstation 6.0 and the death of irreproducible bugs

Have you ever dealt with an irreproducible bug? The one that hits once in a blue moon and hides when you try to use any debugging tools? Well, since we also get them in VMware, we decided to do something about it. So we combined the gdb support in Workstation 6.0 with the Record/Replay. The result allows you to record the execution triggering the bug and then debug it with gdb as many times as you want, each time getting 100% reproducibility.

You can use this feature to debug Linux kernel or Linux processes. I'll start with the kernel as it requires less preparation. Download latest build of Workstation here (or get evaluation copy). Add a line enabling debugger connection as described in previous post:

debugStub.listen.guest32=1

Prepare VM for recording (VM > Settings > Options > Snapshot/Replay > Enable execution record and replay). Record VM execution while reproducing a bug. Replay the recording, attach gdb and follow the execution of VM. All the usual gdb features work in Replay mode - breakpoints, ^C, single step, memory inspection, debugging with symbols, etc. One difference is that to preserve determinism debugger won't allow memory or register modifications.

When reproducing a bug you may want to skip the recording up to the point just before things go wrong. We added a few features to help with that. First, we made replay fast by making time run faster. You can increase and decrease the pace of time by using left and right arrows during replay or using this configuration file line (bigger argument - slower replay):

replay.halt_delay = 1000

We also added a command to indicate a recording position:

(gdb) monitor position

and a command that activates a breakpoint at a given position:

(gdb) monitor stopat 10000

For example, when debugging Linux driver issues, I set a breakpoint in die() and similar functions, replay it once and use "monitor position". Then I replay again and use "monitor stopat" at position a few thousand units before die() and step from there.

You can also make debugger jump forward by several units using incremental form of stopat:

(gdb) monitor stopat +100
(gdb) continue

The caveats. Some devices are not supported; no support for 64-bit or SMP. Recording slows down virtual machine (a little) and requires disk space, but not by that much. This feature is experimental but we'll be happy to hear from you if you need help.

Next, I am going to post about application debugging. Debugger lives outside of the virtual machine, so to debug the processes it needs some information about the kernel you are using.

Tuesday, April 17, 2007

Debugging Linux kernels with Workstation 6.0

We just quietly added an exciting feature to Workstation 6.0. I believe it will make WS6 a great tool for Linux kernel development. You can now use gdb on your host to debug the Linux kernel running inside the VM. No kdb, no recompiling and no need for second machine. All you need is a single line in VM's configuration file.

To use the new feature, grab the latest build of Workstation here, or free 30-day evaluation here. Put this line into configuration file of your Linux VM:

debugStub.listen.guest32=1

Now whenever you run the virtual machine, you'll see the following in the vmware.log file (debug builds will also print this message to Host console):

VMware Workstation is listening for debug connection on port 8832.

Run gdb on the Host, reference it to the kernel with symbols and attach to the virtual machine:

% gdb
(gdb) file vmlinux-2.4.21-27.EL.debug
(gdb) target remote localhost:8832

That's it. The VM is blocked now, so you can "continue" it and "^C" back to gdb. Breakpoints, single step, memory inspection - all this works as usual. If you have SMP VM, then each VCPU is mapped on a thread, so use "info threads" and "thread NN" to switch between them.

Debugging the 64-bit kernel works in the same way, except you need to use a different option:

debugStub.listen.guest64=1

and connect to port 8864. Since gdb starts in 32-bit mode by default, you may also need to switch it to i386:x64-64 before connecting:

(gdb) set architecture i386:x86-64
(gdb) target remote localhost:8864

The kernels with symbols are sadly lacking on most distributions, but if you use RHEL then this website may help (look for kernel-debuginfo rpm):

http://people.redhat.com/duffy/debuginfo/index-js.html

The gdb support in WS6 is experimental, so there may be rough edges here and there. Please post on community forums if something doesn't work right or if you have a suggestion:

http://communities.vmware.com/community/vmtn/general/guestdebugmonitor

There are more debugging specific features in WS6 (for example, you can use gdb hand-in-hand with Record/Replay!). I will describe them shortly.

Updated 4/20/07: added explanation of 64-bit support.
Updated 5/14/07: release build prints "waiting for gdb" message into vmware.log only.
Updated 7/24/07: pointers to new build and discussion forum.