Friday, April 20, 2007

Workstation 6.0 and the death of irreproducible bugs

Have you ever dealt with an irreproducible bug? The one that hits once in a blue moon and hides when you try to use any debugging tools? Well, since we also get them in VMware, we decided to do something about it. So we combined the gdb support in Workstation 6.0 with the Record/Replay. The result allows you to record the execution triggering the bug and then debug it with gdb as many times as you want, each time getting 100% reproducibility.

You can use this feature to debug Linux kernel or Linux processes. I'll start with the kernel as it requires less preparation. Download latest build of Workstation here (or get evaluation copy). Add a line enabling debugger connection as described in previous post:

debugStub.listen.guest32=1

Prepare VM for recording (VM > Settings > Options > Snapshot/Replay > Enable execution record and replay). Record VM execution while reproducing a bug. Replay the recording, attach gdb and follow the execution of VM. All the usual gdb features work in Replay mode - breakpoints, ^C, single step, memory inspection, debugging with symbols, etc. One difference is that to preserve determinism debugger won't allow memory or register modifications.

When reproducing a bug you may want to skip the recording up to the point just before things go wrong. We added a few features to help with that. First, we made replay fast by making time run faster. You can increase and decrease the pace of time by using left and right arrows during replay or using this configuration file line (bigger argument - slower replay):

replay.halt_delay = 1000

We also added a command to indicate a recording position:

(gdb) monitor position

and a command that activates a breakpoint at a given position:

(gdb) monitor stopat 10000

For example, when debugging Linux driver issues, I set a breakpoint in die() and similar functions, replay it once and use "monitor position". Then I replay again and use "monitor stopat" at position a few thousand units before die() and step from there.

You can also make debugger jump forward by several units using incremental form of stopat:

(gdb) monitor stopat +100
(gdb) continue

The caveats. Some devices are not supported; no support for 64-bit or SMP. Recording slows down virtual machine (a little) and requires disk space, but not by that much. This feature is experimental but we'll be happy to hear from you if you need help.

Next, I am going to post about application debugging. Debugger lives outside of the virtual machine, so to debug the processes it needs some information about the kernel you are using.

1 comment:

Slava said...

I had to disable additional comments due to spam. Please post your comments and suggestions at our forum:

http://communities.vmware.com/community/vmtn/general/guestdebugmonitor

Thank you.