Tuesday, April 17, 2007

Debugging Linux kernels with Workstation 6.0

We just quietly added an exciting feature to Workstation 6.0. I believe it will make WS6 a great tool for Linux kernel development. You can now use gdb on your host to debug the Linux kernel running inside the VM. No kdb, no recompiling and no need for second machine. All you need is a single line in VM's configuration file.

To use the new feature, grab the latest build of Workstation here, or free 30-day evaluation here. Put this line into configuration file of your Linux VM:

debugStub.listen.guest32=1

Now whenever you run the virtual machine, you'll see the following in the vmware.log file (debug builds will also print this message to Host console):

VMware Workstation is listening for debug connection on port 8832.

Run gdb on the Host, reference it to the kernel with symbols and attach to the virtual machine:

% gdb
(gdb) file vmlinux-2.4.21-27.EL.debug
(gdb) target remote localhost:8832

That's it. The VM is blocked now, so you can "continue" it and "^C" back to gdb. Breakpoints, single step, memory inspection - all this works as usual. If you have SMP VM, then each VCPU is mapped on a thread, so use "info threads" and "thread NN" to switch between them.

Debugging the 64-bit kernel works in the same way, except you need to use a different option:

debugStub.listen.guest64=1

and connect to port 8864. Since gdb starts in 32-bit mode by default, you may also need to switch it to i386:x64-64 before connecting:

(gdb) set architecture i386:x86-64
(gdb) target remote localhost:8864

The kernels with symbols are sadly lacking on most distributions, but if you use RHEL then this website may help (look for kernel-debuginfo rpm):

http://people.redhat.com/duffy/debuginfo/index-js.html

The gdb support in WS6 is experimental, so there may be rough edges here and there. Please post on community forums if something doesn't work right or if you have a suggestion:

http://communities.vmware.com/community/vmtn/general/guestdebugmonitor

There are more debugging specific features in WS6 (for example, you can use gdb hand-in-hand with Record/Replay!). I will describe them shortly.

Updated 4/20/07: added explanation of 64-bit support.
Updated 5/14/07: release build prints "waiting for gdb" message into vmware.log only.
Updated 7/24/07: pointers to new build and discussion forum.

29 comments:

sg said...

Interesting. I tried this and it seems to work, I'm using RH72 kernel. This is pretty cool.

Manu said...

Geez!! This is cool.. I'll try it and get back.

Nigel Cunningham said...

I tried with a 64 bit kernel and got:

(gdb) target remote localhost:8832
Remote debugging using localhost:8832
Ignoring packet error, continuing...
warning: unrecognized item "timeout" in "qSupported" respoonse
Ignoring packet error, continuing...
Ignoring packet error, continuing...
Ignoring packet error, continuing...
Malformed response to offset query, timeout
(gdb)

Any hints?

Nigel

Slava said...

Hi Nigel, thanks for your comment!

Please try this option instead of debugStub.listen.guest32:

debugStub.listen.guest64=1

and connect to port 8864:

set architecture i386:x86-64
target remote localhost:8864

The gdb connection supports both 32-bit and 64-bit, but since remote protocols are different we are using different options/ports.

pacifist said...

My company has its own internal debugger that targets 32-bit Windows, 64-bit Windows (Itanium and x64), linux, IBM Z/OS, and several other platforms. Will you be publishing this interface at some point so we can add support to our debugger?

Slava said...

Hi pacifist. The WS6 uses remote gdb protocol:

http://sourceware.org/gdb/onlinedocs/gdb_33.html

I'd like to talk to you about debugger your company is developing. How can I contact you?

pacifist said...
This comment has been removed by the author.
Vivek said...
This comment has been removed by the author.
Linux said...
This comment has been removed by the author.
Manish said...

I tried this option both in WS6 on Windows and Linux host but I don't get the waiting message and the VM just boots normally without waiting.

What am i missing?

Manish said...

I forgot to mention, I am using the GA version (#45731)of WS6 on both Windows and Linux host.

Manish said...

BTW, I am running RedHat Enterprise Linux v5.0 in the guest.

Slava said...

Manish,

Thanks for noticing this! The release build will not print the message to the console, but you can find the message in the vmware.log file, and you will be able to attach to running VM with gdb running on the Host. I just double checked that with WS60 GA build 45731.

The VM is waiting for debugger asynchronously, so VM will be running as usual until debugger attaches. Let me know if this is inconvenient.

Once again, thank you for the comment. I will update the article.

P.S. The RHEL5 uses dynamically relocatable kernel. You may need to tell gdb where its sections are loaded to make symbols match.

Manish said...

Hi Slava,

It did work on both Windows and Linux host, as you said, without wait. But thats good. If you can provide a tunable parameter that determines if the VM is going to block or not, that might be useful.

Some more questions.

Is this feature going to be available with 'Fusion' on Mac OS X?
Does this feature work with a product like 'Insight' which is a GUI front end for gdb?

Thanks for your help

Manish

olivier said...

Is it possible to debug Windows or Windows Kernel with this functionality ?

Slava said...

Olivier,

Yes and no. Since WS6 support gdb as external debugger, you can debug Windows kernel at the instruction level, but symbol-level debugging is not available.

orenl said...

This is indeed very useful !

I tried it already and managed to work with a running linux kernel. As I work a lot with loadable modules, I would like to be able to debug code in modules. However, I couldn't figure out how to convince gdb to load symbols for loadable modules. Is it possible, and if so how ?

Thanks.. Oren.

Bradley Schatz said...

Do you have hardware assisted breakpoints working? They appear to silently fail for me.

Taking an alternative route, I have also tried manually inserting an INT3 instruction, but this causes a kernel stack fault.

Slava said...

To orenl: I haven't tried it with linux modules, but generally you need to tell gdb location of the sections of the module you are interested in. The gdb "add-symbol-file" command does this in addition to providing gdb with symbols. Depending on your kernel version, you can get section information from "insmod -m", /proc/ksyms, etc...

Slava said...

To Bradley: hardware breakpoints should work in WS6 RC2 or later. The gdb didn't know about the int3 that you put manually, so it passed it to the Guest OS and it couldn't handle it.

I suspect you have hit a bug. I would like to ask you a few questions, but blogspot comments are not really suitable for discussions. Could you please repost the question in Community Forums? I will follow up there. Thank you!

Slava said...

Hi,

Replay debugging now has dedicated Community Forum secton. I would appreciate if additional comments are posted there, so that other engineers and customers can participate in discussion too. Thank you.

Ben said...

Hello,

I'm writing a Network Boot Program running in the PXE environment. How do I use VMWare's Record/Replay to debug it?

Thanks

snagg said...
This comment has been removed by the author.
Mrunal Gawade said...

Hi Slava,

I downloaded a 30 day evaluation version for Linux viz. 6.0.3 build- 80004. I put the line as per your specification in .vmx file and then rebooted the OS but I do not see the vmware.log message which you have mentioned. I am running OpenSuse 10.1 with kernel 2.6.18.2-34. I have a question on the kernel symbol file which you have mentioned in the gdb file command. I could find a rpm for my kernel symbol version, and installed it but I am not able to see the file which you have mentioned in the invocation of gdb. Could you tell me more on that? like where kernel symbol table files get installed and what would be the file for my version as mentioned above. I am really stuck on an issue related to a kernel panic and need the debug environment as soon as possible.

Thank you,
Mrunal

JS said...

Thank you. This is exactly what I needed it.

I would like to debug a Linux kernel but from a Mac OSX machine (Linux kernel will run on a vm).

How can I do that?

Thanks!

Slava said...

I had to disable additional comments due to spam. Please post your comments and suggestions at our forum:

http://communities.vmware.com/community/vmtn/general/guestdebugmonitor

Thank you.

Slava said...

Hi JS. I have a feeling I was asked the question about OSX in Forums already. The short answer is that Darwin gdb does not seem to like format of vmlinux. So the easiest solution is to run two Linux VMs - one target and one with gdb, and attach gdb form second VM to the first VM. This will be considered a remote gdb connection by first VM, so you'll have to add this line to the config file of the first VM assuming VM is 32-bit):

debugStub.listen.guest32.remote = "TRUE"

Slava said...

Hi Mrunal, and sorry for the late replay. You have asked several questions and I'd need more information about your environment to diagnose the issue. If it is still a problem, could you please repost in the Forums? Thanks a lot. the URL of the Forum is:

http://communities.vmware.com/community/vmtn/general/guestdebugmonitor

Slava said...

Hi Ben, thanks for your question and sorry for late reply. The Record/Replay debugging is perfect for your scenario. The way to use it is to enable recording mode in VM, say, while it is still in the BIOS and keep it recording why your PXE bootloader is working. Once it manifests a bug, you stop recording, and debug the recording like you'd debug a kernel issue. One caveat is that if you use 16-bits mode you'll have to adjust linear addresses in gdb.

That was a summary. If you have additional questions, please post in our Forum:

http://communities.vmware.com/community/vmtn/general/guestdebugmonitor

Thanks a lot.