Tuesday, October 16, 2007

Configuring application debugging with Record/Replay

In my previous article I explained how to debug processes running in Ubuntu 7.04 VM using Record/Replay technology built into VMware Workstation 6.0.1. This article tells how to use Record/Replay debugging with different distributions of Linux.

When debugging an application using Record/Replay, you need to run the debugger on the Host (outside of Virtual Machine). The reason for this is obvious - if the debugger runs inside the Virtual Machine, it will disturb the execution of the VM and you will not get 100% determinism. The downside of running the debugger outside of the VM is that it cannot use kernel services to debug processes.

We solved this problem by teaching our debugger how to implement process-level debugging by traversing Linux kernel data structures. Since the Linux kernel is evolving rapidly, the format of these data structures changes quite frequently. This is why we require users to tell us the offsets of some kernel data structures with the "monitor linuxoffsets" command. Here is example of this command for Ubuntu 7.04:

(gdb) monitor linuxoffsets 0x20614,0x80,0,0x68,0x194,0xa4,0x1b0, \ 0x24,0x18,0x28,0x2000,0xc4,0xec,0x10

This line may look cryptic, but its semantics are quite simple. You can see its format by issuing the following command in gdb:

(gdb) monitor help linuxoffsets
Informs debug stub about offsets in Linux kernel. Offsets have to be
set before other monitor commands are used. The format is:

monitor linuxoffsets [-l] <version>,<mm>,<next_task>,<tasks>, \

where each field except version, pgd, fs, threadsize and commsize
are hexadecimal offsets of the field in task_struct, pgd is offset
in mm_struct, rsp0/esp0 and fs are offsets in
thread_struct, version is kernel version and threadsize is
THREAD_SIZE. If some field does not exist, use 0. For example:

monitor linuxoffsets 0x20407,0x2c,0x48,0,0x236,0x6c,0x260,0xc,0 \

You may use getlinuxoffsets and getlinuxoffset.gdb scripts to
obtain offsets from kernel with symbols or kernel source tree.

The output mentions two little scripts that can automatically compute the offsets line for you. The first one can be used if you have a Linux kernel compiled with symbols, and the second one works with a Linux source tree. Here is the the first script:

------- cut here: getlinuxoffsets.gdb ------------

# Copyright 2007 VMware, Inc. All rights reserved.
set $linuxVersion=LINUXVERSION
if (uint32_t)0
define OFFS
printf "0x%x,", ((unsigned)&((struct $arg0 *) 0)->$arg1)
OFFS task_struct mm
if $linuxVersion < 0x020415
OFFS task_struct next_task
printf "0x0,"
printf "0x0,"
OFFS task_struct tasks
OFFS task_struct comm
OFFS task_struct pid
OFFS task_struct thread
OFFS mm_struct pgd
if sizeof(void *) == 0x8
OFFS thread_struct rsp0
OFFS thread_struct esp0
OFFS thread_struct fs
if $linuxVersion < 0x020600
printf "0x2000,"
printf "0x%x,", sizeof ((union thread_union *)0)->stack
if $linuxVersion < 0x020611
printf "0x0,0x0,"
OFFS task_struct group_leader
OFFS task_struct thread_group
printf "0x%x\n", sizeof ((struct task_struct *)0)->comm

------- cut here ---------------------------------

You should set LINUXVERSION to the correct Linux version. Invoke the script this way (vmlinux.dbg is kernel with symbols):

% gdb --quiet --command getlinuxoffsets.gdb vmlinux.dbg

For example, if you are dealing with uniprocessor RHEL4 AS Update 3, this sequence of steps will get you the offsets line:

# Replace LINUXVERSION with 0x020609 in getlinuxoffsets.gdb
% rpm2cpio kernel-debuginfo-2.6.9-34.EL.i686.rpm | cpio -i --make-directories
% gdb --quiet --command getlinuxoffsets.gdb \


That's it, you can feed this line to "monitor linuxoffsets".

Not all distributions come with kernels with symbols, however. The alternative way of obtaining the offsets line is to use a second script together with the source tree of the kernel. It actually consists of three files

------- cut here: getlinuxoffsets ----------------

# Copyright 2007 VMware, Inc. All rights reserved.
if [ "$CC" == "" ]; then
if [ "$1" == "" ]; then
$CC -c -I "$INCLUDE_PATH" -I "$INCLUDE_PATH"/asm/mach-default \
getlinuxoffsets2.c && \
$CC -o getlinuxoffsets.tmp getlinuxoffsets1.c getlinuxoffsets2.o && \
./getlinuxoffsets.tmp && \
rm -f getlinuxoffsets.tmp getlinuxoffsets1.o getlinuxoffsets2.o


------- cut here: getlinuxoffsets1.c -------------

/* Copyright 2007 VMware, Inc. All rights reserved. */
#include <stdio.h>
extern unsigned offsets[];
extern unsigned offsets_cnt;
unsigned i;
for (i = 0; i < offsets_cnt; i++) {
printf("%#x%c", offsets[i], (i == offsets_cnt - 1) ? '\n' : ',');
return 0;


------- cut here: getlinuxoffsets2.c -------------

/* Copyright 2007 VMware, Inc. All rights reserved. */
#define __KERNEL__ 1
#define MODULE 1
#include <linux/version.h>
#include <linux/autoconf.h>
#include <linux/types.h>
#define KBUILD_BASENAME "debugstub"
#include <linux/sched.h>
#define OFFS(_st, _fld) ((unsigned)&((struct _st *)0)->_fld)
#define NELEM(_arr) (sizeof(_arr) / sizeof(_arr[0]))
unsigned offsets[] = {
OFFS(task_struct, mm),
OFFS(task_struct, next_task),
OFFS(task_struct, tasks),
OFFS(task_struct, comm),
OFFS(task_struct, pid),
OFFS(task_struct, thread),
OFFS(mm_struct, pgd),
#if CONFIG_X86_64
OFFS(thread_struct, rsp0),
OFFS(thread_struct, esp0),
OFFS(thread_struct, fs),
OFFS(task_struct, group_leader),
OFFS(task_struct, thread_group),
sizeof ((struct task_struct *)0)->comm,
unsigned offsets_cnt = NELEM(offsets);


Invoke the getlinuxoffsets script as follows (provide the path to the kernel source tree if it is not /usr/src/linux):

% chmod u+x getlinuxoffsets
% ./getlinuxoffsets

Note that these scripts and Record/Replay feature in the WS6.0.1 are not officially supported by VMware. If you have questions or suggestions, the best place to express them is our forum. Thank you.


Yuhong Bao said...

Why not have gdb directly get the offsets from the symbols?

Slava said...

Hi Yuhong. You are right. We could have parsed debug information in the kernel to obtain the symbols. The problem is sometimes kernel with debug information is not available, so we had to provide a way to get offsets without it anyway, and adding DWARF parsing to Workstation seemed like unnecessary level of complexity at that time. Good news is that you only have to generate offsets once for a new kernel.

P.S. I had to disable comments in this blog due to spam that was added to some other posts. Sorry for inconvenience. Please post questions and suggestions in the forum. Thank you.