内核堆栈跟踪源代码行

给定如下的内核堆栈跟踪,如何确定发生问题的特定代码行?

kernel: [<ffffffff80009a14>] __link_path_walk+0x173/0xfb9 kernel: [<ffffffff8002cbec>] mntput_no_expire+0x19/0x89 kernel: [<ffffffff8000eb94>] link_path_walk+0xa6/0xb2 kernel: [<ffffffff80063c4f>] __mutex_lock_slowpath+0x60/0x9b kernel: [<ffffffff800238de>] __path_lookup_intent_open+0x56/0x97 kernel: [<ffffffff80063c99>] .text.lock.mutex+0xf/0x14 kernel: [<ffffffff8001b222>] open_namei+0xea/0x712 kernel: [<ffffffff8006723e>] do_page_fault+0x4fe/0x874 kernel: [<ffffffff80027660>] do_filp_open+0x1c/0x38 kernel: [<ffffffff8001a061>] do_sys_open+0x44/0xbe kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0 

虽然我没有find函数调用的麻烦 – 但是将__link_path_walk加偏移量转换为实际的行号是困难的部分。

假设这是一个标准的分配提供的内核,我知道确切的版本和内部版本号,那么获取必要的元数据和进行相应的查找的过程是什么?

给定一个具有debugging符号的unmipped vmlinux (通常包含在与您的内核版本匹配的“linux-devel”或“linux-headers”包中),您可以使用binutils附带的addr2line程序将地址转换为源文件中的行。

考虑这个呼叫跟踪:

 Call Trace: [<ffffffff8107bf5d>] ? finish_task_switch+0x3d/0x120 [<ffffffff815f3130>] __schedule+0x3b0/0x9d0 [<ffffffff815f3779>] schedule+0x29/0x70 [<ffffffff815f2ccc>] schedule_hrtimeout_range_clock.part.24+0xdc/0xf0 [<ffffffff81076440>] ? hrtimer_get_res+0x50/0x50 [<ffffffff815f2c6f>] ? schedule_hrtimeout_range_clock.part.24+0x7f/0xf0 [<ffffffff815f2cf9>] schedule_hrtimeout_range_clock+0x19/0x60 [<ffffffff815f2d53>] schedule_hrtimeout_range+0x13/0x20 [<ffffffff811a8aa9>] poll_schedule_timeout+0x49/0x70 [<ffffffff811aa203>] do_sys_poll+0x423/0x550 [<ffffffff814eaf8c>] ? sock_recvmsg+0x9c/0xd0 [<ffffffff811a8c50>] ? poll_select_copy_remaining+0x140/0x140 [<ffffffff811a8c50>] ? poll_select_copy_remaining+0x140/0x140 [<ffffffff811a8c50>] ? poll_select_copy_remaining+0x140/0x140 [<ffffffff811a8c50>] ? poll_select_copy_remaining+0x140/0x140 [<ffffffff811a8c50>] ? poll_select_copy_remaining+0x140/0x140 [<ffffffff811a8c50>] ? poll_select_copy_remaining+0x140/0x140 [<ffffffff811a8c50>] ? poll_select_copy_remaining+0x140/0x140 [<ffffffff811a8c50>] ? poll_select_copy_remaining+0x140/0x140 [<ffffffff811a8c50>] ? poll_select_copy_remaining+0x140/0x140 [<ffffffff811aa3fe>] SyS_poll+0x5e/0x100 [<ffffffff816015d2>] system_call_fastpath+0x16/0x1b 

那么poll_select_copy_remaining调用者的地址可以通过以下方式find:

 $ addr2line -e /tmp/vmlinux ffffffff811a8c50 /tmp/linux-3.15-rc8/fs/select.c:209 

我手边没有〜= RHEL5,所以显示的输出来自于Fedora 20,虽然这个过程应该大致相同( function的名称已经改变了 )。

您需要为您的内核安装适当的kernel-debug-debuginfo软件包(假设RHEL或衍生发行版)。 这个软件包提供了一个vmlinux镜像(一个未压缩的未剥离的内核版本):

 # rpm -ql kernel-debug-debuginfo | grep vmlinux /usr/lib/debug/lib/modules/3.14.7-200.fc20.x86_64+debug/vmlinux 

该图像可以直接与gdb一起使用

 # gdb /usr/lib/debug/lib/modules/3.14.7-200.fc20.x86_64+debug/vmlinux GNU gdb (GDB) Fedora 7.7.1-13.fc20 Copyright (C) 2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> ... Reading symbols from /usr/lib/debug/lib/modules/3.14.7-200.fc20.x86_64+debug/vmlinux...done. (gdb) disassemble link_path_walk Dump of assembler code for function link_path_walk: 0xffffffff81243d50 <+0>: callq 0xffffffff817ea840 <__fentry__> 0xffffffff81243d55 <+5>: push %rbp 0xffffffff81243d56 <+6>: mov %rsp,%rbp 0xffffffff81243d59 <+9>: push %r15 0xffffffff81243d5b <+11>: mov %rsi,%r15 0xffffffff81243d5e <+14>: push %r14 0xffffffff81243d60 <+16>: push %r13 0xffffffff81243d62 <+18>: push %r12 0xffffffff81243d64 <+20>: push %rbx 0xffffffff81243d65 <+21>: mov %rdi,%rbx 0xffffffff81243d68 <+24>: sub $0x78,%rsp 0xffffffff81243d6c <+28>: mov %gs:0x28,%rax 0xffffffff81243d75 <+37>: mov %rax,0x70(%rsp) 0xffffffff81243d7a <+42>: xor %eax,%eax 0xffffffff81243d7c <+44>: movzbl (%rdi),%eax 0xffffffff81243d7f <+47>: cmp $0x2f,%al .... 

你也可以在vmlinux镜像上使用objdump(1)

 # objdump -rDlS /usr/lib/debug/lib/modules/3.14.7-200.fc20.x86_64+debug/vmlinux > vmlinux.out 

标志是:

  -D --disassemble-all Like -d, but disassemble the contents of all sections, not just those expected to contain instructions. -r --reloc Print the relocation entries of the file. If used with -d or -D, the relocations are printed interspersed with the disassembly. -S --source Display source code intermixed with disassembly, if possible. Implies -d. -l --line-numbers Label the display (using debugging information) with the filename and source line numbers corresponding to the object code or relocs shown. Only useful with -d, -D, or -r. 

你可以在那里查找function:

 ffffffff81243d50 <link_path_walk>: link_path_walk(): /usr/src/debug/kernel-3.14.fc20/linux-3.14.7-200.fc20.x86_64/fs/namei.c:1729 * * Returns 0 and nd will have valid dentry and mnt on success. * Returns error and drops reference to input namei data on failure. */ static int link_path_walk(const char *name, struct nameidata *nd) { ffffffff81243d50: e8 eb 6a 5a 00 callq ffffffff817ea840 <__entry_text_start> ffffffff81243d55: 55 push %rbp ffffffff81243d56: 48 89 e5 mov %rsp,%rbp ffffffff81243d59: 41 57 push %r15 ffffffff81243d5b: 49 89 f7 mov %rsi,%r15 ffffffff81243d5e: 41 56 push %r14 ffffffff81243d60: 41 55 push %r13 ffffffff81243d62: 41 54 push %r12 ffffffff81243d64: 53 push %rbx ffffffff81243d65: 48 89 fb mov %rdi,%rbx ffffffff81243d68: 48 83 ec 78 sub $0x78,%rsp ffffffff81243d6c: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax ffffffff81243d73: 00 00 ffffffff81243d75: 48 89 44 24 70 mov %rax,0x70(%rsp) ffffffff81243d7a: 31 c0 xor %eax,%eax /usr/src/debug/kernel-3.14.fc20/linux-3.14.7-200.fc20.x86_64/fs/namei.c:1733 struct path next; int err; while (*name=='/') ffffffff81243d7c: 0f b6 07 movzbl (%rdi),%eax ffffffff81243d7f: 3c 2f cmp $0x2f,%al ffffffff81243d81: 75 10 jne ffffffff81243d93 <link_path_walk+0x43> ffffffff81243d83: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) /usr/src/debug/kernel-3.14.fc20/linux-3.14.7-200.fc20.x86_64/fs/namei.c:1734 name++; ffffffff81243d88: 48 83 c3 01 add $0x1,%rbx /usr/src/debug/kernel-3.14.fc20/linux-3.14.7-200.fc20.x86_64/fs/namei.c:1733 static int link_path_walk(const char *name, struct nameidata *nd) { struct path next; int err; while (*name=='/') .... 

并将偏移量与实际的代码行匹配。

如果addr2line应该为行号打印问号,或者objdump无法内联源代码,并且您有自定义的内核,请务必使用CONFIG_DEBUG_INFO集重新编译内核。 您可能需要重新构build内核的错误。