c - How does GDB determine the address to break at when you do "break function-name"?

Question

Welcome To Ask or Share your Answers For Others

c - How does GDB determine the address to break at when you do "break function-name"?

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

c - How does GDB determine the address to break at when you do "break function-name"?

A simple example that demonstrates my issue:

// test.c
#include <stdio.h>

int foo1(int i) {
    i = i * 2;
    return i;
}

void foo2(int i) {
    printf("greetings from foo! i = %i", i);
}

int main() {
    int i = 7;
    foo1(i);
    foo2(i);
    return 0;
}

$ clang -o test -O0 -Wall -g test.c

Inside GDB I do the following and start the execution:

(gdb) b foo1  
(gdb) b foo2

After reaching the first breakpoint, I disassemble:

(gdb) disassemble 
Dump of assembler code for function foo1:
   0x0000000000400530 <+0>:     push   %rbp
   0x0000000000400531 <+1>:     mov    %rsp,%rbp
   0x0000000000400534 <+4>:     mov    %edi,-0x4(%rbp)
=> 0x0000000000400537 <+7>:     mov    -0x4(%rbp),%edi
   0x000000000040053a <+10>:    shl    $0x1,%edi
   0x000000000040053d <+13>:    mov    %edi,-0x4(%rbp)
   0x0000000000400540 <+16>:    mov    -0x4(%rbp),%eax
   0x0000000000400543 <+19>:    pop    %rbp
   0x0000000000400544 <+20>:    retq   
End of assembler dump.

I do the same after reaching the second breakpoint:

(gdb) disassemble 
Dump of assembler code for function foo2:
   0x0000000000400550 <+0>:     push   %rbp
   0x0000000000400551 <+1>:     mov    %rsp,%rbp
   0x0000000000400554 <+4>:     sub    $0x10,%rsp
   0x0000000000400558 <+8>:     lea    0x400644,%rax
   0x0000000000400560 <+16>:    mov    %edi,-0x4(%rbp)
=> 0x0000000000400563 <+19>:    mov    -0x4(%rbp),%esi
   0x0000000000400566 <+22>:    mov    %rax,%rdi
   0x0000000000400569 <+25>:    mov    $0x0,%al
   0x000000000040056b <+27>:    callq  0x400410 <printf@plt>
   0x0000000000400570 <+32>:    mov    %eax,-0x8(%rbp)
   0x0000000000400573 <+35>:    add    $0x10,%rsp
   0x0000000000400577 <+39>:    pop    %rbp
   0x0000000000400578 <+40>:    retq   
End of assembler dump.

GDB obviously uses different offsets (+7 in foo1 and +19 in foo2), with respect to the beginning of the function, when setting the breakpoint. How can I determine this offset by myself without using GDB?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T18:21:42+0000

gdb uses a few methods to decide this information.

First, the very best way is if your compiler emits DWARF describing the function. Then gdb can decode the DWARF to find the end of the prologue.

However, this isn't always available. GCC emits it, but IIRC only when optimization is used.

I believe there's also a convention that if the first line number of a function is repeated in the line table, then the address of the second instance is used as the end of the prologue. That is if the lines look like:

< function f >
line 23  0xffff0000
line 23  0xffff0010

Then gdb will assume that the function f's prologue is complete at 0xfff0010.

I think this is the mode used by gcc when not optimizing.

Finally gdb has some prologue decoders that know how common prologues are written on many platforms. These are used when debuginfo isn't available, though offhand I don't recall what the purpose of that is.

Categories

c - How does GDB determine the address to break at when you do "break function-name"?

c - How does GDB determine the address to break at when you do "break function-name"?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags