共计 11020 个字符,预计需要花费 28 分钟才能阅读完成。
3-1
使⽤ gcc 编译代码并使⽤ binutils ⼯具对⽣成的⽬标文件和可执⾏文件(ELF 格局)进⾏剖析。具体要求如下:
- 编写⼀个简略的打印“hello world!”的程序源文件:hello.c
- 对源文件进⾏本地编译,⽣成针对⽀持 x86_64 指令集架构处理器的⽬标文件 hello.o。
- 查看 hello.o 的文件的文件头信息。
- 查看 hello.o 的 Section header table。
- 对 hello.o 反汇编,并查看 hello.c 的 C 程序源码和机器指令的对应关系。
简略地打印 “hello world !”
#include <stdio.h>
int main(){printf("hello world !\n");
return 0;
}
编译为指标文件
gcc hello.cc -o hello.o
查看文件头信息
$ readelf -h hello.o
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x1060
Start of program headers: 64 (bytes into file)
Start of section headers: 14712 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 13
Size of section headers: 64 (bytes)
Number of section headers: 31
Section header string table index: 30
查看 Section header table
$ readelf -l hello.o
Elf file type is DYN (Shared object file)
Entry point 0x1060
There are 13 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040
0x00000000000002d8 0x00000000000002d8 R 0x8
INTERP 0x0000000000000318 0x0000000000000318 0x0000000000000318
0x000000000000001c 0x000000000000001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x00000000000005f8 0x00000000000005f8 R 0x1000
LOAD 0x0000000000001000 0x0000000000001000 0x0000000000001000
0x00000000000001f5 0x00000000000001f5 R E 0x1000
LOAD 0x0000000000002000 0x0000000000002000 0x0000000000002000
0x0000000000000160 0x0000000000000160 R 0x1000
LOAD 0x0000000000002db8 0x0000000000003db8 0x0000000000003db8
0x0000000000000258 0x0000000000000260 RW 0x1000
DYNAMIC 0x0000000000002dc8 0x0000000000003dc8 0x0000000000003dc8
0x00000000000001f0 0x00000000000001f0 RW 0x8
NOTE 0x0000000000000338 0x0000000000000338 0x0000000000000338
0x0000000000000020 0x0000000000000020 R 0x8
NOTE 0x0000000000000358 0x0000000000000358 0x0000000000000358
0x0000000000000044 0x0000000000000044 R 0x4
GNU_PROPERTY 0x0000000000000338 0x0000000000000338 0x0000000000000338
0x0000000000000020 0x0000000000000020 R 0x8
GNU_EH_FRAME 0x0000000000002014 0x0000000000002014 0x0000000000002014
0x0000000000000044 0x0000000000000044 R 0x4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x10
GNU_RELRO 0x0000000000002db8 0x0000000000003db8 0x0000000000003db8
0x0000000000000248 0x0000000000000248 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.gnu.property .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt
03 .init .plt .plt.got .plt.sec .text .fini
04 .rodata .eh_frame_hdr .eh_frame
05 .init_array .fini_array .dynamic .got .data .bss
06 .dynamic
07 .note.gnu.property
08 .note.gnu.build-id .note.ABI-tag
09 .note.gnu.property
10 .eh_frame_hdr
11
12 .init_array .fini_array .dynamic .got
反汇编
$ objdump -d hello.o > 3-1.asm
3-1.o: file format elf64-x86-64
Disassembly of section .init:
0000000000001000 <_init>:
1000: f3 0f 1e fa endbr64
1004: 48 83 ec 08 sub $0x8,%rsp
1008: 48 8b 05 d9 2f 00 00 mov 0x2fd9(%rip),%rax # 3fe8 <__gmon_start__>
100f: 48 85 c0 test %rax,%rax
1012: 74 02 je 1016 <_init+0x16>
1014: ff d0 callq *%rax
1016: 48 83 c4 08 add $0x8,%rsp
101a: c3 retq
Disassembly of section .plt:
0000000000001020 <.plt>:
1020: ff 35 9a 2f 00 00 pushq 0x2f9a(%rip) # 3fc0 <_GLOBAL_OFFSET_TABLE_+0x8>
1026: f2 ff 25 9b 2f 00 00 bnd jmpq *0x2f9b(%rip) # 3fc8 <_GLOBAL_OFFSET_TABLE_+0x10>
102d: 0f 1f 00 nopl (%rax)
1030: f3 0f 1e fa endbr64
1034: 68 00 00 00 00 pushq $0x0
1039: f2 e9 e1 ff ff ff bnd jmpq 1020 <.plt>
103f: 90 nop
Disassembly of section .plt.got:
0000000000001040 <__cxa_finalize@plt>:
1040: f3 0f 1e fa endbr64
1044: f2 ff 25 ad 2f 00 00 bnd jmpq *0x2fad(%rip) # 3ff8 <__cxa_finalize@GLIBC_2.2.5>
104b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
Disassembly of section .plt.sec:
0000000000001050 <puts@plt>:
1050: f3 0f 1e fa endbr64
1054: f2 ff 25 75 2f 00 00 bnd jmpq *0x2f75(%rip) # 3fd0 <puts@GLIBC_2.2.5>
105b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
Disassembly of section .text:
0000000000001060 <_start>:
1060: f3 0f 1e fa endbr64
1064: 31 ed xor %ebp,%ebp
1066: 49 89 d1 mov %rdx,%r9
1069: 5e pop %rsi
106a: 48 89 e2 mov %rsp,%rdx
106d: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
1071: 50 push %rax
1072: 54 push %rsp
1073: 4c 8d 05 66 01 00 00 lea 0x166(%rip),%r8 # 11e0 <__libc_csu_fini>
107a: 48 8d 0d ef 00 00 00 lea 0xef(%rip),%rcx # 1170 <__libc_csu_init>
1081: 48 8d 3d c1 00 00 00 lea 0xc1(%rip),%rdi # 1149 <main>
1088: ff 15 52 2f 00 00 callq *0x2f52(%rip) # 3fe0 <__libc_start_main@GLIBC_2.2.5>
108e: f4 hlt
108f: 90 nop
0000000000001090 <deregister_tm_clones>:
1090: 48 8d 3d 79 2f 00 00 lea 0x2f79(%rip),%rdi # 4010 <__TMC_END__>
1097: 48 8d 05 72 2f 00 00 lea 0x2f72(%rip),%rax # 4010 <__TMC_END__>
109e: 48 39 f8 cmp %rdi,%rax
10a1: 74 15 je 10b8 <deregister_tm_clones+0x28>
10a3: 48 8b 05 2e 2f 00 00 mov 0x2f2e(%rip),%rax # 3fd8 <_ITM_deregisterTMCloneTable>
10aa: 48 85 c0 test %rax,%rax
10ad: 74 09 je 10b8 <deregister_tm_clones+0x28>
10af: ff e0 jmpq *%rax
10b1: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
10b8: c3 retq
10b9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
00000000000010c0 <register_tm_clones>:
10c0: 48 8d 3d 49 2f 00 00 lea 0x2f49(%rip),%rdi # 4010 <__TMC_END__>
10c7: 48 8d 35 42 2f 00 00 lea 0x2f42(%rip),%rsi # 4010 <__TMC_END__>
10ce: 48 29 fe sub %rdi,%rsi
10d1: 48 89 f0 mov %rsi,%rax
10d4: 48 c1 ee 3f shr $0x3f,%rsi
10d8: 48 c1 f8 03 sar $0x3,%rax
10dc: 48 01 c6 add %rax,%rsi
10df: 48 d1 fe sar %rsi
10e2: 74 14 je 10f8 <register_tm_clones+0x38>
10e4: 48 8b 05 05 2f 00 00 mov 0x2f05(%rip),%rax # 3ff0 <_ITM_registerTMCloneTable>
10eb: 48 85 c0 test %rax,%rax
10ee: 74 08 je 10f8 <register_tm_clones+0x38>
10f0: ff e0 jmpq *%rax
10f2: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
10f8: c3 retq
10f9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
0000000000001100 <__do_global_dtors_aux>:
1100: f3 0f 1e fa endbr64
1104: 80 3d 05 2f 00 00 00 cmpb $0x0,0x2f05(%rip) # 4010 <__TMC_END__>
110b: 75 2b jne 1138 <__do_global_dtors_aux+0x38>
110d: 55 push %rbp
110e: 48 83 3d e2 2e 00 00 cmpq $0x0,0x2ee2(%rip) # 3ff8 <__cxa_finalize@GLIBC_2.2.5>
1115: 00
1116: 48 89 e5 mov %rsp,%rbp
1119: 74 0c je 1127 <__do_global_dtors_aux+0x27>
111b: 48 8b 3d e6 2e 00 00 mov 0x2ee6(%rip),%rdi # 4008 <__dso_handle>
1122: e8 19 ff ff ff callq 1040 <__cxa_finalize@plt>
1127: e8 64 ff ff ff callq 1090 <deregister_tm_clones>
112c: c6 05 dd 2e 00 00 01 movb $0x1,0x2edd(%rip) # 4010 <__TMC_END__>
1133: 5d pop %rbp
1134: c3 retq
1135: 0f 1f 00 nopl (%rax)
1138: c3 retq
1139: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
0000000000001140 <frame_dummy>:
1140: f3 0f 1e fa endbr64
1144: e9 77 ff ff ff jmpq 10c0 <register_tm_clones>
0000000000001149 <main>:
1149: f3 0f 1e fa endbr64
114d: 55 push %rbp
114e: 48 89 e5 mov %rsp,%rbp
1151: 48 8d 3d ac 0e 00 00 lea 0xeac(%rip),%rdi # 2004 <_IO_stdin_used+0x4>
1158: e8 f3 fe ff ff callq 1050 <puts@plt>
115d: b8 00 00 00 00 mov $0x0,%eax
1162: 5d pop %rbp
1163: c3 retq
1164: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
116b: 00 00 00
116e: 66 90 xchg %ax,%ax
0000000000001170 <__libc_csu_init>:
1170: f3 0f 1e fa endbr64
1174: 41 57 push %r15
1176: 4c 8d 3d 3b 2c 00 00 lea 0x2c3b(%rip),%r15 # 3db8 <__frame_dummy_init_array_entry>
117d: 41 56 push %r14
117f: 49 89 d6 mov %rdx,%r14
1182: 41 55 push %r13
1184: 49 89 f5 mov %rsi,%r13
1187: 41 54 push %r12
1189: 41 89 fc mov %edi,%r12d
118c: 55 push %rbp
118d: 48 8d 2d 2c 2c 00 00 lea 0x2c2c(%rip),%rbp # 3dc0 <__do_global_dtors_aux_fini_array_entry>
1194: 53 push %rbx
1195: 4c 29 fd sub %r15,%rbp
1198: 48 83 ec 08 sub $0x8,%rsp
119c: e8 5f fe ff ff callq 1000 <_init>
11a1: 48 c1 fd 03 sar $0x3,%rbp
11a5: 74 1f je 11c6 <__libc_csu_init+0x56>
11a7: 31 db xor %ebx,%ebx
11a9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
11b0: 4c 89 f2 mov %r14,%rdx
11b3: 4c 89 ee mov %r13,%rsi
11b6: 44 89 e7 mov %r12d,%edi
11b9: 41 ff 14 df callq *(%r15,%rbx,8)
11bd: 48 83 c3 01 add $0x1,%rbx
11c1: 48 39 dd cmp %rbx,%rbp
11c4: 75 ea jne 11b0 <__libc_csu_init+0x40>
11c6: 48 83 c4 08 add $0x8,%rsp
11ca: 5b pop %rbx
11cb: 5d pop %rbp
11cc: 41 5c pop %r12
11ce: 41 5d pop %r13
11d0: 41 5e pop %r14
11d2: 41 5f pop %r15
11d4: c3 retq
11d5: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
11dc: 00 00 00 00
00000000000011e0 <__libc_csu_fini>:
11e0: f3 0f 1e fa endbr64
11e4: c3 retq
Disassembly of section .fini:
00000000000011e8 <_fini>:
11e8: f3 0f 1e fa endbr64
11ec: 48 83 ec 08 sub $0x8,%rsp
11f0: 48 83 c4 08 add $0x8,%rsp
11f4: c3 retq
3-2
如下例⼦ C 语⾔代码:
#include <stdio.h> int global_init = 0x11111111; const int global_const = 0x22222222; void main() { static int static_var = 0x33333333; static int static_var_uninit; int auto_var = 0x44444444; printf("hello world!\n"); return; }
请问编译为 .o 文件后,global_init, global_const, static_var, static_var_uninit, auto_var 这些变 量别离寄存在那些 section ⾥,”hello world!\n” 这个字符串⼜在哪⾥?并尝试⽤⼯具查看并验证你的猜想。
global_init
和static_var
在.data
段
Disassembly of section .data:
0000000000004000 <__data_start>:
...
0000000000004008 <__dso_handle>:
4008: 08 40 00 or %al,0x0(%rax)
400b: 00 00 add %al,(%rax)
400d: 00 00 add %al,(%rax)
...
0000000000004010 <global_init>:
4010: 11 11 adc %edx,(%rcx)
4012: 11 11 adc %edx,(%rcx)
0000000000004014 <static_var.2317>:
4014: 33 33 xor (%rbx),%esi
4016: 33 33 xor (%rbx),%esi
global_const
在rodata
段
Disassembly of section .rodata:
0000000000002000 <_IO_stdin_used>:
2000: 01 00 add %eax,(%rax)
2002: 02 00 add (%rax),%al
0000000000002004 <global_const>:
2004: 22 22 and (%rdx),%ah
2006: 22 22 and (%rdx),%ah
2008: 68 65 6c 6c 6f pushq $0x6f6c6c65
200d: 20 77 6f and %dh,0x6f(%rdi)
2010: 72 6c jb 207e <__GNU_EH_FRAME_HDR+0x66>
2012: 64 21 00 and %eax,%fs:(%rax)
static_var_uninit
在bss
段
Disassembly of section .bss:
0000000000004018 <completed.8060>:
4018: 00 00 add %al,(%rax)
...
000000000000401c <static_var_uninit.2318>:
401c: 00 00 add %al,(%rax)
...
auto_var
为局部变量,不在任何 section 中,在代码段中间接应用
4-1
针对
rv32ima
指令集架构,反复3-1
的操作
指令用法与 3-1
基本相同
$riscv64-unknown-elf-gcc 3-1.c -march=rv32ima -mabi=ilp32 -g -Wall -o 4-1.o
$readelf -h 4-1.o
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: RISC-V
Version: 0x1
Entry point address: 0x10090
Start of program headers: 52 (bytes into file)
Start of section headers: 26136 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 2
Size of section headers: 40 (bytes)
Number of section headers: 21
Section header string table index: 20
$readelf -l 4-1.o
Elf file type is EXEC (Executable file)
Entry point 0x10090
There are 2 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x00010000 0x00010000 0x0362c 0x0362c R E 0x1000
LOAD 0x00362c 0x0001462c 0x0001462c 0x00858 0x008b0 RW 0x1000
Section to Segment mapping:
Segment Sections...
00 .text .rodata
01 .eh_frame .init_array .fini_array .data .sdata .sbss .bss
$riscv64-unknown-elf-objdump -d 4-1.o > 4-1.asm
- 反汇编节选
4-1.o: file format elf32-littleriscv
Disassembly of section .text:
00010074 <register_fini>:
10074: ffff0797 auipc a5,0xffff0
10078: f8c78793 addi a5,a5,-116 # 0 <register_fini-0x10074>
1007c: 00078863 beqz a5,1008c <register_fini+0x18>
10080: 00000517 auipc a0,0x0
10084: 14050513 addi a0,a0,320 # 101c0 <__libc_fini_array>
10088: 0f00006f j 10178 <atexit>
1008c: 00008067 ret
00010090 <_start>:
10090: 00005197 auipc gp,0x5
10094: db018193 addi gp,gp,-592 # 14e40 <__global_pointer$>
10098: 04418513 addi a0,gp,68 # 14e84 <_edata>
1009c: 09c18613 addi a2,gp,156 # 14edc <__BSS_END__>
100a0: 40a60633 sub a2,a2,a0
100a4: 00000593 li a1,0
100a8: 20c000ef jal ra,102b4 <memset>
100ac: 00000517 auipc a0,0x0
100b0: 11450513 addi a0,a0,276 # 101c0 <__libc_fini_array>
100b4: 0c4000ef jal ra,10178 <atexit>
100b8: 168000ef jal ra,10220 <__libc_init_array>
100bc: 00012503 lw a0,0(sp)
100c0: 00410593 addi a1,sp,4
100c4: 00000613 li a2,0
100c8: 07c000ef jal ra,10144 <main>
100cc: 0c00006f j 1018c <exit>
...
...
00010144 <main>:
10144: ff010113 addi sp,sp,-16
10148: 00112623 sw ra,12(sp)
1014c: 00812423 sw s0,8(sp)
10150: 01010413 addi s0,sp,16
10154: 000137b7 lui a5,0x13
10158: 61878513 addi a0,a5,1560 # 13618 <__modsi3+0x30>
1015c: 304000ef jal ra,10460 <puts>
10160: 00000793 li a5,0
10164: 00078513 mv a0,a5
10168: 00c12083 lw ra,12(sp)
1016c: 00812403 lw s0,8(sp)
10170: 01010113 addi sp,sp,16
10174: 00008067 ret
...
...
qemu-riscv32
模仿运行
$ qemu-riscv32 4-1.o
hello world !
正文完