3-1
使⽤ gcc 编译代码并使⽤ binutils ⼯具对⽣成的⽬标文件和可执⾏文件(ELF 格局)进⾏剖析。具体要求如下:
- 编写⼀个简略的打印 “hello world!” 的程序源文件:hello.c
- 对源文件进⾏本地编译,⽣成针对⽀持 x86_64 指令集架构处理器的⽬标文件 hello.o。
- 查看 hello.o 的文件的文件头信息。
- 查看 hello.o 的 Section header table。
- 对 hello.o 反汇编,并查看 hello.c 的 C 程序源码和机器指令的对应关系。
简略地打印 "hello world !"
#include <stdio.h>int main(){ printf("hello world !\n"); return 0;}
编译为指标文件
gcc hello.cc -o hello.o
查看文件头信息
$ readelf -h hello.oELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: DYN (Shared object file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x1060 Start of program headers: 64 (bytes into file) Start of section headers: 14712 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 13 Size of section headers: 64 (bytes) Number of section headers: 31 Section header string table index: 30
查看 Section header table
$ readelf -l hello.oElf file type is DYN (Shared object file)Entry point 0x1060There are 13 program headers, starting at offset 64Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040 0x00000000000002d8 0x00000000000002d8 R 0x8 INTERP 0x0000000000000318 0x0000000000000318 0x0000000000000318 0x000000000000001c 0x000000000000001c R 0x1 [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2] LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x00000000000005f8 0x00000000000005f8 R 0x1000 LOAD 0x0000000000001000 0x0000000000001000 0x0000000000001000 0x00000000000001f5 0x00000000000001f5 R E 0x1000 LOAD 0x0000000000002000 0x0000000000002000 0x0000000000002000 0x0000000000000160 0x0000000000000160 R 0x1000 LOAD 0x0000000000002db8 0x0000000000003db8 0x0000000000003db8 0x0000000000000258 0x0000000000000260 RW 0x1000 DYNAMIC 0x0000000000002dc8 0x0000000000003dc8 0x0000000000003dc8 0x00000000000001f0 0x00000000000001f0 RW 0x8 NOTE 0x0000000000000338 0x0000000000000338 0x0000000000000338 0x0000000000000020 0x0000000000000020 R 0x8 NOTE 0x0000000000000358 0x0000000000000358 0x0000000000000358 0x0000000000000044 0x0000000000000044 R 0x4 GNU_PROPERTY 0x0000000000000338 0x0000000000000338 0x0000000000000338 0x0000000000000020 0x0000000000000020 R 0x8 GNU_EH_FRAME 0x0000000000002014 0x0000000000002014 0x0000000000002014 0x0000000000000044 0x0000000000000044 R 0x4 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RW 0x10 GNU_RELRO 0x0000000000002db8 0x0000000000003db8 0x0000000000003db8 0x0000000000000248 0x0000000000000248 R 0x1 Section to Segment mapping: Segment Sections... 00 01 .interp 02 .interp .note.gnu.property .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt 03 .init .plt .plt.got .plt.sec .text .fini 04 .rodata .eh_frame_hdr .eh_frame 05 .init_array .fini_array .dynamic .got .data .bss 06 .dynamic 07 .note.gnu.property 08 .note.gnu.build-id .note.ABI-tag 09 .note.gnu.property 10 .eh_frame_hdr 11 12 .init_array .fini_array .dynamic .got
反汇编
$ objdump -d hello.o > 3-1.asm3-1.o: file format elf64-x86-64Disassembly of section .init:0000000000001000 <_init>: 1000: f3 0f 1e fa endbr64 1004: 48 83 ec 08 sub $0x8,%rsp 1008: 48 8b 05 d9 2f 00 00 mov 0x2fd9(%rip),%rax # 3fe8 <__gmon_start__> 100f: 48 85 c0 test %rax,%rax 1012: 74 02 je 1016 <_init+0x16> 1014: ff d0 callq *%rax 1016: 48 83 c4 08 add $0x8,%rsp 101a: c3 retq Disassembly of section .plt:0000000000001020 <.plt>: 1020: ff 35 9a 2f 00 00 pushq 0x2f9a(%rip) # 3fc0 <_GLOBAL_OFFSET_TABLE_+0x8> 1026: f2 ff 25 9b 2f 00 00 bnd jmpq *0x2f9b(%rip) # 3fc8 <_GLOBAL_OFFSET_TABLE_+0x10> 102d: 0f 1f 00 nopl (%rax) 1030: f3 0f 1e fa endbr64 1034: 68 00 00 00 00 pushq $0x0 1039: f2 e9 e1 ff ff ff bnd jmpq 1020 <.plt> 103f: 90 nopDisassembly of section .plt.got:0000000000001040 <__cxa_finalize@plt>: 1040: f3 0f 1e fa endbr64 1044: f2 ff 25 ad 2f 00 00 bnd jmpq *0x2fad(%rip) # 3ff8 <__cxa_finalize@GLIBC_2.2.5> 104b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)Disassembly of section .plt.sec:0000000000001050 <puts@plt>: 1050: f3 0f 1e fa endbr64 1054: f2 ff 25 75 2f 00 00 bnd jmpq *0x2f75(%rip) # 3fd0 <puts@GLIBC_2.2.5> 105b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)Disassembly of section .text:0000000000001060 <_start>: 1060: f3 0f 1e fa endbr64 1064: 31 ed xor %ebp,%ebp 1066: 49 89 d1 mov %rdx,%r9 1069: 5e pop %rsi 106a: 48 89 e2 mov %rsp,%rdx 106d: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp 1071: 50 push %rax 1072: 54 push %rsp 1073: 4c 8d 05 66 01 00 00 lea 0x166(%rip),%r8 # 11e0 <__libc_csu_fini> 107a: 48 8d 0d ef 00 00 00 lea 0xef(%rip),%rcx # 1170 <__libc_csu_init> 1081: 48 8d 3d c1 00 00 00 lea 0xc1(%rip),%rdi # 1149 <main> 1088: ff 15 52 2f 00 00 callq *0x2f52(%rip) # 3fe0 <__libc_start_main@GLIBC_2.2.5> 108e: f4 hlt 108f: 90 nop0000000000001090 <deregister_tm_clones>: 1090: 48 8d 3d 79 2f 00 00 lea 0x2f79(%rip),%rdi # 4010 <__TMC_END__> 1097: 48 8d 05 72 2f 00 00 lea 0x2f72(%rip),%rax # 4010 <__TMC_END__> 109e: 48 39 f8 cmp %rdi,%rax 10a1: 74 15 je 10b8 <deregister_tm_clones+0x28> 10a3: 48 8b 05 2e 2f 00 00 mov 0x2f2e(%rip),%rax # 3fd8 <_ITM_deregisterTMCloneTable> 10aa: 48 85 c0 test %rax,%rax 10ad: 74 09 je 10b8 <deregister_tm_clones+0x28> 10af: ff e0 jmpq *%rax 10b1: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 10b8: c3 retq 10b9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)00000000000010c0 <register_tm_clones>: 10c0: 48 8d 3d 49 2f 00 00 lea 0x2f49(%rip),%rdi # 4010 <__TMC_END__> 10c7: 48 8d 35 42 2f 00 00 lea 0x2f42(%rip),%rsi # 4010 <__TMC_END__> 10ce: 48 29 fe sub %rdi,%rsi 10d1: 48 89 f0 mov %rsi,%rax 10d4: 48 c1 ee 3f shr $0x3f,%rsi 10d8: 48 c1 f8 03 sar $0x3,%rax 10dc: 48 01 c6 add %rax,%rsi 10df: 48 d1 fe sar %rsi 10e2: 74 14 je 10f8 <register_tm_clones+0x38> 10e4: 48 8b 05 05 2f 00 00 mov 0x2f05(%rip),%rax # 3ff0 <_ITM_registerTMCloneTable> 10eb: 48 85 c0 test %rax,%rax 10ee: 74 08 je 10f8 <register_tm_clones+0x38> 10f0: ff e0 jmpq *%rax 10f2: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 10f8: c3 retq 10f9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)0000000000001100 <__do_global_dtors_aux>: 1100: f3 0f 1e fa endbr64 1104: 80 3d 05 2f 00 00 00 cmpb $0x0,0x2f05(%rip) # 4010 <__TMC_END__> 110b: 75 2b jne 1138 <__do_global_dtors_aux+0x38> 110d: 55 push %rbp 110e: 48 83 3d e2 2e 00 00 cmpq $0x0,0x2ee2(%rip) # 3ff8 <__cxa_finalize@GLIBC_2.2.5> 1115: 00 1116: 48 89 e5 mov %rsp,%rbp 1119: 74 0c je 1127 <__do_global_dtors_aux+0x27> 111b: 48 8b 3d e6 2e 00 00 mov 0x2ee6(%rip),%rdi # 4008 <__dso_handle> 1122: e8 19 ff ff ff callq 1040 <__cxa_finalize@plt> 1127: e8 64 ff ff ff callq 1090 <deregister_tm_clones> 112c: c6 05 dd 2e 00 00 01 movb $0x1,0x2edd(%rip) # 4010 <__TMC_END__> 1133: 5d pop %rbp 1134: c3 retq 1135: 0f 1f 00 nopl (%rax) 1138: c3 retq 1139: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)0000000000001140 <frame_dummy>: 1140: f3 0f 1e fa endbr64 1144: e9 77 ff ff ff jmpq 10c0 <register_tm_clones>0000000000001149 <main>: 1149: f3 0f 1e fa endbr64 114d: 55 push %rbp 114e: 48 89 e5 mov %rsp,%rbp 1151: 48 8d 3d ac 0e 00 00 lea 0xeac(%rip),%rdi # 2004 <_IO_stdin_used+0x4> 1158: e8 f3 fe ff ff callq 1050 <puts@plt> 115d: b8 00 00 00 00 mov $0x0,%eax 1162: 5d pop %rbp 1163: c3 retq 1164: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 116b: 00 00 00 116e: 66 90 xchg %ax,%ax0000000000001170 <__libc_csu_init>: 1170: f3 0f 1e fa endbr64 1174: 41 57 push %r15 1176: 4c 8d 3d 3b 2c 00 00 lea 0x2c3b(%rip),%r15 # 3db8 <__frame_dummy_init_array_entry> 117d: 41 56 push %r14 117f: 49 89 d6 mov %rdx,%r14 1182: 41 55 push %r13 1184: 49 89 f5 mov %rsi,%r13 1187: 41 54 push %r12 1189: 41 89 fc mov %edi,%r12d 118c: 55 push %rbp 118d: 48 8d 2d 2c 2c 00 00 lea 0x2c2c(%rip),%rbp # 3dc0 <__do_global_dtors_aux_fini_array_entry> 1194: 53 push %rbx 1195: 4c 29 fd sub %r15,%rbp 1198: 48 83 ec 08 sub $0x8,%rsp 119c: e8 5f fe ff ff callq 1000 <_init> 11a1: 48 c1 fd 03 sar $0x3,%rbp 11a5: 74 1f je 11c6 <__libc_csu_init+0x56> 11a7: 31 db xor %ebx,%ebx 11a9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 11b0: 4c 89 f2 mov %r14,%rdx 11b3: 4c 89 ee mov %r13,%rsi 11b6: 44 89 e7 mov %r12d,%edi 11b9: 41 ff 14 df callq *(%r15,%rbx,8) 11bd: 48 83 c3 01 add $0x1,%rbx 11c1: 48 39 dd cmp %rbx,%rbp 11c4: 75 ea jne 11b0 <__libc_csu_init+0x40> 11c6: 48 83 c4 08 add $0x8,%rsp 11ca: 5b pop %rbx 11cb: 5d pop %rbp 11cc: 41 5c pop %r12 11ce: 41 5d pop %r13 11d0: 41 5e pop %r14 11d2: 41 5f pop %r15 11d4: c3 retq 11d5: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1) 11dc: 00 00 00 00 00000000000011e0 <__libc_csu_fini>: 11e0: f3 0f 1e fa endbr64 11e4: c3 retq Disassembly of section .fini:00000000000011e8 <_fini>: 11e8: f3 0f 1e fa endbr64 11ec: 48 83 ec 08 sub $0x8,%rsp 11f0: 48 83 c4 08 add $0x8,%rsp 11f4: c3 retq
3-2
如下例⼦ C 语⾔代码:
#include <stdio.h>int global_init = 0x11111111;const int global_const = 0x22222222;void main(){ static int static_var = 0x33333333; static int static_var_uninit; int auto_var = 0x44444444; printf("hello world!\n"); return;}
请问编译为 .o 文件后,global_init, global_const, static_var, static_var_uninit, auto_var 这些变 量别离寄存在那些 section ⾥,"hello world!\n" 这个字符串⼜在哪⾥?并尝试⽤⼯具查看并验证你的猜想。
global_init
和static_var
在.data
段
Disassembly of section .data:0000000000004000 <__data_start>: ...0000000000004008 <__dso_handle>: 4008: 08 40 00 or %al,0x0(%rax) 400b: 00 00 add %al,(%rax) 400d: 00 00 add %al,(%rax) ...0000000000004010 <global_init>: 4010: 11 11 adc %edx,(%rcx) 4012: 11 11 adc %edx,(%rcx)0000000000004014 <static_var.2317>: 4014: 33 33 xor (%rbx),%esi 4016: 33 33 xor (%rbx),%esi
global_const
在rodata
段
Disassembly of section .rodata:0000000000002000 <_IO_stdin_used>: 2000: 01 00 add %eax,(%rax) 2002: 02 00 add (%rax),%al0000000000002004 <global_const>: 2004: 22 22 and (%rdx),%ah 2006: 22 22 and (%rdx),%ah 2008: 68 65 6c 6c 6f pushq $0x6f6c6c65 200d: 20 77 6f and %dh,0x6f(%rdi) 2010: 72 6c jb 207e <__GNU_EH_FRAME_HDR+0x66> 2012: 64 21 00 and %eax,%fs:(%rax)
static_var_uninit
在bss
段
Disassembly of section .bss:0000000000004018 <completed.8060>: 4018: 00 00 add %al,(%rax) ...000000000000401c <static_var_uninit.2318>: 401c: 00 00 add %al,(%rax) ...
auto_var
为局部变量,不在任何 section 中,在代码段中间接应用
4-1
针对rv32ima
指令集架构,反复3-1
的操作
指令用法与 3-1
基本相同
$riscv64-unknown-elf-gcc 3-1.c -march=rv32ima -mabi=ilp32 -g -Wall -o 4-1.o$readelf -h 4-1.o ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: RISC-V Version: 0x1 Entry point address: 0x10090 Start of program headers: 52 (bytes into file) Start of section headers: 26136 (bytes into file) Flags: 0x0 Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 2 Size of section headers: 40 (bytes) Number of section headers: 21 Section header string table index: 20$readelf -l 4-1.o Elf file type is EXEC (Executable file)Entry point 0x10090There are 2 program headers, starting at offset 52Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x00010000 0x00010000 0x0362c 0x0362c R E 0x1000 LOAD 0x00362c 0x0001462c 0x0001462c 0x00858 0x008b0 RW 0x1000 Section to Segment mapping: Segment Sections... 00 .text .rodata 01 .eh_frame .init_array .fini_array .data .sdata .sbss .bss$riscv64-unknown-elf-objdump -d 4-1.o > 4-1.asm
- 反汇编节选
4-1.o: file format elf32-littleriscvDisassembly of section .text:00010074 <register_fini>: 10074: ffff0797 auipc a5,0xffff0 10078: f8c78793 addi a5,a5,-116 # 0 <register_fini-0x10074> 1007c: 00078863 beqz a5,1008c <register_fini+0x18> 10080: 00000517 auipc a0,0x0 10084: 14050513 addi a0,a0,320 # 101c0 <__libc_fini_array> 10088: 0f00006f j 10178 <atexit> 1008c: 00008067 ret00010090 <_start>: 10090: 00005197 auipc gp,0x5 10094: db018193 addi gp,gp,-592 # 14e40 <__global_pointer$> 10098: 04418513 addi a0,gp,68 # 14e84 <_edata> 1009c: 09c18613 addi a2,gp,156 # 14edc <__BSS_END__> 100a0: 40a60633 sub a2,a2,a0 100a4: 00000593 li a1,0 100a8: 20c000ef jal ra,102b4 <memset> 100ac: 00000517 auipc a0,0x0 100b0: 11450513 addi a0,a0,276 # 101c0 <__libc_fini_array> 100b4: 0c4000ef jal ra,10178 <atexit> 100b8: 168000ef jal ra,10220 <__libc_init_array> 100bc: 00012503 lw a0,0(sp) 100c0: 00410593 addi a1,sp,4 100c4: 00000613 li a2,0 100c8: 07c000ef jal ra,10144 <main> 100cc: 0c00006f j 1018c <exit>......00010144 <main>: 10144: ff010113 addi sp,sp,-16 10148: 00112623 sw ra,12(sp) 1014c: 00812423 sw s0,8(sp) 10150: 01010413 addi s0,sp,16 10154: 000137b7 lui a5,0x13 10158: 61878513 addi a0,a5,1560 # 13618 <__modsi3+0x30> 1015c: 304000ef jal ra,10460 <puts> 10160: 00000793 li a5,0 10164: 00078513 mv a0,a5 10168: 00c12083 lw ra,12(sp) 1016c: 00812403 lw s0,8(sp) 10170: 01010113 addi sp,sp,16 10174: 00008067 ret......
qemu-riscv32
模仿运行
$ qemu-riscv32 4-1.ohello world !