ELF loading and dynamic linking
Published:
Updated:
Some notes on ELF 🧝 loading and dynamic linking mainly for GNU userland (ld.so
, libc
, libdl
) running on top of the Linux kernel. Some prior knowlegde on the topic (virtual memory, shared objects, sections) might be useful to understand this.
Table of content
ELF introduction
The ELF format is a standard file format used for different types of objects:
- executables, kernel images (
ET_EXEC
); - shared objects (shared libraries,
.so
, dynamic libraries,ET_DYN
); - compilation outputs (
.o
,ET_REL
); - core dumps (
ET_CORE
).
Static libraries (.a
files, archive packages) are not ELF files but archives of .o
files.
More information about the ELF format can be found in man elf
, elf.h
, the System V specification, the LSB (for Linux-specific stuff). The readelf
tool can be used to visualise the fields of ELF files and is very useful to understand what information is in the ELF files, correlate them with /proc/${pid}/maps
and understand how the loading and linking of programs work on ELF-based systems.
ELF header
The ELF header is defined as:
typedef struct
{
unsigned char e_ident[EI_NIDENT]; /* Magic number and other info */
ElfXX_Half e_type; /* Object file type */
ElfXX_Half e_machine; /* Architecture */
ElfXX_Word e_version; /* Object file version */
ElfXX_Addr e_entry; /* Entry point virtual address */
ElfXX_Off e_phoff; /* Program header table file offset */
ElfXX_Off e_shoff; /* Section header table file offset */
ElfXX_Word e_flags; /* Processor-specific flags */
ElfXX_Half e_ehsize; /* ELF header size in bytes */
ElfXX_Half e_phentsize; /* Program header table entry size */
ElfXX_Half e_phnum; /* Program header table entry count */
ElfXX_Half e_shentsize; /* Section header table entry size */
ElfXX_Half e_shnum; /* Section header table entry count */
ElfXX_Half e_shstrndx; /* Section header string table index */
} ElfXX_Ehdr;
where XX
is either 32
(for ELF-32) or 64
(for ELF-64).
The ElfW(type)
macro can be used to refer to the native ELF types:
#define ElfW(type) _ElfW (Elf, __ELF_NATIVE_CLASS, type)
#define _ElfW(e,w,t) _ElfW_1 (e, w, _##t)
#define _ElfW_1(e,w,t) e##w##t
Which is used as:
ElfW(Ehdr)* native_header;
ElfW(Off) native_offset;
ElfW(Addr) native_address;
The ELF header can be read with readelf -h
:
$ readelf -h /bin/sh
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x404c
Start of program headers: 64 (bytes into file)
Start of section headers: 123672 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 9
Size of section headers: 64 (bytes)
Number of section headers: 27
Section header string table index: 26
The e_ident
fields contains:
- the magic number of ELF files (
0x7f454c46
); - the class (32 vs 64 bit);
- the data field (endianess);
- the ELF version;
- the OS/ABI (System V, NetBSD, Linux, Solaris, FreeBSD, etc.;
- the ABI version.
Static binary
Let's start with the loading of statically linked binaries:
- the kernel maps the program in memory (and the vDSO);
- the kernel sets up the stack and registers (passing information such as the argument and environment variables) and calls the main program entry point.
The executable is loaded at a fixed address and no relocation is needed.
Mapping the executable in memory
The program headers defines the in-memory layout of the program and the location of the informations needed for loading, dynamic linking and more generally at runtime (for dynamic symbol resolution, exception handling, etc.) . The program headers are located by the fields e_phoff
, e_phentsize
and e_phnum
of the ELF header. Each program header is defined as:
// The fields are in a slightly different order for Elf32.
typedef struct
{
Elf64_Word p_type; /* Segment type */
Elf64_Word p_flags; /* Segment flags */
Elf64_Off p_offset; /* Segment file offset */
Elf64_Addr p_vaddr; /* Segment virtual address */
Elf64_Addr p_paddr; /* Segment physical address */
Elf64_Xword p_filesz; /* Segment size in file */
Elf64_Xword p_memsz; /* Segment size in memory */
Elf64_Xword p_align; /* Segment alignment */
} Elf64_Phdr;
The readelf -l
command can be used to see the program headers. and tells us which sections are located in which segments by comparing the program headers and the section headers (the sections are explained in the next section).
$ readelf -l /bin/bash-static
Elf file type is EXEC (Executable file)
Entry point 0x403d0e
There are 6 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000001cda14 0x00000000001cda14 R E 200000
LOAD 0x00000000001cde60 0x00000000007cde60 0x00000000007cde60
0x000000000000a900 0x0000000000013720 RW 200000
NOTE 0x0000000000000190 0x0000000000400190 0x0000000000400190
0x0000000000000044 0x0000000000000044 R 4
TLS 0x00000000001cde60 0x00000000007cde60 0x00000000007cde60
0x0000000000000070 0x00000000000000a8 R 8
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 10
GNU_RELRO 0x00000000001cde60 0x00000000007cde60 0x00000000007cde60
0x00000000000001a0 0x00000000000001a0 R 1
Section to Segment mapping:
Segment Sections...
00 .note.ABI-tag .note.gnu.build-id .rela.plt .init .plt .text __libc_freeres_fn __libc_thread_freeres_fn .fini .rodata __libc_subfreeres __libc_atexit __libc_thread_subfreeres .eh_frame .gcc_except_table
01 .tdata .init_array .fini_array .jcr .data.rel.ro .got .got.plt .data .bss __libc_freeres_ptrs
02 .note.ABI-tag .note.gnu.build-id
03 .tdata .tbss
04
05 .tdata .init_array .fini_array .jcr .data.rel.ro .got
Each PT_LOAD
entry defines a segment which is mapped by the kernel in memory: the kernel maps the relevant part of the files in the virtual address space of the process. Each PT_LOAD
entry contains:
- the offset and size of the segment on the file;
- the virtual address of the segment in the virtual address space and its size;
- the access rights (
R
,W
,E
for readable, writable, executable); - the physical address field is not used and is the same as the virtual address field.
We can check this with:
$ gdb /bin/bash-static -s ls
(gdb) break main
(gdb) catch syscall execve
(gdb) run
Starting program: /bin/bash-static -c ls
Catchpoint 1 (call to syscall execve), 0x00000000004ffb97 in ?? ()
(gdb) !cat /proc/$(pgrep bash-static)/maps
00400000-005ce000 r-xp 00000000 08:11 524423 /bin/bash-static
007cd000-007d9000 rw-p 001cd000 08:11 524423 /bin/bash-static
007d9000-00805000 rw-p 00000000 00:00 0 [heap]
7ffff7e69000-7ffff7e70000 r--s 00000000 08:11 1192265 /usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache
7ffff7e70000-7ffff7ffb000 r--p 00000000 08:11 787456 /usr/lib/locale/locale-archive
7ffff7ffb000-7ffff7ffd000 r-xp 00000000 00:00 0 [vdso]
7ffff7ffd000-7ffff7fff000 r--p 00000000 00:00 0 [vvar]
7ffffffde000-7ffffffff000 rw-p 00000000 00:00 0 [stack]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
The memory mappings of the bash-static
executable correspond to the PT_LOAD
entries of the ELF file.
The first segment is readable executable but not writable and contains the the code (the .text
section) and the readonly data (.rodata
). This first segment is not mutable and not mutated and is shared unmodified by every instance of the program: each instance of the program shares the same physical memory pages for this segment, reducing the memory consumption.
The second segment is writable and contains the initialised (.data
) and uninitialised (.bss
) variables. The memory pages of this segment are shared with copy-on-write semantics: each memory page is initially shared between each instance; when a process modifies one of those pages, the OS create a new copy of the page for the process.
There are slight differences between the ranges specified in the ELF file and the range on the process: because a virtual address space mapping must be page aligned (4 KiB aligned on AMD-64), the three lower nibbles addresses must be zeros in the virtual address space (so some extra bytes of the file must be mapped in each memory mapping).
The size of this second segment is larger in memory than on file: the remaining bytes are filled with zeros (.bss
).
Sections
The ELF file contains a list of sections as well: the sections are not useful at runtime but can be used to understand the different parts of the binary and and are used by the debuggers.
The section headers describe the sections in the ELF object. Those headers are located using the e_shoff
, e_shentsize
and e_shnum
fields of the ELF header. Each section header is defined as:
typedef struct
{
ElfXX_Word sh_name; /* Section name (string tbl index) */
ElfXX_Word sh_type; /* Section type */
ElfXx_Word sh_flags; /* Section flags */
ElfXx_Addr sh_addr; /* Section virtual addr at execution */
ElfXx_Off sh_offset; /* Section file offset */
ElfXx_Word sh_size; /* Section size in bytes */
ElfXX_Word sh_link; /* Link to another section */
ElfXX_Word sh_info; /* Additional section information */
ElfXX_Word sh_addralign; /* Section alignment */
ElfXX_Word sh_entsize; /* Entry size if section holds table */
} ElfXX_Shdr;
readelf -S
displays the sections headers:
$ readelf -S /bin/bash-static
There are 28 section headers, starting at offset 0x1d8898:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .note.ABI-tag NOTE 0000000000400190 00000190
0000000000000020 0000000000000000 A 0 0 4
[ 2] .note.gnu.build-i NOTE 00000000004001b0 000001b0
0000000000000024 0000000000000000 A 0 0 4
[ 3] .rela.plt RELA 00000000004001d8 000001d8
00000000000001b0 0000000000000018 AI 0 5 8
[ 4] .init PROGBITS 0000000000400388 00000388
000000000000001a 0000000000000000 AX 0 0 4
[ 5] .plt PROGBITS 00000000004003b0 000003b0
0000000000000120 0000000000000000 AX 0 0 16
[ 6] .text PROGBITS 00000000004004d0 000004d0
0000000000159ab4 0000000000000000 AX 0 0 16
[ 7] __libc_freeres_fn PROGBITS 0000000000559f90 00159f90
0000000000000d9c 0000000000000000 AX 0 0 16
[ 8] __libc_thread_fre PROGBITS 000000000055ad30 0015ad30
00000000000000e0 0000000000000000 AX 0 0 16
[ 9] .fini PROGBITS 000000000055ae10 0015ae10
0000000000000009 0000000000000000 AX 0 0 4
[10] .rodata PROGBITS 000000000055ae40 0015ae40
00000000000469e0 0000000000000000 A 0 0 64
[11] __libc_subfreeres PROGBITS 00000000005a1820 001a1820
00000000000000a0 0000000000000000 A 0 0 8
[12] __libc_atexit PROGBITS 00000000005a18c0 001a18c0
0000000000000008 0000000000000000 A 0 0 8
[13] __libc_thread_sub PROGBITS 00000000005a18c8 001a18c8
0000000000000010 0000000000000000 A 0 0 8
[14] .eh_frame PROGBITS 00000000005a18d8 001a18d8
000000000002bff4 0000000000000000 A 0 0 8
[15] .gcc_except_table PROGBITS 00000000005cd8cc 001cd8cc
0000000000000148 0000000000000000 A 0 0 1
[16] .tdata PROGBITS 00000000007cde60 001cde60
0000000000000070 0000000000000000 WAT 0 0 8
[17] .tbss NOBITS 00000000007cded0 001cded0
0000000000000038 0000000000000000 WAT 0 0 8
[18] .init_array INIT_ARRAY 00000000007cded0 001cded0
0000000000000010 0000000000000000 WA 0 0 8
[19] .fini_array FINI_ARRAY 00000000007cdee0 001cdee0
0000000000000010 0000000000000000 WA 0 0 8
[20] .jcr PROGBITS 00000000007cdef0 001cdef0
0000000000000008 0000000000000000 WA 0 0 8
[21] .data.rel.ro PROGBITS 00000000007cdf00 001cdf00
00000000000000e4 0000000000000000 WA 0 0 32
[22] .got PROGBITS 00000000007cdfe8 001cdfe8
0000000000000010 0000000000000008 WA 0 0 8
[23] .got.plt PROGBITS 00000000007ce000 001ce000
00000000000000a8 0000000000000008 WA 0 0 8
[24] .data PROGBITS 00000000007ce0c0 001ce0c0
000000000000a6a0 0000000000000000 WA 0 0 64
[25] .bss NOBITS 00000000007d8780 001d8760
0000000000008d88 0000000000000000 WA 0 0 64
[26] __libc_freeres_pt NOBITS 00000000007e1508 001d8760
0000000000000078 0000000000000000 WA 0 0 8
[27] .shstrtab STRTAB 0000000000000000 001d8760
0000000000000134 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
The GDB info files
command can be used to display the different sections of a process.
Calling the program entry point
The kernel then sets up the (first, main) stack for the process and calls the entry point of the program (located with the e_entry
field of the ELF header).
The stack of the process and its initial registers are used to pass informations to the process such as:
- the arguments;
- the environment variables;
- the auxiliary vector which contains additional informations used for bootstrapping the process (such as the location of the VDSO).
More information about the auxiliary vector can be found in man getauxval
and the relevant specs. We can display the auxiliary vector at the startup of a dynamically-linked executable by passing LD_SHOW_AUXV=1
is in its environment:
$ LD_SHOW_AUXV=1 /bin/true
AT_SYSINFO_EHDR: 0x7fffed9fc000
AT_HWCAP: bfebfbff
AT_PAGESZ: 4096
AT_CLKTCK: 100
AT_PHDR: 0x400040
AT_PHENT: 56
AT_PHNUM: 9
AT_BASE: 0x7f6ba3e9f000
AT_FLAGS: 0x0
AT_ENTRY: 0x401432
AT_UID: 1000
AT_EUID: 1000
AT_GID: 1000
AT_EGID: 1000
AT_SECURE: 0
AT_RANDOM: 0x7fffed8b92d9
AT_EXECFN: /bin/true
AT_PLATFORM: x86_64
Auxiliary vector in the AMD-64 ABI
For example, the System V ABI, AMD64 supplement defined in section 3.1 (Process Initialization) defines that the stack contains:
- the number of arguments
argc
at*(%rsp)
, - pointers to the arguments (and a
NULL pointer
) starting from*(8+%rsp)
until*(8+8*argc+%rsp)
; - then the environment pointers (and a
NULL
pointer); - the the auxiliary vector entries (and a
NULL
pointer).
The corresponding strings are stored after this on the stack.
Dynamic Binary
When dynamic linking is involved, things are more complicated: the libraries must be mapped in memory and the symbols must be resolved.
The libraries must be able to be loaded anywhere in the process virtual address space and must be relocated. The kernel does only map the program file in memory but the dynamic linker (a.k.a. the interpreter) as well which must:
- locate and map all dependencies (as well as shared object specified in
LD_PRELOAD
); - relocate the files.
This is a very high level overview as I understand it:
- the kernels initialises the process:
- it maps the main program, the interpreter (dynamic linker) segments and the vDSO in the virtual address space;
- it sets up the stack (passing the arguments, environment) and calls the dynamic linker entry point;
- the dynamic linker loads the different ELF objects and binds them together
- it relocates itself (!);
- it finds and loads the necessary libraries;
- it does the relocations (which binds the ELF objects);
- it calls the initialisation functions functions of the shared objects (those functions are specified in the
DT_INIT
andDT_INIT_ARRAY
entries of the ELF objects); - it calls the main program entry point; (the main program entry point is found in the
AT_ENTRY
entry of the auxiliary vector: it has been initialised by the kernel from thee_entry
ELF header field);
- the executable then initialises itself.
Base address
The shared objects are designed to be mapped anywhere in the virtual address space without modification: the read-only segment is mapped unmodified in each instance of the shared object and every instance of the library shares the same memory pages for this segment. For this reason, the virtual addresses expressed in many data structures ELF (such as in the program headers) are expressed as offset from the base address of the shared object (the address at which the shared object is mapped):
$ readelf -l /lib/x86_64-linux-gnu/libc.so.6 | less
Elf file type is DYN (Shared object file)
Entry point 0x21c50
There are 10 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040
0x0000000000000230 0x0000000000000230 R E 8
INTERP 0x000000000016bfb0 0x000000000016bfb0 0x000000000016bfb0
0x000000000000001c 0x000000000000001c R 10
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x000000000019e774 0x000000000019e774 R E 200000
LOAD 0x000000000019f740 0x000000000039f740 0x000000000039f740
0x0000000000004ff8 0x00000000000092e0 RW 200000
DYNAMIC 0x00000000001a2ba0 0x00000000003a2ba0 0x00000000003a2ba0
0x00000000000001e0 0x00000000000001e0 RW 8
NOTE 0x0000000000000270 0x0000000000000270 0x0000000000000270
0x0000000000000044 0x0000000000000044 R 4
TLS 0x000000000019f740 0x000000000039f740 0x000000000039f740
0x0000000000000010 0x0000000000000080 R 8
GNU_EH_FRAME 0x000000000016bfcc 0x000000000016bfcc 0x000000000016bfcc
0x0000000000006a24 0x0000000000006a24 R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 10
GNU_RELRO 0x000000000019f740 0x000000000039f740 0x000000000039f740
0x00000000000038c0 0x00000000000038c0 R 1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_d .gnu.version_r .rela.dyn .rela.plt .plt .text __libc_freeres_fn __libc_thread_freeres_fn .rodata .interp .eh_frame_hdr .eh_frame .gcc_except_table .hash
03 .tdata .init_array __libc_subfreeres __libc_atexit __libc_thread_subfreeres .data.rel.ro .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.gnu.build-id .note.ABI-tag
06 .tdata .tbss
07 .eh_frame_hdr
08
09 .tdata .init_array __libc_subfreeres __libc_atexit __libc_thread_subfreeres .data.rel.ro .dynamic .got
In this example the second LOAD
segment of the libc is mapped at 0x000000000039f740 + base_address = 0x000000000039f740 + 0x7f69eab7a000
:
$ cat /proc/$(pgrep sleep)/maps
00400000-00407000 r-xp 00000000 08:01 527401 /bin/sleep
00606000-00607000 r--p 00006000 08:01 527401 /bin/sleep
00607000-00608000 rw-p 00007000 08:01 527401 /bin/sleep
0141f000-01440000 rw-p 00000000 00:00 0 [heap]
7f69eab7a000-7f69ead19000 r-xp 00000000 08:01 2626010 /lib/x86_64-linux-gnu/libc-2.19.so
7f69ead19000-7f69eaf19000 ---p 0019f000 08:01 2626010 /lib/x86_64-linux-gnu/libc-2.19.so
7f69eaf19000-7f69eaf1d000 r--p 0019f000 08:01 2626010 /lib/x86_64-linux-gnu/libc-2.19.so
7f69eaf1d000-7f69eaf1f000 rw-p 001a3000 08:01 2626010 /lib/x86_64-linux-gnu/libc-2.19.so
7f69eaf1f000-7f69eaf23000 rw-p 00000000 00:00 0
7f69eaf23000-7f69eaf43000 r-xp 00000000 08:01 2625993 /lib/x86_64-linux-gnu/ld-2.19.so
7f69eaf85000-7f69eb10e000 r--p 00000000 08:01 2245023 /usr/lib/locale/locale-archive
7f69eb10e000-7f69eb111000 rw-p 00000000 00:00 0
7f69eb141000-7f69eb143000 rw-p 00000000 00:00 0
7f69eb143000-7f69eb144000 r--p 00020000 08:01 2625993 /lib/x86_64-linux-gnu/ld-2.19.so
7f69eb144000-7f69eb145000 rw-p 00021000 08:01 2625993 /lib/x86_64-linux-gnu/ld-2.19.so
7f69eb145000-7f69eb146000 rw-p 00000000 00:00 0
7fffffeaa000-7fffffecb000 rw-p 00000000 00:00 0 [stack]
7ffffff03000-7ffffff05000 r-xp 00000000 00:00 0 [vdso]
7ffffff05000-7ffffff07000 r--p 00000000 00:00 0 [vvar]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Some part of the second PT_LOAD
segment is readonly: this is because of the PT_GNU_RELRO
program header. This program header asks the dynamic linker to mark this part of the memory in read-only after the relocation is done.
Mapping the executable in memory
As before the kernels maps the executable in memory using the DT_LOAD
entries:
$ readelf -l /bin/bash
Elf file type is EXEC (Executable file)
Entry point 0x4205bc
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x00000000000001f8 0x00000000000001f8 R E 8
INTERP 0x0000000000000238 0x0000000000400238 0x0000000000400238
0x000000000000001c 0x000000000000001c R 1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000f1a74 0x00000000000f1a74 R E 200000
LOAD 0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
0x0000000000009068 0x000000000000f298 RW 200000
DYNAMIC 0x00000000000f1df8 0x00000000006f1df8 0x00000000006f1df8
0x0000000000000200 0x0000000000000200 RW 8
NOTE 0x0000000000000254 0x0000000000400254 0x0000000000400254
0x0000000000000044 0x0000000000000044 R 4
GNU_EH_FRAME 0x00000000000d6af0 0x00000000004d6af0 0x00000000004d6af0
0x000000000000407c 0x000000000000407c R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 10
GNU_RELRO 0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
0x0000000000000220 0x0000000000000220 R 1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
03 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
08 .init_array .fini_array .jcr .dynamic .got
Finding the interpreter and running the interpreter
Finding the interpreter
The location of the dynamic linker (also called the interpreter) to use is hard-coded in the executable: the PT_INTERP
entry in the program headers defines the location of this string in the executable file and in the process virtual address space.
$ readelf -l /bin/bash
Elf file type is EXEC (Executable file)
Entry point 0x4205bc
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x00000000000001f8 0x00000000000001f8 R E 8
[...]
Section to Segment mapping:
Segment Sections...
00
01 .interp
[...]
Mapping the interpreter
The dynamic linker (/lib64/ld-linux-x86-64.so.2
) is mapped by the kernel in the virtual address space of the process (using the PT_LOAD
entries):
$ readelf -l /lib64/ld-linux-x86-64.so.2
Elf file type is DYN (Shared object file)
Entry point 0x1190
There are 7 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x000000000001fe08 0x000000000001fe08 R E 200000
LOAD 0x0000000000020c00 0x0000000000220c00 0x0000000000220c00
0x00000000000013e4 0x00000000000015a8 RW 200000
DYNAMIC 0x0000000000020e70 0x0000000000220e70 0x0000000000220e70
0x0000000000000170 0x0000000000000170 RW 8
NOTE 0x00000000000001c8 0x00000000000001c8 0x00000000000001c8
0x0000000000000024 0x0000000000000024 R 4
GNU_EH_FRAME 0x000000000001d440 0x000000000001d440 0x000000000001d440
0x000000000000064c 0x000000000000064c R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 10
GNU_RELRO 0x0000000000020c00 0x0000000000220c00 0x0000000000220c00
0x0000000000000400 0x0000000000000400 R 1
Section to Segment mapping:
Segment Sections...
00 .note.gnu.build-id .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_d .rela.dyn .rela.plt .plt .text .rodata .eh_frame_hdr .eh_frame
01 .data.rel.ro .dynamic .got .got.plt .data .bss
02 .dynamic
03 .note.gnu.build-id
04 .eh_frame_hdr
05
06 .data.rel.ro .dynamic .got
Calling the interpreter
Now the kernel calls the entry point of the dynamic linker located by the e_entry
field of its ELF header with the arguments, environment and auxiliary vector:
$ readelf -h /lib64/ld-linux-x86-64.so.2
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x1190
Start of program headers: 64 (bytes into file)
Start of section headers: 139456 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 7
Size of section headers: 64 (bytes)
Number of section headers: 23
Section header string table index: 22
The auxiliary vector contains informations which will be used by the dynamic linker, and the libc
. Some interesting values for the dynamic linker are:
AT_PHDR
, the virtual address space of program headers of the executable;AT_BASE
, the base address of the interpreter/dynamic-linker.
AT_PHDR
can be used to find the base address of the executable with:
// Simplified code from the GNU dynamic linker source code:
for (ph = phdr; ph < &phdr[phnum]; ++ph)
if (ph->p_type == PT_PHDR)
main_map->l_addr = (ElfW(Addr)) phdr - ph->p_vaddr;
Here are some values for a given process:
$ LD_SHOW_AUXV=1 /bin/bash -c "unset LD_SHOW_AUXV; sleep 100000"
AT_SYSINFO_EHDR: 0x7fff5cbfc000
AT_HWCAP: bfebfbff
AT_PAGESZ: 4096
AT_CLKTCK: 100
AT_PHDR: 0x400040
AT_PHENT: 56
AT_PHNUM: 9
AT_BASE: 0x7ffdd94ce000
AT_FLAGS: 0x0
AT_ENTRY: 0x4205bc
AT_UID: 1000
AT_EUID: 1000
AT_GID: 1000
AT_EGID: 1000
AT_SECURE: 0
AT_RANDOM: 0x7fff5ca4ddf9
AT_EXECFN: /bin/bash
AT_PLATFORM: x86_64
We can see that the AT_BASE
field is the base address of the dynamic linker and the AT_PHDR
is at the beginning of the executable mapping:
$ cat /proc/10130/maps
00400000-004f2000 r-xp 00000000 08:11 526344 /bin/bash
006f1000-006f2000 r--p 000f1000 08:11 526344 /bin/bash
006f2000-006fb000 rw-p 000f2000 08:11 526344 /bin/bash
006fb000-00702000 rw-p 00000000 00:00 0
01729000-01738000 rw-p 00000000 00:00 0 [heap]
7ffdd8ad2000-7ffdd8c71000 r-xp 00000000 08:11 1192272 /lib/x86_64-linux-gnu/libc-2.19.so
7ffdd8c71000-7ffdd8e71000 ---p 0019f000 08:11 1192272 /lib/x86_64-linux-gnu/libc-2.19.so
7ffdd8e71000-7ffdd8e75000 r--p 0019f000 08:11 1192272 /lib/x86_64-linux-gnu/libc-2.19.so
7ffdd8e75000-7ffdd8e77000 rw-p 001a3000 08:11 1192272 /lib/x86_64-linux-gnu/libc-2.19.so
7ffdd8e77000-7ffdd8e7b000 rw-p 00000000 00:00 0
7ffdd8e7b000-7ffdd8e7e000 r-xp 00000000 08:11 1192277 /lib/x86_64-linux-gnu/libdl-2.19.so
7ffdd8e7e000-7ffdd907d000 ---p 00003000 08:11 1192277 /lib/x86_64-linux-gnu/libdl-2.19.so
7ffdd907d000-7ffdd907e000 r--p 00002000 08:11 1192277 /lib/x86_64-linux-gnu/libdl-2.19.so
7ffdd907e000-7ffdd907f000 rw-p 00003000 08:11 1192277 /lib/x86_64-linux-gnu/libdl-2.19.so
7ffdd907f000-7ffdd90a5000 r-xp 00000000 08:11 1180383 /lib/x86_64-linux-gnu/libtinfo.so.5.9
7ffdd90a5000-7ffdd92a4000 ---p 00026000 08:11 1180383 /lib/x86_64-linux-gnu/libtinfo.so.5.9
7ffdd92a4000-7ffdd92a8000 r--p 00025000 08:11 1180383 /lib/x86_64-linux-gnu/libtinfo.so.5.9
7ffdd92a8000-7ffdd92a9000 rw-p 00029000 08:11 1180383 /lib/x86_64-linux-gnu/libtinfo.so.5.9
7ffdd92a9000-7ffdd92cd000 r-xp 00000000 08:11 1183083 /lib/x86_64-linux-gnu/libncurses.so.5.9
7ffdd92cd000-7ffdd94cc000 ---p 00024000 08:11 1183083 /lib/x86_64-linux-gnu/libncurses.so.5.9
7ffdd94cc000-7ffdd94cd000 r--p 00023000 08:11 1183083 /lib/x86_64-linux-gnu/libncurses.so.5.9
7ffdd94cd000-7ffdd94ce000 rw-p 00024000 08:11 1183083 /lib/x86_64-linux-gnu/libncurses.so.5.9
7ffdd94ce000-7ffdd94ee000 r-xp 00000000 08:11 1192269 /lib/x86_64-linux-gnu/ld-2.19.so
7ffdd951c000-7ffdd96a7000 r--p 00000000 08:11 787456 /usr/lib/locale/locale-archive
7ffdd96a7000-7ffdd96ab000 rw-p 00000000 00:00 0
7ffdd96e5000-7ffdd96ec000 r--s 00000000 08:11 1192265 /usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache
7ffdd96ec000-7ffdd96ee000 rw-p 00000000 00:00 0
7ffdd96ee000-7ffdd96ef000 r--p 00020000 08:11 1192269 /lib/x86_64-linux-gnu/ld-2.19.so
7ffdd96ef000-7ffdd96f0000 rw-p 00021000 08:11 1192269 /lib/x86_64-linux-gnu/ld-2.19.so
7ffdd96f0000-7ffdd96f1000 rw-p 00000000 00:00 0
7fff5ca2f000-7fff5ca50000 rw-p 00000000 00:00 0 [stack]
7fff5cbfc000-7fff5cbfe000 r-xp 00000000 00:00 0 [vdso]
7fff5cbfe000-7fff5cc00000 r--p 00000000 00:00 0 [vvar]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Q: why is there a gap in the dynamic linker mapping?
Library resolution
The dynamic linker locates and maps all the required shared objects in the process virtual address space. Each ELF shared object declares the libraries it depends on with DT_NEEDED
entries in the dynamic section.
The PT_DYNAMIC
program header locates the position of dynamic (.dynamic
) section in the file and in the virtual address space of the process (as an offset from the base address of the ELF object).
$ readelf -l /bin/bash
Elf file type is EXEC (Executable file)
Entry point 0x4205bc
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
[...]
DYNAMIC 0x00000000000f1df8 0x00000000006f1df8 0x00000000006f1df8
0x0000000000000200 0x0000000000000200 RW 8
[...]
Section to Segment mapping:
Segment Sections...
[...]
04 .dynamic
[...]
The content of the dynamic section can be shown by readelf -d
:
$ readelf -d /bin/bash
Dynamic section at offset 0xf1df8 contains 27 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libncurses.so.5]
0x0000000000000001 (NEEDED) Shared library: [libtinfo.so.5]
0x0000000000000001 (NEEDED) Shared library: [libdl.so.2]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x000000000000000c (INIT) 0x41d570
0x000000000000000d (FINI) 0x4b7f34
0x0000000000000019 (INIT_ARRAY) 0x6f1de0
0x000000000000001b (INIT_ARRAYSZ) 8 (bytes)
0x000000000000001a (FINI_ARRAY) 0x6f1de8
0x000000000000001c (FINI_ARRAYSZ) 8 (bytes)
0x000000006ffffef5 (GNU_HASH) 0x400298
0x0000000000000005 (STRTAB) 0x4121f8
0x0000000000000006 (SYMTAB) 0x404b30
0x000000000000000a (STRSZ) 35877 (bytes)
0x000000000000000b (SYMENT) 24 (bytes)
0x0000000000000015 (DEBUG) 0x0
0x0000000000000003 (PLTGOT) 0x6f2000
0x0000000000000002 (PLTRELSZ) 5112 (bytes)
0x0000000000000014 (PLTREL) RELA
0x0000000000000017 (JMPREL) 0x41c178
0x0000000000000007 (RELA) 0x41c0b8
0x0000000000000008 (RELASZ) 192 (bytes)
0x0000000000000009 (RELAENT) 24 (bytes)
0x000000006ffffffe (VERNEED) 0x41c008
0x000000006fffffff (VERNEEDNUM) 2
0x000000006ffffff0 (VERSYM) 0x41ae1e
0x0000000000000000 (NULL) 0x0
The dynamic section declares each shared object dependency as a DT_NEEDED
entry. The dynamic linker (transitively) finds all those DT_NEEDED
entries and maps the corresponding shared object in the process virtual address space:
If the DT_NEEDED
has any /
, it is treated as a full path name.
Otherwise, the file is searched in the following locations:
- in the paths specified by
DT_RPATH
if any; - in the path specified in the
LD_LIBRARY_PATH
environment variable; - in the hard-coded paths:
/usr/lib/
,/lib/
,/usr/lib64/
,/lib64/
/usr/lib/${architecture}
,/lib/${architecture}
in multi-arch systems;/usr/lib
,/lib
.
A suffix can be added after each of those paths based on the processor capabilities. For example, /lib/i386-linux-gnu/i686/cmov/
for a processor with support for i686
features and the cmov
(Conditional Move) instructions.
The libraries specified in LD_PRELOAD
(and their dependencies) are loaded as well using the same algorithm.
The ldd
tool can be used to find all the ELF objects loaded by the dynamic linker:
$ ldd /bin/bash
linux-vdso.so.1 (0x00007fff88bfc000)
libncurses.so.5 => /lib/x86_64-linux-gnu/libncurses.so.5 (0x00007f6a58816000)
libtinfo.so.5 => /lib/x86_64-linux-gnu/libtinfo.so.5 (0x00007f6a585ec000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f6a583e7000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f6a5803e000)
/lib64/ld-linux-x86-64.so.2 (0x00007f6a58a6c000)
Symbols
The dynamic linker uses symbols to link ELF objects together:
- a component can export a symbol corresponding to a a functions or a variable which is instantiated in this ELF object;
- another component can reference this symbol because it uses this function/variables;
- the linker matches the symbol names and links the users of the symbol with the definer of the symbol by storing the value associated with the symbol (usually the address of the corresponding function or variable) where the users of the symbols expects to find it.
The .dynsym
section (found under DT_SYMTAB
in the dynamic section) contains the list of symbols (imported as well as exported) necessary at runtime:
- symbols which are exported by the ELF object such as
endgrent
in this case; - symbols imported by the ELF object (such as
interactive_comments
).
$ readelf -s /bin/bash
Symbol table '.dynsym' contains 2291 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND endgrent@GLIBC_2.2.5 (2)
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __ctype_toupper_loc@GLIBC_2.3 (3)
3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND iswlower@GLIBC_2.2.5 (2)
[...]
17: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterTMCloneTab
[...]
1682: 00000000006fae48 0 NOTYPE GLOBAL DEFAULT 25 __bss_start
[...]
2283: 00000000004ccd10 16 OBJECT GLOBAL DEFAULT 15 true_doc
2284: 0000000000496b00 165 FUNC GLOBAL DEFAULT 13 mbsmbchar
2285: 00000000004764c0 47 FUNC GLOBAL DEFAULT 13 sh_wrerror
2286: 00000000004491e0 18 FUNC GLOBAL DEFAULT 13 restore_pgrp_pipe
2287: 00000000006f2e80 4 OBJECT GLOBAL DEFAULT 24 interactive_comments
2288: 00000000004b5a40 490 FUNC GLOBAL DEFAULT 13 tilde_expand_word
2289: 0000000000460600 307 FUNC GLOBAL DEFAULT 13 array_shift
2290: 0000000000700bcc 4 OBJECT GLOBAL DEFAULT 25 history_lines_this_sessio
More information about the different fields of the symbol table is in the appendix.
Relocation
A given ELF object defines some symbols and imports/uses some others. The dynamic linker needs to connect those references by placing the value of the symbols (typically the effective address of the references variable/function) where the ELF object expects to find it. This process of resolving the symbol references is the relocation.
The relocations tables can be show by readelf -r
:
$ readelf -r /bin/bash
Relocation section '.rela.dyn' at offset 0x1c0b8 contains 8 entries:
Offset Info Type Sym. Value Sym. Name + Addend
0000006f1ff8 006800000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
0000006fae80 01a100000005 R_X86_64_COPY 00000000006fae80 stdout + 0
0000006fae88 07de00000005 R_X86_64_COPY 00000000006fae88 stdin + 0
0000006fae90 06bc00000005 R_X86_64_COPY 00000000006fae90 UP + 0
0000006fae98 01e200000005 R_X86_64_COPY 00000000006fae98 __environ + 0
0000006faea0 060100000005 R_X86_64_COPY 00000000006faea0 PC + 0
0000006faec0 042700000005 R_X86_64_COPY 00000000006faec0 BC + 0
0000006faec8 06e400000005 R_X86_64_COPY 00000000006faec8 stderr + 0
Relocation section '.rela.plt' at offset 0x1c178 contains 213 entries:
Offset Info Type Sym. Value Sym. Name + Addend
0000006f2018 000100000007 R_X86_64_JUMP_SLO 0000000000000000 endgrent + 0
0000006f2020 000200000007 R_X86_64_JUMP_SLO 0000000000000000 __ctype_toupper_loc + 0
0000006f2028 000300000007 R_X86_64_JUMP_SLO 0000000000000000 iswlower + 0
0000006f2030 000400000007 R_X86_64_JUMP_SLO 0000000000000000 sigprocmask + 0
0000006f2038 000500000007 R_X86_64_JUMP_SLO 0000000000000000 __snprintf_chk + 0
0000006f2040 000600000007 R_X86_64_JUMP_SLO 0000000000000000 getservent + 0
0000006f2048 000700000007 R_X86_64_JUMP_SLO 0000000000000000 wcscmp + 0
0000006f2050 000800000007 R_X86_64_JUMP_SLO 0000000000000000 putchar + 0
[...]
Each .rela.foo
section defined relocations for the corresponding .foo
section:
- the
.rela.dyn
defines relocations for the.dyn
section; - the
.rela.plt
defines relocations for the.plt
section.
An entry in those tables is defined as:
typedef struct
{
ElfXX_Addr r_offset; /* Address */
ElfXX_Xword r_info; /* Relocation type and symbol index */
ElfXX_Sxword r_addend; /* Addend */
} ElfXX_Rela;
The fields of this table are:
- the offset describe the location of this relocation from the base address;
- the symbol and represents the value of the relocation: the dynamic linker resolved the symbol and adds the addend;
- the type represents the way this value will be stored at this location.
The dynamic linker finds the relocation table in the program header with DT_RELA
(base address of the relocations) and DT_RELASZ
(size in bytes): this is usually the .rela.data
section. The dynamic linker applies all those relocations in all loaded objects.
Another relocation table can be applies lazily on demand (lazy binding). Those relocations are indicated with DT_JMPREL
(base address) and ST_PLTRELSZ
: this is usually the .rela.plt
section. Those relocations are usually deferred (unless lazy binding is disabled with LD_BIND_NOW
is set) in order to speed up the initialisation of the program.
Initialisation functions
The linker then calls the initialisation functions of the shared objects. Each function is passed the argc
, argv
and envp
parameters.
They are found and executed with (in this order)
- the
DT_PREINIT_ARRAY
and theDT_PREINIT_ARRAYSZ
field (for the.preinit_array
section); - the
DT_INIT
field (for the.init
section); - the
DT_INIT_ARRAY
andDT_INIT_ARRAY
fields (for the.init_array
section).
The constructors of all dependencies of a shared object are called before the constructor of this shared object.
The initialisation functions of the executable are not called by the dynamic linker but by the __libc_csu_init
function (for the GNU libc) which is a part of the libc which is statically linked in the executable:
Q: what does CSU means?
const size_t size = __init_array_end - __init_array_start;
for (size_t i = 0; i < size; i++)
(*__init_array_start [i]) (argc, argv, envp);
The preinitialisation functions of the executable, however, are called by the dynamic linker.
See the appendix for how to define initialisation functions.
Entry point
The dynamic linker then calls the entry point specified in the ELF header of the executable.
In our bash
example, we can check that this entry point is in the .text
section and is the _start
functions (in the symbols table):
$ readelf -h /bin/bash
ELF Header:
[...]
Entry point address: 0x4205bc
[...]
$ readelf -S /bin/bash
There are 27 section headers, starting at offset 0xfaf38:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[...]
[13] .text PROGBITS 000000000041e2f0 0001e2f0
0000000000099c42 0000000000000000 AX 0 0 16
[...]
$ readelf -s /bin/bash
Symbol table '.dynsym' contains 2291 entries:
Num: Value Size Type Bind Vis Ndx Name
[...]
1726: 00000000004205bc 0 FUNC GLOBAL DEFAULT 13 _start
Program startup
The rest of the initialisation process is not done by dynamic linker anymore:
_start
calls the libc__libc_start_main
;__libc_start_main
calls the executable__libc_csu_init
(statically-linked part of the libc);__libc_csu_init
calls the executable constructors (and other initialisatios);__libc_start_main
calls the executablemain()
;__libc_start_main
calls the executableexit()
.
However, the dynamic linker can still be used later for two reasons:
-
for lazy binding;
-
for dynamic loading of shared objects
dlopen()
and dynamic resolution of symbolsdlsym()
.
Conclusion
Advanced topics not covered here:
- Debugging information (DWARF,
.debug_*
); - Dynamic loading with
dlopen()
and dynamic symbol resolution withdlsym()
; - Multiple namespaces with
dlmopen()
; - Symbol versioning;
- Thread Local Storage (TLS) support (
PT_TLS
,.tdata
,.tbss
); - Audit (see
man rtld-audit
and the Sun Linkers and Libraries Guide); - Lazy binding implementation (PLT);
- PIC (
-fPIC
), PIE (-fPIC -pie
) and ASLR (see Position Independent Code in shared libraries and Position Independent Code in shared libraries on x64); - Symbol interposition.
Appendix: more details
Symbol fields
Symbols have an associated type such as STT_FUNC
for functions, STT_OBJECT
for data
The Ndx
fields is the number of the section the symbol is in.
The @
thingie in the symbol names is related to symbol versioning which is an extension.
Binding
The binding is used by the static linker and defines how the symbols are visible across different .o
files of the same final object:
STB_LOCAL
means that the symbol is visible inside the.o
file (static
variables and functions in C);STB_GLOBAL
means that it is visible outside of the.o
file (only one must be present when doing statically linking the.o
files together);STB_WEAK
is similar but allows multiple instances ad can be overridden by aSTB_GLOBAL
in another.o
file. Moreover, those symbols are not pulled out of a static library.
void __attribute__((weak)) foo(void) {}
Visibility
The visibility defines the visibility of the symbol for the dynamic linker:
STV_DEFAULT
, is the default visibility (STB_LOCAL
are hidden);STV_PROTECTED
means that inside this executable/shared-object, this symbol cannot be overridden by a symbol from another executable/shared-object;STV_HIDDEN
is used in.o
files for symbols which must not be used outside of the resulting executable/shared-object. They are transformed intoSTB_LOCAL/STV_DEFAULT
by the link-editor.
In GCC the default visibility can be changed with -fvisibility=hidden
for a given file and can be changed on a per-symbol basis with the visibility
attribute:
void __attribute__((visibility("default"))) foo(void) {}
Initialisation (and preinitialisation) functions
void pre_init() {
abort();
}
void (*const preinit_array []) (void)
__attribute__ ((section (".preinit_array"),
aligned (sizeof (void *)))) =
{
&pre_init
};
__attribute__((constructor))
int init() {
abort();
}
Appendix: dynamic loading and dynamic symbol resolution
The functions related to dynamic loading of libraries (dlopen()
) and dynamic symbol resolution (dlsym()
) are implemented in libdl.so
. The loading and linking of ELF shared-objects and the resolution of the symbols are handled by the dynamic linker: libdl.so
delegate most of its job to the dynamic linker.
dlopen()
This is the core of the GNU dlopen()
(in dlfcn/dlopen.c
):
#ifndef SHARED
# define GLRO(name) _##name
#else
# ifdef IS_IN_rtld
# define GLRO(name) _rtld_local_ro._##name
# else
# define GLRO(name) _rtld_global_ro._##name
# endif
static void
dlopen_doit (void *a)
{
struct dlopen_args *args = (struct dlopen_args *) a;
if (args->mode & ~(RTLD_BINDING_MASK | RTLD_NOLOAD | RTLD_DEEPBIND
| RTLD_GLOBAL | RTLD_LOCAL | RTLD_NODELETE
| __RTLD_SPROF))
GLRO(dl_signal_error) (0, NULL, NULL, _("invalid mode parameter"));
args->new = GLRO(dl_open) (args->file ?: "", args->mode | __RTLD_DLOPEN,
args->caller,
args->file == NULL ? LM_ID_BASE : NS,
__dlfcn_argc, __dlfcn_argv, __environ);
}
The dynamic linker expose a set of callbacks to the application in the _rtld_global_ro
object:
struct rtld_global_ro {
// [...]
void *(*_dl_open) (const char *file, int mode, const void *caller_dlopen,
Lmid_t nsid, int argc, char *argv[], char *env[]);
void (*_dl_close) (void *map);
// [...]
};
This _rtld_global_ro
object is defined in libdl.so
:
25: 0000000000220cc0 304 OBJECT GLOBAL DEFAULT 15 _rtld_global_ro@@GLIBC_PRIVATE
and used in libdl.so
:
13: 0000000000000000 0 OBJECT GLOBAL DEFAULT UND _rtld_global_ro@GLIBC_PRIVATE (7)
dlclose()
and dlmopen()
use the same mechanism.
dlsym()
The dlym()
function uses directly the _dl_sym()
function of ld.so
:
static void
dlsym_doit (void *a)
{
struct dlsym_args *args = (struct dlsym_args *) a;
args->sym = _dl_sym (args->handle, args->name, args->who);
}
dlvsum()
and dladdr()
use the same mechanism.
References
- System V ABI
- System V ABI, AMD64 supplement
- Linux Standard Base
- Understanding
ld-linux.so.2
- About ELF Auxiliary Vectors
- Hello from a libc-free world!
- Linux x86 Program Start Up
- How To Write Shared Libraries
- Concise summary of the ELF format
- Inside ELF Symbol Tables
- LSB - Symbol Versioning
- ELF-64 Object File Format
DT_GNU_HASH
- Linkers and names (
DT_SONAME
) - Optimizing Linker Load Times
Backlinks: