/dev/posts/

The ELF file format

Published:

Updated:

Some notes on the ELF 🧝 file format with references, explanations and some examples.

  
  

— John Ronald Reuel Tolkien

The ELF (Executable and Linkable Format) file format is a standard file format for executable files, dynamic libraries[1] (DSOs, .so files), compiled compilation unit (.o files) and core dumps. It is used for many platforms[2] including many recent Unix-ish systems (System V, GNU, BSD) and embedded software[3].

You might want to read this document alongside with the outputs of readelf, objdump -D[4], objcopy --dump-section, elfcat[5] and/or an hexadecimal editor. You might want to cross-reference with elf.h, the manpage (man 5 elf) or the ELF specs.

Table of content

Basic structure

The ELF header is located at the beginning of the ELF file and contains information about the target operating system (OS), architecture, the type of ELF file (executable, dynamic library, etc.) and the location of two important structures within the ELF file defining two views of the ELF file:

Execution view

The execution view is given by the program header table. This table is used (by the kernel, by the dynamic linker, etc.) to create a runtime image of the program in memory:

Linking view

The linking view is given by the section header table which describes the location of the different sections (within the file and within the the runtime image of the program).

The .o files generated by the compiler are made of different sections (.text for executable code, .data for initialised global variables, .rss for uninitialised global variables, .rodata for read-only global variables, etc.): the link editor combines different .o files in a single executable or DSO (Dynamic Shared Object), by merging the sections of the different .o files with the same name, and generates some others (.got, .dynamic, .plt, .got.plt, etc.)[6].

The linking view is not used at runtime: all the information needed at runtime is in the the program header table. Some sections are not used at runtime (debugging information, full symbol table) and are not present in the execution view. Those sections and the section header table can be omitted (or stripped) from the ELF file.

If they are present those extra informations can be used by debugging tools (such as GDB), profiling tools, etc. Many tools for inspection and manipulation of ELF files (readelf, objdump) rely on the section table header to work correctly.

Other important structures

The dynamic section contains important informations used for dynamic linking.

Symbol tables list the symbols defined and used by the file.

Hash tables are used for efficient lookup of symbols by their name (symbol table entries by symbol name).

Relocation tables list the relocations needed to relocate the ELF file at a different memory address or to link it to other ELF objects.

String tables are lists of strings which are referenced at other places in the ELF file (for section names in the section header table, for symbol names on the symbol tables, etc.).

The GOT (Global Offset Table) is a table filled by the dynamic linker with addresses of functions and variables. The program uses those entries to get the address of variables or functions which could be located in another ELF module.

The PLT (Procedure Linkage Table) contains trampolines: they are stubs for functions which might be located in another ELF module. The program calls those stubs which calls the real function (by dereferencing a corresponding GOT entry). This is used for lazy relocation.

Notes are used to add miscellaneous informations (such as GNU ABI (Application Binary Interface) informations, GNU build IDs).

ELF header

The ELF header is at the beginning of the ELF file and contains:

The ELF header is using the following structure[7]:

typedef struct {
  unsigned char e_ident[EI_NIDENT]; /* Magic number and other info */
  Elf64_Half    e_type;             /* Object file type */
  Elf64_Half    e_machine;          /* Architecture */
  Elf64_Word    e_version;          /* Object file version */
  Elf64_Addr    e_entry;            /* Entry point virtual address */
  Elf64_Off     e_phoff;            /* Program header table file offset */
  Elf64_Off     e_shoff;            /* Section header table file offset */
  Elf64_Word    e_flags;            /* Processor-specific flags */
  Elf64_Half    e_ehsize;           /* ELF header size in bytes */
  Elf64_Half    e_phentsize;        /* Program header table entry size */
  Elf64_Half    e_phnum;            /* Program header table entry count */
  Elf64_Half    e_shentsize;        /* Section header table entry size */
  Elf64_Half    e_shnum;            /* Section header table entry count */
  Elf64_Half    e_shstrndx;         /* Section header string table index */
} Elf64_Ehdr;

readelf -h can display the content of the ELF header.

ELF class

The e_ident[EI_CLASS] field describes the ELF class: 32-bit (ELFCLASS32) or 64-bit (ELFCLASS64) for 32-bit and 64-bit programs respectively.

The ELF structures are different for the two ELF classes: the fields are the same but their type and sometimes their order is different (in order to have packed structures). For example, the -ELF header is using the Elf32_Ehdr and Elf64_Ehdr structures for -ELFCLASS32 and ELFCLASS64 respectively.

ELF endianess

The e_ident[EI_DATA] field describes the encoding (endianess) of the architecture (either ELFDATA2LSB or ELFDATA2MSB). The fields of the ELF file are encoded in the encoding/endianess of the architecture: you might have to swap the endianess (see endian.h) if you process ELF files from a foreign architecture.

ELF type

The ELF type is in the e_type field:

A major difference between ET_EXEC and ET_DYN files is that ET_DYN files are always fixed at a given position in the virtual address. In contrast, ET_DYN files can be relocated anywhere in the virtual address space by applying a constant offset to its virtual addresses[9]: the same .so file can be mapped at different locations in different processes[10]. Usually, the shared-object is mapped at address 0 in the ELF file[11].

Normal (ET_EXEC) executables are always mapped at a given location so the location of their subprograms and global variables is always the same for each process. This knowledge can be exploited by an attacker to get control of the process. In order to avoid this, the program can be compiled as a PIE[12] (Position Independent Executable) which can be mapped (relocated) at any address in the process virtual address space. PIEs being relocatable are ET_DYN instead of ET_EXEC file.

The Linux kernel (vmlinux) uses the ET_EXEC type and its loadable modules (.ko files) use the ET_REL type.

Location of the header tables

The location of the section header table and program header table are described in the ELF header:

Section header table

The section header table defines the linking view of the ELF file: each entry defines a section within the file. The compiler generates relocatable object (.o files) made of different sections (.text, .data, .rodata, .rss, etc.). When the link editor ld combines different relocatable objects into an executable or shared-object, it merges the sections with the same name in a single section in the final output. For example, it combines the .text sections (containing the compiled code) of the different .o files in a single .text section.

The section table is an array of section descriptions with the structure:

typedef struct {
  Elf64_Word    sh_name;      /* Section name (string tbl index) */
  Elf64_Word    sh_type;      /* Section type */
  Elf64_Xword   sh_flags;     /* Section flags */
  Elf64_Addr    sh_addr;      /* Section virtual addr at execution */
  Elf64_Off     sh_offset;    /* Section file offset */
  Elf64_Xword   sh_size;      /* Section size in bytes */
  Elf64_Word    sh_link;      /* Link to another section */
  Elf64_Word    sh_info;      /* Additional section information */
  Elf64_Xword   sh_addralign; /* Section alignment */
  Elf64_Xword   sh_entsize;   /* Entry size if section holds table */
} Elf64_Shdr;

The first entry of a section header table is always a empty null section (type SHT_NULL).

readelf -S can display the section header table. readelf -x can be used to get a hexdump of a given ELF section. A raw dump of a section can be produced with objcopy a.out --dump-section .dynstr=/dev/stdout /dev/null | cat. Note that, some sections are not visible to objcopy and objdump: you might want to use elfcat[5:1] instead.

Section names

Each section has a name (.text, .data, .rodata, .rss, .got, .plt, etc.): all section names are stored in a string table (.shstrtab). The e_shstrndx field of the ELF header is the index (in the section header table) of the section containing the section names:

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
[...]
  Section header string table index: 26

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [26] .shstrtab         STRTAB           0000000000000000  0001e220
       00000000000000f3  0000000000000000           0     0     1

The sh_name field of the section header is the byte offset of the section name within this string table.

Existing sections

Summary of ELF sections
Section name Type Usage (and equivalent runtime description)
.text SHT_PROGBITS Main executable code
.data SHT_PROGBITS Initialised read and write data
.rodata SHT_PROGBITS Read only data
.bss SHT_NOBITS Uninitialised read and write data
.data.rel.ro SHT_PROGBITS
.tdata SHT_PROGBITS Initialised thread-local data (part of PT_TLS)
.tbss SHT_NOBITS Uninitialised thread-local data (part of PT_TLS)
.init SHT_PROGBITS Initialisation code (usually .init, DT_INIT)
.fini SHT_PROGBITS Termination code (usually .fini, DT_FINI)
.init_array SHT_INIT_ARRAY Addresses of initialisation functions (DT_INIT_ARRAY and DT_INIT_ARRAYSZ`)
.fini_array SHT_FINI_ARRAY Addresses of termination functions (DT_FINI_ARRAY and DT_FINI_ARRAYSZ`)
.ctors SHT_PROGBITS Similar to .init_array but old-school
.dtors SHT_PROGBITS Similar to .fini_array but old-school
.dynsym SHT_DYNSYM Dynamic symbol table (DT_SYMTAB)
.dynstr SHT_STRTAB String table for dynamic linkins (DT_STRTAB)
.symtab SHT_SYMTAB Full symbol table
.symtab_shndx SHT_SYMTAB_SHNDX
.strtab SHT_STRTAB String table used for the symbol table
.relaXXX SHT_RELA Relocations for section XXX, with addend
.relXXX SHT_REL Relocations for section XXX, without addend
.rela.dyn SHT_RELA Other runtime relocations, with addend
.rel.dyn SHT_REL Other runtime relocations, without addend
.rela.plt SHT_RELA PLT relocations, with addend
.rel.plt SHT_REL PLT relocations, without addend
.got SHT_PROGBITS Main GOT
.got.plt SHT_PROGBITS PLT GOT, GOT used by the PLT (lazy relocations)
.hash SHT_HASH Standard symbol hash table (DT_HASH)
.gnu.hash SHT_GNU_HASH GNU symbol hash table (DT_GNU_HASH)
.gnu.version SHT_VERSYM GNU symbol versions (DT_VERSYM)
.gnu.version_r SHT_VERNEED GNU versions requirements (DT_VERNEED and DT_VERNEED_NUM)
.gnu.version_d SHT_VERDEF GNU versions definitions (DT_VERDEF and DT_VERDEF_NUM)
.debug_info SHT_PROGBITS DWARF, Main DWARF section (variables, subprograms, types, etc.)
.debug_abbrev SHT_PROGBITS DWARF, Type of the nodes in debug_abbrev
.debug_aranges SHT_PROGBITS DWARF
.debug_line SHT_PROGBITS DWARF, Mapping between instruction and source code lines
.debug_str SHT_PROGBITS DWARF, Strings for DWARF sections
.debug_fame SHT_PROGBITS DWARF, Stack unwinding information[13]
.debug_macro Debug macros (GNU extension)
.debug_link [14]
.stab SHT_PROGBITS Debugging informations in the (old) stab format
.stabstr SHT_PROGBITS Strings associated with the .stab section
.eh_frame SHT_PROGBITS Runtime stack unwinding information[13:1]
.eh_frame_hdr SHT_PROGBITS Header (location and index) of the EH frame table (PT_GNU_EH_FRAME)
.shstrtab SHT_STRTAB String table for section names
.note.XXXX SHT_NOTE Note
.note.ABI-tag SHT_NOTE ABI used in this file (NT_GNU_ABI_TAG)
.note.gnu.build-id SHT_NOTE Build-id for thie build[15] (NT_GNU_BUILD_ID note.)
.dynamic SHT_DYNAMIC Dynamic table, dynamic linking information (PT_DYNAMIC)
.interp SHT_PROGBITS Interpreter (PT_INTERP)
.group SHT_GROUP Group of related sections (used for COMDAT)
.comment
.jcr SHT_PROGBITS Used for Java (?)
.stapsdt.base Used for SystemTap SDT
.note.stapsdt Used for SystemTap SDT
.gcc_except_table SHT_PROGBITS LSDA (Language Specific Data) for exception handling
.gnu.warning Warning message when linking against this file[16]
.gnu_warning.XXX SHT_PROGBITS Warning message when linking against symbol XXX[16:1]
.ARM.extab SHT_PROGBITS
.ARM.exidx SHT_ARM_EXIDX
.ARM.attributes SHT_ARM_ATTRIBUTES

Section types

For symbol tables (SHT_SYMTAB and SHT_DYNSYM) and the dynamic section (SHT_DYNAMIC), the sh_link gives the index of the string table used to find the strings referenced in the section.

For symbol hash tables (SHT_HASH and SHT_GNU_HASH) and relocation tables (SHT_RELA and SHT_REL), it gives the index of the associated symbol table.

Section info

For relocation tables, the sh_info field gives the index of the section it applies to. This is mostly relevant for .o files. For executables and DSOs on GNU systems, the .rela.dyn uses 0 because it applies to many different sections and rela.plt uses the index of the .plt even if it applies to the .got.plt.

For symbol tables, it gives the index in the symbol table which can be used to skip the STT_LOCAL symbols.

Section flags

The sh_flags is a field of flags:

Program header table

The program header table defines the execution view of the ELF file:

The program table is an array of program headers:

typedef struct {
   uint32_t   p_type;   /* Segment type */
   uint32_t   p_flags;  /* Segment flags */
   Elf64_Off  p_offset; /* Segment file offset */
   Elf64_ddr p_vaddr;   /* Segment virtual address */
   Elf64_Addr p_paddr;  /* Segment physical address */
   uint64_t   p_filesz; /* Segment size in file */
   uint64_t   p_memsz;  /* Segment size in memory */
   uint64_t   p_align;  /* Segment alignment */
} Elf64_Phdr;

The program header table can be seen with readelf -l. readelf tells as well which section is located in each region described in a program header entry.

Segments

A PT_LOAD entry represents a loadable segment to load (typically mmap()) in the program memory. A typical ELF executable or DSO has two such entries describing two segments[18]:

  1. The first one is the text segment. It is executable, readable but not writable and contains code and read-only data (.text, .rodata, .plt, .eh_frame, etc);
  2. The second one is the data segment. It is readable, writable but not executable and contains the modifiable data (.data, .got, got.plt, .bss, etc.).

The idea in this separation is that everything which does not need to be written (read-only data, code) should be read-only:

Note: security considerations

Another important property in the design is that executable segments are not writable[20]. If a process has VMAs[21] which are both executable and writable, an attacker might exploit bugs such as buffer overflows in order to write arbitraty code in the program's memory and possibly execute it. If the executable pages are read-only, the attackers can try to write arbitrary code but it will not be executable[22].

Example

A simple hello world program:

  Type           Offset             VirtAddr           PhysAddr
                FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                0x00000000000001c0 0x00000000000001c0  R E    8
  INTERP         0x0000000000000200 0x0000000000400200 0x0000000000400200
                0x000000000000001c 0x000000000000001c  R      1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                0x00000000000006dc 0x00000000000006dc  R E    200000
  LOAD           0x00000000000006e0 0x00000000006006e0 0x00000000006006e0
                0x0000000000000230 0x0000000000002288  RW     200000
  DYNAMIC        0x00000000000006f8 0x00000000006006f8 0x00000000006006f8
                0x00000000000001d0 0x00000000000001d0  RW     8
  NOTE           0x000000000000021c 0x000000000040021c 0x000000000040021c
                0x0000000000000044 0x0000000000000044  R      4
  GNU_EH_FRAME   0x00000000000005b4 0x00000000004005b4 0x00000000004005b4
                0x0000000000000034 0x0000000000000034  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                0x0000000000000000 0x0000000000000000  RW     10

We can see the resulting VMAs[21:1] in /proc/$pid/maps of a corresponding process:

  • The first VMA (Virtual Memory Area) is the text segment.
  • The second VMA is the part of the data segment which is initialised.
  • The fourth VMA is the part of the data segment which is not been initialised. This is the end of the .bss segment. This part is it not stored in the ELF file and and is a thus a separate MAP_ANONYMOUS VMA.
00400000-00401000 r-xp 00000000 08:13 27418661   /home/foo/temp/wait
00600000-00601000 rw-p 00000000 08:13 27418661   /home/foo/temp/wait
00601000-00603000 rw-p 00000000 00:00 0
[...]

Read only relocations

On GNU systems, the dynamic linker may be instructed to mprotect() the .got section against write access after the relocation is finished. This improves the security by preventing the poisoning of the (non-PLT) GOT[23] after the relocation is done.

This is enabled with ld -z relro (which generates a PT_GNU_RELRO entry) and disabled explicitly with ld -z norelo. When enabled, PT_GNU_RELRO is present in the program header table and describes a range of memory which the dynamic linker can mprotect() after the (non-lazy) relocation is done (the .got section).

The same example program linked with ld -z relro features the additional PT_GNU_RELRO entry:

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                 0x00000000000001f8 0x00000000000001f8  R E    8
  INTERP         0x0000000000000238 0x0000000000400238 0x0000000000400238
                 0x000000000000001c 0x000000000000001c  R      1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x000000000000070c 0x000000000000070c  R E    200000
  LOAD           0x0000000000000e10 0x0000000000600e10 0x0000000000600e10
                 0x0000000000000230 0x0000000000002258  RW     200000
  DYNAMIC        0x0000000000000e28 0x0000000000600e28 0x0000000000600e28
                 0x00000000000001d0 0x00000000000001d0  RW     8
  NOTE           0x0000000000000254 0x0000000000400254 0x0000000000400254
                 0x0000000000000044 0x0000000000000044  R      4
  GNU_EH_FRAME   0x00000000000005e4 0x00000000004005e4 0x00000000004005e4
                 0x0000000000000034 0x0000000000000034  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     10
  GNU_RELRO      0x0000000000000e10 0x0000000000600e10 0x0000000000600e10
                 0x00000000000001f0 0x00000000000001f0  R      1

This can be seen in /proc/$pid/maps:

00400000-00401000 r-xp 00000000 08:13 27418663   /home/foo/temp/wait2
00600000-00601000 r--p 00000000 08:13 27418663   /home/foo/temp/wait2
00601000-00602000 rw-p 00001000 08:13 27418663   /home/foo/temp/wait2
00602000-00604000 rw-p 00000000 00:00 0
[...]

In addition, ld -z now (DF_BIND_NOW) might be used which disables lazy-relocation. By combining the two options, you can get an executable or DSO without .got.plt and all the GOT will be read-only after relocation.

Other program header entries

String tables

String tables are lists of strings. They use the SHT_STRTAB section type. Each string in the string table is terminated by a NUL byte and is referenced by its byte offset from the beginning of the table.

The first entry of a string table is always the empty string (the first byte of a string table is always NUL): the empty string can always be designated with the zero offset.

The content of a string section can be displayed with readelf -p .dynstr or with objcopy a.out --dump-section .dynstr=/dev/stdout /dev/null | tr '\000' '\n'.

Usages:

References to string tables:

Example of .shstrtab (x86_64 GNU/Linux)

Section Headers:

 [Nr] Name              Type             Address           Offset
     Size              EntSize          Flags  Link  Info  Align
 [27] .shstrtab         STRTAB           0000000000000000  000008f1
     0000000000000108  0000000000000000           0     0     1
 

File hexdump:

000008b0: 0000 0000 0000 0000 4743 433a 2028 4465  ........GCC: (De
000008c0: 6269 616e 2034 2e39 2e32 2d31 3029 2034  bian 4.9.2-10) 4
000008d0: 2e39 2e32 0047 4343 3a20 2844 6562 6961  .9.2.GCC: (Debia
000008e0: 6e20 342e 382e 342d 3129 2034 2e38 2e34  n 4.8.4-1) 4.8.4
000008f0: 0000 2e73 796d 7461 6200 2e73 7472 7461  ...symtab..strta
00000900: 6200 2e73 6873 7472 7461 6200 2e69 6e74  b..shstrtab..int
00000910: 6572 7000 2e6e 6f74 652e 4142 492d 7461  erp..note.ABI-ta
00000920: 6700 2e6e 6f74 652e 676e 752e 6275 696c  g..note.gnu.buil
00000930: 642d 6964 002e 676e 752e 6861 7368 002e  d-id..gnu.hash..
00000940: 6479 6e73 796d 002e 6479 6e73 7472 002e  dynsym..dynstr..
00000950: 676e 752e 7665 7273 696f 6e00 2e67 6e75  gnu.version..gnu
00000960: 2e76 6572 7369 6f6e 5f72 002e 7265 6c61  .version_r..rela
00000970: 2e64 796e 002e 7265 6c61 2e70 6c74 002e  .dyn..rela.plt..
00000980: 696e 6974 002e 7465 7874 002e 6669 6e69  init..text..fini
00000990: 002e 726f 6461 7461 002e 6568 5f66 7261  ..rodata..eh_fra
000009a0: 6d65 5f68 6472 002e 6568 5f66 7261 6d65  me_hdr..eh_frame
000009b0: 002e 696e 6974 5f61 7272 6179 002e 6669  ..init_array..fi
000009c0: 6e69 5f61 7272 6179 002e 6a63 7200 2e64  ni_array..jcr..d
000009d0: 796e 616d 6963 002e 676f 7400 2e67 6f74  ynamic..got..got
000009e0: 2e70 6c74 002e 6461 7461 002e 6273 7300  .plt..data..bss.
000009f0: 2e63 6f6d 6d65 6e74 0000 0000 0000 0000  .comment........

This string table of section names starts at 0x8f1:

  • the first entry if the empty string for section header number 0 (offset 0);
  • the second entry is the .symtab string (offset 1);
  • the this entry is the .strtab string (offset 9).

Symbols and the symbol table

What is a symbol?

Symbols are used for linking (by the link editor and the dynamic linker).

The C statement:

extern int foo;

int foo = 3;

defines a global variable associated with the foo symbol[24].

A user of this global variable:

extern int foo;

int foo_updater()
{
  return foo++;
}

will link to the foo symbol.

The linker will bind the user of the global variable with the global variable because they are using the same symbol name.

Symbol tables

Three section header table often includes two different symbol tables:

The former can be used by debugging tools and the latter contains the minimum amount of entries for the dynamic linker. For this reason, only the latter is mapped in the process virtual address space and is present in the dynamic table.

The symbol tables are arrays of symbol entries:

typedef struct {
  Elf64_Word    st_name;  /* Symbol name (string tbl index) */
  unsigned char st_info;  /* Symbol type and binding */
  unsigned char st_other; /* Symbol visibility (and 0) */
  Elf64_Section st_shndx; /* Section index */
  Elf64_Addr    st_value; /* Symbol value */
  Elf64_Xword   st_size;  /* Symbol size */
} Elf64_Sym;

At runtime, the dynamic symbol table is given by the dynamic table entry ST_SYMTAB. Its size is not given and can be inferred from the hash table (DT_HASH or DT_GNU_HASH).

readelf -s can display the symbol tables.

Symbol type

Section index

Each symbol can be associated with a section (by its index).

Some special values are used:

Visibility and binding

Common visibility and binding combinations
Binding Visibility Meaning
STT_LOCAL STV_DEFAULT Local to relocatable object
STT_GLOBAL STV_HIDDEN Local to the executable or DSO[25]
STT_GLOBAL STV_DEFAULT Global (visible in other runtime ELF modules)

Symbol binding

The symbol binding control the link-time visibility of the symbol (i.e. outside translation units and within a given ELF runtime objecte but not across runtime ELF objects). It is a part of the stb_info field.

Symbol visibility

The symbol visibility controls the visibility across executable and DSOs. It is stored in the st_other field. This field is not relevant for STT_LOCAL symbols.

The different values are:

The STT_HIDDEN can be used in order to mark symbols which need not be used outside of the DSO:

The visibility of a symbol can be defined in GCC with the visibility attribute:

int get_answer(void) __attribute__(visibility("hidden"))
{
  return 42;
}

The default visibility can be changed with command-line arguments with recent versions of GCC (gcc -fvisibility=hidden) or with pragmas:

#pragma GCC visibility push(hidden)
int get_answer(void) __attribute__(visibility("hidden"))
{
  return 42;
}
#pragma GCC visibility pop(hidden)

Relocation tables

The relocation tables are arrays of relocation entries using one of those forms:

typedef struct {
  Elf64_Addr    r_offset;  /* Address */
  Elf64_Xword   r_info;    /* Relocation type and symbol index */
} Elf64_Rel;

typedef struct {
  Elf64_Addr    r_offset;  /* Address */
  Elf64_Xword   r_info;    /* Relocation type and symbol index */
  Elf64_Sxword  r_addend;  /* Addend */
} Elf64_Rela;

The relocations exist in two forms. In both cases an addend is added to the symbol:

readelf -r can display the relocation tables.

Relocation address

ET_REL files have one relocation section .rela.foo (or .rel.foo) per relocated section .foo. The r_offset address of the relocation is the offset of within the relocated .foo section.

For ET_EXEC and ET_DYN files, there is usually two relocation tables: the normal relocation table .rela.dyn (or .rel.dyn) and the lazy/PLT relocation table .rela.plt (or .rel.plt). The r_offset address of the relocation has a different meaning: it is the (runtime) virtual address of the relocation. The location of the relocation tables is described at runtime in the dynamic section (DT_RELA, DT_REL, DT_RELASZ, DT_RELSZ, DT_RELAENT, DT_RELENT DL_PLTREL, PLTRELSZ, DT_JMPREL).

GOT

The executable code is (usually) in the read-only segment:

As we do no want to modify the code (in the readonly text segment) in order to share it, the dynamic linker cannot relocate the DSO by patching the addresses of the referenced objects in the executable code. Instead, the address of the object is stored by the dynamic linker in the writable segment and the code fetches this address.

The link editor creates a section in the writable segment, the GOT (.got), containing all the slots for those addresses[27]. It creates a relocation entries in order to make the dynamic linker store the suitable values in the GOT.

GOT examples for x86_64

Compilation

For example, this C code:

extern int foo;

int get_foo()
{
  return foo;
}

compiles into this (gcc -S deref.c -o- -fPIC):

get_foo:
        movq    foo@GOTPCREL(%rip), %rax
        movl    (%rax), %eax
        ret

foo@GOTPCREL(%rip) resolves to a memory address (a entry in the GOT) where the address of foo is written: the first instruction stores this address in the %rax register. In the next instruction, the processor fetches the foo variable by dereferencing this address.

Relocatable object

When compiled into a relocatable object, we get this relocation:

Relocation section '.rela.text' at offset 0x250 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000003  000b00000009 R_X86_64_GOTPCREL 0000000000000000 foo - 4

It asks the link editor to generate a GOT entry for the address of foo and fill the relative address of this GOT entry in the instruction (movq foo@GOTPCREL(%rip), %rax). The link editor creates the GOT entry.

An addend of -4 is used because the relative instructions in x86 are using the address of the next instruction as a base address.

Shared object

At runtime, the GOT entry needs to be filled by the dynamic linker. In order to do this, the link editor creates a relocation for the GOT entry in the shared-object:

Relocation section '.rela.dyn' at offset 0x458 contains 9 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000200990 000800000006 R_X86_64_GLOB_DAT 00000000002009ec foo + 0

This entry sets the third entry in the .got GOT[28]:

[19] .got              PROGBITS         0000000000200980  00000980
     0000000000000030  0000000000000008  WA       0     0     8

PLT

The Procedure Linkage table is used for calling functions whose address is not known at link time (because they might be in another shared-object or the executable). The PLT can be disassembled with objdump -D -j .plt.

For example this code:

#include <stdlib.h>

int main(int argc, char** argv)
{
  abort();
  return 0;
}

is compiled into (gcc test.c -S -o- -fPIC -O3):

main:
        subq    $8, %rsp
        call    abort

When decompiling the resulting executable we find that the call to foo as been replaced by a call to a stub for abort@plt (called a trampoline):

0000000000400410 <main>:
  400410:       48 83 ec 08             sub    $0x8,%rsp
  400414:       e8 c7 ff ff ff          callq  4003e0 &lt;abort@plt>

This trampoline fetches the address of the abort in the GOT and jumps to this address:

00000000004003e0 <abort@plt>:
  4003e0:       ff 25 ea 04 20 00       jmpq   *0x2004ea(%rip)  # 6008d0 &lt;_GLOBAL_OFFSET_TABLE_+0x18>
  4003e6:       68 00 00 00 00          pushq  $0x0
  4003eb:       e9 e0 ff ff ff          jmpq   4003d0 &lt;_init+0x28>

All of this is done by the first instruction of this PLT trampoline: the two remaining instructions are used for lazy relocation which is explained afterwards.

A relocation exists in order to store the address of foo in this PLT GOT entry:

Relocation section '.rela.plt' at offset 0x360 contains 3 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
0000006008d0  000100000007 R_X86_64_JUMP_SLO 0000000000000000 abort +

Lazy relocations

Relocation in dynamic linking can slow down the initialisation of the application: each symbol must be looked up in all loaded DSOs and the executable. In order to speed up the relocation of programs, lazy relocation is used for function calls[29]: the corresponding PLT GOT entry is not filled with the address of the function in the process initialisation but only when the function is actually called.

# Special .PLT0 entry:
00000000004003d0 <abort@plt-0x10>:
  4003d0:  ff 35 ea 04 20 00   pushq  0x2004ea(%rip)  # 6008c0 &lt;_GLOBAL_OFFSET_TABLE_+0x8>
  4003d6:  ff 25 ec 04 20 00   jmpq   *0x2004ec(%rip) # 6008c8 &lt;_GLOBAL_OFFSET_TABLE_+0x10>
  4003dc:  0f 1f 40 00

# .PLT1 for abort:
00000000004003e0 <abort@plt>:
  4003e0:  ff 25 ea 04 20 00   jmpq   *0x2004ea(%rip) # 6008d0 &lt;_GLOBAL_OFFSET_TABLE_+0x18>
  4003e6:  68 00 00 00 00      pushq  $0x0
  4003eb:  e9 e0 ff ff ff      jmpq   4003d0 &lt;_init+0x28>
  1. The dynamic linker preinitialises the PLT GOT,
    • the first entry of the PLT GOT is filled by the dynamic linker with the address of _DYNAMIC;
    • the second entry of the PLT GOT is filled by the dynamic linker with a value used by the dynamic linker to recognise this ELF executable or DSO;
    • the third entry of the PLT GOT is filled with the address of a callback of the dynamic linker;
    • the PLT GOT entry for abort@plt is initially filled with the address of its second instruction (0x4003e6);
  2. on the first call of the PLT trampoline abort@plt, a. the first instruction of the trampoline jumps to the second instruction of the trampoline; b. the second instruction of the PLT pushes on the stack the index of this relocation in the relocation table (from DT_JMPREL); c. the third instruction jumps to the first entry of the PLT (.PLT0); d. this entry pushes the second entry of the PLT GOT on the stack (this is used by the dynamic linker to identify this shared-object); e. this entry jumps to the callback of the dynamic linker; f. the dynamic linker does the real relocations,
    • it uses the arguments passed on the stack (identifier of this shared-object or executable and index in the relocation table),
    • it resolves the symbol;
    • it updates the PLT GOT entry with the address of the symbol;
    • it jumps to the address of the symbol in order to execute the function; g. the function is executed;
  3. on other calls, the PLT GOT entry now contains the address of the function and the PLT entry jumps to it directly (instead of jumping to .PLT0 and to the dynamic linker).

In the section header table:

In the dynamic section:

PLT example for x86_64

Compilation

This time let's compile a function call:

extern int foo(void);

int get_foo()
{
  return foo() + 42;
}

We get this assembly (cc -O3 -S -fpic):

get_foo:
.LFB0:
        subq    $8, %rsp
        call    foo@PLT
        addq    $8, %rsp
        addl    $42, %eax
        ret

The foo@PLT asks the assembler to use the address of a PLT entry for the foo function

Relocatable object

We get this relocation in the relocatable object:

Relocation section '.rela.text' at offset 0x260 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000005  000b00000004 R_X86_64_PLT32    0000000000000000 foo - 4

It asks the link editor to patch the instruction with the 32-bit relative address of the PLT entry for symbol foo. The link editor creates a PLT entry, corresponding PLT GOT entry (in the .got.plt) section and a relocation entry for this PLT GOT entry (in .rela.dyn).

Shared object

We get this relocation in the shared-object:

Relocation section '.rela.plt' at offset 0x4f0 contains 3 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000200960  000400000007 R_X86_64_JUMP_SLO 0000000000000000 foo + 0

This relocation entry asks the dynamic linker to lazily initialise the PLT GOT entry:

  1. it will first fill the PLT GOT entry with the second instruction of the associated PLT entry;
  2. when the PLT is called, it will call the dynamic linker which will initialise the PLT GOT entry with the address of the foo symbol.

Some x86_64 relocations

Link time relocation:

Runtime relocations:

Some x86 relocations

Hash tables

Standard hash table

The standard hash table is built by the link editor. It is described by the .hash SHT_HASH section and by the DT_HASH entry in the dynamic section. Its structure is quite simple:

// Pseudo-C:
struct {
  Elf32_Word nbucket;          /* Number of buckets */
  Elf32_Word nchain;           /* Numer of entries in .dynsy* */
  Elf32_Word buckets[nbucket]; /* First entry in the chain */
  Elf32_Word chains[nchain];   /* Next entry in the chain */
};

The lookup looks like this:

Elf64_Sym const* lookup_symbol(
  const char* symbol,
  Elf64_Sym const* symbol_table
  const char* string_table,
  Elf32_Word const* hash_table)
{
  Elf32_Word nbucket           = hash_table[0];
  Elf32_Word nchain            = hash_table[1];
  Elf32_Word const* buckets    = hash_table + 2;
  Elf32_Word const* chains     = hash_table + 2 + nbucket;

  unsigned long hash = elf_hash(symbol_name);

  // Iterate on the chain:
  while (Elf32_Word index = buckets[hash % nbucket];
         chains[index] != STN_UNDEF;
    index = chains[index])
    if (strcmp(symbol, string_table + symbol_table[index].st_name) == 0)
      return symbol_table + index;

  return NULL;
}

GNU hash table

The GNU hash table is a more efficient alternative to the standard hash table[30]. Both can be present in the same ELF file but modern GNU ELF files usually only contains the GNU hash table. It is described by the .gnu.hash SHT_GNU_HASH section and by the DT_GNU_HASH entry in the dynamic section.

Main differences:

// Pseudo-C:
struct Gnu_Hash_Header {
  uint32_t nbuckets;
  uint32_t symndx;    /* Index of the first accessible symbol in .dynsym */
  uint32_t maskwords; /* Nyumber of elements in the Bloom Filter */
  uint32_t shift2;    /* Shift count for the Bloom Filter */
  uintXX_t bloom_filter[maskwords];
  uint32_t buckets[nbuckets];
  uint32_t values[dynsymcount - symndx];
};

Notes

Each entry of a note section begins with:

typedef struct {
  Elf64_Word n_namesz;  /* Length of the note's name.  */
  Elf64_Word n_descsz;  /* Length of the note's descriptor.  */
  Elf64_Word n_type;    /* Type of the note.  */
} Elf64_Nhdr;

After this comes the note name and the note content:

Padding is used after the name and the content of the note to ensure 4 byte alignment.

Each note is usually in its own section (.note.XXX) but they are all grouped in the same program entry. readelf -n can display the notes.

GNU notes

GNU notes are using the string "GNU" (with a terminating 0 byte) and define the notes:

CORE notes

See Anatomy of an ELF core file.

LINUX notes

See Anatomy of an ELF core file.

Dynamic section

The dynamic section provides important informations for the dynamic linker. A statically linked executable does not have a PT_DYNAMIC entry.

It is an array of entries with the structure:

typedef struct {
  Elf64_Sxword  d_tag;   /* Dynamic entry type */
  union {
    Elf64_Xword d_val; /* Integer value */
    Elf64_Addr d_ptr;  /* Address value */
  } d_un;
} Elf64_Dyn;

readelf -d can display the content of the dynamic section.

The dynamic table is available as at runtime with the _DYNAMIC local symbol. A DT_NULL entry marks the end of the dynamic section.

Shared objects

RPATH

The DT_RUNRPATH (and DT_RPATH [32]) defines an additional path where the shared-objects should be searched.

The dynamic linker (ld.so) recognises several special values in DT_RUNRPATH (and DT_RPATH):

The DT_RPATH can be set with ld -rpath='$ORIGIN' (or gcc -Wl,-rpath='$ORIGIN'). ld --enable-new-dtags might be needed to add the DT_RUNPATH entries as well.

Symbols

The type of hash table generated by the link editor can be chosen with ld --hash-style=style=sysv|gnuboth`. By default, the GNU hash table is used on (not-too old) GNU systems.

Relocations

At runtime there is usually two different relocation tables: the main relocation table and the PLT relocation table.

The main relocation table (.rela.dyn section) is located with DT_RELA (address), DT_RELASZ (byte size of the relocation table), DT_RELAENT (byte size of a relocation entry) for relocation tables with addend. The main relocation table without addend uses DT_REL, DT_RELSZ and DT_RELENT.

Another relocation table (.rela.plt section) is used for the PLT. It is located with: DT_JMPREL (address) and DT_PLTRELSZ (byte size of the relocation table). The DT_PLTREL gives the type of relocation table (either DT_RELA or DT_REL) used for the PLT.

The DT_PLTGOT is the address of the PLT GOT (.got.plt). The dynamic linker needs to know it because the first entries of the PLT GOT are used by the dynamic linker.

Symbol lookup

Each relocation implies a symbol lookup.

In ELF, symbol resolution is using a mostly[34] flat-namespace[35]: a used symbol is not bound to a specific DSO and is it searched in all the executable and all DSOs with breadth-first search[36] (using the order of the DT_NEEDED entries).

This search is in O(#modules). For each executable or shared-object, a hash table (DT_HASH, DT_GNU_HASH or both) is included in the file (and available at runtime) in order to speed up the symbol lookup.

Flags

DT_FLAGS is a field of flags:

Initialisation and termination functions

Initialisation functions are called in this order:

  1. DT_PREINIT_ARRAY array (of byte size DT_PREINIT_ARRAYSZ) of preinitialisation function addresses.
  2. DT_INIT, address of an initialisation function (the .init section);
  3. DT_INIT_ARRAY array (of byte size DT_INIT_ARRAYSZ) of initialisation function addresses.

Termination functions are called in this order:

  1. DT_FINI_ARRAY array (if byte size DT_FINI_ARRAYSZ) of termination function addresses;
  2. DT_FINI address of a termination function respectively (.fini sections).

Debug interface

If a DT_DEBUG entry is present, this value will be set by the dynamic linker to a pointer to the address of a struct r_debug (see link.h):

struct r_debug
{
  int r_version;          /* Version number for this protocol. */
  struct link_map *r_map; /* Head of the chain of loaded objects.  */
  ElfW(Addr) r_brk;
  enum {
    RT_CONSISTENT,        /* Mapping change is complete.  */
    RT_ADD,               /* Beginning to add a new object.  */
    RT_DELETE             /* Beginning to remove an object mapping.  */
  } r_state;
  ElfW(Addr) r_ldbase;    /* Base address the linker is loaded at.  */
};

This can be used to traverse the list of executables and shared-objects (of a given namespace):

struct link_map {
  /* These first few members are part of the protocol with the debugger.
     This is the same format used in SVR4.  */
  ElfW(Addr) l_addr;          /* Difference between the address in the ELF
                                 file and the addresses in memory.  */
  char *l_name;               /* Absolute file name object was found in.  */
  ElfW(Dyn) *l_ld;            /* Dynamic section of the shared object.  */
  struct link_map *l_next, *l_prev; /* Chain of loaded objects.  */
};

The struct link_map can be obtained at runtime with dlinfo(handle, RTLD_DI_LINKMAP, &res).

String table

DT_STRTAB and DT_STRSZ give the location and byte size of string table used by the dynamic section (.dynstr);

Symbol versions

Those entries are GNU extensions for versioning of symbol:

Not covered (much) here

GNU symbol versioning

Main structures:

See the LSB.

TLS

The ELF file contains an initialisation image for the TLS data:

See ELF Handling For Thread Local Storage.

COMDAT

COMDAT refers to the ability of the static linker to remove redundant code and data when combining different .o files. This is used in C++ when instanciating templates. In order to do this, the compiler creates dedicated sections for each template instanciation.

For example, this C++ code:

#include <string>

std::string foo(std::string& x)
{
  return x + x;
}

Generates the following sections in the relocatable object:

$ readelf -WS test.o
There are 26 section headers, starting at offset 0xc058:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .group            GROUP           0000000000000000 000040 00000c 04     24  18  4
  [ 2] .text             PROGBITS        0000000000000000 00004c 00002d 00  AX  0   0  1
  [ 3] .rela.text        RELA            0000000000000000 008278 000018 18   I 24   2  8
  [ 4] .data             PROGBITS        0000000000000000 000079 000000 00  WA  0   0  1
  [ 5] .bss              NOBITS          0000000000000000 000079 000000 00  WA  0   0  1
  [ 6] .text._ZStplIcSt11char_traitsIcESaIcEENSt7__cxx1112basic_stringIT_T0_T1_EERKS8_SA_ PROGBITS        0000000000000000 000079 000062 00 AXG  0   0  1
  [ 7] .rela.text._ZStplIcSt11char_traitsIcESaIcEENSt7__cxx1112basic_stringIT_T0_T1_EERKS8_SA_ RELA            0000000000000000 008290 000060 18   I 24   6  8
  [ 8] .gcc_except_table._ZStplIcSt11char_traitsIcESaIcEENSt7__cxx1112basic_stringIT_T0_T1_EERKS8_SA_ PROGBITS        0000000000000000 0000db 000010 00  AG  0   0  1
[...]

Section groups (sections with .group name and SHT_GROUP type) are used to group related sections: the first Elf32_Word of the group section is a set of flags (GRP_COMDAT is used for COMDAT section groups) and the remaining Elf32_Word of the section are the indices of the sections belonging to this group.

ARM

References

Authoritative references

Blogs posts, articles, books and such


  1. Static libraries (.a files) are archives to .o files. Different formats exist for them. ↩︎

  2. Notable exception are the Apple systems (MacOS X, iOS, Darwin) which use their own Mach-O format (coming from their NeXTSTEP lineage) and Microsoft systems (Windows) which use the PE file (Portable Executable) format (which is based on the old Unix System V COFF format). ↩︎

  3. For example, it used used for ARM-based embedded software. ↩︎

  4. GNU objdump and objcopy both rely on BFD and are unable to see some sections (and can synthesise some others) because of the file-format abstraction of the BFD library. objdump from elfutils (called eu-objdump on some distributions) does not have this limitation (but only has a limited subset of the feature of GNU objdump). ↩︎

  5. I wrote this tool because objcopy --dump-section was not completely satisfying. ↩︎ ↩︎

  6. With the GNU BFD linker, the layout of sections after linking is given by a linker script. The default linker script can be seen with ld -verbose. Another linker script can be used with ld -T some_linker_script. ↩︎

  7. The C structures (and the associated comments) are taken from the GNU elf.h file. Only the 64 bit variant is displayed here. ↩︎

  8. This is an extension to the ELF standard not documented in the specification. ↩︎

  9. They are using PIC code (Position Independent Code). They must be compiled with cc -fpic (or -fPIC). ↩︎

  10. In contrast to PE (Portable Executable) files, the (readonly) text segment (such as the code) is shared for all processes (and with the filesystem cache) even if the shared-object is loaded at different addresses. In order to achieve this, the code for shared-objects should be compiled as PIC (Position Independent Code).

    PE files are built with a preferred address and if they must be relocated, the code becomes private to the process. In other words, Windows DLL (Dynamic-Link Library) do not use PIC. ↩︎

  11. Prelinked DSOs are located at a given (non-null) address in the ELF file. ↩︎

  12. They are compiled with cc -fpie (or -fPIE). ↩︎

  13. The .debug_frame DWARF section is used to tell the debugger how to unwind each stack frame

    The .eh_frame has been created in order to unwind the stack at runtime. This is used for exception handling.

    The .eh_frame section contains information for uwinding the frame for each instruction address. This is use by the Itanium C++ exception ABI to unwind the stack on exceptions. Its format is based on the .debug_frame DWARF section.

    If it is present the .debug_frame can be omitted. ↩︎ ↩︎

  14. .gnu_debuglink is used to locate a separate file containing debug informations. Another solution is to use a NT_GNU_BUILD_ID note. ↩︎

  15. .note.gnu.build-id describes the build-id used to locate a separate ELF file containing the debug informations. This is the NT_GNU_BUILD_ID note. ↩︎

  16. .gnu.warning and .gnu_warning.XXX contains warning message displayed by the linker to issue warnings when linking against this ELF file or this symbol respectively.

    Example:

    Hex dump of section '.gnu.warning.gets':
    0x00000000 74686520 60676574 73272066 756e6374 the `gets' funct
    0x00000010 696f6e20 69732064 616e6765 726f7573 ion is dangerous
    0x00000020 20616e64 2073686f 756c6420 6e6f7420  and should not
    0x00000030 62652075 7365642e 00                be used..
    
    ↩︎ ↩︎
  17. For relocation sections which apply to a single section, the sh_info field is the index of the target section. ↩︎

  18. As a result, the sections in the ELF files are grouped in three parts:

    1. the sections which belong to the text segment;

    2. the sections which belong to the data segment;

    3. the sections which do not belong to any segment (and are not available/used at runtime).

    ↩︎
  19. This means that there is usually no runtime relocation in the text segment: all the runtime relocations are done in the text segment.

    If the DT_TEXTREL flag is present (or a DT_TEXTREL dynamic table entry) is present, text relocation are present in this file. ↩︎

  20. This property is so important that the MPROTECT feature of the PaX (a Linux patch) prevents the existence of VMAs which are both executable and writable in most cases in order to enhance security. ↩︎

  21. The VMA are the different available/mapped regions in the virtual address space. Each VMA has some properties such as:

    • permissions (rwx);

    • whether it is shared with other processes (MAP_SHARED) or private to this process (MAP_PRIVATE);

    • whether it has an associated file (and the offset of the VMA within the file);

    • etc.

    They are created with mmap() (or similar) or directly by the kernel. On Linux, they can be seen in /proc/$pid/maps or with the pmap tool. ↩︎ ↩︎

  22. However they can use other techniques such as GOT infection and ROP (Return Oriented Programming). ↩︎

  23. The PLT GOT is still vulnerable to GOT poisoning. ↩︎

  24. In C, symbols have the name of the corresponding C function or variable on ELF systems.

    In C++, function overloading, templates, namespaces and so on make it more difficult. The name of the object (including the types of its arguments for functions) is mangled to form the symbol. Different name mangling schemes exist, but modern versions of GCC and clang use the name mangling of the C++ Itanium ABI: For example with this ABI, the foo::Something::bar(int) method is mangled into _ZN3foo9Something3barEi. The c++filt program can be used to demangle C++ symbol names (or the __cxa_demangle function). ↩︎

  25. This is what appears in the .o file. In the shared-object or executable, it is converted to STT_LOCAL and STV_DEFAULT. ↩︎

  26. The usage of STV_PROTECTED symbols is not recommended because it slows down the dynamic linkage. ↩︎

  27. In fact, it creates two GOT sections: .got and .got.plt. ↩︎

  28. The address of the GOT entry is 0x200990 and the address of .got is 0x200980: the offset of the GOT entry within .got is 0x200990 - 0x200980 = 0x10 = 16. Each GOT entry is 8 bytes on x86_64 so this is the third entry. ↩︎

  29. The usage of the PLT can be disabled at compile-time (for a given compilation unit) with cc -fno-plt or for a given function with __attribute__((noplt)). This disables lazy binding. ↩︎

  30. See GNU Hash ELF Section by Ali Bahrami and How to write Shared Libraries by Ulrich Depper. ↩︎

  31. Each shared-object dependency is described with a DT_NEEDED entry. A typical value is libfoo.so.6 (where 6 is a version number). This file is searched in different directories by the dynamic linker. A same shared object can be present in different incompatible versions.

    The link editor ld links against libfoo.so (using the -lfoo flag) which is a symbolic link to the current version of the shared object. Shared objects usually contain a DT_SONAME entry defining the full (shared-object) name (libfoo.so.6) of this shared-object. This value is copied a as DT_NEEDED entry in the dependent ELF objects.

    If no DT_SONAME is present, the link editor creates a DT_NEEDED entry with libfoo.so instead when given the -lfoo flags.

    If a full path to the shared object is given to ld and this shared object does not have DT_SONAME entry, the full path to the shared object will be used in the DT_NEEDED entry. ↩︎

  32. DT_RPATH serves the same purpose but is searched before the LD_LIBRARY_PATH environment variable which is not considered a good solution. For this reason, the DT_RUNRPATH was created as a replacement: the values of DT_RUNPATH are searched after the LD_LIBRARY_PATH environment. DT_RPATH is deprecated and ignored when DT_RUNPATH is present (and recognised by the dynamic linker). ↩︎

  33. There is no size/number of entries for the symbol table at the program header table level. This is not needed at runtime as the symbol lookup always go through the hash table. ↩︎

  34. Solaris and GNU systems have the ability to handle different namespaces (see dlmopen()): different shared-object can be placed in different namespaces. Usually only two namespaces are used: one for the dynamic linker and a second one for the the application and the shared-object libraries. ↩︎

  35. This is on contrast with Windows PE (Portable Executable) files and MacOS X which both use a two-level namespace lookup: they import a given symbol from a given DLL (Dynamic-Link Library) or .dyld. ↩︎

  36. This is a simplification. Other things influence the order and the set of ELF modules used for a given lookup: DT_SYMBOLIC, dlopen(), dlmopen() etc.

    dlopen-ed shared-object and their dependencies are not added to the global scope but only in a local scope (unless RTLD_GLOBAL is used).

    dlmopen() can be used to create separate symbol namespaces with their own sets of ELF shared-objects. ↩︎

  37. The DT_TEXTREL dynamic table entry can be used as well but its usage is deprecated/optional. ↩︎