Anatomy of an ELF core file
Published:
Updated:
The Executable and Linkable Format (ELF) 🧝 is used for compilation outputs (.o
files), executables, shared libraries and core dumps. The first cases are documented in the System V ABI specification and the Tools Interface Standard (TIS) ELF specification but there does not seem to be much documentation about the usage of the ELF format for core dumps. Here are some notes on this.
Prerequisites: some knowledge about ELF files.
Let's create a core dump and look at it:
pid=$(pgrep xchat)
gcore $pid
readelf -a core.$pid
Table of content
ELF header
Nothing special in the ELF header. e_type=ET_CORE
marks the file as a core file:
ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: CORE (Core file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x0 Start of program headers: 64 (bytes into file) Start of section headers: 57666560 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 344 Size of section headers: 64 (bytes) Number of section headers: 346 Section header string table index: 345
Program headers
Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align NOTE 0x0000000000004b80 0x0000000000000000 0x0000000000000000 0x0000000000009064 0x0000000000000000 R 1 LOAD 0x000000000000dbe4 0x0000000000400000 0x0000000000000000 0x0000000000000000 0x000000000009d000 R E 1 LOAD 0x000000000000dbe4 0x000000000069c000 0x0000000000000000 0x0000000000004000 0x0000000000004000 RW 1 LOAD 0x0000000000011be4 0x00000000006a0000 0x0000000000000000 0x0000000000004000 0x0000000000004000 RW 1 LOAD 0x0000000000015be4 0x0000000001872000 0x0000000000000000 0x0000000000ed4000 0x0000000000ed4000 RW 1 LOAD 0x0000000000ee9be4 0x00007f248c000000 0x0000000000000000 0x0000000000021000 0x0000000000021000 RW 1 LOAD 0x0000000000f0abe4 0x00007f2490885000 0x0000000000000000 0x000000000001c000 0x000000000001c000 R 1 LOAD 0x0000000000f26be4 0x00007f24908a1000 0x0000000000000000 0x000000000001c000 0x000000000001c000 R 1 LOAD 0x0000000000f42be4 0x00007f24908bd000 0x0000000000000000 0x00000000005f3000 0x00000000005f3000 R 1 LOAD 0x0000000001535be4 0x00007f2490eb0000 0x0000000000000000 0x0000000000000000 0x0000000000002000 R E 1 LOAD 0x0000000001535be4 0x00007f24910b1000 0x0000000000000000 0x0000000000001000 0x0000000000001000 R 1 LOAD 0x0000000001536be4 0x00007f24910b2000 0x0000000000000000 0x0000000000001000 0x0000000000001000 RW 1 LOAD 0x0000000001537be4 0x00007f24910b3000 0x0000000000000000 0x0000000000060000 0x0000000000060000 RW 1 LOAD 0x0000000001597be4 0x00007f2491114000 0x0000000000000000 0x0000000000800000 0x0000000000800000 RW 1 LOAD 0x0000000001d97be4 0x00007f2491914000 0x0000000000000000 0x0000000000000000 0x00000000001a8000 R E 1 LOAD 0x0000000001d97be4 0x00007f2491cbc000 0x0000000000000000 0x000000000000e000 0x000000000000e000 R 1 LOAD 0x0000000001da5be4 0x00007f2491cca000 0x0000000000000000 0x0000000000003000 0x0000000000003000 RW 1 LOAD 0x0000000001da8be4 0x00007f2491ccd000 0x0000000000000000 0x0000000000001000 0x0000000000001000 RW 1 LOAD 0x0000000001da9be4 0x00007f2491cd1000 0x0000000000000000 0x0000000000008000 0x0000000000008000 R 1 LOAD 0x0000000001db1be4 0x00007f2491cd9000 0x0000000000000000 0x000000000001c000 0x000000000001c000 R 1 [...]
The PT_LOAD
entry in the program header describes Virtual Memory Areas (VMAs) of the process:
VirtAddr
is the virtual address of the beginning of the VMA.MemSiz
is the size of the VMA in the virtual address space.Flags
are the permissions of this VMA (read, write, execute).Offset
is the offset of the corresponding data in the core dump file. This is not the offset in the original mapped file.FileSiz
is the size of the corresponding data in this core file. Read-only file-mapped VMAs which have the same content as their originating file are not duplicated in the core file. TheirFileSiz
is 0 and we are expected to look at the original file in order to have the content.- The name of the mapped file and the offset in this file are not described here but in the
PT_NOTE
section (its content is described later).
As these are VMAs, they are aligned on page boundaries.
We can compare that with cat /proc/$pid/maps
and we find the same information:
00400000-0049d000 r-xp 00000000 08:11 789936 /usr/bin/xchat 0069c000-006a0000 rw-p 0009c000 08:11 789936 /usr/bin/xchat 006a0000-006a4000 rw-p 00000000 00:00 0 01872000-02746000 rw-p 00000000 00:00 0 [heap] 7f248c000000-7f248c021000 rw-p 00000000 00:00 0 7f248c021000-7f2490000000 ---p 00000000 00:00 0 7f2490885000-7f24908a1000 r--p 00000000 08:11 1442232 /usr/share/icons/gnome/icon-theme.cache 7f24908a1000-7f24908bd000 r--p 00000000 08:11 1442232 /usr/share/icons/gnome/icon-theme.cache 7f24908bd000-7f2490eb0000 r--p 00000000 08:11 1313585 /usr/share/fonts/opentype/ipafont-gothic/ipag.ttf 7f2490eb0000-7f2490eb2000 r-xp 00000000 08:11 1195904 /usr/lib/x86_64-linux-gnu/gconv/CP1252.so 7f2490eb2000-7f24910b1000 ---p 00002000 08:11 1195904 /usr/lib/x86_64-linux-gnu/gconv/CP1252.so 7f24910b1000-7f24910b2000 r--p 00001000 08:11 1195904 /usr/lib/x86_64-linux-gnu/gconv/CP1252.so 7f24910b2000-7f24910b3000 rw-p 00002000 08:11 1195904 /usr/lib/x86_64-linux-gnu/gconv/CP1252.so 7f24910b3000-7f2491113000 rw-s 00000000 00:04 1409039 /SYSV00000000 (deleted) 7f2491113000-7f2491114000 ---p 00000000 00:00 0 7f2491114000-7f2491914000 rw-p 00000000 00:00 0 [stack:1957] [...]
The three first PT_LOAD
entries of the core dump map to the VMAs of the xchat
ELF file:
00400000-0049d000
, VMA corresponding to the readonly executable segment;0069c000-006a0000
, VMA corresponding to the initialized part of the read-write segment;006a0000-006a4000
, the part of the read-write segment which is not in thexchat
ELF file (zero initialized,.bss
).
We can compare this to the program headers of the xchat
program:
Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040 0x00000000000001c0 0x00000000000001c0 R E 8 INTERP 0x0000000000000200 0x0000000000400200 0x0000000000400200 0x000000000000001c 0x000000000000001c R 1 [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2] LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000 0x000000000009c4b4 0x000000000009c4b4 R E 200000 LOAD 0x000000000009c4b8 0x000000000069c4b8 0x000000000069c4b8 0x0000000000002bc9 0x0000000000007920 RW 200000 DYNAMIC 0x000000000009c4d0 0x000000000069c4d0 0x000000000069c4d0 0x0000000000000360 0x0000000000000360 RW 8 NOTE 0x000000000000021c 0x000000000040021c 0x000000000040021c 0x0000000000000044 0x0000000000000044 R 4 GNU_EH_FRAME 0x0000000000086518 0x0000000000486518 0x0000000000486518 0x0000000000002e64 0x0000000000002e64 R 4 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RW 10 Section to Segment mapping: Segment Sections... 00 01 .interp 02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_d .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame 03 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss 04 .dynamic 05 .note.ABI-tag .note.gnu.build-id 06 .eh_frame_hdr 07
Sections
ELF core dump are not expected to have section headers. The Linux kernel does not generate sections headers when it generates core dumps. GDB generates section headers with the same information as the program headers:
- the
SHT_NOBITS
sections are not present in the core file but reference parts of other existing files; - the
SHT_PROGBITS
section are present in the core file; - the
SHT_NOTE
section header maps to thePT_NOTE
program header.
Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] note0 NOTE 0000000000000000 00004b80 0000000000009064 0000000000000000 A 0 0 1 [ 2] load NOBITS 0000000000400000 0000dbe4 000000000009d000 0000000000000000 AX 0 0 1 [ 3] load PROGBITS 000000000069c000 0000dbe4 0000000000004000 0000000000000000 WA 0 0 1 [ 4] load PROGBITS 00000000006a0000 00011be4 0000000000004000 0000000000000000 WA 0 0 1 [ 5] load PROGBITS 0000000001872000 00015be4 0000000000ed4000 0000000000000000 WA 0 0 1 [ 6] load PROGBITS 00007f248c000000 00ee9be4 0000000000021000 0000000000000000 WA 0 0 1 [ 7] load PROGBITS 00007f2490885000 00f0abe4 000000000001c000 0000000000000000 A 0 0 1 [ 8] load PROGBITS 00007f24908a1000 00f26be4 000000000001c000 0000000000000000 A 0 0 1 [ 9] load PROGBITS 00007f24908bd000 00f42be4 00000000005f3000 0000000000000000 A 0 0 1 [10] load NOBITS 00007f2490eb0000 01535be4 0000000000002000 0000000000000000 AX 0 0 1 [11] load PROGBITS 00007f24910b1000 01535be4 0000000000001000 0000000000000000 A 0 0 1 [12] load PROGBITS 00007f24910b2000 01536be4 0000000000001000 0000000000000000 WA 0 0 1 [13] load PROGBITS 00007f24910b3000 01537be4 0000000000060000 0000000000000000 WA 0 0 1 [...] [345] .shstrtab STRTAB 0000000000000000 036febe4 0000000000000016 0000000000000000 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), l (large) I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific
Notes
The PT_NOTE
program header references additional information such as the content of the CPU registers of the different threads, the files associated with each VMA, etc. It is made of entries of ElfW(Nhdr)
(i.e. either Elf32_Nhdr
or Elf64_Nhdr
) structures:
- a originator name;
- a originator-specific ID (a 4 bytes value);
- a binary content.
typedef struct elf32_note {
Elf32_Word n_namesz; /* Name size */
Elf32_Word n_descsz; /* Content size */
Elf32_Word n_type; /* Content type */
} Elf32_Nhdr;
typedef struct elf64_note {
Elf64_Word n_namesz; /* Name size */
Elf64_Word n_descsz; /* Content size */
Elf64_Word n_type; /* Content type */
} Elf64_Nhdr;
Here is the content of the notes:
Displaying notes found at file offset 0x00004b80 with length 0x00009064: Owner Data size Description CORE 0x00000088 NT_PRPSINFO (prpsinfo structure) CORE 0x00000150 NT_PRSTATUS (prstatus structure) CORE 0x00000200 NT_FPREGSET (floating point registers) LINUX 0x00000440 NT_X86_XSTATE (x86 XSAVE extended state) CORE 0x00000080 NT_SIGINFO (siginfo_t data) CORE 0x00000150 NT_PRSTATUS (prstatus structure) CORE 0x00000200 NT_FPREGSET (floating point registers) LINUX 0x00000440 NT_X86_XSTATE (x86 XSAVE extended state) CORE 0x00000080 NT_SIGINFO (siginfo_t data) CORE 0x00000150 NT_PRSTATUS (prstatus structure) CORE 0x00000200 NT_FPREGSET (floating point registers) LINUX 0x00000440 NT_X86_XSTATE (x86 XSAVE extended state) CORE 0x00000080 NT_SIGINFO (siginfo_t data) CORE 0x00000150 NT_PRSTATUS (prstatus structure) CORE 0x00000200 NT_FPREGSET (floating point registers) LINUX 0x00000440 NT_X86_XSTATE (x86 XSAVE extended state) CORE 0x00000080 NT_SIGINFO (siginfo_t data) CORE 0x00000130 NT_AUXV (auxiliary vector) CORE 0x00006cee NT_FILE (mapped files)
Most data structures (prpsinfo
, prstatus
, etc.) are defined in C header files (such as linux/elfcore.h
).
Generic process informations
The CORE/NT_PRPSINFO
entry defines generic process informations such as the process state, UID, GID, filename and (part of) its arguments.
The CORE/NT_AUXV
entry describes the auxiliary vector.
Thread information
Each thread has the following entries:
CORE/NT_PRSTATUS
(PID, PPID, content of the general purpose registers, etc.);CORE/NT_FPREGSET
(content of the floating point registers);CORE/NT_X86_STATE
;CORE/SIGINFO
.
For multithread processes, there are two approaches:
- either move all the threads informations in the same
PT_NOTE
, the consumer must then guess which entry belongs to which thread (in practice, aNT_PRSTATUS
defines a new thread); - or move each thread in a separate
PT_NOTE
.
See the wording of LLDB source code:
If a core file contains multiple thread contexts then there is two data forms
- Each thread context(2 or more NOTE entries) contained in its own segment (PT_NOTE)
- All thread context is stored in a single segment(PT_NOTE). This case is little tricker since while parsing we have to find where the new thread starts. The current implementation marks beginning of new thread when it finds NT_PRSTATUS or NT_PRPSINFO NOTE entry.
File association
The CORE/NT_FILE
entry describes the association between VMAs and files. Each non-anonymous VMA has an entry with:
- the position of the VMA in the virtual address space (Start, End);
- the offset of the VMA within the file (Page Offset);
- the associated file name.
Page size: 1 Start End Page Offset 0x0000000000400000 0x000000000049d000 0x0000000000000000 /usr/bin/xchat 0x000000000069c000 0x00000000006a0000 0x000000000009c000 /usr/bin/xchat 0x00007f2490885000 0x00007f24908a1000 0x0000000000000000 /usr/share/icons/gnome/icon-theme.cache 0x00007f24908a1000 0x00007f24908bd000 0x0000000000000000 /usr/share/icons/gnome/icon-theme.cache 0x00007f24908bd000 0x00007f2490eb0000 0x0000000000000000 /usr/share/fonts/opentype/ipafont-gothic/ipag.ttf 0x00007f2490eb0000 0x00007f2490eb2000 0x0000000000000000 /usr/lib/x86_64-linux-gnu/gconv/CP1252.so 0x00007f2490eb2000 0x00007f24910b1000 0x0000000000002000 /usr/lib/x86_64-linux-gnu/gconv/CP1252.so 0x00007f24910b1000 0x00007f24910b2000 0x0000000000001000 /usr/lib/x86_64-linux-gnu/gconv/CP1252.so 0x00007f24910b2000 0x00007f24910b3000 0x0000000000002000 /usr/lib/x86_64-linux-gnu/gconv/CP1252.so 0x00007f24910b3000 0x00007f2491113000 0x0000000000000000 /SYSV00000000 (deleted) 0x00007f2491914000 0x00007f2491abc000 0x0000000000000000 /usr/lib/x86_64-linux-gnu/libtcl8.6.so 0x00007f2491abc000 0x00007f2491cbc000 0x00000000001a8000 /usr/lib/x86_64-linux-gnu/libtcl8.6.so 0x00007f2491cbc000 0x00007f2491cca000 0x00000000001a8000 /usr/lib/x86_64-linux-gnu/libtcl8.6.so 0x00007f2491cca000 0x00007f2491ccd000 0x00000000001b6000 /usr/lib/x86_64-linux-gnu/libtcl8.6.so 0x00007f2491cd1000 0x00007f2491cd9000 0x0000000000000000 /usr/share/icons/hicolor/icon-theme.cache 0x00007f2491cd9000 0x00007f2491cf5000 0x0000000000000000 /usr/share/icons/oxygen/icon-theme.cache 0x00007f2491cf5000 0x00007f2491d11000 0x0000000000000000 /usr/share/icons/oxygen/icon-theme.cache 0x00007f2491d11000 0x00007f2491d1d000 0x0000000000000000 /usr/lib/xchat/plugins/tcl.so [...]
As far as I understand (from the binutils readelf
source code), the format of the CORE/NT_FILE
entry is:
- number of map entries (32 or 64 bits);
- page size (set to 1 by GDB instead of the real page size, 32 ou 64 bits);
- each map entry with the format:
- start
- end;
- file offset
- each (null terminated) path string in order.
References
- A brief look into core dumps
- Extending the ELF Core Format for Forensics Snapshots
- FreeBSD Userspace Coredumps
- What's Inside a Linux Kernel Core Dump
Backlinks:
- Crash reporting in Rust
- What to Do When It is Already Too Late ? Crashdumps for Embedded Systems
- LIEF documentation - ELF core dumps
- corepipe
- WebAssembly tool-conventions, Coredump, about the usage of coredump for post-mortem debugging with WebAssembly
- madcore