Anatomy of an ELF core file

computer system elf coredump

The ELF format is used for compilation outputs (.o files), executables, shared libraries and core dumps. The first cases are documented in the System V ABI specification and the TIS ELF specification but there does not seem to be much documentation about the usage of the ELF format for core dumps. Here are some notes on this.

Let's create a core dump and look at it:

pid=$(pgrep xchat)
gcore $pid
readelf -a core.$pid

ELF header

Nothing special except in the ELF header. The e_type=ET_CORE marks the file as a core file:

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              CORE (Core file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          64 (bytes into file)
  Start of section headers:          57666560 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         344
  Size of section headers:           64 (bytes)
  Number of section headers:         346
  Section header string table index: 345

Program headers

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  NOTE           0x0000000000004b80 0x0000000000000000 0x0000000000000000
                 0x0000000000009064 0x0000000000000000  R      1
  LOAD           0x000000000000dbe4 0x0000000000400000 0x0000000000000000
                 0x0000000000000000 0x000000000009d000  R E    1
  LOAD           0x000000000000dbe4 0x000000000069c000 0x0000000000000000
                 0x0000000000004000 0x0000000000004000  RW     1
  LOAD           0x0000000000011be4 0x00000000006a0000 0x0000000000000000
                 0x0000000000004000 0x0000000000004000  RW     1
  LOAD           0x0000000000015be4 0x0000000001872000 0x0000000000000000
                 0x0000000000ed4000 0x0000000000ed4000  RW     1
  LOAD           0x0000000000ee9be4 0x00007f248c000000 0x0000000000000000
                 0x0000000000021000 0x0000000000021000  RW     1
  LOAD           0x0000000000f0abe4 0x00007f2490885000 0x0000000000000000
                 0x000000000001c000 0x000000000001c000  R      1
  LOAD           0x0000000000f26be4 0x00007f24908a1000 0x0000000000000000
                 0x000000000001c000 0x000000000001c000  R      1
  LOAD           0x0000000000f42be4 0x00007f24908bd000 0x0000000000000000
                 0x00000000005f3000 0x00000000005f3000  R      1
  LOAD           0x0000000001535be4 0x00007f2490eb0000 0x0000000000000000
                 0x0000000000000000 0x0000000000002000  R E    1
  LOAD           0x0000000001535be4 0x00007f24910b1000 0x0000000000000000
                 0x0000000000001000 0x0000000000001000  R      1
  LOAD           0x0000000001536be4 0x00007f24910b2000 0x0000000000000000
                 0x0000000000001000 0x0000000000001000  RW     1
  LOAD           0x0000000001537be4 0x00007f24910b3000 0x0000000000000000
                 0x0000000000060000 0x0000000000060000  RW     1
  LOAD           0x0000000001597be4 0x00007f2491114000 0x0000000000000000
                 0x0000000000800000 0x0000000000800000  RW     1
  LOAD           0x0000000001d97be4 0x00007f2491914000 0x0000000000000000
                 0x0000000000000000 0x00000000001a8000  R E    1
  LOAD           0x0000000001d97be4 0x00007f2491cbc000 0x0000000000000000
                 0x000000000000e000 0x000000000000e000  R      1
  LOAD           0x0000000001da5be4 0x00007f2491cca000 0x0000000000000000
                 0x0000000000003000 0x0000000000003000  RW     1
  LOAD           0x0000000001da8be4 0x00007f2491ccd000 0x0000000000000000
                 0x0000000000001000 0x0000000000001000  RW     1
  LOAD           0x0000000001da9be4 0x00007f2491cd1000 0x0000000000000000
                 0x0000000000008000 0x0000000000008000  R      1
  LOAD           0x0000000001db1be4 0x00007f2491cd9000 0x0000000000000000
                 0x000000000001c000 0x000000000001c000  R      1
[...]

The PT_LOAD entry in the program header describes VMAs of the process:

  • VirtAddr is the virtual address of the beginning of the VMA.

  • MemSiz is the size of the VMA in the virtual address space.

  • Flags are the permissions of this VMA (read, write, execute).

  • Offset is the offset of the corresponding data in the core dump file. This is not the offset in the original mapped file.

  • FileSiz is the size of the corresponding data in this core file. Read-only file-mapped VMAs which have the same content as their originating file are not duplicated in the core file. Their FileSiz is 0 and we are expected to look at the original file in order to have the content.

  • The name of the mapped file and the offset in this file are not described here but in the PT_NOTE section (its content is described later).

As these are VMAs, they are aligned on page boundaries.

We can compare that with cat /proc/$pid/maps and we find the same information:

00400000-0049d000 r-xp 00000000 08:11 789936          /usr/bin/xchat
0069c000-006a0000 rw-p 0009c000 08:11 789936          /usr/bin/xchat
006a0000-006a4000 rw-p 00000000 00:00 0
01872000-02746000 rw-p 00000000 00:00 0               [heap]
7f248c000000-7f248c021000 rw-p 00000000 00:00 0
7f248c021000-7f2490000000 ---p 00000000 00:00 0
7f2490885000-7f24908a1000 r--p 00000000 08:11 1442232 /usr/share/icons/gnome/icon-theme.cache
7f24908a1000-7f24908bd000 r--p 00000000 08:11 1442232 /usr/share/icons/gnome/icon-theme.cache
7f24908bd000-7f2490eb0000 r--p 00000000 08:11 1313585 /usr/share/fonts/opentype/ipafont-gothic/ipag.ttf
7f2490eb0000-7f2490eb2000 r-xp 00000000 08:11 1195904 /usr/lib/x86_64-linux-gnu/gconv/CP1252.so
7f2490eb2000-7f24910b1000 ---p 00002000 08:11 1195904 /usr/lib/x86_64-linux-gnu/gconv/CP1252.so
7f24910b1000-7f24910b2000 r--p 00001000 08:11 1195904 /usr/lib/x86_64-linux-gnu/gconv/CP1252.so
7f24910b2000-7f24910b3000 rw-p 00002000 08:11 1195904 /usr/lib/x86_64-linux-gnu/gconv/CP1252.so
7f24910b3000-7f2491113000 rw-s 00000000 00:04 1409039 /SYSV00000000 (deleted)
7f2491113000-7f2491114000 ---p 00000000 00:00 0
7f2491114000-7f2491914000 rw-p 00000000 00:00 0      [stack:1957]
[...]

The three first PT_LOAD entries of the core dump map to the VMAs of the xchat ELF file:

  • 00400000-0049d000, VMA corresponding to the readonly executable segment;

  • 0069c000-006a0000, VMA corresponding to the initialized part of the read-write segment;

  • 006a0000-006a4000, the part of the read-write segment which is not in the xchat ELF file (zero initialized, .bss).

We can compare this to the program headers of the xchat program:

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                 0x00000000000001c0 0x00000000000001c0  R E    8
  INTERP         0x0000000000000200 0x0000000000400200 0x0000000000400200
                 0x000000000000001c 0x000000000000001c  R      1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x000000000009c4b4 0x000000000009c4b4  R E    200000
  LOAD           0x000000000009c4b8 0x000000000069c4b8 0x000000000069c4b8
                 0x0000000000002bc9 0x0000000000007920  RW     200000
  DYNAMIC        0x000000000009c4d0 0x000000000069c4d0 0x000000000069c4d0
                 0x0000000000000360 0x0000000000000360  RW     8
  NOTE           0x000000000000021c 0x000000000040021c 0x000000000040021c
                 0x0000000000000044 0x0000000000000044  R      4
  GNU_EH_FRAME   0x0000000000086518 0x0000000000486518 0x0000000000486518
                 0x0000000000002e64 0x0000000000002e64  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     10

 Section to Segment mapping:
  Segment Sections...
   00
   01     .interp
   02     .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_d .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
   03     .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss
   04     .dynamic
   05     .note.ABI-tag .note.gnu.build-id
   06     .eh_frame_hdr
   07

Sections

ELF core dump are not expected to have section headers. The Linux kernel does not generate sections headers when it generates core dumps. GDB generates section headers with the same information as the program headers:

  • the SHT_NOBITS sections are not present in the core file but reference parts of other existing files;

  • the SHT_PROGBITS section are present in the core file;

  • the SHT_NOTE section header maps to the PT_NOTE program header.

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] note0             NOTE             0000000000000000  00004b80
       0000000000009064  0000000000000000   A       0     0     1
  [ 2] load              NOBITS           0000000000400000  0000dbe4
       000000000009d000  0000000000000000  AX       0     0     1
  [ 3] load              PROGBITS         000000000069c000  0000dbe4
       0000000000004000  0000000000000000  WA       0     0     1
  [ 4] load              PROGBITS         00000000006a0000  00011be4
       0000000000004000  0000000000000000  WA       0     0     1
  [ 5] load              PROGBITS         0000000001872000  00015be4
       0000000000ed4000  0000000000000000  WA       0     0     1
  [ 6] load              PROGBITS         00007f248c000000  00ee9be4
       0000000000021000  0000000000000000  WA       0     0     1
  [ 7] load              PROGBITS         00007f2490885000  00f0abe4
       000000000001c000  0000000000000000   A       0     0     1
  [ 8] load              PROGBITS         00007f24908a1000  00f26be4
       000000000001c000  0000000000000000   A       0     0     1
  [ 9] load              PROGBITS         00007f24908bd000  00f42be4
       00000000005f3000  0000000000000000   A       0     0     1
  [10] load              NOBITS           00007f2490eb0000  01535be4
       0000000000002000  0000000000000000  AX       0     0     1
  [11] load              PROGBITS         00007f24910b1000  01535be4
       0000000000001000  0000000000000000   A       0     0     1
  [12] load              PROGBITS         00007f24910b2000  01536be4
       0000000000001000  0000000000000000  WA       0     0     1
  [13] load              PROGBITS         00007f24910b3000  01537be4
       0000000000060000  0000000000000000  WA       0     0     1
[...]
  [345] .shstrtab         STRTAB           0000000000000000  036febe4
       0000000000000016  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific

Notes

The PT_NOTE program header contains additional information such as the registers of the threads, the files associated with each VMA, etc. It is made of entries of (ElfW(Nhdr) structure):

  • a originator name;

  • a originator-specific ID (a 4 bytes value);

  • a binary content.

Here is the content of the notes:

Displaying notes found at file offset 0x00004b80 with length 0x00009064:
  Owner                 Data size       Description
  CORE                 0x00000088       NT_PRPSINFO (prpsinfo structure)

  CORE                 0x00000150       NT_PRSTATUS (prstatus structure)
  CORE                 0x00000200       NT_FPREGSET (floating point registers)
  LINUX                0x00000440       NT_X86_XSTATE (x86 XSAVE extended state)
  CORE                 0x00000080       NT_SIGINFO (siginfo_t data)

  CORE                 0x00000150       NT_PRSTATUS (prstatus structure)
  CORE                 0x00000200       NT_FPREGSET (floating point registers)
  LINUX                0x00000440       NT_X86_XSTATE (x86 XSAVE extended state)
  CORE                 0x00000080       NT_SIGINFO (siginfo_t data)

  CORE                 0x00000150       NT_PRSTATUS (prstatus structure)
  CORE                 0x00000200       NT_FPREGSET (floating point registers)
  LINUX                0x00000440       NT_X86_XSTATE (x86 XSAVE extended state)
  CORE                 0x00000080       NT_SIGINFO (siginfo_t data)

  CORE                 0x00000150       NT_PRSTATUS (prstatus structure)
  CORE                 0x00000200       NT_FPREGSET (floating point registers)
  LINUX                0x00000440       NT_X86_XSTATE (x86 XSAVE extended state)
  CORE                 0x00000080       NT_SIGINFO (siginfo_t data)

  CORE                 0x00000130       NT_AUXV (auxiliary vector)
  CORE                 0x00006cee       NT_FILE (mapped files)

Most data structures (prpsinfo, prstatus, etc.) are defined in C header files (such as linux/elfcore.h).

Generic process informations

The CORE/NT_PRPSINFO entry defines generic process informations such as the process state, UIG, GID, filename and (part of) its arguments.

The CORE/NT_AUXV entry describes the auxiliary vector.

Thread information

Each thread has the following entries:

  • CORE/NT_PRSTATUS (PID, PPID, general purpose registers, etc.);

  • CORE/NT_FPREGSET (floating point registers);

  • CORE/NT_X86_STATE;

  • CORE/SIGINFO.

For multithread processes there are two approaches:

  • either move all the threads informations in the same PT_NOTE, the consumer must then guess which entry belongs to which thread (in pratice, a NT_PRSTATUS defines a new thread);

  • or move each thread in a separate PT_NOTE.

See the wording of LLDB source code:

If a core file contains multiple thread contexts then there is two data forms

  1. Each thread context(2 or more NOTE entries) contained in its own segment (PT_NOTE)

  2. All thread context is stored in a single segment(PT_NOTE). This case is little tricker since while parsing we have to find where the new thread starts. The current implementation marks beginning of new thread when it finds NT_PRSTATUS or NT_PRPSINFO NOTE entry.

File association

The CORE/NT_FILE entry describes the association between VMAs and files. Each non-anonymous VMA has an entry with:

  • the position of the VMA in the virtual address space (Start, End);

  • the offset of the VMA within the file (Page Offset);

  • the associated file name.

    Page size: 1
                 Start                 End         Page Offset
    0x0000000000400000  0x000000000049d000  0x0000000000000000
        /usr/bin/xchat
    0x000000000069c000  0x00000000006a0000  0x000000000009c000
        /usr/bin/xchat
    0x00007f2490885000  0x00007f24908a1000  0x0000000000000000
        /usr/share/icons/gnome/icon-theme.cache
    0x00007f24908a1000  0x00007f24908bd000  0x0000000000000000
        /usr/share/icons/gnome/icon-theme.cache
    0x00007f24908bd000  0x00007f2490eb0000  0x0000000000000000
        /usr/share/fonts/opentype/ipafont-gothic/ipag.ttf
    0x00007f2490eb0000  0x00007f2490eb2000  0x0000000000000000
        /usr/lib/x86_64-linux-gnu/gconv/CP1252.so
    0x00007f2490eb2000  0x00007f24910b1000  0x0000000000002000
        /usr/lib/x86_64-linux-gnu/gconv/CP1252.so
    0x00007f24910b1000  0x00007f24910b2000  0x0000000000001000
        /usr/lib/x86_64-linux-gnu/gconv/CP1252.so
    0x00007f24910b2000  0x00007f24910b3000  0x0000000000002000
        /usr/lib/x86_64-linux-gnu/gconv/CP1252.so
    0x00007f24910b3000  0x00007f2491113000  0x0000000000000000
        /SYSV00000000 (deleted)
    0x00007f2491914000  0x00007f2491abc000  0x0000000000000000
        /usr/lib/x86_64-linux-gnu/libtcl8.6.so
    0x00007f2491abc000  0x00007f2491cbc000  0x00000000001a8000
        /usr/lib/x86_64-linux-gnu/libtcl8.6.so
    0x00007f2491cbc000  0x00007f2491cca000  0x00000000001a8000
        /usr/lib/x86_64-linux-gnu/libtcl8.6.so
    0x00007f2491cca000  0x00007f2491ccd000  0x00000000001b6000
        /usr/lib/x86_64-linux-gnu/libtcl8.6.so
    0x00007f2491cd1000  0x00007f2491cd9000  0x0000000000000000
        /usr/share/icons/hicolor/icon-theme.cache
    0x00007f2491cd9000  0x00007f2491cf5000  0x0000000000000000
        /usr/share/icons/oxygen/icon-theme.cache
    0x00007f2491cf5000  0x00007f2491d11000  0x0000000000000000
        /usr/share/icons/oxygen/icon-theme.cache
    0x00007f2491d11000  0x00007f2491d1d000  0x0000000000000000
        /usr/lib/xchat/plugins/tcl.so
[...]

As far as I understand (from the binutils readelf source code) the format of the CORE/NT_FILE entry is:

  1. number of map entries (32 or 64 bits);
  2. page size (set to 1 by GDB instead of the real page size, 32 ou 64 bits);
  3. each map entry with the format:
    1. start
    2. end;
    3. file offset
  4. each (null terminated) path string in order.

References