当前位置:网站首页>Deep understanding of ELF files

Deep understanding of ELF files

2022-06-25 17:48:00 Akur studio

Know the executable program

The address of a source file in the process of generating an executable program needs to go through the following main steps .

After the source file is processed by the compiler, a relocatable target file will be generated , That's what we're used to .o file , After being processed by the linker , Will be more than one .o Files are processed into executable files .

  • Target can be located from

.o It is called a relocatable target , Contains binary code and data , Its form can be combined with other goals , Create an executable target file

because .o Documents are also ELF A kind of document , So we can use readelf -h Check it out. .o Of documents elf Header data

$ readelf -h main.o
ELF  head :
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
   Category :                              ELF64
   data :                              2  Complement code , Small end of the sequence  (little endian)
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI  edition :                          0
   type :                              REL ( Relocatable files )
   System architecture :                          Advanced Micro Devices X86-64
   edition :                              0x1
   Entry point address :               0x0
   Program header start :          0 (bytes into file)
  Start of section headers:          960 (bytes into file)
   sign :             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           64 (bytes)
  Number of section headers:         14
  Section header string table index: 13

By comparing with the file header structure

typedef struct
{

  unsigned char e_ident[EI_NIDENT]; /* Magic number and other info */
  Elf64_Half e_type;   /* Object file type */
  Elf64_Half e_machine;  /* Architecture */
  Elf64_Word e_version;  /* Object file version */
  Elf64_Addr e_entry;  /* Entry point virtual address */
  Elf64_Off e_phoff;  /* Program header table file offset */
  Elf64_Off e_shoff;  /* Section header table file offset */
  Elf64_Word e_flags;  /* Processor-specific flags */
  Elf64_Half e_ehsize;  /* ELF header size in bytes */
  Elf64_Half e_phentsize;  /* Program header table entry size */
  Elf64_Half e_phnum;  /* Program header table entry count */
  Elf64_Half e_shentsize;  /* Section header table entry size */
  Elf64_Half e_shnum;  /* Section header table entry count */
  Elf64_Half e_shstrndx;  /* Section header string table index */
} Elf64_Ehdr;

The first thing I see is Magic Magic number , The size of these numbers is defined by the macro #define EI_NIDENT (16) To limit ,Magic Put it in ELF Of the header of the file 16 byte , The meanings of each byte are as follows :

adopt readelf -S We can roughly give the composition of the relocatable file according to the address offset as follows :

  • Executable file

We compile the same source code into an executable program , And then use readelf -h View the header of the executable :

$ readelf -h a.out
ELF  head :
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
   Category :                              ELF64
   data :                              2  Complement code , Small end of the sequence  (little endian)
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI  edition :                          0
   type :                              DYN ( Shared target file )
   System architecture :                          Advanced Micro Devices X86-64
   edition :                              0x1
   Entry point address :               0x1060
   Program header start :          64 (bytes into file)
  Start of section headers:          14744 (bytes into file)
   sign :             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         13
  Size of section headers:           64 (bytes)
  Number of section headers:         31
  Section header string table index: 30

Use readelf -S Let's take a look at the section composition of the executable file, which is roughly as follows :

By comparing the heads of relocatable files and executable files , We can see the following differences :

[IMPORTANT] The above process is only seen through official tools , Will official tools deceive us , A real executable is really readelf Is the output composed in this way ?

We can follow suit readelf Analyze the information given by yourself ELF file , From a real existence ELF File to understand ELF The composition of the document

.https://github.com/zzu-andrew/note_book/src/elf_parser/elf_parser.h

// 1.  Load the executable file into memory 
mmap_res = ::mmap(nullptr, program_length_, PROT_READ, MAP_PRIVATE, fd_, 0);
if (mmap_res == MAP_FAILED)
{
    ERROR_EXIT("mmap");
}

mmap_program_ = static_cast<std::uint8_t *>(mmap_res);
// 2.  Take out the file header
file_header = reinterpret_cast<Elf64_Ehdr *>(mmap_program_);
// 3.  Take out the segment header and header
const Elf64_Ehdr *file_header;
const Elf64_Shdr *section_table;
const char *section_string_table;
size_t section_string_table_index;
Elf64_Xword section_number;

file_header = reinterpret_cast<Elf64_Ehdr *>(mmap_program_);
section_table = reinterpret_cast<Elf64_Shdr *>(mmap_program_ + file_header->e_shoff);

// e_shstrndx = 35
section_string_table_index = file_header->e_shstrndx == SHN_XINDEX ?
                             reinterpret_cast<Elf64_Shdr *>(&mmap_program_[file_header->e_shoff])->sh_link :
                             file_header->e_shstrndx;
section_string_table = reinterpret_cast<char *>(&mmap_program_[section_table[section_string_table_index].sh_offset]);

section_number = reinterpret_cast<Elf64_Shdr *>(&mmap_program_[file_header->e_shoff])->sh_size;

After the above steps, print the information of the file header as follows :

$3 = {e_ident = "\177ELF\002\001\001\000\000\000\000\000\000\000\000", e_type = 3, e_machine = 62, e_version = 1, e_entry = 4512, e_phoff = 64, e_shoff = 36184, e_flags = 0, e_ehsize = 64, e_phentsize = 56,
e_phnum = 13, e_shentsize = 64, e_shnum = 36, e_shstrndx = 35}

By reading the size of the open file , The size of the entire executable is : fileSize = 38488 Through the file header, we can know : + ELF The head size is : e_ehsize = 64 + The segment header table offset is : e_phoff = 64 The size is e_phentsize = 56, The number is e_phnum = 13 + The offset address of the section header table is :e_shoff = 36184, The size is e_shentsize = 64, The number is e_shnum = 36 + Section address offset : 36184 +

ELF Head on head , We directly force the pointer pointing to the head to Elf64_Ehdr after , The extracted data is completely consistent with the corresponding file , So we can see that ELF The header placed at the head of the file is indeed the same as readelf The output is the same .

Then calculate according to the offset , The head of the segment should follow closely ELF After the head , therefore , The position of the segment should be offset backwards by the head pointer 64 Place bits by viewing e_phoff The value of is really 64

So let's verify whether the tail is where the section header table is stored , adopt ELF Boss, we know , The size of the section head is 64, The offset position of the section head is 36184, The number of section headers is 36, According to the above figure , The section head is in the last part , Then there must be fileSize - e_shoff = e_shentsize * e_shnum This equation , Otherwise, it means that the section head has not ELF The tail of the executable program is filled with .

38488 - 36184 = 2304 = 36 * 64( Section head length )

After calculation , Whole ELF The end of the file is really filled with section headers . For the verification occupied by other sections, you can verify on the basis of the original program , Here we will not verify them one by one

-pie && -no-pie

Careful readers may find , I use readelf -h The types of read executables are displayed as shared types , This is because my system is ubuntu As a result of , A lot now Ubuntu The default compiler of the system will be added by default when the compiler is -pie Option, which causes the generated executable to be marked as a shared type

pie(Position-Independent-Executable) It can be used to create programs between shared libraries and commonly executable programs , It is a program that can redistribute addresses like a shared library .

PIE The earliest by RedHat Of ⼈ Realization , He added... To the linker -pie Options , In this way ⽤-fPIE The compiled object can get the location through the linker ⽆ Guan Kezhi ⾏ Program .

Standard enforceability ⾏ The program needs a fixed address , And only when loaded to this address , The program can be executed correctly ⾏.PIE Can make the program like a shared library ⼀ Samples are loaded anywhere in main memory , This requires compiling the program into a location ⽆ Turn off , And link to ELF Shared objects .

lead ⼊PIE The reason is that the program can be loaded at a random address , Usually , The kernel runs at a fixed address ⾏, If you can change ⽤ Location ⽆ Turn off , It is difficult for an attacker to use the executable in the system ⾏ Code is attacking . Attacks such as buffer overflows will ⽆ Law enforcement .⽽ And the cost of this safety enhancement is very ⼩.

About Linux The overall analysis of binary has been put into [https://github.com/zzu-andrew/note_book/src/elf_parser]

For full-text documents, see :https://github.com/zzu-andrew/note_book/Linux/Linux Binary analysis .adoc

原网站

版权声明
本文为[Akur studio]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/176/202206251731465019.html