Recently, a friend learned about information security , Then send me a question to discuss with me , Although it feels simple , But it's actually interesting , Just take it out and have a look . The title is as follows :
You can see a paragraph in the picture C Code of language , You can see 3 A question . Here I'll bring up the code , The code is as follows :
1 #include <stdio.h> 2 3 int main(int argc, char* argv[]) 4 { 5 int apple; 6 7 char buf[9]; 8 9 gets(buf); 10 11 if (apple == 0x64636261) 12 { 13 printf("hello world!"); 14 } 15 16 return 0; 17 }
I'll write down the questions as well , There are three questions :
(1) Analyze what kind of overflow it is
(2) Give the variables of the topic apple The address of , for example 0x0012ff44, give buf The address of each character
(3)a、b、c、d Of ASCII The code values are respectively 0x61、0x62、0x63 and 0x64 , give buf Input mode , So that the program can output hello world
What is a buffer
In short , A buffer is an area of memory where data is stored . According to the allocation of memory for storing data , Memory can be divided into stack memory and heap memory .
Stack memory , Used to store local variables 、 Parameters of functions, etc , For the protection of the scene when the function is called , Stack memory is also used , For example, save the return address of the function . Stack memory , from CPU To maintain , stay 32 Bit operating system , from CPU Of EBP and ESP Two registers to maintain .
Heap memory , It's a programmer who applies through specific functions , such as malloc and new Such as function . After the heap memory request, the programmer releases it . and Stack memory As the function returns Stack memory It will also be automatically recycled .
What is buffer overflow
It's usually memory coverage , Because the buffer is divided into Stack and Pile up , So the buffer overflow is divided into Stack overflow and Heap overflow . because C/C++ Many functions don't check memory boundaries in the early days , All memory boundary checking is done by the programmer himself . This may cause buffer overflow due to negligence . And now , Most of the functions that operate memory , Security check is added to the previous function , It's safer than before .
Some safety books think , Avoid buffer overflows , Don't use stack memory , Instead of using heap memory , This kind of understanding is wrong . Because improper use of heap memory can also cause overflow , There are also security risks .
Buffer overflow attack
The essence of buffer overflow attacks is that data is run as code . In programs that have buffer overflow attacks , The attacker puts executable code into memory as data , And then make the implanted data run in a specific way , So as to achieve the purpose of attack .
title
With the bedding on it , Let's talk about the content of the topic .
The first question is , What kind of overflow is the above code . You can see in the code that , Array buf[9] It's a buffer , and buf Is a local variable . Local variables are stored in the stack . In code gets() Function is a function that receives user input , But it doesn't check memory boundaries .buf[9] The length of is 9 Bytes , But when used gets() Function to get user input , When more than 9 Bytes , It will also receive all of them . This creates a buffer overflow , More specifically , Stack overflow . This is C/C++ Characteristics of language , Array out of bounds is allowed , Because in a lot of programming , In order to store indefinite data , You're going to use an array out of bounds .
The second question is , If apple The memory address of is 0x0012ff44, So give me buf The address of each character in . A variable is equivalent to giving a name to the first address of a memory , The type of the variable limits the memory length of the variable , such as 0x0012ff44 This is a memory address , Give this memory address a name apple, in addition Variable apple The type is int, Then limit the length of the variable 4 Bytes .
The title of the second question is , It's about giving us apple The address of , Then let's write buf The address of the variable . Here we need to know two more things . First , The local variable is in the stack address, which is known , The growth direction of stack address is from high to low . second , stay C In language , A local variable defined within a function , The memory address in the stack is allocated according to the order of variable definition . that , In the code , Defined first apple , Later defined buf Variable . that ,apple The address of is better than buf The address should be high ( Big 、 On ), Pictured .
After knowing the above two points , that buf What's the address ? Let's start with apple Address actually occupied ,apple The address of the variable is 0x0012ff44, This address is actually apple The first address of the variable , because 0x0012ff44 It represents only one byte of memory space , and apple yes int Variable of type , Occupy 4 Bytes , that apple What is actually occupied is 0x0012ff44、0x0012ff45、0x0012ff46 and 0x0012ff47 Four memory spaces , That is to say 4 Bytes . and apple The first address is 0x0012ff44.
Besides, buf Variable ,buf For the definition of char buf[9], shows buf Occupy 9 Bytes , and buf stay apple Defined later , that buf The address in the stack memory must be less than apple Of the address of . Is that just using apple Minus the address of 9 Namely buf What's the address? ? Not yet . although buf Occupy 9 Bytes , But in 32 Bit CPU in , Data in memory is generally in accordance with 4 Byte aligned (32 It's just right 4 Bytes ). that , That is, through 0x0012ff44 - 0xC Namely buf The first address . The memory structure is shown in the figure below .
In the diagram above , The part marked in red , Namely buf Variable memory , The part marked green , It is apple Variable memory . The white memory , It's the memory used to align . Is this a waste of memory . Yes ! stay 32 A system. , Memory press 4 Byte alignment ,CPU Access speed is the fastest . therefore , waste 3 Bytes for memory alignment , In exchange for CPU Read faster , It's worth it . In computer algorithms , Two sentences are often mentioned ,“ Trade space for time ” and “ Trade time for space ”, This is obviously “ Trade space for time ” The situation of . As can be seen from the figure above ,buf The starting address of is 0x0012ff38.
Third question , Is to let the program output “hello world” This string . But from the code , Only in apple be equal to 0x64636261 When , Will be output "hello world" character string . And there's no right in the whole code apple The code that makes the assignment . and 0x64636261 What is it again? ? Give a hint in the title of the third question ,0x61 For lowercase letters a Of ASCII code ,0x62 For lowercase letters b Of ASCII code . that , That is to say, let apple Filled with letters abcd that will do . Look at the picture below .
As long as we're giving buf adopt gets assignment , More than 9 Characters , To cover the memory behind it . So how many characters do you need to enter ?buf Is the length of the 9 Bytes , The aligned bytes are 3 Bytes ,apple Is the length of the 4, So input a total of 16 Only characters , front 12 Random input , Last 4 Inputs abcd that will do .
wait , In the code apple == 0x64636261, look apple The comparison is dcba, But why the input is abcd Well ? It's a matter of byte order , There is no discussion here , As long as you understand the problem of byte order , You can understand , And byte order is used in the development of network programs and reverse analysis , It's the foundation of the foundation .
demonstration
This procedure , I use XP + VC6 To demonstrate . Why use VC6, Because in the new version of VS in , No more gets Function , Because it's not safe , So it was discarded .
Enter the code above VC6 in , And then use DEBUG Compile (Release Compiled words , The generated binaries are optimized , Memory structure is not obvious , There are also different ways of spillover , Because it's a test question , The simplest way to show the problem is to ).
After compiling , stay gets() Set breakpoints at , Then open the “watch” window , Take a look at apple and buf Memory address of , Here's the picture .
It can be seen that ,apple The address is 0x0012ff7c,buf The memory address of is 0x0012ff70. Is there any doubt ? Different from the address in the title ! Don't worry. ! The same program in different operating systems ( such as ,XP and Win7) The memory addresses of variables on are different , Even in patching different systems (XP SP2 and XP SP3) It can be different . however , Let's pay attention to two points , First of all ,apple The address of buf The address is big , second ,apple Address and buf The address difference of 0xC. Just look at it from these two points , It's the same as what we analyzed earlier .
Then open “memory” window , Look at memory , Here's the picture .
next , stay if At the lower breakpoint , And let the program run , So we can input , Here's the picture .
I have entered 12 individual 1, Because of the former 12 You can type in as many characters as you like , And then I entered abcd, Press enter when input is complete , We are if The breakpoint set at position is broken , Now look at the memory , Here's the picture .
You can see from the above picture that , stay 0x0012ff7c Location , That is to say apple In the stack space , Be filled with 0x61、0x62、0x63 and 0x64. Although there is no place in the program for apple Variable assignment , But we've covered... By spillover apple Memory address of , It was successfully assigned a value . Let the program run , Watch the program run , Here's the picture .
You can see , character string “hello world” Is the output .
summary
The whole topic has been analyzed above , No difficulty , It's just some basic knowledge . What practical significance does this kind of topic have ? Take the code of this topic as an example , If gets Received a string of passwords , Only if the password is correct , To perform specific functions , And whether the password is right or not may have a flag bit . So in time, you don't know the correct password , As long as the overflow to cover the flag bit is not able to perform a specific function ? Of course, this is just a simple example . For buffer overflow 、SQL Inject 、XSS Such attacks , Their problems are caused by lax inspection External input data is treated as code Yes , And then there's the question of security . therefore , They are essentially the same . therefore , For programmers , Just can't trust any external input , Be sure to check the external input strictly .