当前位置：网站首页>IO model review

IO model review

2022-07-07 10:26:00 【Zong_ 0915】

IO Model review

Preface
One . IO The basic concept of
Two . IO Model
3、 ... and . summary
- 3.1 select、poll、epoll The three difference *
- 3.2 Five kinds IO Model

Preface

This piece of ice river teacher's article , Do a review .

One . IO The basic concept of

First of all, what is IO： It involves the process of data migration between computer core and other devices , Namely IO. For example, disk IO：

Input ： Read data from disk to memory .
Output ： Write data in memory to disk .

And the operating system initiates once IO The operation generally includes two stages ：

IO call ： The application process goes to Operating system kernel A call .
IO perform ： Operating system kernel complete IO operation .

among ,IO The implementation stage is divided into two stages ：

Data preparation stage ： Kernel wait I/O The device is ready for data .
Copy data phase ： Copy data from kernel buffer to user process buffer .

Pictured ：
Insert picture description here

Two . IO Model

IO There are five types of models ：

2.1 Blocking type IO（BIO）

Application process initiation IO call , however Kernel data is not ready . therefore The application process has been blocking and waiting , Until the kernel data is ready .
Insert picture description here
shortcoming ： If the kernel data is never ready , Then the user process will always be blocked , Waste performance .

2.2 Non-blocking type IO（NIO)

In view of the blocking IO The shortcomings of , On its basis , If the kernel data is not ready , Non-blocking type IO Meeting First, return the error message to the user process , Make it unnecessary to wait , Then request by polling . Insert picture description here advantage ： Compared with blocking IO, Users don't have to wait , It will not enter the blocking state because the kernel data is not ready .
shortcoming ： Frequent polling , Lead to frequent system calls , Consume a lot of CPU resources .

2.3 IO Multiplexing （BIO）

Since frequent polling leads to CPU It costs a lot . Let the kernel data be ready , Actively notify the application to make system calls . That is to say IO Multiplexing .

Concept ： File descriptor （File Descriptor）

The file descriptor is a non negative integer . When opening an existing file or creating a new file , The kernel returns a file descriptor . Reading and writing files also need to use file descriptors to specify the files to be read and written .

IO The core idea of reuse model ： The system provides us with a class of functions , They can monitor multiple at the same time fd The operation of , ren Which one returns kernel data ready , The application process then initiates a system call . Be careful , Here, the application process is still required to initiate system calls .

IO There are three ways of multiplexing ：select、poll、epoll.

2.3.1 select

The application process can pass select function , Come on Monitor multiple at the same time fd, stay select Function monitoring fd In the process of , As long as any data state is ready .select It will return to the readable state , At this time, the application process will initiate a request to read kernel data . Pictured ：
Insert picture description here
Drawbacks as follows ：

Monitoring IO The maximum number of connections has an upper limit .
select When the function returns , By traversing fd aggregate , Find ready descriptor fd.（ namely Traverse all streams ）

2.3.2 poll

Whereas select Disadvantages of the way , It came up. poll. Compared with the former ,poll It solves the problem of limiting the number of connections . however poll You still need to traverse the file descriptor to get the ready socket.

2.3.3 epoll

So in order to solve the problem select and poll The problem is , And then there is IO Multiplexing （epoll） Model , Use event driven to realize ：
Insert picture description here
Looks like select There is no difference in the flow chart , Add some words here ：

epoll First, through epoll_ctl() Function to register a file descriptor .
Once based on a fd When it's ready , The kernel uses a callback mechanism , Quickly activate this fd.
When the process calls epoll_wait() I'll be informed when . By adopting Listen for event callbacks Mechanism to Avoid traversing all text descriptors .

although IO Multiplexing this way for non blocking IO, There is no need to make frequent calls , But through callback . But when the process calls epoll_wait() when , It can still be blocked .

Important things are to be repeated for 3 times , Multiplexing IO It still is ： Synchronous blocking ！ Synchronous blocking ！ Synchronous blocking ！

Therefore, we hope to have such a function in design （ I think it would be better to understand this way ）：

Multiplexing IO, Although you can specify the corresponding IO flow . Avoid traversing all IO.
Although the results are obtained through callback , But the process of waiting for the result , It needs blocking to wait .
Therefore, the design hopes that the user process can not wait , Do something else first . Wait for the callback result , I can feel it again .

Then comes the signal drive IO Model .

2.4 Signal driven IO（NIO）

On the basis of multiplexing . Send a signal to the kernel . At this time, the application process does not need to be blocked , You can do other things . When the kernel data is ready , Re pass SIGIO Signal the application process . Once the process gets the signal , Immediately call to get kernel data . Pictured ：
Insert picture description here
Of course , The data status inquiry process here is asynchronous, right , But the data replication part , It is still synchronously blocked , Therefore, the whole signal drives IO The process of is not asynchronous .

2.5 asynchronous IO（AIO）

You only need to send a request to the kernel once , All operations of data status inquiry and data copy can be completed , And don't block waiting for results .
Insert picture description here

Here's the explanation BIO、NIO、AIO：

Synchronous blocking (blocking-IO) abbreviation BIO.
Synchronous nonblocking (non-blocking-IO) abbreviation NIO.
Asynchronous non-blocking (asynchronous-non-blocking-IO) abbreviation AIO.

3、 ... and . summary

3.1 select、poll、epoll The three difference *

Comparative study	select	poll	epoll
Underlying data structure	Array	Linked list	Red and black trees + Double linked list
Get ready fd The way	Traverse all of	Traverse all of	Event callback
The complexity of the event	`O(n)`	`O(n)`	`O(1)`
maximum connection	1024(`Linux`)	unlimited	unlimited
`fd` Data copy mode	Every time you call `select`, We need to `fd` Copy from user space to kernel space	Every time you call `poll`, We need to `fd` Copy from user space to kernel space	Through memory mapping （`mmap`）, There is no need to make frequent copies fd, At a time can be .

3.2 Five kinds IO Model

IO Model	Blocked state	sync
Blocking type IO	Blocking	Sync
Non-blocking type IO	Non blocking	Sync
IO Multiplexing	Blocking	Sync
Signal driven IO	Non blocking	Sync
asynchronous IO	Non blocking	asynchronous

Blocking , The concept of non blocking distinguishes ： It can be simply understood as whether you need to do something to get a reply immediately , If you cannot get a return immediately , Need to wait , That's blocking . Prefer whether to respond immediately .

Sync , The concept of asynchrony ： You always do one thing before you do another , Whether it takes time to wait , This is synchronization . Otherwise, it is asynchronous . I prefer whether I can do two things in parallel .

Then look back at the table above ：

Non-blocking type IO The explanation in this respect ：

Non blocking ： Because user processes can get results immediately （ It may be the data that the end user wants , It may also be an error message ）.
Sync ： Because the data replication phase is always executed after the data query phase .

IO Explanation of multiplexing ：

Blocking ： The user process needs to wait for the callback result to return . The process is blocked .
Sync ： Because the data replication phase is always executed after the data query phase .

Signal driven IO The explanation in this respect ：

Non blocking ： The user process can get the return immediately in the data query stage .
Sync ： You need to wait for the kernel to send a signal , Express fd eureka . Let the user process get fd, Then the user process initiates a request for data copy .

asynchronous IO The explanation in this respect ：

Non blocking ： Users can also get results immediately .
asynchronous ： The whole data waiting and copying operation are handed over to the operating system , Not users , Users do not need to block waiting .

In the end, it can be found that , For these five IO Model , The difference between asynchronous and synchronous is nothing more than ：

Sync ： The data waiting and copying process is divided into two stages , Initiated by the application process . It needs to be initiated twice .
asynchronous ： The data waiting and copying operations are handed over to the operating system . The application process can initiate a request .

原网站

版权声明
本文为[Zong_ 0915]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/188/202207070757327139.html