当前位置:网站首页>IO model review

IO model review

2022-07-07 10:26:00 Zong_ 0915

Preface

This piece of ice river teacher's article , Do a review .

One . IO The basic concept of

First of all, what is IO: It involves the process of data migration between computer core and other devices , Namely IO. For example, disk IO:

  • Input : Read data from disk to memory .
  • Output : Write data in memory to disk .

And the operating system initiates once IO The operation generally includes two stages :

  1. IO call : The application process goes to Operating system kernel A call .
  2. IO perform Operating system kernel complete IO operation .

among ,IO The implementation stage is divided into two stages :

  1. Data preparation stage : Kernel wait I/O The device is ready for data .
  2. Copy data phase : Copy data from kernel buffer to user process buffer .

Pictured :
 Insert picture description here

Two . IO Model

IO There are five types of models :

2.1 Blocking type IO(BIO)

Application process initiation IO call , however Kernel data is not ready . therefore The application process has been blocking and waiting , Until the kernel data is ready .
 Insert picture description here
shortcoming : If the kernel data is never ready , Then the user process will always be blocked , Waste performance .


2.2 Non-blocking type IO(NIO)

In view of the blocking IO The shortcomings of , On its basis , If the kernel data is not ready , Non-blocking type IO Meeting First, return the error message to the user process , Make it unnecessary to wait , Then request by polling . Insert picture description here advantage : Compared with blocking IO, Users don't have to wait , It will not enter the blocking state because the kernel data is not ready .
shortcoming : Frequent polling , Lead to frequent system calls , Consume a lot of CPU resources .


2.3 IO Multiplexing (BIO)

Since frequent polling leads to CPU It costs a lot . Let the kernel data be ready , Actively notify the application to make system calls . That is to say IO Multiplexing .

Concept : File descriptor (File Descriptor

The file descriptor is a non negative integer . When opening an existing file or creating a new file , The kernel returns a file descriptor . Reading and writing files also need to use file descriptors to specify the files to be read and written .

IO The core idea of reuse model : The system provides us with a class of functions , They can monitor multiple at the same time fd The operation of , ren Which one returns kernel data ready , The application process then initiates a system call . Be careful , Here, the application process is still required to initiate system calls .

IO There are three ways of multiplexing :select、poll、epoll.

2.3.1 select

The application process can pass select function , Come on Monitor multiple at the same time fd, stay select Function monitoring fd In the process of , As long as any data state is ready .select It will return to the readable state , At this time, the application process will initiate a request to read kernel data . Pictured :
 Insert picture description here
Drawbacks as follows :

  1. Monitoring IO The maximum number of connections has an upper limit .
  2. select When the function returns , By traversing fd aggregate , Find ready descriptor fd.( namely Traverse all streams

2.3.2 poll

Whereas select Disadvantages of the way , It came up. poll. Compared with the former ,poll It solves the problem of limiting the number of connections . however poll You still need to traverse the file descriptor to get the ready socket.

2.3.3 epoll

So in order to solve the problem select and poll The problem is , And then there is IO Multiplexing (epoll) Model , Use event driven to realize
 Insert picture description here
Looks like select There is no difference in the flow chart , Add some words here :

  1. epoll First, through epoll_ctl() Function to register a file descriptor .
  2. Once based on a fd When it's ready , The kernel uses a callback mechanism , Quickly activate this fd.
  3. When the process calls epoll_wait() I'll be informed when . By adopting Listen for event callbacks Mechanism to Avoid traversing all text descriptors .

although IO Multiplexing this way for non blocking IO, There is no need to make frequent calls , But through callback . But when the process calls epoll_wait() when , It can still be blocked .

Important things are to be repeated for 3 times , Multiplexing IO It still is : Synchronous blocking ! Synchronous blocking ! Synchronous blocking !

Therefore, we hope to have such a function in design ( I think it would be better to understand this way ):

  1. Multiplexing IO, Although you can specify the corresponding IO flow . Avoid traversing all IO.
  2. Although the results are obtained through callback , But the process of waiting for the result , It needs blocking to wait .
  3. Therefore, the design hopes that the user process can not wait , Do something else first . Wait for the callback result , I can feel it again .

Then comes the signal drive IO Model .


2.4 Signal driven IO(NIO)

On the basis of multiplexing . Send a signal to the kernel . At this time, the application process does not need to be blocked , You can do other things . When the kernel data is ready , Re pass SIGIO Signal the application process . Once the process gets the signal , Immediately call to get kernel data . Pictured :
 Insert picture description here
Of course , The data status inquiry process here is asynchronous, right , But the data replication part , It is still synchronously blocked , Therefore, the whole signal drives IO The process of is not asynchronous .


2.5 asynchronous IO(AIO)

You only need to send a request to the kernel once , All operations of data status inquiry and data copy can be completed , And don't block waiting for results .
 Insert picture description here


Here's the explanation BIO、NIO、AIO

  • Synchronous blocking (blocking-IO) abbreviation BIO.
  • Synchronous nonblocking (non-blocking-IO) abbreviation NIO.
  • Asynchronous non-blocking (asynchronous-non-blocking-IO) abbreviation AIO.

3、 ... and . summary

3.1 select、poll、epoll The three difference *

Comparative study selectpollepoll
Underlying data structure Array Linked list Red and black trees + Double linked list
Get ready fd The way Traverse all of Traverse all of Event callback
The complexity of the event O(n)O(n)O(1)
maximum connection 1024(Linux) unlimited unlimited
fd Data copy mode Every time you call select, We need to fd Copy from user space to kernel space Every time you call poll, We need to fd Copy from user space to kernel space Through memory mapping (mmap), There is no need to make frequent copies fd, At a time can be .

3.2 Five kinds IO Model

IO Model Blocked state sync
Blocking type IO Blocking Sync
Non-blocking type IO Non blocking Sync
IO Multiplexing Blocking Sync
Signal driven IO Non blocking Sync
asynchronous IO Non blocking asynchronous

Blocking , The concept of non blocking distinguishes : It can be simply understood as whether you need to do something to get a reply immediately , If you cannot get a return immediately , Need to wait , That's blocking . Prefer whether to respond immediately .

Sync , The concept of asynchrony : You always do one thing before you do another , Whether it takes time to wait , This is synchronization . Otherwise, it is asynchronous . I prefer whether I can do two things in parallel .

Then look back at the table above :

Non-blocking type IO The explanation in this respect :

  • Non blocking : Because user processes can get results immediately ( It may be the data that the end user wants , It may also be an error message ).
  • Sync : Because the data replication phase is always executed after the data query phase .

IO Explanation of multiplexing :

  • Blocking : The user process needs to wait for the callback result to return . The process is blocked .
  • Sync : Because the data replication phase is always executed after the data query phase .

Signal driven IO The explanation in this respect :

  • Non blocking : The user process can get the return immediately in the data query stage .
  • Sync : You need to wait for the kernel to send a signal , Express fd eureka . Let the user process get fd, Then the user process initiates a request for data copy .

asynchronous IO The explanation in this respect :

  • Non blocking : Users can also get results immediately .
  • asynchronous : The whole data waiting and copying operation are handed over to the operating system , Not users , Users do not need to block waiting .

In the end, it can be found that , For these five IO Model , The difference between asynchronous and synchronous is nothing more than :

  • Sync : The data waiting and copying process is divided into two stages , Initiated by the application process . It needs to be initiated twice .
  • asynchronous : The data waiting and copying operations are handed over to the operating system . The application process can initiate a request .
原网站

版权声明
本文为[Zong_ 0915]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/188/202207070757327139.html