当前位置:网站首页>[high level knowledge] epoll implementation principle of user mode protocol stack

[high level knowledge] epoll implementation principle of user mode protocol stack

2022-06-09 02:02:00 InfoQ

Epoll  yes  Linux IO  Multiplexing management mechanism . For now  Linux  Platform high performance network  IO  Necessary components . The implementation of the kernel can refer to :fs/eventpoll.c .
Why do you need to implement it yourself  epoll  Well ? Now I am going to build a user mode protocol stack . Adopt single thread mode .https://github.com/wangbojing/NtyTcp, As for why we need to implement user mode protocol stack ? Baidu  C10M  The problem of .
  Because the protocol stack achieves the user state, it needs to realize the high-performance network by itself  IO  Management of . therefore  epoll  Just do it yourself   Next . Code :https://github.com/wangbojing/NtyTcp/blob/master/src/nty_epoll_rb.c
  In the realization of  epoll  Before , You have to understand the kernel first  epoll  Operating principle . Kernel  epoll  It can be understood from four aspects .
 1. Epoll  Data structure of ,rbtree  Storage of ,ready  Queue storage ready  io.
 2. Epoll  Thread safety for ,SMP  Operation of , And deadlock prevention .
 3. Epoll  Kernel callback .
 4. Epoll  Of  LT( Level trigger ) And  ET( edge-triggered ).
  The following four aspects are used to realize  epoll:
One 、 Epoll  data structure
 Epoll  It mainly consists of two structures :eventpoll  And  epitem.Epitem  Is each  IO  The corresponding event . such as  epoll_ctl EPOLL_CTL_ADD  During operation , You need to create a  epitem.Eventpoll  Is each  epoll  the   Corresponding . such as  epoll_create  To create a  eventpoll.
 Epitem  The definition of :

null
The data structure is shown in the figure below :

null
List  Used to store ready  IO. For data structure, we mainly discuss two aspects :insert  And  remove. The same is true of ,  about  list  We also discuss  insert  And  remove. When to insert data into  list  What about China? ?
  When kernel  IO  When ready   Hou , Will perform  epoll_event_callback  Callback function for , take  epitem  Add to  list  in . When to delete  list  The data in ? When  epoll_wait  When activating the re run , take  list  Of  epitem  one by one  copy  To  events  Parameters in .
 Rbtree  Used to store all  io  The data of , Convenient and fast communication  io_fd  lookup . Also from the  insert  And  remove  To discuss . about  rbtree  When to add : When  App  perform  epoll_ctl EPOLL_CTL_ADD  operation , take  epitem  Add to  rbtree  in . When to delete ? When  App  perform  epoll_ctl EPOLL_CTL_DEL  operation , take  epitem  Add to  rbtree  in .List  And  rbtree  And how to achieve thread safety ,SMP, To prevent deadlock ?
Two 、 Epoll  Locking mechanism
 Epoll  Lock protection is required from the following aspects .List  The operation of ,rbtree  The operation of ,epoll_wait  The waiting for .List  Use minimum granularity locks  spinlock, Easy to do  SMP  When adding operations under , Able to operate quickly  list.
List  add to :

null
346  That's ok : obtain  spinlock.
347  That's ok :epitem  Of  rdy  Set as  1, representative  epitem  Already in the ready queue , If the same event is triggered later, you only need to   change  event.
348  That's ok : Add to  list  in .
349  That's ok : take  eventpoll  Of  rdnum  Domain   Add  1.
350  That's ok : Release  spinlock.
List  Delete :

null
301  That's ok : obtain  spinlock .
304  That's ok : interpretation  rdnum  And  maxevents  Size , avoid  event  overflow .
307  That's ok : Loop traversal  list, Judge to add  list  Can't be empty  .
309  That's ok : obtain  list  The first node  310  That's ok : remove  list  The first node .
311  That's ok : take  epitem  Of  rdy  Domain set to  0, identification  epitem  Not in the ready queue .
313  That's ok :copy epitem  Of  event  To user space  events.
316  That's ok :copy  Quantity plus  1 .
317  That's ok :eventpoll  in  rdnum  reduce 1. avoid  SMP  Under the system , Multi core competition . Spin lock is used here , Not suitable for sleep lock .
【 Article Welfare 】 In addition, Xiaobian also sorted out some C++ Back-end development interview questions , Teaching video , Back end learning roadmap for free , You can add what you need :
Click to join the learning exchange group ~
  Group file sharing
Xiaobian strongly recommends C++ Back end development free learning address :
C/C++Linux Server development senior architect /C++ Background development architect ​

null
Rbtree  The addition of :

null
149  That's ok : Get mutex .
153  That's ok : lookup  sockid  Of  epitem  Whether there is . If it exists, you cannot add , If it does not exist, you can add .
160  That's ok : Distribute  epitem.
167  That's ok :sockid  assignment
168  That's ok : Will be set  event  Add to  epitem  Of  event  Domain .
170  That's ok : take  epitem  Add to  rbrtree  in .
173  That's ok : Release the mutex .
Rbtree  Delete :

null
177  That's ok : Get mutex .
181  That's ok : Delete  sockid  The node of , If it doesn't exist , be  rbtree  return -1.
188  That's ok : Release  epitem .
190  That's ok : Release the mutex .
Epoll_wait  Pending
 Epoll_wait The pending application of  pthread_cond_wait, Specific implementation can refer to :https://github.com/wangbojing/NtyTcp/blob/master/src/nty_epoll_rb.c.
3、 ... and 、 Epoll  return
transfer
 Epoll  When to execute the callback function of , This part needs to be connected with  Tcp  The protocol stack of .Tcp  The sequence diagram of the protocol stack is shown in   Shown below ,epoll  The part called back from the protocol stack is numbered from the figure below  1,2,3,4. Specifically  Tcp  Implementation of protocol stack , follow-up   From another article .

null
The four steps are described in detail below :
  Number  1:tcp  Three handshakes , Peer feedback  ack  after ,socket  Get into  rcvd  state . Need to monitor  socket  Of  event  Set as  EPOLLIN, At this point, the logo can enter  accept  Read  socket  data .
  Number  2: stay  established  state , After receiving the data , Need to put  socket  Of  event  Set as  EPOLLIN  state .
  Number  3: stay  established  state , received  fin  when , here  socket  Enter into  close_wait. need  socket  Of  event  Set as  EPOLLIN. Read the disconnection information .
  Number  4: testing  socket  Of  send  state , If the opposite end  cwnd>0  Yes. , Data sent . Therefore, it is necessary to  socket  Set as  EPOLLOUT. So here we add  EPOLL  Callback function for , Can make  epoll  Receive... Normally  io  event .
Four 、 LT  And  ET LT
 ( Level trigger ) And  ET( edge-triggered ) It's a concept in electronic signals . I don't know. I can  man epoll  Viewed . As shown in the figure below :

null
such as :event = EPOLLIN | EPOLLLT, take  event  Set to  EPOLLIN  And horizontal trigger . as long as  event  by  EPOLLIN  Can be called continuously  epoll  Callback function . such as : event = EPOLLIN | EPOLLET,event  If from  EPOLLOUT  Change to  EPOLLIN  When , Will touch   Hair . In this case , Change only happens once , Therefore, it is only called once  epoll  Callback function . About horizontal trigger and edge touch   Issued at  epoll  When the callback function is executed , If  EPOLLET( edge-triggered ), And previous  event  contrast , Such as   If it changes, call  epoll  Callback function , If  EPOLLLT( Level trigger ), View  event  Is it  EPOLLIN,  You can call  epoll  Callback function .

Reference material

null
Recommend a zero sound education C/C++ Free open courses developed in the background , Personally, I think the teacher spoke well , Share with you :
C/C++ Background development senior architect , The content includes Linux,Nginx,ZeroMQ,MySQL,Redis,fastdfs,MongoDB,ZK, Streaming media ,CDN,P2P,K8S,Docker,TCP/IP, coroutines ,DPDK Etc , Learn now





original text :【 The advanced architecture 】 User state protocol stack Epoll Realization principle
原网站

版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/159/202206081420176382.html