当前位置:网站首页>Sysom case analysis: where is the missing memory| Dragon lizard Technology

Sysom case analysis: where is the missing memory| Dragon lizard Technology

2022-07-07 16:04:00 InfoQ

writing / The system operational  SIG

stay
《AK47  invincible , All memory leaks are wiped out 》
In the article , We shared slab  Methods and tools for troubleshooting memory leaks , This time we share a more secretive and difficult to check " Memory leak " Case study .

One 、  Problem phenomenon

The customer receives the system alarm ,K8S  Cluster some nodes  used  Memory keeps rising ,top  The memory used by the viewing process is not much , The user who has insufficient remaining memory but cannot find memory , Mysterious disappearance of memory , You need to check where the memory goes .

null
perform  top  Instructions and sort the output by memory , The processes that use the most memory are  800M  about , It doesn't add up to  used 9G  Usage of .

null

Two 、 Problem analysis

2.1  Where is the memory ?

Before analyzing specific problems , Let's first classify the system memory , It is easy to find places where memory usage is abnormal , From the nature of memory usage , Memory can be simply divided into application memory and kernel memory , Two kinds of memory usage plus free memory , It should be close to  memory total, This distinction can quickly locate the boundary of the problem .

null
among  allocpage  Finger pass  __get_free_pages/alloc_pages  etc.  API  The amount of memory requested by the interface directly from the partner system ( It doesn't contain  slab  and  vmalloc).
2.1.1  Memory analysis
Calculate the application memory and kernel memory respectively according to the memory map , You can know which part has exceptions , But the calculation of these indicators is cumbersome , Many memory values still overlap . For this pain point ,SysOM  The memory disk function of the operation and maintenance platform shows the memory usage in a visual way , And directly give whether there is a memory leak , In this case , Use  SysOM  testing , Direct display  allocpage  There is a leak , The usage is close to  6G.

null
2.1.2 allocpage  Memory
Since it is  alloc page  Type takes up too much memory , Can I directly from  sysfs、procfs  Check the memory usage of the file node ? unfortunately , This part of memory is the kernel / The driver directly calls  __get_free_page/alloc_pages  Function to apply for single or multiple consecutive pages from partner system , There is no interface at the system level to query the memory usage details . If there is a leak in this kind of memory , Will appear " Memory disappears out of thin air " The phenomenon of , It's hard to find out , The cause of the problem is also difficult to investigate . For this difficulty , our SysOM  System operation and maintenance can cover such memory statistics and cause diagnosis .

So it needs to be further passed  SysOM  Diagnostic tool  SysAK  Dynamically grab the usage of this kind of memory .

2.2 allocPage  Type memory troubleshooting

2.2.1  Dynamic diagnosis
For kernel memory leaks , We can use  SysAK  Tools to dynamically track , Start the command and wait  10  minute .

sysak memleak -t page -i 600

null
The diagnosis showed  10  Within minutes  receive_mergeable  The memory allocated by the function is  4919  Time did not release , The memory size is  300M  about , So that's the analysis , We need to combine the code to confirm  receive_mergeable  Whether the memory allocation and release logic of the function is correct .
2.2.2  Distribution and release summary
1)page_to_skb  Each time, a linear data area will be allocated as  128 Byte  Of  skb.

2) Data area call  alloc_pages_node  function , Apply from the partner system at one time  32k  Memory (order=3).

3) Every  skb  Would be right  32k  Of  head page  Generate a reference count , That is, only when all  skb  When both are released , this  32k  Memory is released back to the partner system .

4)receive_mergeable  The function is responsible for applying for memory , But I am not responsible for releasing this part of memory , Only when the application is from  socket recvQ  Read the data away in  head page  The reference count is subtracted by one , When  page refs  by  0  when , Release back to the partner system .

When applying consumption data is slow , May lead to  receive_mergeable  The memory requested by the function is not released in time , And the worst case is one  skb  Will occupy  32k  Memory , Use  sysak skcheck  Check  socket  Residual condition of receiving queue and sending queue .

null
You can tell from the output that , There's only... In the system  nginx  The receiving queue of the process has residual data ,socket fd=11  Of  Recv-Q  Be close to  3M  The data of is not received , By direct  kill 146935, The system memory is back to normal , So the root cause of the problem is  nginx  The data was not collected in time .

3、 ... and 、 The conclusion of the question

After communicating with the business party , The final confirmation is the business configuration , Lead to  nginx  There is a thread that does not process data , As a result, the memory applied by the network card driver is not released in time , and  allocpage  Memory cannot be counted , Thus, the memory disappears out of thin air .
Conclusion verification
Is there really data left in the receiving queue , This combination  crash  The tool  files  Command passed  fd  Find the corresponding sock:

socket = file->private_data
sock = socket->sk

null
Through many observations , Find out  sk_receive_queue  Upper  skb  It hasn't changed for a long time , And that proves it  nginx  Failed to handle the... On the receiving queue in time  skb, As a result, the memory allocated in the network card driver is not released .

Four 、 Memory leak suspect

In the process of troubleshooting, I also encountered a very confused place ,sockstat  and  slabtop  Check  tcp mem  and  skbuff_head_cache  It is normal to use , This further masks the memory occupied by the network .

tcp mem = 32204*4K=125M

null
skb  Quantity in  1.5 ten thousand ~3  Between ten thousand .

null
According to the previous analysis , One skb In the worst case, it takes  32k  Memory , that  2  m  skb  The largest is  600M  about , How can it take up a few  G  了 , Is there a problem with the analysis ? As shown in the figure below ,skb  There may be several nonlinear regions of  frag page, And each  frag page  It may also be caused by  compund page  form .

null
use  crash  Actually read  skb  Memory discovery , There are some  skb  There is  17  individual  frag page, And the data size is only  10 Byte.

null
analysis  frag page  Of  order  by  3, It means a  frag page  Occupy  32k  Memory .

null
In extreme cases , One  skb  May occupy (1+17)
8=144  page , Upper figure  slabinfo  in skbuff_head_cache  active  object  The number of  15033  individual , So the theoretical maximum total memory  =144
15033*4K = 8.2G, And now we encounter scenario consumption  6G  It is entirely possible to have a memory of .

——  End  ——

Join the dragon lizard community

Join wechat group : Add a community assistant - Dragon lizard community Little Dragon ( WeChat :openanolis_assis), remarks 【 japalura 】 Be with you ; Join the nail group : Scan the QR code of the nail group below . Welcome to developers / Users join the dragon lizard community (OpenAnolis) communication , Jointly promote the development of dragon lizard community , Create an active 、 Healthy open source operating system ecosystem !

null
About the dragon lizard community

Dragon lizard community (OpenAnolis) By enterprises and institutions 、 Institutions of higher learning 、 Scientific research institutions 、 nonprofit organization 、 Individuals are waiting voluntarily 、 equality 、 Open source 、 A non-profit open source community based on collaboration . Dragon lizard community was founded in  2020  year  9  month , Designed to build an open source 、 neutral 、 Open Linux  Upstream distribution community and innovation platform .

The short-term goal of the dragon lizard community is to develop the dragon lizard operating system (Anolis OS) As  CentOS  Countermeasures after stopping service , Build a compatible international network  Linux  Community distribution of mainstream manufacturers . The medium and long-term goal is to explore and build a future oriented operating system , Establish a unified open source operating system ecosystem , Incubate innovative open source projects , Prosper the open source ecosystem .

at present ,
Anolis OS 8.6
The published , More dragon lizard self-developed characteristics , Support  X86_64 、RISC-V、Arm64、LoongArch  framework , Perfect fit  Intel、 Megacell 、 Kun Peng 、 Godson and other chips , And provide national secret support of the whole stack .

Welcome to download :https://openanolis.cn/download

Join us , Work together to build an open source operating system for the future !https://openanolis.cn
原网站

版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/188/202207071347276447.html