当前位置:网站首页>JVM garbage collection mechanism and common garbage collectors
JVM garbage collection mechanism and common garbage collectors
2022-06-11 00:16:00 【Xujunsheng】
JVM Garbage collection mechanism and common garbage collectors
The opening
In the previous post , We introduced JVM And garbage collection algorithm . Today we are going to talk about JVM The most important heap memory is how to use the garbage collector for garbage collection , And how to use commands to configure and use these garbage collectors .
Before that , Let's do a brief review of the garbage collection mechanism .
Garbage collection
According to the semantic meaning , Garbage collection , First of all, we need to find the garbage , Then recycle it . however GC The process is just the opposite , It is to find the active object first , Then identify other inactive objects as garbage , Then delete .
So garbage collection Only with active objects , It has nothing to do with the size of the heap .
The implementation details of various garbage collectors are different , But overall , Garbage collectors focus on two things :
- Find all living objects
- Discard the rest , Dead object , Objects that are no longer in use
Mark reachable objects (Marking Reachable Objects)
modern JVM All of the GC Algorithm , The first step is to find all the surviving objects . As shown in the figure below :

- First , Some specific objects are specified as
Garbage Collection Roots(GC Root element ). - secondly ,GC Traverse (traverses) Overall object relationship diagram in memory (object graph), from GC The root element starts scanning , Direct reference to , And other objects ( Through the property field of the object ). all GC All accessed objects are marked as living objects .
- After the marking phase is complete , All living objects are marked ( Living objects are represented in blue in the above figure ). Other objects ( The gray data structure in the figure above ) It's from GC The root element is not reachable , That is, programs can no longer use these unreachable objects (unreachable object). Such objects are considered garbage ,GC They will be cleared in the next phase .
There is one thing to pay attention to during the marking phase : In the marking phase , Need to pause all application threads , To traverse the reference relationships of all objects . Because you can't keep track of changing reference diagrams without pausing . This scenario is called Stop The World( abbreviation STW, The whole line stopped ), The point at which a thread can be safely suspended is called safer (safe point), then ,JVM You can concentrate on cleaning up . The safety point may be triggered by many factors , At present ,GC Is the most common reason for triggering a safety point .
The time this phase is suspended , And heap memory size , The total number of objects is not directly related , It is By the surviving object (alive objects) To determine the number of . therefore Increasing the size of heap memory does not directly affect the time taken by the marking phase .
After the marking phase is complete ,GC Proceed to the next step , Delete unreachable objects .
Delete unreachable objects (Removing Unused Objects)
Various GC The algorithm is slightly different when deleting unreachable objects , But in general, it can be divided into three categories : eliminate (sweeping)、 Arrangement (compacting) And copy (copying).
eliminate (Sweeping)
Mark and Sweep( Mark — eliminate ) The concept of the algorithm is very simple : Just ignore all the garbage . That is, after the marking phase , Memory space occupied by all unreachable objects , Are considered idle , So it can be used to allocate new objects .
This algorithm requires the use of free tables (free-list), To record all free areas , And the size of each area .
Maintaining free tables increases the overhead of object allocation . There is another weakness —— That's it Debris problem .
For example, I applied 1k、2k、3k、4k、5k Of memory .

For some reason 2k and 4k Of memory , No longer use , You need to hand it over to the garbage collector for recycling .

This is the time , I should have enough 6k Of free space . Next , I am going to apply for a 5k Space , As a result, the system told me that I was out of memory .
And the longer the system runs , The more pieces there are .
Arrangement (Compacting)
Mark — eliminate — Sorting algorithm (Mark-Sweep-Compact), All marked objects ( The living object ), Migrate to the beginning of memory space , Eliminated 「 Mark — Clear algorithm 」 The shortcomings of .
The corresponding disadvantage is GC The pause time will increase , Because you need to copy all the objects to another place , Then modify the references to these objects .
The advantages of this algorithm are also obvious , After defragmentation , Assigning new objects is simple , Just go through 「 Pointer collision 」(pointer bumping) that will do . Use this algorithm , The amount of memory space remaining is always clear , No more memory fragmentation problems .
hypothesis Java The memory in the heap is absolutely regular , All used memory is put in one edge , Free memory is put on the other side , There is a pointer in the middle as an indicator of the dividing point , Then the allocated memory is just that A pointer moves a distance equal to the size of the object in the direction of free space , This distribution is called “ Pointer collision ”(Bump The Pointer).
Copy (Copying)
Mark — Copy algorithm (Mark and Copy) and 「 Mark — Sorting algorithm 」(Mark and Compact) Very similar : Both move all living objects .
The difference lies in ,「 Mark — Copy algorithm 」 Is to move memory to another space : Survival zone .
「 Mark — Replication method 」 The advantage is : Tagging and copying can be done at the same time . The drawback is that you need an extra memory section , To store all living objects .
The generational Hypothesis
We briefly introduce some common memory recycling algorithms , at present ,JVM The garbage collector , It is the development of several simple algorithms . Take a brief look at their characteristics :
- Mark - eliminate (Mark-Sweep): The efficiency of general , The disadvantage is that it will cause memory fragmentation .
- Copy algorithm (Copy): Replication algorithm is the most efficient of all algorithms , The disadvantage is that it will cause a certain waste of space .
- Mark - Arrangement (Mark-Compact): The efficiency is worse than the first two , But no space is wasted , It also eliminates the memory fragmentation problem .
therefore , There is no optimal algorithm , Only the most appropriate algorithm .
JVM It's computing nodes , Not storage nodes . The ideal situation , After the object is used up , Its life cycle ends immediately . And those resources that are accessed frequently , We want it to reside in memory .
Studies have shown that , Most of the objects , It can be divided into two categories :
- Most objects have a short life cycle
- Other objects are likely to survive for a long time
Most die fast . We call this hypothesis the weak generational hypothesis (weak generational hypothesis).
Corresponding to this is the strong generational hypothesis (Strong Generational Hypothesis): The more times an object survives the garbage collection process, the more difficult it is for an object to die .
Now the garbage collector , Both physically and logically , Distinguish between these two types of objects :
- We take the area occupied by the fast dead objects , called The younger generation (Young generation)
- Take the area occupied by other living long objects , called Old age (Old generation), The old age is also called... In some places
Tenured Generation.
The younger generation
The garbage collection algorithm used by young generation is replication algorithm . Because the younger generation GC after , Very few objects will survive , It's very efficient to copy this part of the object .
We also learned that the replication algorithm will cause a certain waste of space , So there will be many regions among the young generation .

As shown in the figure , The young generation is divided into :Eden District , Two Survivor District .
In the early Qing Dynasty Eden Distinguish when the match is full , It will trigger the younger generation to GC(Minor GC). The specific process is as follows :
- stay Eden The zone executed the first time GC after , The surviving objects will enter one of them through the replication algorithm Survivor Partition ( hereinafter referred to as from);
- Eden District again GC, take Eden and from Clean up the area together , Live objects are copied to to District ; Next , Just empty from It's OK .
In the process , There is always one Survivor The partition is empty .Eden、from、to The default scale for is 8:1:1, So it will only cause 10% Waste of space . The proportion of , By parameters -XX:SurvivorRatio Configured ( The default is 8).
In general , We just need to understand this level OK 了 . But in an ordinary interview , Another point will often be mentioned , Although the frequency is not too high , It is TLAB, Here we also briefly introduce .
TLAB
TLAB The full name is Thread Local Allocation Buffer,JVM By default, each thread is opened with one buffer Area , Used to speed up object allocation . This buffer Just put it in Eden In the area .
stay Java Many of the objects in the program are small objects and can be lost after use , They don't exist for thread sharing and are suitable for being fast GC, So for small objects, usually JVM Priority will be given to TLAB On , also TLAB The allocation on is thread private , So there is no lock overhead . Therefore, in practice, it is usually more efficient to allocate multiple small objects than to allocate one large object .
This truth and Java In language ThreadLocal similar , Avoid the operation of public areas , And some lock competition .

The allocation of objects takes precedence in TLAB The distribution of , but TLAB It's usually very small , So when the object is relatively large , Will be in Eden The shared area of the zone is allocated .
TLAB It's an optimization technique , Similar optimizations include allocation of objects on the stack ( This leads to the topic of escape analysis , Default on ).
Old age
In the old days 「 Mark - eliminate 」、「 Mark - Arrangement 」 Algorithm , Because the survival rate of the old generation is generally relatively high , The space is relatively large , It's not worth copying , It's better to take the way of local collection .
that , How did the object enter the old age ? There are many ways .
(1) promote (Promotion)
If the object is old enough , Will pass 「 promote 」 Into the old age .
About whether the object is old or not , Through its age (age) To determine the . Whenever it happens Minor GC, The age of the surviving objects will increase 1. Until a certain threshold is reached , They're going to put these “ Obstinate ” To the old age .
If these objects become unreachable , Until the old days GC When , Will be cleaned up .
This threshold , You can use the parameter ‐XX:+MaxTenuringThreshold To configure , The maximum is 15, Because it uses 4bit Stored .
(2) Distribution guarantee
Look at the picture of the young generation , Every time you live , Will be placed in one of the surviving areas , The default scale for this area is 10%.
But we can't guarantee that the number of objects that survive each time is less than 10%, When Survivor Space is not enough , It depends on other memory ( The old age ) Distribution guarantee . This is the time , Objects are also allocated directly to the older generation .
(3) Big objects are distributed directly to the elderly
Objects beyond a certain size will be allocated directly in the older generation . This value is determined by the parameter -XX:PretenureSizeThreshold Configured . The default is 0, It means all of them Eden District Distribution .
(4) Dynamic object age determination
Some garbage collection algorithms , There is no demand for age Must be achieved 15 To rise to the old age , It uses some dynamic calculation methods . such as , If the sum of the sizes of objects of the same age in the surviving area , More than half of the survival zone , Greater than or equal to age We're going to go straight to the older generation .
Card mark (card marking)
We can see , Object references are a huge mesh . Some objects may be in Eden District , Some may be in the older generation , Then this kind of How cross generational references are handled What about ?
because Minor GC It happened alone , If an old age object references it , How to ensure that young people can survive ?
For yes 、 No judgment , We usually use Bitmap( Bitmap ) And bloom filter to speed up the search .
JVM In a similar way . Actually , The old age is divided into many card pages (card page) Of ( The general quantity is 2 Power of power ).
Card table (Card Table) Is a collection used to mark the status of a card page , Each card entry corresponds to a card page .
If the younger generation has a target assignment , And there was an object in the old age that pointed to this new object , Then the memory card page corresponding to the old age object , It will be marked as dirty, The card table only needs very small storage space to keep these states .
Garbage collection , You can read this card and watch first , Make a quick judgment .
HotSpot Garbage collector
So let's start with HotSpot Several garbage collectors , Each type of recycler has its own characteristics . We're in the normal GC To optimize the , Be sure to find out what kind of garbage collector you are using now .
Before that , We put the above generation garbage collection into a big picture , When introducing the following collectors , You can map their positions .
Here we briefly introduce some concepts :
Minor GC: It happened in the younger generation GC.Major GC: It happened in the old days GC.Full GC: Recycle the whole pile of garbage . such as Metaspace Area causes recycling of young and old generations .
The younger generation of garbage collectors
Serial Garbage collector
Serial( Serial ) Is the most basic garbage collector , Using the replication algorithm , Used to be JDK1.3.1 The only garbage collector of the previous generation .
Serial It's a single threaded collector , It will not only use one CPU Or a thread to complete garbage collection , And suspend all user threads during garbage collection , Until the end of garbage collection . This phenomenon is called STW(Stop-The-World).
STW It is automatically initiated and completed by the virtual machine in the background , Stop all the threads that the user works normally when the user is not visible , This is unacceptable for many applications .
Serial Although the garbage collector needs to pause all other working threads during garbage collection , But it's simple and efficient , For a limited single CPU In terms of environment , There is no overhead of thread interaction , Can get the highest single thread garbage collection efficiency , therefore Serial The garbage collector is still Java The virtual machine is running in Client Default younger generation garbage collector in mode .
ParNew Garbage collector
ParNew yes Serial Multithreaded version of . By multiple GC Threads do garbage cleaning in parallel . The cleanup process still needs to stop the user thread .
ParNew Pursuit “ Low pause time ”, And Serial The only difference is the use of multithreading for garbage collection , In a multiple CPU Performance ratio in environment Serial There will be a certain degree of improvement ; But thread switching requires extra overhead , So in the single CPU Not as good as in the environment Serial.
ParNew Collectors are many that run in Server Virtual machine in mode , In especial JDK 7 The preferred new generation collector in previous legacy systems , There is one with the function 、 Performance independent but important reasons are , except Serial Outside the collector , At present, only ParNew Can and CMS The collector works with .
Parallel Scavenge Garbage collector
Parallel Scavenge The collector is a new generation collector , It's also a collector that uses replication algorithms , Is a parallel multithreaded collector ,Java1.8 The default new generation garbage collector .
It focuses on the program achieving a Controllable throughput (Thoughput,CPU Time to run user code /CPU Total time consumed , namely throughput = Run user code time /( Run user code time + Garbage collection time )), Throughput first .
High throughput can be used most efficiently CPU Time , Finish the operation task of the program as soon as possible , It is mainly applicable to tasks that operate in the background without too much interaction with users .
It is associated with ParNew The main difference is :
- Parallel Scavenge: Pursuit CPU throughput , Be able to complete the assigned task in a short time , Suitable for background computing without interaction . Weak interaction and strong computing .
- ParNew: Seeking to reduce user pause time , Suitable for interactive applications . Strong interaction and weak computation .
Garbage collectors in the old days
Serial Old Garbage collector
With the younger generation Serial The garbage collector corresponds to , All single threaded versions , It is also suitable for clients .
Young generation Serial, Using the replication algorithm .
Old age Serial Old , Use the tag - Sorting algorithm .
Parallel Old
Parallel Scavenge The old version , Support multi thread concurrent collection , Based on tags - Finishing algorithm implementation .
In the case of paying attention to throughput or processor resource scarcity , Priority can be given to Parallel Scavenge Add Parallel Old Collector combination .

CMS Garbage collector
CMS(Concurrent Mark Sweep) The collector is to get the shortest GC Collector with pause time as target , It makes user threads and GC Threads can execute concurrently , So in the process of garbage collection, users will not feel obvious stuck .
You know from the name ,CMS It's based on markers - Clear algorithm implementation . Its operation process is more complicated than the previous collectors , The whole process is divided into seven steps , Include :
- Initial marker (Initial Mark)
- Concurrent Tags (Concurrent Mark)
- Concurrent pre cleanup (Concurrent Preclean)
- Concurrent cancelable pre cleanup (Concurrent Abortable Preclean)
- Final marker (Final Remark)
- Concurrent elimination (Concurrent Sweep)
- Concurrent reset (Concurrent Reset)
In the long term ,CMS Garbage collector , Is to be G1 When the garbage collector is replaced . stay Java8 after , Using it will throw a warning .
Java HotSpot 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
G1
G1 The full name is GarbageFirst GC,G1 Our goal is to kill CMS Of , It is also a soft real-time garbage collector . comparison CMS,G1 The use of is more humanized . such as ,CMS The relevant parameters of the garbage collector are 72 individual , and G1 The only parameters for are 26 individual .
G1 stay JDK1.7 Version officially launched , yes JDK 9 Future default garbage collector , To replace the CMS Recyclers .
Other recyclers , It's an overall collection of a certain era , Collection time is naturally difficult to control .G1 Cut the pile into many pieces , Think of each as a small goal , Some goals are easy to achieve .
There's another interview question :G1 Is there a distinction between the younger generation and the older generation ?

As shown in the figure ,G1 There is a Eden Area and Survivor The concept of zone , But they are not continuous in memory , It's made up of small portions .
The size of this small area is fixed , It's called a small pile area (Region). The small pile area can be Eden District , It can also be Survivor District , It can also be Old District . therefore G1 The concepts of the younger generation and the older generation are logical .
Every piece Region, The size is the same , Its value is in 1M To 32M One... Between bytes 2 The power of .
But if my object is too big , One Region What if I can't let it go ? Notice that there is a large yellow area in the picture , Its name is Humongous Region, Larger than Region 50% The object of , Will be assigned here .
Region Size , It can be set through parameters :-XX:G1HeapRegionSize=32M.
that , When it's recycled , Which small heap areas are recycled ? Is it random ?
Of course not . in fact , The small dump area with the most garbage , Will be collected first . This is it. G1 The origin of the name .
Configuration parameters
adopt -XX:+PrintCommandLineFlags Parameters , You can view the current Java The version uses the default garbage collector . such as , In my system Java8 The default collectors are as follows :
java -XX:+PrintCommandLineFlags -version
-XX:InitialHeapSize=265277248 -XX:MaxHeapSize=4244435968 -XX:+PrintCommandLineFlags -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:-UseLargePagesIndividualAllocation -XX:+UseParallelGC
java version "1.8.0_321"
Java(TM) SE Runtime Environment (build 1.8.0_321-b07)
Java HotSpot(TM) 64-Bit Server VM (build 25.321-b07, mixed mode)
Here are some configuration parameters :
-XX:+UseSerialGCParameter specifies that both the new generation and the old generation use a serial garbage collector , namely Serial + Serial Old-XX:+UseParNewGCThe new generation uses ParNew, Used in the old days Serial Old-XX:+UseParallelGCThe new generation uses Parallel Scavenge, In the old days, it was used by default Serial Old-XX:+UseParallelOldGCThe new generation uses Parallel Scavenge, Used in the old days Parallel Old-XX:+UseConcMarkSweepGC, Indicates that the younger generation uses ParNew, In the old days CMS-XX:+UseG1GCUse G1 Garbage collector-XX:+UseZGCUse ZGC Garbage collector
In order to make a better impression , Please look at the chart below. . Their relationship is still complex . Pay special attention to -XX:+UseParNewGC This parameter , Already in Java9 Was abandoned in .

There are so many garbage collectors and parameters , Then what exactly do we use ? Where to optimize ?
at present , although Java It's a higher version of , But the most used is Java8. from Java8 Upgrade to a higher version of Java system , There is a certain cost , therefore CMS The garbage collector will last for some time .
The most used garbage collector online , There is CMS and G1, as well as Java8 default Parallel Scavenge.
- CMS Setting parameters of :
-XX:+UseConcMarkSweepGC. - Java 8 Default parameters :
-XX:+UseParallelGC. - Java 13 Default parameters :
-XX:+UseG1GC.
STW
We mentioned above STW, If during garbage collection ( Whether it's marking, sorting, copying ), What if new objects enter ?
In order to ensure that the program will not be disorderly , The best way is to pause all threads of the user . In this period of time , You can't new Object's , Can only wait for . Performance in JVM It's a short Caton , Nothing can be done . This headache , It's called Stop the world. abbreviation STW.
Here's an interview question : GC In the process , Can you still create new objects ?
The answer is No .
Marking stage , Most are to STW Of . If you don't pause the user process , When marking objects , It is possible that other user threads will generate some new objects and references , Cause confusion .
Now the garbage collector , Will try to reduce this process . But even the most advanced ZGC, There will also be a short STW The process . What we have to do is build on the existing infrastructure , Try to reduce GC pause .
Stop the World I believe you have a very deep understanding through the last article
You may be right STW There is no concept of the impact of , Let me give an example to illustrate .
The peak traffic of a highly concurrent service is 10 Ten thousand times / second , In the back 10 A load balancing machine , Then, on average, each machine needs 1w/s. If something happens to a machine during this time STW, lasted 1 second , So it was necessary 10ms You can return 1 Million requests , Need to wait at least 1 Second .
Performance in users , It's the system that gets stuck . If our GC Very often , This kind of Caton will be particularly obvious , Seriously affecting the user experience .
Although I say Java It provides us with a great automatic memory management mechanism , But it can't be abused , Because it has STW Hard wound .
in the final analysis , All kinds of garbage collectors are designed to solve the headache STW problem , Give Way GC Less time , Pause less , More throughput .
Reference material
- Zhi-ming zhou 《 In depth understanding of Java virtual machine 》
- Largo Education 《 Explain profound theories in simple language Java virtual machine 》
边栏推荐
- [JVM] class loading mechanism
- Bluetooth development (8) -- avdtp connection process
- 【JVM】内存模型
- 763. dividing alphabetic intervals
- [database] types of NoSQL database
- array_column() expects parameter 1 to be array, array given
- SQL statement -- enter the month, query the date (month, year, day), and output the month
- 452. 用最少数量的箭引爆气球
- SQL query, subquery as result field
- 对接请求方式
猜你喜欢
![[auto reply Script] happy new year. I typed every word myself, not forwarded it~](/img/53/75d2bbbee653a41e206ebd751fdea1.png)
[auto reply Script] happy new year. I typed every word myself, not forwarded it~

Bluetooth development (6) -- literacy of Bluetooth protocol architecture

VTK example -- three intersecting planes

In the month of safety production, Huangpu launched a publicity campaign for gas safety in shops

【Pygame小游戏】这款“打地鼠”小游戏要火了(来来来)
![[JVM] garbage collection mechanism](/img/61/e7611380954cdcd316dd0e570bdf17.png)
[JVM] garbage collection mechanism

A simple understanding of B tree
![[pyGame] this](/img/7c/adc13c0c87ca31c8581d68e6f21363.jpg)
[pyGame] this "groundhog" game is going to be popular (come on, come on)
![[database] MySQL index interview questions](/img/ff/8713465293f728f57840237242e227.png)
[database] MySQL index interview questions

Things about Bluetooth development (10) -- getting to know ble for the first time
随机推荐
Lambda learning records
Judgment and other issues: how to determine whether the judgment of the procedure is correct?
canvas绘画折线段
MD5Util
【Pygame小游戏】《坦克大战》,那些童年的游戏你还记得几个呢?
VTK example -- three intersecting planes
mysql 数据库 表 备份
SystemVerilog (x) - user defined type
VTK例子--三個相交的平面
[pyGame games] I'm not afraid you can't walk the maze series: the ultimate AI walks the maze. After learning, it will take you to open the door to a new world ~ (with game source code)
About optimizing API interface response speed
基于华为云ECS的目标检测与识别的昇腾AI开发体验【华为云至简致远】
Mysql database table backup
Compared with the "South-to-North Water Transfer", what will the "east to west" of the fire bring to cloud computing?
CSDN daily practice -- half search of ordered table
LeetCode 1996. 游戏中弱角色的数量*
Merge sort
VTK例子--三个相交的平面
[AI card player] for the first time to see such an "exciting" landowner, the key factor for a high winning rate is
[fireworks in the sky] it's beautiful to light up the night sky with gorgeous fireworks. A programmer brought a fireworks show to pay New Year's greetings to everyone~