当前位置:网站首页>Yiwen gets rid of the garbage collector

Yiwen gets rid of the garbage collector

2022-07-05 22:50:00 Thousands of miles in all directions

Catalog

1. Overview of garbage collection

1.1 Marking stage

1.1.1. Reference counting

1.1.2. Accessibility analysis

1.2 Sweep phase

1.2.1. Mark - Clear algorithm

1.2.2. Copy algorithm

1.2.3. Mark - Compression algorithm

1.3. Different generations of collection algorithms

2. Garbage collector

2.1. assessment GC Performance index of

2.2. Overview of different garbage collectors

2.3. Serial Recyclers : Serial recovery

2.4. ParNew Recyclers : Parallel recycling

2.5. Parallel Recyclers : Throughput priority

2.6. CMS Recyclers : Low latency

2.7. G1 Recyclers : Regionalized generational

2.7.1 Incremental collection algorithm

2.7.2. G1 The recycling process of the garbage collector

2.8. Garbage collector summary

2.8.1. 7 Summary of a classic garbage collector

2.8.2. Garbage collector combination

2.8.3. How to choose a garbage collector

3. Discussion on important topics related to garbage recycling

3.1. System.gc() The understanding of the

3.2. Let's talk about quoting

3.2.1 Strong citation (Strong Reference)— No recovery

3.2.2 Soft citation (Soft Reference): If there is not enough memory, it will be recycled

3.2.3 Weak reference (Weak Reference) Discovery is recycling

3.2.4 Virtual reference (Phantom Reference): Object recovery tracking

1. Overview of garbage collection

Garbage is an object that does not have any pointers in the running program . If not cleaned up in time , These garbage objects will always occupy memory space , As a result, the memory space is gradually occupied . We know C and C++ It is the problem of space allocation that requires users to deal with themselves , If not handled properly , It brings all kinds of abnormal problems . and Java A major feature from the beginning of its establishment is its own garbage collector . Garbage collector greatly improves the development efficiency , Now it has become the standard configuration of modern languages , For example python、go、C#、Ruby And other high-level languages also have such functions .

about Java For developers , With automatic memory management , There is no need to manually participate in memory allocation and recycling , This is released from the heavy memory management , You can focus more on business development . But automatic memory management is like a black box , If you rely too much on “ Automatically ”, It will weaken the ability to locate and solve problems when the program has memory overflow , So master JVM The principle of automatic memory allocation and memory recovery is very important , Only in this way can the system and reasonable settings , And when you meet, for example outofMemoryError Wait for the question , Can quickly locate and solve problems .

Design a garbage collector , We must first clarify several issues : What needs to be recycled ? When to recycle ? How to recycle ?

Garbage collection is not for all spaces , We are mainly concerned with garbage collection in the method area and heap , Because the method area stores the user's class information , Instead, the objects created are stacked , There are many contents and the interaction with users is also the most close . And the pile is divided into multiple areas , The garbage collector can recycle the young generation , It can also be recycled for the elderly , Even full stack and method area recycling . In terms of number of times , Young The area collects the most frequently , Old There is less collection in the area , and Perm District ( Meta space ) Basically don't collect .

The next question is how to know which memory can be recycled ? How to judge whether an object is garbage ? There's almost everything in the pile Java Object instances , stay GC Before performing garbage collection , Simply speaking , When an object is no longer referenced by any surviving object , It can be regarded as recyclable garbage .

How to recycle it ? Practice has found that this is not done in a pinch , At least two stages are needed , The first is the marking stage , The second stage is the cleanup stage . The marking stage is mainly to judge whether the object is alive , There are generally two ways : Reference counting algorithm and reachability analysis algorithm . The three common garbage removal algorithms are mark - clear Algorithm (Mark-Sweep)、 Copy algorithm (copying)、 Mark - Compression algorithm (Mark-Compact). Next, we will focus on analyzing these methods .

1.1 Marking stage

1.1.1. Reference counting

Reference counting algorithm (Reference Counting) Relatively simple , It is to save an integer reference counter attribute for each object . Used to record references to objects .

For an object A, As long as any object references A, be A The reference counter of is added 1; When the reference fails , The reference counter subtracts 1. As long as the object A The value of the reference counter of is 0, That is to say, the object A It can't be used anymore , It can be recycled . Obviously, the implementation of this method is relatively simple , Garbage objects are easy to identify , At the same time, the judgment efficiency is high , There is no delay in recycling .

But this method also has obvious disadvantages : It needs a separate field to store the counter , This increases the cost of storage space . In addition, each assignment requires updating the counter , With addition and subtraction , This increases the cost of time . The biggest problem with this method is that it cannot handle circular references , therefore Java This kind of algorithm is not used in the garbage collector of .

Let's explain what circular reference is , When p When the pointer is disconnected , Internal references form a loop , This is circular quotation , As shown in the figure below .

Reference counting algorithm , It's a resource recycling option for many languages , For example, it's hotter because of artificial intelligence Python, It also supports both reference counting and garbage collection .Python How to solve circular reference ? Manual release ! Of course, this is not done manually, but by using weak references weakref Wait for a certain time to release .

1.1.2. Accessibility analysis

Compared with the reference counting algorithm , Reachability analysis algorithm not only has the characteristics of simple implementation and high efficiency , It can also effectively solve the problem of circular reference in the reference counting algorithm , So in Java、C# There are applications in . This type of garbage collection is also called traceable garbage collection (Tracing Garbage Collection), So-called "GCRoots” A root set is a set of references that must be active .

Reachability analysis algorithm is based on the set of root objects (GCRoots) As the starting point , Search whether the target object connected by the root object collection is reachable from top to bottom . After using the reachability analysis algorithm , Live objects in memory are connected directly or indirectly by the set of root objects , The path a search takes is called a reference chain (Reference Chain). If the target object is not connected by any chain of references , It is not accessible , It means that the object is dead , Can be marked as a garbage object . In the reachability analysis algorithm , Only objects that can be connected directly or indirectly by the collection of root objects are living objects .

  What are the root objects ? stay Java In language ,GC Roots It includes the following elements :

  • Objects referenced in the virtual machine stack , such as : Parameters used in the method called by each thread 、 Local variables, etc .

  • Local method stack JNI( The usual local method ) Referenced object

  • Object referenced by a class static property in a method area , such as :Java Class reference type static variable

  • The object referenced by a constant in the method area , such as : String constant pool (String Table) Quote from

  • All are locked in sync synchronized The object of holding .

  • Java References inside the virtual machine . The basic data type corresponds to Class object , Some resident exception objects ( Such as :NullPointerException、OutOfMemoryError), system class loader .

  • reflect java What's going on inside the virtual machine JMXBean、JVMTI Callback registered in 、 Local code cache, etc .

Except for these fixed GC Roots Beyond the collection , According to the garbage collector selected by the user and the current recycled memory area , There can be other objects “ Temporary ” To join , Together they make up the whole GC Roots aggregate , such as : Generational collection and partial recycling (PartialGC).

If only for Java An area of the heap is garbage collected ( such as : Typically, it's only for the new generation ), It must be considered that the memory area is the implementation details of the virtual machine itself , It's not isolated , The objects in this region may be referenced by objects in other regions , At this time, you need to add the associated area objects as well GCRoots Consider in a set , To ensure the accuracy of accessibility analysis .

If you want to use the reachability analysis algorithm to determine whether the memory is recyclable , Then the analysis work must be carried out in a snapshot that can guarantee consistency . If this point is not satisfied, the accuracy of the analysis results cannot be guaranteed . This is also the cause GC It has to be “stop The World” An important reason for . Even those who claim to have little pause CMS Collector , Enumerating the root nodes also has to pause .

1.2 Sweep phase

JVM The three most common garbage collection algorithms in are mark - clear algorithm (Mark-Sweep)、 Copy algorithm (copying)、 Mark - Compression algorithm (Mark-Compact).

1.2.1. Mark - Clear algorithm

After successfully distinguishing between the live and dead objects in memory ,GC The next task is to perform garbage collection , Free up the memory space occupied by useless objects , So that there is enough free memory space to allocate memory for new objects .

Mark - Clear algorithm (Mark-Sweep) Is a very basic and common garbage collection algorithm , The principle is that when the effective memory space in the heap (available memory) When it's exhausted , It stops the whole process ( Also known as stop the world), Then we do two things , The first term is the label , The second term is clear .

  • Mark :Collector Traverse from the reference root node , Mark all referenced objects . It's usually in the object Header Record as reachable object in .

  • eliminate :Collector Linear traversal of heap memory from beginning to end , If you find an object in its Header Is not marked as reachable object in , It will be recycled

This method has obvious shortcomings , The efficiency of mark removal algorithm is not high , And it's going on GC When , You need to stop the entire application , The user experience is poor , In addition, the free memory cleared in this way is discontinuous , Produce internal debris , Need to maintain a free list .

1.2.2. Copy algorithm

To solve the problem of marking - There are some flaws in the efficiency of garbage collection ,M.L.Minsky On 1963 In, he invented replication (Copying) Algorithm , It was also M.L.Minsky I have successfully introduced Lisp In an implementation version of the language . Its core idea is to divide the living memory space into two pieces , Use only one piece at a time , Copy the live objects in the memory in use to the unused memory block at the time of garbage collection , Then clear all objects in the memory block being used , Swap two memory roles , Finally, complete the garbage collection .

  The advantage of this way is : There is no marking and clearing process , Implement a simple , Efficient operation . The other is to ensure the continuity of space after copying , There will be no “ debris ” problem . Of course, the shortcomings of this algorithm are also obvious , It needs twice the memory space . This method is mainly used in the Cenozoic , Garbage collection for conventional applications , One time can usually be recycled 70% - 99% Of memory space . Recycling is very cost-effective . So today's commercial virtual machines are using this collection algorithm to recycle the new generation .

1.2.3. Mark - Compression algorithm

The efficiency of replication algorithm is based on the small number of living objects 、 On the premise that there are many garbage objects . This often happens in the new generation , But in the older generation , More often, most objects are living objects . If you still use the replication algorithm , Because there are so many survivors , The cost of replication will also be high . therefore , Based on the characteristics of older generation garbage collection , You need to use other algorithms .

Mark and clear algorithm can be applied to the elderly generation , But the algorithm is not only inefficient , After the recovery, the memory will be generated , therefore JVM Designers need to improve on this basis . Mark - Compress (Mark-Compact) The algorithm was born , The execution process of this method is :

  1. The first stage is the same as the mark removal algorithm , Mark all referenced objects from the root node .

  2. The second stage compresses all surviving objects to one end of memory , To discharge in order .

  3. after , Clean up all the space outside the boundary .

  Mark - The final effect of the compression algorithm is equivalent to the mark - After the cleaning algorithm is executed , Do another memory defragmentation , therefore , You can also call it a marker - eliminate - Compress (Mark-Sweep-Compact) Algorithm .

The essential difference between the two lies in the mark - The scavenging algorithm is a non mobile recycling algorithm , Mark - Compression is mobile . Whether to move the recovered objects is a risk decision with both advantages and disadvantages . You can see , Tagged live objects will be sorted out , Sort by memory address , And unmarked memory will be cleaned up . In this way , When we need to allocate memory to new objects ,JVM Just hold a starting address of memory , This is obviously a lot less overhead than maintaining a free list .

The advantage of this method is the elimination of marks - In the clearing algorithm , The disadvantage of memory area dispersion , When we need to allocate memory to new objects ,JVM Just hold a starting address of memory . The other is , Eliminates the duplication algorithm , The high cost of halving memory .

The disadvantage of this way is : In terms of efficiency , Mark - The sorting algorithm is lower than the copying algorithm ; While moving objects , If an object is referenced by another object , You also need to adjust the address of the reference ; In the process of moving , User applications need to be suspended throughout , namely :STW.

Mark-SweepMark-CompactCopying
rate secondary The slowest The fastest
Space overhead Less ( But it's going to pile up debris ) Less ( No debris ) You usually need a living object 2 Times the space ( No debris )
move objects no yes yes

In terms of efficiency , The replication algorithm is the number one , But too much memory is wasted .

And in order to take into account the three indicators mentioned above as much as possible , Mark - The sorting algorithm is relatively smoother , But the efficiency is not satisfactory , It has one more marker stage than the copy algorithm , Than mark - Clearing has one more stage to clean up memory

Isn't there an optimal algorithm ? Unfortunately, there is no , Even the latest G1 Not good either. . There is no best algorithm , Only the most appropriate algorithm , And optimize according to the actual scene , Minimize negative impacts , This is the next generation collection problem .

1.3. Different generations of collection algorithms

In all of the previous algorithms , There is no one algorithm that can completely replace the others , They all have their own unique advantages and characteristics , Generational collection algorithm came into being .

Generational collection algorithm , Based on the fact that : Different objects have different lifecycles . therefore , Objects in different lifecycles can be collected in different ways , In order to improve recycling efficiency . In general, it is to put Java The heap is divided into the new generation and the old generation , In this way, different recycling algorithms can be used according to the characteristics of different years , To improve the efficiency of garbage collection .

stay Java During the operation of the program , There will be a lot of objects , Some of these objects are related to business information , such as Http In the request Session object 、 Threads 、Socket Connect , This kind of object is directly related to the business , So the life cycle is longer . But there are still some objects , It mainly refers to the temporary variables generated during the program operation , These objects have a shorter life cycle , such as :String object , Because of the properties of its invariant class , The system will produce a lot of these objects , Some objects can even be recycled only once .

At present, almost all GC All use generational collection algorithm to perform garbage collection . stay HotSpot in , Based on the concept of generations ,GC The memory recovery algorithm used must combine the characteristics of the younger generation and the older generation .

The younger generation (Young Gen)

The characteristics of the young generation : The region is smaller than that of the older generation , Object life cycle is short 、 The survival rate is low , Recycling is frequent .

In this case, the recycling of the replication algorithm , Speed is the fastest . The efficiency of the replication algorithm is only related to the size of the current surviving object , So it's very suitable for young people to recycle . The memory utilization of replication algorithm is not high , adopt hotspot Two of them survivor The design of the project has been eased .

Old age (Tenured Gen)

The characteristics of the old age : The area is large , Object has a long life cycle 、 The survival rate is high , Recycling is not as frequent as the younger generation . In this case, there are a large number of objects with high survival rate , The replication algorithm obviously becomes inappropriate . It's usually marked by - Clear or mark - Clear and mark - The mixed implementation of collation .

  • Mark The cost of a phase is proportional to the number of surviving objects .

  • Sweep The cost of the phase is positively related to the size of the managed area .

  • Compact The cost of the phase is proportional to the data of the surviving object .

With HotSpot Medium CMS Take the recycler for example ,CMS Is based on Mark-Sweep Realized , The recycling efficiency of objects is very high . And for debris ,CMS The adoption is based on Mark-Compact Algorithm Serial Old Reclaimer as compensation : When memory recycling is not good ( Caused by debris Concurrent Mode Failure when ), Will adopt Serial Old perform Full GC In order to achieve the old generation of memory arrangement .

2. Garbage collector

We analyzed the advantages and disadvantages of different garbage collection algorithms , This chapter will take a look at the common recyclers , How to actually build a garbage collector . Before that , Let's take a look at some indicators to evaluate the quality of garbage collectors .

2.1. assessment GC Performance index of

The garbage collector is evaluated mainly through the following indicators :

  • throughput : The percentage of time spent running user code in total runtime ( Total operation time = The running time of the program + Time of memory recovery )

  • Garbage collection expenses : The ratio of the time spent in garbage collection to the total running time .

  • Pause time : When performing garbage collection , The time when the program's worker thread is suspended .

  • Collection frequency : Relative to the execution of the application , Frequency of collection operations .

  • Memory footprint :Java The amount of memory the heap occupies .

  • Fast cycle : The time from the birth of an object to its recycling .

“ throughput 、 Pause time 、 Memory footprint ” These three together constitute a “ Impossible Triangle ”, A good collector usually satisfies at most two of them at the same time . Of the three , Because with the development of hardware , More memory is more and more tolerable , The improvement of hardware performance also helps to reduce the impact of collector runtime on the application , That is, the throughput is improved . Pauses are becoming intolerable , Imagine when Taobao started the double 11 second kill , All of a sudden JVM Start garbage cleaning for a few seconds , What would that be . Simply speaking , There are two main points : throughput 、 Pause time .

Throughput is CPU Time used to run user code and CPU Ratio of total elapsed time , Throughput = Run user code time /( Run user code time + Garbage collection time ). such as : The virtual machine is running 100 minute , The garbage collection is spent 1 minute , The throughput is 99%. In this case , Applications can tolerate higher pause times , therefore , High throughput applications have a longer time base , Quick response doesn't have to be considered .

Throughput priority , It means in unit time ,STW The shortest time :0.3 + 0.3 = 0.6

 “ Pause time ” Application thread is suspended for a period of time , Give Way GC The state of thread execution . for example ,GC period 100 Millisecond pause time means here 100 No application threads were active during the millisecond period .

Pause time takes precedence , It means as much as possible to make a single STW The shortest time :0.1 + 0.1 + 0.1 + 0.1 = 0.4

  High throughput is better because it makes the end user of the application feel that only the application thread is doing “ productive ” Work . Intuition , The higher the throughput, the faster the program runs .

Low pause time ( Low latency ) It's better because from the end user's point of view, whether it's GC It is always bad for an application to be suspended due to other reasons . It depends on the type of application , Sometimes even briefly 200 Millisecond pauses can interrupt the end user experience . therefore , It is very important to have a low and large pause time , Especially for an interactive application .

Unfortunately ” High throughput ” and ” Low pause time ” It's a pair of competing goals ( contradiction ). Because if you choose throughput first , Then it is necessary to reduce the frequency of memory recycling , But this will lead to GC A longer pause time is required to perform memory reclamation . Contrary , If we choose the principle of low delay first , So in order to reduce the pause time of each memory recycle , Memory recycling can only be performed frequently , But this has caused the memory reduction of the younger generation and the decrease of the program throughput .

In the design GC When the algorithm , We have to set our goals : One GC The algorithm can only aim at one of two targets ( That is, only focus on larger throughput or minimum pause time ), Or try to find a compromise between the two .

Now the general pursuit is , In the case of maximum throughput priority , Reduce the pause time , So we often let the operation and maintenance add machines .

2.2. Overview of different garbage collectors

The garbage collection mechanism is Java Your signature ability , Greatly improved development efficiency . This, of course, is also the focus of the interview . Different algorithms are suitable for different scenarios , Different scenarios also need different algorithms , therefore Java Our garbage collector is not a , It's a combination of recyclers .

The garbage collector is not specified too much in the specification , It can be made by different manufacturers 、 Different versions JVM To achieve . because JDK The version of is in the process of high-speed iteration , therefore Java So far, it has derived a lot of GC edition . Analyze the garbage collector from different angles , Can be GC There are different types .

By thread count , It can be divided into serial garbage collector and parallel garbage collector .

image-20210512144253383

Serial recovery means that only one... Is allowed in the same period of time CPU Used to perform garbage collection operations , At this point, the worker thread is suspended , Until the end of the garbage collection . In such areas as single CPU Hardware platforms such as processors or smaller application memory are not particularly advantageous , The performance of serial collector can surpass parallel collector and concurrent collector . therefore , Serial recycling is applied to the client by default Client Mode of JVM in .

And in those with strong concurrency CPU On , The pause time of parallel collector is shorter than that of serial collector . Contrary to serial recycling , Parallel collection can use multiple CPU At the same time, garbage collection , So it improves the throughput of the application , But parallel recycling is still the same as serial recycling , Use exclusive , Used “Stop-the-World” Mechanism .

According to the working mode , It can be divided into parallel garbage collector and exclusive garbage collector . Concurrent garbage collector and application thread work alternately , To minimize application pause time . Exclusive garbage collector (Stop the world) Once the run , Stop all user threads in the application , Until the garbage collection process is complete .

According to the way of fragment processing , It can be divided into compression type garbage collector and non compression type garbage collector . The compression type garbage collector will be finished after recycling , Compress the living objects , Remove the recovered debris . The uncompressed garbage collector does not do this .

By working memory range , It can also be divided into the younger generation garbage collector and the older generation garbage collector .

Among these recyclers , The most important thing is :

  • Serial recycler :Serial、Serial Old

  • Parallel recycler :ParNew、Parallel Scavenge、Parallel old

  • Concurrency collector :CMS、G1

The relationship between these seven garbage collectors is :

image-20200713093757644

Although we will compare the various collectors , But not to pick out the best collector . There is no universal standard 、 Universal collector for any scenario , So we only choose the collector that is most suitable for the specific application . We can go through -XX:+PrintCommandLineFlags: Check the command line parameters ( Contains the garbage collector used ).

In practice , We need to consider which collectors are suitable for different generations , At present, the main relationship is :

  • Cenozoic collector :Serial、ParNew、Parallel Scavenge;

  • Old age collector :Serial Old、Parallel Old、CMS;

  • The whole heap collector :G1;

And these kinds of recyclers are not randomly combined , Common combinations are as follows :

image-20200713094745366

  1. There is a connection between the two collectors , It shows that they can be used together :Serial/Serial Old、Serial/CMS、ParNew/Serial Old、ParNew/CMS、Parallel Scavenge/Serial Old、Parallel Scavenge/Parallel Old、G1;

  2. among Serial Old As CMS appear "Concurrent Mode Failure" A failed backup plan .

  3. ( Red dotted line ) Because of the cost of maintenance and compatibility testing , stay JDK 8 When will Serial+CMS、ParNew+Serial Old These two combinations are declared obsolete (JEP173), And in JDK9 The support for these combinations has been completely removed from (JEP214), namely : remove .

  4. ( Dotted green line )JDK14 in : Abandoning Parallel Scavenge and Serialold GC Combine (JEP366).

  5. ( Green dashed frame )JDK14 in : Delete CMS Garbage collector (JEP363).

2.3. Serial Recyclers : Serial recovery

Serial The collector is the most basic 、 The oldest garbage collector , yes JDK1.3 Before recycling, the only option for the new generation .Serial The collector serves as HotSpot in client The default new generation garbage collector in mode .Serial The collector uses a replication algorithm 、 Serial recycling and "stop-the-World" Mechanism to perform memory recycling .

Except for the younger generation ,Serial The collector also provides Serial Old The collector .Serial Old The collector also uses serial recovery and "Stop the World" Mechanism , It's just that the memory recovery algorithm uses tags - Compression algorithm .

  • Serial old Is running on the Client Mode default old age garbage collector

  • Serial 0ld stay Server There are two main uses in this mode :① And the new generation of Parallel scavenge In combination with ② As an older generation CMS The backup garbage collection scheme of the collector .

image-20200713100703799

This collector is a single threaded collector , But its “ Single thread ” It doesn't just mean that it uses one CPU Or a collection thread to complete the garbage collection , What's more, when it's collecting garbage , All other worker threads must be suspended , Until it's collected (Stop The World).

advantage : Simple and efficient ( One way ratio with other collectors ), For a limited single CPU In terms of environment ,Serial The collector has no overhead of thread interaction , Focusing on garbage collection can naturally achieve the highest single thread collection efficiency . Running on the Client Virtual machine in mode is a good choice . In the user's desktop application scenario , Available memory is generally not large ( Dozens of MB To one or two hundred MB), Garbage collection can be completed in a short time ( Dozens of ms To more than 100 ms), As long as it doesn't happen frequently , Using a serial recycler is acceptable .

stay HotSpot In the virtual machine , Use -XX:+UseSerialGC Parameter to specify that both younger and older generations use serial collectors . It is equivalent to the new generation Serial GC, And for the elderly Serial Old GC. Of course, this garbage collector is just for solving , Because only in single core cpu Just can use , And at present, most applications need interaction , I can't accept this kind of garbage collector .

2.4. ParNew Recyclers : Parallel recycling

if Serial GC It's a single threaded garbage collector in the younger generation , that ParNew The collector is Serial Multithreaded version of collector .Par yes Parallel Abbreviation ,New: It can only deal with the Cenozoic .

ParNew In addition to the parallel reclaiming method, the collector performs memory recycling , There's almost no difference between the two garbage collectors .ParNew Collectors in the younger generation also use replication algorithms 、"Stop-the-World" Mechanism .ParNew A lot JVM Running on the Server New generation default garbage collector in mode .

image-20200713102030127

This collector can work in two ways , For the new generation , Frequent recycling , Use parallel to be efficient . For the elderly , Less recycling times , Use serial mode to save resources . because ParNew The collector is based on parallel recycling , So can we conclude that ParNew The collector's recovery efficiency in any scenario is better than serial Collectors are more efficient ?ParNew How many collectors are running CPU Under the environment of , Because we can make full use of CPU、 Multi core and other physical hardware resources , Garbage collection can be done more quickly , Improve the throughput of the program . But in a single CPU Under the environment of ,ParNew The collector is no better than Serial Collectors are more efficient . although Serial The collector is based on serial recycling , But because of CPU There is no need to switch tasks frequently , Therefore, it can effectively avoid some extra overhead in the process of multi-threaded interaction .

Because besides Serial Outside , At present, only ParNew GC Can and CMS The collector works with . In the program , Developers can use options "-XX:+UseParNewGC" Manually specify the use of ParNew The collector performs the memory reclaim task . It means that the younger generation uses parallel collectors , It doesn't affect the older generation .-XX:ParallelGCThreads Limit the number of threads , Default on and CPU Number of threads with the same data .

2.5. Parallel Recyclers : Throughput priority

HotSpot In addition to having ParNew The collector is based on parallel recycling ,Parallel Scavenge The collector also uses a replication algorithm 、 Parallel recycling and "Stop the World" Mechanism .

that Parallel Whether the emergence of collectors is unnecessary ? and ParNew Collectors are different ,Parallel Scavenge The goal of the collector is to achieve a manageable throughput (Throughput), It's also known as a throughput first garbage collector . Adaptive adjustment strategy is also Parallel Scavenge And ParNew An important difference .

High throughput can be used efficiently CPU Time , Complete the operation task of the program as soon as possible , It is mainly suitable for tasks that are operated in the background without much interaction . therefore , It is commonly used in server environments . for example , Those that perform batch processing 、 The order processing 、 Payment of wages 、 Applications for Scientific Computing .

Parallel The collector JDK1.6 Provides for the implementation of garbage collection for the elderly Parallel Old The collector , In place of the old days Serial Old The collector .Parallel Old The collector uses tags - Compression algorithm , But it's also based on parallel recycling and "Stop-the-World" Mechanism .

image-20200713110359441

In the application scenario of program throughput priority ,Parallel Collectors and Parallel Old Combination of collectors , stay Server The memory recovery performance in mode is very good . stay Java8 in , The default is this garbage collector .

2.6. CMS Recyclers : Low latency

stay JDK1.5 period ,Hotspot Introduced a garbage collector that can almost be regarded as an epoch-making significance in strong interactive applications :CMS(Concurrent-Mark-Sweep) The collector , This collector is HotSpot The first real concurrent collector in virtual machine , It's the first time that garbage collection threads and user threads work at the same time .

CMS The focus of the collector is to minimize the pause time of user threads during garbage collection . The shorter the pause time ( Low latency ) The more suitable it is for programs that interact with users , Good response speed can improve user experience . At present, a large part of Java Applications focused on Internet sites or B/S On the server side of the system , Such applications pay special attention to the response speed of services , I hope the system has the shortest pause time , To give users a better experience .CMS Collector is very suitable for this kind of application .

CMS Our garbage collection algorithm uses tags - Clear algorithm , And it's going to be "Stop-the-World". Unfortunately ,CMS As a collector for the older generation , But not with JDK1.4.0 The new generation of collectors that already exist in Parallel Scavenge Cooperation , So in JDK1.5 Use in CMS When I came to collect the old age , The new generation can only choose ParNew perhaps Serial One of the collectors . stay G1 Before appearance ,CMS It's still very widely used , To this day , There are still a lot of systems in use CMS GC.

image-20200713205154007

CMS The whole process is more complicated than the previous collectors , The whole process is divided into 4 The main stages , The initial marking stage 、 Concurrent tagging phase 、 The relabel phase and the concurrent cleanup phase

  • Initial marker (Initial-Mark) Stage : At this stage , All the working threads in the program will be due to “Stop-the-World” There is a short pause due to the mechanism , The main task of this stage is just to mark GCRoots Objects that can be directly related to . Once the marking is complete, all application threads that were previously suspended will be resumed . Because the directly related objects are relatively small , So it's very fast here .

  • Concurrent Tags (Concurrent-Mark) Stage : from GC Roots The process of traversing the whole object graph begins with the directly related object , This process takes a long time but does not need to pause the user thread , Can run concurrently with the garbage collection thread .

  • Re label (Remark) Stage : Because in the concurrent marking phase , The worker thread of the program will run at the same time or cross run with the garbage collection thread ,“Stop-the-World” So in order to fix the concurrent marking period , The tag record of the part of the object whose tag changes as the user program continues to operate , The pause time at this stage is usually a little longer than the initial marking stage , But it's also much shorter than the concurrent marking phase .

  • Concurrent elimination (Concurrent-Sweep) Stage : This stage cleans and removes the dead objects judged by the marking stage , Free up memory . Because you don't need to move live objects , So this stage can be concurrent with user threads

Even though CMS The collector uses concurrent recycling ( Non exclusive ), But it still needs to be executed in the two phases of its initialization marking and re marking “Stop-the-World” Mechanism to suspend worker threads in a program , But the pause won't be long , Therefore, it can be said that all the current garbage collectors can't be done, and there is no need at all “stop-the-World”, Just shorten the pause as much as possible .

Because the most time-consuming phase of concurrent tagging and concurrent cleanup does not need to suspend work , So the overall recycling is low pause . in addition , Since the user thread is not interrupted during the garbage collection phase , So in CMS In the process of recycling , You should also make sure that the application user thread has enough memory available . therefore ,CMS Collectors can't wait until the old days are almost completely full to collect like other collectors , But when the heap memory utilization reaches a certain threshold , And we started recycling , To make sure that the application is CMS There is still enough space to support the application running in the working process . If CMS The memory reserved during running cannot meet the needs of the program , There will be one “Concurrent Mode Failure” Failure , At this point the virtual machine will initiate the backup plan : The occasional Serial Old Collector to redo old garbage collection , That's a long pause .

CMS The garbage collection algorithm of the collector adopts the tag clearing algorithm , This means that after every memory recycle , Because the memory space occupied by the useless object of memory recycling is very likely to be some discontinuous memory blocks , Inevitably, there will be some memory fragmentation . that CMS When allocating memory space for new objects , You will not be able to use pointer collisions (Bump the Pointer) technology , Only the free list can be selected (Free List) Perform memory allocation .

Some people will think that since Mark Sweep Will cause memory fragmentation , So why not replace the algorithm with Mark Compact?

The answer is simple , Because when concurrent cleanup , use Compact If you clean up the memory , How to use the memory used by the original user thread ? To ensure that the user thread can continue to execute , The premise is that the resources it runs on will not be affected .Mark Compact More suitable for “Stop the World” In this scenario

CMS The advantages of : Concurrent collection , Low latency .CMS The disadvantages of : Memory fragmentation will occur , After concurrent cleanup , Insufficient space available for user thread . In cases where large objects cannot be allocated , Had to trigger ahead of time FullGC.CMS The collector to CPU Resources are very sensitive . In the concurrent phase , It doesn't cause users to pause , But it can slow down the application because it takes up some threads , The total throughput will be reduced .CMS The collector cannot handle floating garbage , May appear “Concurrent Mode Failure" Failure leads to another Full GC The birth of . In the concurrent marking phase, because the worker thread and garbage collection thread of the program are running at the same time or cross running , If a new garbage object is generated in the concurrent marking phase ,CMS These garbage objects will not be marked , Eventually, these new garbage objects will not be recycled in time , So it can only be executed the next time GC To free up the memory space that has not been reclaimed before .

2.7. G1 Recyclers : Regionalized generational

The existing algorithms mentioned above , In the process of recycling , The application software will be in a Stop the World The state of , At this point, all threads of the application will hang , Suspend all normal work , Wait for the garbage collection to finish . If the recycling time is too long , The application will be suspended for a long time , It will seriously affect the user experience or the stability of the system . So , The research on real-time garbage collection algorithm gradually produces incremental collection (Incremental Collecting) The way , This is also G1 The basic principle of garbage collector .

2.7.1 Incremental collection algorithm

The basic idea of this way is : If all the rubbish is disposed of at one time , Need to cause a long pause in the system , Then you can have the garbage collection thread and the application thread execute alternately . Each time the garbage collection thread collects only a small area of memory space , Then switch to the application thread , Repeat in turn , Until the garbage collection is complete . in general , The foundation of the incremental collection algorithm is still the traditional tag - Clean up and copy algorithms . Incremental collection algorithm through the proper handling of conflicts between threads , Allow the garbage collection thread to complete the marking in a phased manner 、 Clean up or copy work .

The advantage of this method is due to the garbage collection process , The application code is also executed intermittently , So it can reduce the pause time of the system . The disadvantages are obvious , Because of the consumption of thread switching and context conversion , It will increase the overall cost of garbage collection , Cause system throughput to drop .

Generally speaking , Under the same conditions , Larger heap space , once GC The longer it takes to get there , of GC The longer the pause is . In order to better control GC The resulting pause time , Divide a large memory area into small pieces , According to the pause time of the target , Recycle several cells at a time , Not the whole heap space , So as to reduce one time GC The resulting pause .

Generation algorithm will be divided into two parts according to the life cycle of the object , The partition algorithm divides the whole heap space into continuous cells . Each cell is used independently , Independent recycling , This has the advantage of controlling how many cells can be recycled at a time , This is also G1 The basic principle of garbage collector .

image-20200712165318590

Now that we have the first few powerful GC, Why publish Garbage First(G1)? The reason is that applications are dealing with more and more businesses 、 More and more complex , And it often causes STW Of GC And can't keep up with the actual demand , That's why we keep trying to GC To optimize .G1(Garbage-First) It's in the garbage collector Java7 update4 Then a new garbage collector was introduced . meanwhile , In order to adapt to the expanding memory and increasing number of processors , Further reduce the pause time (pause time), Good throughput at the same time . Official G1 The goal is to achieve as high throughput as possible with controllable delay , Take on “ Full function collector ” The mission and expectations of .

Why is it called Garbage First(G1) Well ? because G1 It's a parallel collector , It splits heap memory into many unrelated areas (Region)( Physically discontinuous ). Use different Region To express Eden、 Survivor 0 District , Survivor 1 District , The old generation, etc .

G1 GC Avoid the whole... In a planned way Java Garbage collection in the whole area .G1 Keep track of each Region The value of the garbage in it ( The amount of space obtained by recycling and the empirical value of the time required for recycling ), Maintain a priority list in the background , Each time according to the allowed collection time , Priority recycling the most valuable Region. Because the focus of this method is to recycle the largest amount of garbage in the range (Region), So we give G1 A name : Garbage first (Garbage First).G1 Is a server - oriented application garbage collector , It is mainly aimed at equipping with multi-core CPU And large memory machines , Satisfy with a high probability GC At the same time as the pause time , It also has the performance characteristics of high throughput . So the garbage collector is JDK9 The default garbage collector in the future , To replace the CMS Recyclers and Parallel+Parallel Old Combine , By Oracle Officially called “ A fully functional garbage collector ”.

And others GC Collector compared to ,G1 Using a new partitioning algorithm , Its characteristics are :

1. Better concurrency and parallelism

  • Parallelism :G1 During recycling , There can be multiple GC Threads work at the same time , Effective use of multi-core computing power , At this point, the user thread STW.

  • concurrency :G1 Have the ability to execute alternately with the application , Part of the work can be performed at the same time as the application , therefore , Generally speaking , There is no complete blocking of the application during the entire recycle phase .

2. Improved generational collection

  • In terms of generations ,G1 It's still a generational garbage collector , It will distinguish between the younger generation and the older generation , The younger generation still has Eden Area and Survivor District . But from the structure of the pile , It doesn't require the whole thing Eden District 、 The younger generation or the older generation is continuous , Also no longer adhere to fixed size and fixed number .

  • The heap space is divided into several regions (Region), These areas include the logical younger and older generations .

  • Different from all kinds of recyclers before , It takes care of both the young and the old . Compared to other recyclers , Or working in the younger generation , Or working in the older generation ;

image-20200713215133839

3. Better spatial integration

  • CMS:“ Mark - eliminate ” Algorithm 、 Memory fragments 、 Several times GC After a defragmentation

  • G1 Divide the memory into one by one region. Memory is recycled by region As a basic unit .Region Between are replication algorithms , But as a whole, it can be regarded as a mark - Compress (Mark-Compact) Algorithm , Both algorithms can avoid memory fragmentation . This feature is good for the program to run for a long time , Allocating large objects does not trigger the next time in advance because continuous memory space cannot be found GC. Especially when Java When the pile is very large ,G1 More obvious advantages .

4. Predictable pause time model

( namely : Soft real time soft real-time), This is a G1 be relative to CMS Another big advantage of ,G1 In addition to pursuing a low pause , It can also establish a predictable pause time model , Enables the user to specify explicitly at a length of M In milliseconds of time , The time consumed in garbage collection shall not exceed N millisecond .G1 The reason why the collector can build a predictable pause time model , Or because it will Region As the smallest unit of single recovery , That is, the memory space collected each time is Region An integral multiple of size , This can be avoided in a planned way throughout Java Garbage collection in the whole area .

A more specific approach is to let G1 The collector tracks each Region The garbage in it is piled up “ value ” size , Value is the amount of space obtained by recycling and the empirical value of the time required for recycling , Then maintain a priority list in the background , Each time according to the user set the allowed collection pause time ( Using parameter -XX: MaxGCPauseMillis Appoint , The default value is 200 millisecond ), Give priority to the ones with the most return value Region, This is the same. “Garbage First” The origin of the name . This kind of use Region Partition memory space , As well as the priority area recycling method , To ensure the G1 The collector obtains the highest collection efficiency in a limited time .

5. Partition Region: break up the whole into parts

Use G1 When collecting , It will be the whole Java The heap is divided into 2048 Independent of the same size Region block , Every Region The block size depends on the actual size of the heap space , The whole is controlled in 1MB To 32MB Between , And for 2 Of N The next power , namely 1MB,2MB,4MB,8MB,16MB,32MB. Can pass -XX:G1HeapRegionSize Set up . be-all Region Same size , And in JVM It's not going to change in the life cycle

Although the concept of the new generation and the old generation remains , But the new generation and the old generation are no longer physically isolated , They are all part of Region( There is no need for continuity ) Set . adopt Region The dynamic allocation of the method to achieve logical continuity .

  One region It could belong to Eden,Survivor perhaps Old/Tenured Memory area . But one region It can only belong to one character . In the picture E It means that we should region Belong to Eden Memory area ,S Indicates belonging to survivor Memory area ,O Indicates belonging to Old Memory area . The blank space in the figure represents the unused memory space .

G1 The garbage collector also adds a new memory area , be called Humongous Memory area , As shown in figure of H block . It is mainly used to store large objects , If exceeded 1.5 individual region, Put it on H.

Set up H Why : For objects in the heap , The default will be assigned directly to the old age , But if it is a large object that exists for a short time, it will have a negative impact on the garbage collector . To solve this problem ,G1 Divided into one Humongous District , It's used to store magnified objects . If one H There's no room for a big object , that G1 Will look for continuous H Area to store . In order to find a continuous H District , Sometimes I have to start Full GC.G1 Most of the actions of H District as part of the older generation .

G1 Applicable scenarios for collectors : For server applications , For having large memory 、 Multiprocessor machines . The main application is low demand GC Delay , And has a lot of applications to provide solutions ; Such as : It's about the size of the pile 6GB Or more , The predictable pause time can be less than 0.5 second ;(G1 By cleaning up part at a time rather than the whole Region Incremental cleaning to ensure that every time GC The pause time will not be too long ).

To replace JDK1.5 Medium CMS The collector ; In the following case , Use G1 Maybe it's better than CMS good :

  • exceed 50% Of Java The heap is occupied by active data ;

  • The frequency of object allocation or age promotion varies greatly ;

  • GC The pause time is too long ( Longer than 0.5 to 1 second );

HotSpot In the garbage collector , except G1 outside , Other garbage collectors use built-in JVM Threads execute GC Multithreading operation of , and G1 GC Application threads can be used to undertake the task of running in the background GC Work , When JVM Of GC When threads are slow , The system calls the application thread to help speed up the garbage collection process .

This section concludes with another question : What if the object is referenced by different regions

One Region It can't be isolated , One Region Objects in can be arbitrarily Region Object reference in , When judging the survival of an object , Is it necessary to scan the whole Java Only by stacking can we ensure accuracy ? Obviously, this is inefficient .

in fact , No matter what G1 There are other generational collectors ,JVM Is to use Remembered Set To avoid global scanning . Every Region There is a corresponding Remembered Set. Every time Reference Type data write operation , There will be a Write Barrier Temporarily interrupt the operation , Then check whether the object pointed to by the reference to be written and the Reference Type data in different Region( Other collectors : Whether the new generation objects are referenced ). If different , adopt CardTable Record the relevant reference information to where the reference points to the object Region Corresponding Remembered Set in . When it comes to garbage collection , stay GC The enumeration range of the root node is added with Remembered Set; To ensure that there is no global scan , There will be no omission .

2.7.2. G1 The recycling process of the garbage collector

The operation process can be roughly divided into the following four steps :

  1. Initial marker (Initial Marking): This is similar to other garbage collectors , Just to mark GC Roots Objects that can be directly related to , And modify the value of a specific pointer to ensure that the user thread runs concurrently in the next stage , Can be used correctly in Region Assign new objects to . This stage needs to pause the thread , But it takes a short time , And it's on loan Minor GC When the synchronization is completed , therefore G1 The collector doesn't actually have an extra pause at this stage .

  2. Concurrent Tags (Concurrent Marking): This process is related to CMS similar , from GC Root Start the reachability analysis of the objects in the heap , Recursively scan the whole Region Object graph in space , Find out what to recycle , This stage takes a long time , But it can be executed concurrently with the user program . When the object graph scan is completed to after , Also reprocess the objects recorded in the snapshot that have reference changes during concurrency .

  3. Final marker (Final Marking): Make another brief pause for the user thread , It is used to process the last few snapshot records left after the concurrency stage .

  4. Screening and recovery (Live Data Counting and Evacuation): Is responsible for updating Region Statistical data , To each Region My return Sort by value and cost , Make a recycling plan based on the user's expected pause time , You can choose as many as you like Region Make up a recycling set , And then the part that decided to recycle Region Of live objects copied to empty Region in , And clean up the whole old Region The whole space of . The operation here involves moving the living object , It is necessary to pause the user thread , Done in parallel by multiple collector threads .

  Recycling optional processes :Full GC,G1 My original intention is to avoid Full GC Appearance . But if that doesn't work ,G1 Will stop the execution of the application (Stop-The-World), Garbage collection using a single thread memory collection algorithm , The performance will be very poor , Application pauses can be long . To avoid Full GC Happen , Once it happens, it needs to be adjusted . When will it happen Full GC Well ? For example, the heap memory is too small , When G1 There is no empty memory segment available when copying live objects , It will go back to Full GC, This can be solved by increasing memory .

Lead to G1 Full GC There may be two reasons for this : There is not enough to recycle to-space To store the promotion target ; Run out of space before concurrent processing is complete .

In fact, the recycling phase was also designed to be executed concurrently with the user program , But it's a complicated thing to do , in consideration of G1 Just part of it Region, Pause time is user controllable , So it's not urgent to achieve , And I chose to put this feature in G1 Then came the low latency garbage collector ( namely ZGC) in . in addition , Also considering G1 Not just for low latency , Pausing user threads can maximize the efficiency of garbage collection , In order to ensure the throughput, we chose the implementation scheme of completely suspending user threads .

2.8. Garbage collector summary

2.8.1. 7 Summary of a classic garbage collector

end JDK1.8, Altogether 7 Different garbage collectors . Each garbage collector has its own characteristics , In specific use , We need to choose different garbage collectors according to the specific situation .

Garbage collector classification Role position Usage algorithm characteristic Applicable scenario
Serial Serial operation For the new generation Copy algorithm Response speed first Applicable to single CPU In the environment client Pattern
ParNew Run in parallel For the new generation Copy algorithm Response speed first many CPU Environmental Science Server Mode and CMS In combination with
Parallel Run in parallel For the new generation Copy algorithm Throughput priority It is suitable for background computing without much interaction
Serial Old Serial operation For the old age Mark - Compression algorithm Response speed first Applicable to single CPU In the environment Client Pattern
Parallel Old Run in parallel For the old age Mark - Compression algorithm Throughput priority It is suitable for background computing without much interaction
CMS Run concurrently For the old age Mark - Clear algorithm Response speed first For the Internet or B/S Business
G1 Concurrent 、 Run in parallel For the new generation 、 Old age Mark - Compression algorithm 、 Copy algorithm Response speed first For server applications

GC Development stage :Serial => Parallel( parallel )=> CMS( Concurrent )=> G1 => ZGC

2.8.2. Garbage collector combination

Different manufacturers 、 There is a big gap in the implementation of different versions of virtual machines .HotSpot Virtual machine in JDK7/8 All collectors and combinations are shown in the figure below

image-20200714080151020

  1. There is a connection between the two collectors , It shows that they can be used together :Serial/Serial Old、Serial/CMS、ParNew/Serial Old、ParNew/CMS、Parallel Scavenge/Serial Old、Parallel Scavenge/Parallel Old、G1;

  2. among Serial Old As CMS appear "Concurrent Mode Failure" A failed backup plan .

  3. ( Red dotted line ) Because of the cost of maintenance and compatibility testing , stay JDK 8 When will Serial + CMS、ParNew + Serial old These two combinations are declared as Deprecated(JEP 173), And in JDK 9 in

The support for these combinations has been completely eliminated (JEP214), namely : remove .

  1. ( Dotted green line )JDK 14 in : Abandoning ParallelScavenge and SeriaOold GC Combine (JEP 366)

  2. ( Green dashed frame )JDK 14 in : Delete CMS Garbage collector (JEP 363)

2.8.3. How to choose a garbage collector

Java Configuration of the garbage collector for JVM Optimization is a very important choice , Choosing the right garbage collector can make JVM There is a big improvement in the performance of .

How to choose a garbage collector ?

  1. Size the heap first so that JVM Adaptive completion .

  2. If memory is less than 100M, Using a serial collector

  3. If it's a single core 、 Stand alone program , And there is no pause time requirement , Serial collector

  4. If it is more CPU、 Need high throughput 、 Allow a pause longer than 1 second , Choose parallel or JVM Choose... For yourself

  5. If it is more CPU、 Pursue low pause time , Need to respond quickly ( For example, the delay cannot exceed 1 second , Such as Internet applications ), Using concurrent collectors

    The official recommendation G1, High performance . Now the Internet Project , It's basically using G1.

Finally, one point needs to be made clear :

  1. There is no best collector , There is no universal collection

  2. Tuning is always for specific scenarios 、 Specific needs , There is no collector for once and for all

Garbage collector is also one of the hot questions in the interview , For garbage collection , The interviewer can proceed step by step from theory 、 Practice from various angles , It's not necessarily that the interviewer should know everything . But if you know the principle , It will be a bonus in the interview . It's more common here 、 The basic parts are as follows :

  • What are the garbage collection algorithms ? How to judge whether an object can be recycled ?

  • The basic process of garbage collector work .

in addition , You need to pay more attention to the parameters commonly used in the chapter of garbage collector

3. Discussion on important topics related to garbage recycling

3.1. System.gc() The understanding of the

By default , adopt system.gc() perhaps Runtime.getRuntime().gc() Call to , Will explicitly trigger Full GC, At the same time, the old generation and the new generation are recycled , Trying to free memory occupied by discarded objects . However System.gc() The call comes with a disclaimer , Calls to the garbage collector cannot be guaranteed .( There is no guarantee of immediate effect ).

JVM The implementer can use System.gc() Call to determine JVM Of GC Behavior . And in general , Garbage collection should be automatic , No manual triggering required , Otherwise, it would be too much trouble . In some special cases , For example, we are writing a performance benchmark , We can call... Between runs System.gc().

public class SystemGCTest {
    public static void main(String[] args) {
        new SystemGCTest();
        System.gc();//  remind JVM The garbage collector of gc, But it's not sure if it's going to be done right now gc
        //  And Runtime.getRuntime().gc(); It does the same thing 
​
        System.runFinalization();// Enforce... Using referenced objects finalize() Method 
    }
​
    @Override
    protected void finalize() throws Throwable {
        super.finalize();
        System.out.println("SystemGCTest  Rewrote finalize()");
    }
}

JVM The implementer can use System.gc() Call to determine JVM Of GC Behavior . And in general , Garbage collection should be automatic , There is no need to manually trigger , Otherwise, it would be too much trouble . In some special cases , For example, we write a performance benchmark , We can call... Between runs System.gc().

3.2. Let's talk about quoting

stay Java There are four ways to reference : Strong citation 、 Weak reference 、 Soft quotation and virtual quotation . In this lesson, we will carefully study these four quotations .

3.2.1 Strong citation (Strong Reference)— No recovery

stay Java In the program , The most common type of reference is strong reference ( General system 99% All of the above are strong references ), That's our most common common common object reference , It's also the default reference type .

When in Java Used in language new Operator to create a new object , And assign it to a variable , This variable becomes a strong reference to the object .

Strongly referenced objects are touchable , The garbage collector will never recycle the referenced objects .

For a normal object , If there are no other reference relationships , As long as the scope of the reference is exceeded or will be explicitly corresponding ( strong ) The reference is assigned to nu11, It can be collected as garbage , Of course, the specific recycling time depends on the garbage collection strategy .

Relative , Soft citation 、 Objects with weak and virtual references are soft reachable 、 Weakly palpable and unreasonably palpable , under certain conditions , It's all recyclable . therefore , Strong citation is the cause of Java One of the main causes of memory leaks .

Strong reference examples

StringBuffer str = new StringBuffer("hello mogublog");
 local variable  str  Point to  StringBuffer  The heap space of the instance , adopt  str  You can manipulate this instance , that  str  Namely  StringBuffer  Strong references to instances 

Corresponding memory structure

  here , If you run another assignment statement

StringBuffer str1 = str;

Corresponding memory structure

  The two references in this example , Are all strong references , Strong references have the following characteristics :

  • Strong references can directly access the target object .

  • The object that the strong reference points to will not be recycled by the system at any time , The virtual machine would rather throw OOM abnormal , It doesn't recycle the object that a strong reference points to .

  • Strong references can cause memory leaks .

3.2.2 Soft citation (Soft Reference): If there is not enough memory, it will be recycled

If an object has only soft references , Then the garbage collector will not recycle it when there is enough memory , These objects will be recycled when memory is insufficient . After the soft reference object is recycled ,Java Virtual opportunity adds this soft reference to the reference queue associated with it .

Soft references are often used to implement memory sensitive caching . such as : Caching is useful for soft references . If you have free memory , You can keep the cache temporarily , Clean up when memory is low , This ensures that the cache is used at the same time , Does not run out of memory . When the garbage collector decides to recycle soft reachable objects at some point , Will clean up soft references , And optionally put the reference in a reference queue (Reference Queue). Similar to weak references , It's just Java Virtual Opportunities try to make soft references live longer , Forced to clean up . One sentence summary : When there is enough memory , Objects reachable by soft references are not recycled . When there is not enough memory , The reachable objects of soft references are recycled . stay JDK1.2 After the edition, there is SoftReference Class to implement soft references :

Object obj = new Object();//  Declare strong references 
SoftReference<Object> sf = new SoftReference<>(obj);
obj = null; // Destroy strong references 

Soft references are used to describe some useful , But not the necessary object . Only objects associated with soft references , Before the system is about to run out of memory , These objects will be listed in the recycling scope for a second recycling , If there is not enough memory for this collection , An out-of-memory exception is thrown .

Similar to weak references , It's just Java Virtual Opportunities try to make soft references live longer , Forced to clean up .

3.2.3 Weak reference (Weak Reference) Discovery is recycling

Weak references are also used to describe unnecessary objects , Objects that are only weakly referenced can only survive until the next garbage collection . In system GC when , Just find weak quotes , Whether or not the system heap space is used enough , All objects associated with weak references will be recycled . however , Because the threads of the garbage collector are usually of low priority , therefore , It's not always easy to find objects that hold weak references . under these circumstances , Weakly referenced objects can exist for a long time . Weak references are the same as soft references , When constructing weak references , You can also specify a reference queue , When a weakly referenced object is recycled , Will join the specified reference queue , Through this queue, you can track the recycling of objects . Soft citation 、 Weak references are very suitable for storing the cache data that is not available . If you do , When the system is low on memory , These cached data will be recycled , Does not cause memory overflow . And when there's enough memory , These cached data can exist for quite a long time , So as to accelerate the system . stay JDK1.2 After the edition, there is WeakReference Class to implement weak references :

//  Declare strong references Object obj = new Object();
WeakReference<Object> sf = new WeakReference<>(obj);
obj = null; // Destroy strong references 

The biggest difference between weakly referenced objects and soft referenced objects is , When GC When recycling , The algorithm is needed to check whether the soft reference object is recycled , For weakly referenced objects ,GC Always recycle . Weak reference objects are easier to 、 Faster to be GC Recycling .

Interview questions : Have you ever used WeakHashMap Do you ?

public class WeakReferenceTest {
    public static class User {
        public User(int id, String name) {
            this.id = id;
            this.name = name;
        }
​
        public int id;
        public String name;
​
        @Override
        public String toString() {
            return "[id=" + id + ", name=" + name + "] ";
        }
    }
​
    public static void main(String[] args) {
        // Weak references are constructed 
        WeakReference<User> userWeakRef = new WeakReference<User>(new User(1, "songhk"));
        // Retrieve objects from weak references 
        System.out.println(userWeakRef.get());
        System.gc();
        //  Whether the current memory space is enough or not , Will reclaim its memory 
        System.out.println("After GC:");
        // Try again to get the object from the weak reference 
        System.out.println(userWeakRef.get());
    }
}

After garbage collection , Soft reference objects must be cleared

[id=1, name=songhk] 
After GC:
null
​
Process finished with exit code 0

Weak references are also used to describe unnecessary objects , Objects that are only weakly referenced can only survive until the next garbage collection . In system GC when , Just find weak quotes , Whether or not the system heap space is used enough , All objects associated with weak references will be recycled .

however , Because the threads of the garbage collector are usually of low priority , therefore , It's not always easy to find objects that hold weak references . under these circumstances , Weakly referenced objects can exist for a long time .

Weak references are the same as soft references , When constructing weak references , You can also specify a reference queue , When a weakly referenced object is recycled , Will join the specified reference queue , Through this queue, you can track the recycling of objects .

Soft citation 、 Weak references are very suitable for storing the cache data that is not available . If you do , When the system is low on memory , These cached data will be recycled , Does not cause memory overflow . And when there's enough memory , These cached data can exist for quite a long time , So as to accelerate the system .

stay JDK1.2 After the edition, there is WeakReference Class to implement weak references :

Object obj = new Object(); //  Declare strong references 
WeakReference<Object> sf = new WeakReference<>(obj);
obj = null; // Destroy strong references 

The biggest difference between weakly referenced objects and soft referenced objects is , When GC When recycling , The algorithm is needed to check whether the soft reference object is recycled , For weakly referenced objects ,GC Always recycle . Weak reference objects are easier to 、 Faster to be GC Recycling .

Interview questions : Have you ever used WeakHashMap Do you ?

WeakHashMap Used to store picture information , When memory is low , Timely recovery , Avoided OOM

3.2.4 Virtual reference (Phantom Reference): Object recovery tracking

Also known as “ Ghost quotes ” perhaps “ Phantom reference ”, Is the weakest of all reference types . Whether an object has a virtual reference , It doesn't determine the life cycle of an object at all . If an object only holds virtual references , So it's almost the same as no citation , It can be collected by the garbage collector at any time . It can't be used alone , You can't use virtual references to get referenced objects . When trying to pass through virtual references get() Method to get the object , Always null . That is, we can't get our data through virtual reference . The only purpose of setting virtual reference associations for an object is to track the garbage collection process . such as : Can receive a system notification when this object is recycled by the collector .

【 Interview questions that are both off the beaten track and very high frequency 】 Strong citation 、 Soft citation 、 Weak reference 、 What's the difference between virtual references ? What is the specific use scenario ?

stay JDK1.2 After version ,Java The concept of reference is extended , Divide references into : Strong citation (Strong Reference)、 Soft citation (Soft Reference)、 Weak reference (Weak Reference)、 Virtual reference (Phantom Reference) this 4 In turn, the quoting strength decreased .

In addition to strong references , other 3 All references can be found in java.lang.ref They were found in the bag . Here's the picture , Shows this 3 The class corresponding to each reference type , Developers can use them directly in their applications .

.image-20200712205813321

Reference Only terminator references in subclasses are visible within the package , other 3 All the reference types are public, You can use... Directly in your application :

  • Strong citation (StrongReference): Most traditional “ quote ” The definition of , It refers to the common reference assignment in the program code , It's like “Object obj = new Object()” This quoting relationship . In any case , As long as the strong quoting relationship still exists , The garbage collector will never recycle the referenced objects .

  • Soft citation (SoftReference): Before the system is about to run out of memory , These objects will be included in the scope of recycling for the second time . If there is not enough memory after this recycle , Will throw a memory outflow exception .

  • Weak reference (WeakReference): Objects associated with weak references can only survive until the next garbage collection . When the garbage collector is working , Whether there is enough memory space , Objects associated with weak references will be recycled .

  • Virtual reference (PhantomReference): Whether an object has a virtual reference , It doesn't affect their survival time at all , You can't get an instance of an object through a virtual reference . The only purpose of setting a virtual reference association for an object is to receive a system notification when the object is collected by the collector .

原网站

版权声明
本文为[Thousands of miles in all directions]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/186/202207052242445753.html