当前位置:网站首页>Go deep: the evolution of garbage collection
Go deep: the evolution of garbage collection
2022-06-23 22:50:00 【wenxing】
Stop the world It's about garbage collection (Garbage Collection,GC) A topic that can't be avoided , once Go Linguistic GC Mechanisms also threaten the response time of services ——Discord Technical team articles Why Discord is switching from Go to Rust it has been reviewed that Go Language GC The problems brought about by .Go It has been greatly improved through version iteration GC The problem of , Average time STW The time from 100+ms Down to 0.5ms—— What magic is it that makes the world hardly need to pause ? In this paper , We ask questions 、 The solution attempts to sort out the main process of this evolution .
What are the mechanisms for garbage collection , What are the pros and cons of each ?
Garbage collection mainly includes 「 Reference count 」 And 「 track 」 The mechanism of :
- Reference count (Reference Counting): Add a counter to each object , Whenever the object is referenced / When dereferencing , The counter increases automatically / Self reduction 1; When the counter is reset to zero , The object should be recycled .
- track (Tracing): Periodically traverse memory space , Start with several root storage objects and find the storage objects related to them , Then mark the remaining storage objects that are not associated , Finally, reclaim the memory space occupied by these unrelated storage objects .
These two methods have their own advantages and disadvantages :
Reference count | track | |
|---|---|---|
The whole program needs to be suspended | no | yes |
The extra overhead of the runtime | Concurrency control of counters Memory space of the counter | nothing |
Non recyclable conditions | Object circular reference * | nothing |
* notes : Similar to the following code ,person And apartment Circular references between objects of , As a result, it can not be recycled by the garbage collector based on circular reference .
person = Person() apartment = Apartment() person.apartment = apartment apartment.owner = person person = None apartment = None
among , The method of tracking is Java Multiple garbage collection algorithms 、Go Language, etc ; The method of reference counting is swift、Objective-C、Python Etc ,C++ Smart pointers to can also be considered implementation of reference counting —— among Python Provides circular reference detection , and swift、Objective-C Then use ARC(Automatic Reference Counting), The coder needs to judge the reference type , Distinguish between strong and weak references , for example
class Apartment {
weak var owner: Person?
}Go What is the specific use of the tracking garbage collection mechanism ?
Go 1.5 after , Use the three color marking method for garbage collection .
Traditional marking - The idea of sweeping algorithm is 70 The era brought forward . Hang the user program at some point (STW,stop the world), Implement the garbage collection mechanism —— There are two stages :
- Mark : Start marking objects from some root nodes , Step by step through the survival unit , Until no traversable units remain ; This step requires STW, Otherwise, it may be wrong Recycle objects newly created by users .
- Clean : Reclaim the remaining unlabeled units .
Three color marking method is an improvement of the traditional marking stage , Divided into white ( Objects not scanned )、 gray ( Object to be traversed ) And black ( Object traversed ):
// 1. Mark all objects on the stack as gray
// 2. Take out all gray objects and mark them black , Mark the objects referenced by this object as gray , Until there are no gray objects
func mark() {
for object in range getStackObjects() {
markGray(object)
}
for {
object, ok := getNextGrayObject()
if !ok {
break
}
markBlack(object)
markGray(object.GetReferences())
}
}Tricolor marking algorithm can make STW Time is greatly reduced , Make the user program and garbage collection algorithm can be carried out concurrently . actually ,Go 1.10 After the version , Average time STW The time is reduced to 0.5ms:
What makes user programs and tags - Sweeping algorithms cannot be executed concurrently ?
Because the user program interferes with the marking process . In order to avoid wrong cleaning objects , So we need STW.
that ,“ interfere ” What are there ?
- The user created a new object —— The object remains white , May end up being incorrectly recycled ;
- You dereference a white object from a gray object , And make a black object reference it —— The white object will not be scanned , Because the black object means that the relevant reference object has been scanned , Thus, the white object is incorrectly recycled .
Garbage collection algorithm when running in parallel with user program , How to ensure correctness ?
Depending on the write barrier (Write Barrier). The write barrier is intended to be a mechanism within the operating system , It ensures that the process of writing to the storage system is in a specific order ; In the garbage collection algorithm , The writing barrier is The specific code executed at each write .
We open the write barrier during marking , So as to try to avoid the interference of the user program to the marking process .Go 1.8 The previous method was to use Dijkstra stay 78 Method proposed in —— Insert write barrier (Insertion Write Barrier):
Every time a reference occurs , If the referenced object is white , Set it to gray , Be similar to :
func writePointer(slot, ptr) {
if isWhite(ptr) {
markGray(ptr)
}
*slot = ptr // Set references
}This write barrier just solves the interference of the user program mentioned above with the marking process :
- The object newly created by the user , Directly marked as gray , Avoid error recycling ;
- When the parent node of a white object changes from a gray object to a black object , The object is marked gray , It also avoids error recycling .
This is because the insert write barrier satisfies Strong trichromatic invariance :
Black objects don't point to white objects , It only points to gray objects or black objects .
What about the insert write barrier cost?
After the object on the stack is set to black , To avoid error recycling , Or you need to add a write barrier to the objects on the stack ( The overhead of frequent operations increases ), Or you need to rescan the stack (STW).
therefore ,Go stay 1.5 Version to 1.7 edition , After the insert write barrier is turned on , Only the pointer changes on the heap are grayed out , Do not change the pointer on the stack ; Mark the completed STW, The white objects on the stack will be marked again .
Go from 1.5 Every time before STW Time consuming from ~100ms Reduce the cost of this stage ~10ms.
How to further optimize the insert write barrier , Reduce STW Time consuming ?
Introduce here 1990 Year by year Yuasa The proposed delete write barrier (Deletion Write Barrier):
When a reference to an object is deleted , If the object is white , The object is grayed out . The code example is as follows :
The reliability of deleting the write barrier comes from its satisfaction Weak trichromatic invariance :
The white object pointed by the black object must contain an accessible path from the gray object to multiple white objects
This ensures that the white object will delete the reference , Its own and child nodes can always be marked black during the marking phase , So as to avoid the dangling pointer caused by error recycling .
Final ,Go 1.8 Version introduces a hybrid write barrier that combines insert and delete write barriers (Hybrid Write Barrier):
func writePointer(slot, ptr) {
if isWhite(*slot) {
markGray(*slot) // [ Delete the write barrier ] When deleting a reference , Dereferenced objects are marked gray
}
if isCurrentStackGray() {
markGray(ptr) // At present goroutine If the stack has not been scanned , Then the pointer points to the object marked gray
// This is because Go 1.8 It has been set before , When the write barrier is turned on , All new objects are marked black
// So where the pointer is goroutine The stack has not been scanned yet , The pointer is grayed out for further scanning
// If the current pointer is goroutine When it is already black ,
// * The pointer has either been scanned ( gray / black )
// * Or it is a new allocation object ( black )
}
*slot = ptr
}Whether mixed write barriers solve the problem of inserting write barriers ?
The real problem is : Whether mixed write barriers are avoided 1. At the end of the tag STW Then rescan the stack ; 2. Open the write barrier to the objects on the stack ?
Inserting the write barrier requires rescanning the stack , The white object is referenced by the pointer of the black object on the stack ; Now because the write barrier is removed , Such white objects are grayed out . So there is no need to rescan the stack . And notice , The write barrier is to gray the white object without changing the color of the black object on the stack , Therefore, the performance loss of opening the write barrier to the objects on the stack is avoided .
therefore ,Go 1.8 The introduction of hybrid write barriers guarantees performance , It also reduces the cost of rescanning the stack STW expenses . Every time GC Of STW Time from insert write barrier 1.5 Version of ~10ms Down to 1.8 Of ~0.5ms.
Go 1.8 Why do you need to STW?
There are two more stages to be STW:
- GC Preparation before you start , For example, set the write barrier ;
- At the end of the mark , Rescan global variables 、 Scan the system stack 、 End the marking process, etc .
边栏推荐
- SAP mm ml81n creates a service receipt for a purchase order and reports an error - no matching Po items selected-
- Industry 4.0 era: the rise of low code may bring about changes in the pattern of manufacturing industry
- Analysis and application of ThreadLocal source code
- Tencent News's practice based on Flink pipeline model
- Source code analysis of jmeter5.1 core class testcompiler
- Save: software analysis, verification and test platform
- How to access the top-level domain name and automatically jump to the secondary domain name?
- Batch production of plant hangtag
- C language picture transcoding for performance testing
- 蚂蚁集团自研TEE技术通过国家级金融科技产品认证
猜你喜欢

蚂蚁获FinQA竞赛冠军,在长文本数值推理AI技术上取得突破

Slsa: accelerator for successful SBOM

Section 29 basic configuration case of Tianrongxin topgate firewall

Opengauss Developer Day 2022 was officially launched to build an open source database root community with developers

Game security - call analysis - write code
Mysql中的触发器定义及语法介绍

混沌工程,了解一下

In the eyes of the universe, how to correctly care about counting East and West?

脚本之美│VBS 入门交互实战

Chaos engineering, learn about it
随机推荐
PHPMailer 发送邮件 PHP
How to use FTP to upload websites to the web
What should be done when the encryptor fails to authenticate in the new version of easycvr?
Analysis and application of ThreadLocal source code
Problems and solutions of MacOS installation go SQLite3
Hackinglab penetration test question 8:key can't find it again
How to lossless publish API gateway why do you need API gateway?
Save: software analysis, verification and test platform
[tcapulusdb knowledge base] example of deleting data (TDR table)
Flutter Utils
What is the API gateway architecture? What are the common gateway types?
Opengauss Developer Day 2022 was officially launched to build an open source database root community with developers
Semaphore semaphore details
Role of API service gateway benefits of independent API gateway
Change sql- Tencent cloud database tdsql elite challenge - essence Q & A
Remember a compose version of Huarong Road, you deserve it!
sql server常用sql
在宇宙的眼眸下,如何正确地关心东数西算?
[tcapulusdb knowledge base] reading data example (TDR table)
The article "essence" introduces you to VMware vSphere network, vswitch and port group!