当前位置:网站首页>Go language implementation principle -- map implementation principle
Go language implementation principle -- map implementation principle
2022-07-05 23:12:00 【There is too much uncertainty in life】
Contents of this article
Map Realization principle
1、 summary
Map This data structure exists in many languages , Its query efficiency is very high , The complexity of time is O(1) The level of , The underlying principle uses a hash table , By calculation key Of hash value , Find where it is bucket, And then through bucket Look for it value, The flow is shown in the following figure

2、 Hash collision
because bucket There is a limited number of , and k-v There will be more and more right ones , therefore , Find the corresponding... Through the hash value bucket There may be multiple hash values corresponding to one bucket The situation of , This is it. hash Collision . To solve this problem , Can pass Zipper method or Open address method To solve such problems
2.1、 Zipper method

If multiple elements are placed in one bucket in , Will treat them as Linked list In the form of , Find the corresponding bucket after , Just traverse it .
shortcoming : Because of the use of linked lists to organize data , So extra pointers are needed , And it cannot be used efficiently CPU Cache
2.2、 Open address method

Unlike zipper hair , For the same bucket The elements in , It's using Array In the form of ,Go The language uses this formal way to resolve hash conflicts , When looking for elements, it traverses in sequence , This method can make better use of CPU The cache of .
3、 Common operations
3.1、 Declaration and creation
var m1 map[int]string // m1 by nil, Unable to operate
var m2 = make(map[int]int) // m2 Not for nil, The default length is 0
3.2、 Get value
var m = make(map[int]int)
v1 := m[1]
fmt.Println(v1) // 0
v2, ok := m[2]
fmt.Println(v2, ok) // 0 false
From the above example, it is not difficult to see , If you get a non-existent element , You will get this type of ” Zero value “, therefore , It is more recommended to use the second method to start from map Get the element , Through the second parameter, we can judge whether there is an expected value
3.3、 Assignment operation
var m = make(map[int]int)
// assignment :map[key] = value
m[1] = 666
fmt.Println(m[1]) // 666
// Delete :delete(map,key)
delete(m, 1)
fmt.Println(m[1]) // 0
3.4、key Comparability of
stay Go In language ,map Of key Must be comparable . except section 、 function 、map, Not comparable , Almost all other types are comparable . It is worth mentioning that , Comparability of structures , Depends on all its properties , If all fields are comparable , Then the structure is comparable , On the contrary, it is not comparable .
3.5、 Concurrency issues
// Test concurrent read and write
// Concurrent reading and writing will report errors fatal error: concurrent map read and map write
func test1() {
m := make(map[int]int)
go func() {
for {
m[0] = 5
}
}()
go func() {
for {
_ = m[1]
}
}()
select {
}
}
// Test concurrent reads
func test2() {
m := make(map[int]int)
go func() {
for {
_ = m[0]
}
}()
go func() {
for {
_ = m[1]
}
}()
select {
}
}
// Test and write
// Report errors : fatal error: concurrent map writes
func test3() {
m := make(map[int]int)
go func() {
for {
m[0] = 5
}
}()
go func() {
for {
m[1] = 5
}
}()
select {
}
}
It is not difficult to see from the above case ,Go In language Map Only concurrent reads are supported , The reason for this design , Because in most scenarios Map They all read more and write less , If we make every method mutually exclusive for the sake of that little bit of security, it will be a little more than worth the loss , therefore , The former is selected in terms of performance and concurrency security .
4、 The underlying structure
4.1、 Hash table structure
Map Structure
// A header for a Go map.
type hmap struct {
// Note: the format of the hmap is also encoded in cmd/compile/internal/reflectdata/reflect.go.
// Make sure this stays in sync with the compiler's definition.
count int // # live cells == size of map. Must be first (used by len() builtin)
flags uint8
B uint8 // log_2 of # of buckets (can hold up to loadFactor * 2^B items)
noverflow uint16 // approximate number of overflow buckets; see incrnoverflow for details
hash0 uint32 // hash seed
buckets unsafe.Pointer // array of 2^B Buckets. may be nil if count==0.
oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing
nevacuate uintptr // progress counter for evacuation (buckets less than this have been evacuated)
extra *mapextra // optional fields
}
With the help of translation software, we can know the function of most fields , I won't repeat the explanation here , Here is an introduction to those without English notes flag Field , The function of this field is to record the current map The state of , The concurrency problem mentioned above is judged with the help of this field
Bucket Structure
// A bucket for a Go map.
type bmap struct {
// tophash generally contains the top byte of the **hash** value
// for each key in this bucket. If tophash[0] < minTopHash,
// tophash[0] is a bucket evacuation state instead.
tophash [bucketCnt]uint8
// Followed by bucketCnt keys and then bucketCnt elems.
// NOTE: packing all the **keys** together and then all the **elems** together makes the
// code a bit more complicated than alternating key/elem/key/elem/... but it allows
// us to eliminate padding which would be needed for, e.g., map[int64]int8.
// Followed by an overflow pointer.
}
You can see... In the source code bucketCnt At run time 8, There is only one in this structure tophash attribute , It's a length of 8 Array of , You can know from the comments that it stores hash value , We can know from the following English explanation , It should be followed by key and value, When obtaining, it is obtained by calculating the offset . therefore , At runtime, the actual structure should look like the following
// A bucket for a Go map.
type bmap struct {
tophash [bucketCnt]uint8
key [bucketCnt]T
value [bucketCnt]T
......
}

The following English notes also explain , The reason for organizing data like this , It is to save the space filled by it
4.2、 Overflow bucket
Through the above analysis of the structure, we can know ,bucket The data in the structure is stored in the array , The length of the array is fixed 8, When to map When storing ,bucket The data in may exceed 8. In this case ,Go The approach of language is to create a new bucket, This new bucket be called Overflow bucket , primary bucket, And overflow barrels will form a chain , let me put it another way , primary bucket There will be a pointer to the overflow bucket .
4.3、map The reconstruction
In the use of key obtain value When , We will find the corresponding bucket, Then go back and traverse the elements inside , If so bucket If it is not found, it will check whether there is an overflow bucket , If there is, it will traverse the overflow bucket . We know that the efficiency of traversal is very poor , therefore , To improve performance , Need to add original bucket The number of , because map in bucket It is organized in an array , Therefore, you need to apply for a larger memory when expanding , And then put the old bucket Copy to the new array .
Capacity expansion condition
// If we hit the max load factor or we have too many overflow buckets,
// and we're not already in the middle of growing, start growing.
if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) {
hashGrow(t, h)
goto again // Growing the table invalidates everything, so try again
}
According to the English notes and the method name, it is not difficult to see , When the load factor reaches a certain threshold or there are too many overflowing barrels ( When the number of overflowing barrels is the same as that of normal barrels ), Will trigger map The reconstruction
Load factor
The following is the calculation formula of load factor :
negative load because Son = element plain Count The amount / b u c k e t Count The amount Load factor = Element quantity /bucket Number negative load because Son = element plain Count The amount /bucket Count The amount
The load factor can get the current general usage of space , If the current load factor is small , It indicates that the number of elements is much smaller than the applied capacity , for example : The load factor is 1, So, on average each bucket There's only one element , Because of every bucket Can be stored 8 Elements , From an average point of view , Only about 1/8 Space .
// Maximum average load of a bucket that triggers growth is 6.5.
// Represent as loadFactorNum/loadFactorDen, to allow integer math.
loadFactorNum = 13
loadFactorDen = 2
We can know from the above source code , At run time map The default maximum load factor for is 6.5, That is to say 13÷2 Result , in other words , When map The load factor reaches 6.5 Capacity expansion will be carried out when
Expansion strategy
func hashGrow(t *maptype, h *hmap) {
// If we've hit the load factor, get bigger.
// Otherwise, there are too many overflow buckets,
// so keep the same number of buckets and "grow" laterally.
bigger := uint8(1)
if !overLoadFactor(h.count+1, h.B) {
bigger = 0
h.flags |= sameSizeGrow
}
oldbuckets := h.buckets
newbuckets, nextOverflow := makeBucketArray(t, h.B+bigger, nil)
//......
}
map Structural weight B Fields and bucket The relationship between quantity is : 2 B = b u c k e t s 2^B = buckets 2B=buckets, in other words B Every increase 1,bucket It will double
In the above source code , It's not hard to see. , If the capacity expansion is caused by load factor ,bigger by 1, that B Will increase 1, That is to say buckets The number of will double , If the capacity expansion is caused by too many overflowing barrels ,bigger by 0, The number of barrels will remain the same , Horizontal “ growth ”.
4.4、 Delete principle
In the use of delete(map,key) Delete map When it comes to the elements in , Will bucket The value at the corresponding position in is set to emptyOne, If there is no value after this value , Will be set to emptyRest, The advantage of this is , encounter emptyRest Will not continue to traverse backwards . As shown in the figure below
边栏推荐
- Go语言实现原理——Map实现原理
- 一文搞定垃圾回收器
- openresty ngx_ Lua request response
- openresty ngx_lua正則錶達式
- 派对的最大快乐值
- Detailed explanation of pointer and array written test of C language
- Déterminer si un arbre binaire est un arbre binaire complet
- Yiwen gets rid of the garbage collector
- 2022 R2 mobile pressure vessel filling review simulation examination and R2 mobile pressure vessel filling examination questions
- Basic knowledge of database (interview)
猜你喜欢

LeetCode145. Post order traversal of binary tree (three methods of recursion and iteration)

一文搞定JVM的内存结构

Error when LabVIEW opens Ni instance finder

数据库基础知识(面试)

Douban scoring applet Part-2

第十七周作业

视频标准二三事

Getting started stm32--gpio (running lantern) (nanny level)

Masked Autoencoders Are Scalable Vision Learners (MAE)

一文搞定class的微觀結構和指令
随机推荐
Debian 10 installation configuration
Composition of interface
Sum of two numbers, sum of three numbers (sort + double pointer)
关于MySQL的30条优化技巧,超实用
Go语言实现原理——锁实现原理
Development specification: interface unified return value format [resend]
Summary of binary tree recursive routines
秒杀系统的设计与实现思路
Multi view 3D reconstruction
openresty ngx_lua正则表达式
Use of grpc interceptor
(4)UART应用设计及仿真验证2 —— RX模块设计(无状态机)
第十七周作业
Function default parameters, function placeholder parameters, function overloading and precautions
Global and Chinese market of water treatment technology 2022-2028: Research Report on technology, participants, trends, market size and share
Vision Transformer (ViT)
Multi sensor fusion of imu/ electronic compass / wheel encoder (Kalman filter)
如何快速理解复杂业务,系统思考问题?
Shell: operator
Using LNMP to build WordPress sites