当前位置:网站首页>Multithreading tutorial (XXVII) CPU cache and pseudo sharing

Multithreading tutorial (XXVII) CPU cache and pseudo sharing

2022-06-11 05:30:00 Have you become a great God today

2 Multithreading tutorial ( twenty-seven )cpu cache 、 False sharing

One 、CPU Cache structure

 Insert picture description here

see cpu cache

 [email protected] ~ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 1
On-line CPU(s) list: 0
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 142
Model name: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
Stepping: 11
CPU MHz: 1992.002
BogoMIPS: 3984.00
Hypervisor vendor: VMware
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
NUMA node0 CPU(s): 0

Speed comparison

from cpu To About the required clock cycle
register 1 cycle
L13~4 cycle
L210~20 cycle
L340~45 cycle
Memory 120~240 cycle

Registers can be understood as being in cpu Inside ,cpu The speed to register is the fastest

Clock cycles and cpu It's about the dominant frequency of , such as 4GHZ The main frequency of , A time period is about 0.25ns

In order to improve the cpu Utilization ratio , We read the memory data into the cache

because CPU And The speed of memory varies greatly , We need to improve efficiency by pre reading data to the cache .

Caching is in cache behavior units , Each cache line corresponds to a block of memory , It's usually 64 byte(8 individual long)

The addition of cache will cause the generation of data copies , That is, the same data will be cached in cache lines of different cores

CPU To ensure data consistency , If a CPU Core changed data , Other CPU The entire cache line corresponding to the core must be invalidated

More detailed cpu The caching mechanism can be seen in Blog , It's very good

Two 、 False sharing

As mentioned earlier, caching is based on cache line pseudo units , Each cache line corresponds to a block of memory , It's usually 64 byte(8 individual long)

But if the amount of data is less than 64byte Half of it is 32byte, Cache rows are stored 2 Data , If one of the two data changes, the entire cache line will be invalidated , This is pseudo sharing .

Introduced in the previous session LongAdder For example ,cell Is the accumulation unit ,LongAdder In order to improve efficiency, several cell.

 Insert picture description here

because Cell It's in the form of an array , It's continuously stored in memory , One Cell by 24 byte (16 Byte object header and 8 Bytes of value), Therefore, cache lines can be saved 2 One of the Cell object . Here comes the question :

Core-0 To be modified Cell[0]

Core-1 To be modified Cell[1]

No matter who modifies it successfully , Will lead to each other Core Cache row invalidation for , such as Core-0 in Cell[0]=6000, Cell[1]=8000 To accumulate Cell[0]=6001, Cell[1]=8000 , This will allow Core-1 Cache row invalidation for

@sun.misc.Contended To solve this problem , Its principle is to add... Before and after the object or field using this annotation 128 Byte size padding, So that CPU Different cache lines are used when pre reading objects to the cache , such , It will not invalidate the other party's cache lines

 Insert picture description here

reference :

CPU cache

原网站

版权声明
本文为[Have you become a great God today]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/03/202203020539055666.html