当前位置:网站首页>What is fake sharing after filling the previous hole?
What is fake sharing after filling the previous hole?
2022-07-07 22:53:00 【Yes' level training strategy】
Hello everyone , I am a yes.
I was writing before FastThreadLocal When , Dug a hole .
Cough , It's been a long time , But the impact is not great. I'll make up for it today .
Let's talk about what is pseudo sharing , And why Netty To remove this optimization here ?
Don't talk much , Start !
What is pseudo sharing ?
This noun sounds a bit advanced , Actually, it's easy to understand .
We all know CPU The execution speed of is much faster than the speed of getting data from memory , In order to reduce this gap, researchers continue to study , Output cache , But this cache is due to process integration , Media that cannot be used as main memory , So common CPU The cache structure is shown in the following figure :
L1、L2、L3 Then for CPU And main memory , distance CPU The closer the cache access is, the faster , And the smaller the capacity .
For example, my notebook CPU On :
Access speed :L1>L2>L3> Main memory .
L1 and L2 It's a single core CPU Exclusive , When CPU When accessing data, you will go first L1 Look up , I can't find it L2, And then there was L3, Finally, main memory . So when calculating a data repeatedly , Try to ensure that the data is L1 in , This is efficient .
From the structure above , Experienced students will certainly find that the above structure has the problem of shared memory multithreading . Here we introduce the consistency protocol MESI. The specific contents of the agreement are not expanded here , Here is a simple example to understand :
When cpu1 and cpu3 When jointly accessing a data in main memory , Will be obtained and placed in their own cache , When cpu1 After modifying this data ,cpu3 The data in the cache of is invalid , It will make cpu1 Refresh this change to main memory , Then load the data in the main memory , Only in this way can the data be correct .
Read the figure in sequence , It shouldn't be hard to understand .
Then came the point ,CPU The unit of cache is cache row , in other words CPU Getting data from main memory is not one by one , Take it line by line , The size of this line is generally 64 byte , That's the question .
such as , Now there's a long Array , The size is 8 , Then this array just meets the size of one row . Now? cpu1 Update frequently long[0] Value , and cpu3 Update frequently long[5] Value , This is a little numb .
Due to the mechanism of caching rows , Every time cpu1 The entire array will be loaded into the cache , Only modify each time long[0] It will also make this industry dirty , here cpu3 Access to the long[5] Is failure , therefore cpu3 Need make cpu1 Refresh the changes to main memory , Then it retrieves it from main memory long[5] Do it again , Suppose this time cpu1 Change again long[0], Then the above operation has to be done again !
It is different variables that are obviously modified , But they affect each other , This situation , It's called , False sharing !
How to avoid pseudo sharing ?
The solution is very simple and crude , fill .
Separate the data that may conflict in memory , What kind of partition ? Separate with useless data .
Before and after the key data ( The above figure is only filled in after ) Fill in useless data , Let a cache line , Only one valid data will exist , Others are invalid data , This avoids multiple valid data in one cache line . In this way, it's different CPU If the core modifies different data, it will not cause other data caches to fail , Avoid the problem of pseudo sharing .
therefore Netty in InternalThreadLocalMap This is what the strange code in .
But with all due respect , Maybe I'm too low , I don't see which variable this thing is filled for .
Sure enough , In the latest version, a boss labeled it obsolete
I started from github I looked up , The reason why the boss abandoned it is as follows :
Simple and straightforward translation :
- I don't see any real benefits of filling .
- The only protected object may be BitSet, But it is not frequently modified
- Filling used long, This does not necessarily stop JVM Match the above object reference in the alignment gap .
In short, I didn't find any good use of this filling , So it was scrapped , Future versions will crack it .
So get Netty To show examples of pseudo sharing is no good ( I just wrote before FastThreadLocal The pit is filled ).
Now I'm finished , Let's take another good example .
Run and see with code
I wrote an example , Let's look at the real gap between filling and not filling .
I use two threads to cycle 50 million times to modify two variables in an object a and b, The probability of these two variables will be in the same cache line , This creates a pseudo shared scene .
In the case of unfilled , The number of milliseconds taken is 1400.
Then we use variables p1-p7
Fill it in , separate a and b.
You can see , It turned out to be 380 millisecond , Look at this , It does work ! It indicates that filling is indeed effective !
Actually Java Provides an annotation @Contended
, It can be marked on the specified field , Reduce the occurrence of pseudo sharing , You can think that this annotation will make JVM Automatically help us fill in , Variables that do not need to be filled in by hand . But pay attention to , This annotation needs to be added at startup -XX:-RestrictContended
Parameters , Will take effect .
Let's run and see the result :
Sure enough , It also improves efficiency !
This annotation can also be used in other places , such as ConcurrentHashMap
Inside CounterCell
also Striped64
Inside Cell
But be careful , No, -XX:-RestrictContended
It won't work !
Last
thus , You must have understood what pseudo sharing is , And filling can be used to avoid the problem of pseudo sharing .
But filling represents a waste of space , It is not necessary to fill in any case .
Only when the adjacent fields are updated frequently , It is possible to consider pseudo sharing , Don't worry about other situations .
Okay , That's it today .
I am a yes, From a little bit to a billion , See you next time !
边栏推荐
- 行测-图形推理-3-对称图形类
- Cataloger integrates lidar and IMU for 2D mapping
- Force deduction - question 561 - array splitting I - step by step parsing
- Ren Qian code compilation error modification
- 6-3 find the table length of the linked table
- Line test - graphic reasoning - 3 - symmetric graphic class
- Cannot find module 'xxx' or its corresponding type declaration
- How pyGame rotates pictures
- 行测-图形推理-1-汉字类
- Debezium series: support the use of variables in the Kill Command
猜你喜欢
Quick sort (diagram +c code)
Common verification rules of form components -2 (continuously updating ~)
LeetCode206. Reverse linked list [double pointer and recursion]
Remember an experience of using selectmany
0-5VAC转4-20mA交流电流隔离变送器/转换模块
行测-图形推理-4-字母类
PCL .vtk文件与.pcd的相互转换
Two methods of calling WCF service by C #
数字化转型:五个步骤推动企业进步
The PHP source code of the new website + remove authorization / support burning goose instead of pumping
随机推荐
Unity technical notes (I) inspector extension
Yarn开启ACL用户认证之后无法查看Yarn历史任务日志解决办法
PHP records the pitfalls encountered in the complete docking of Tencent cloud live broadcast and im live group chat
How pyGame rotates pictures
0-5VAC转4-20mA交流电流隔离变送器/转换模块
Remove the default background color of chrome input input box
UWA问答精选
「开源摘星计划」Loki实现Harbor日志的高效管理
IP network active evaluation system -- x-vision
Debezium series: support the use of variables in the Kill Command
Digital transformation: five steps to promote enterprise progress
Matplotlib quick start
Xcode modifies the default background image of launchscreen and still displays the original image
行测-图形推理-1-汉字类
Sword finger offer 63 Maximum profit of stock
Unity FAQ (I) lack of references
关于海康ipc的几个参数
Get the week start time and week end time of the current date
Install mxnet GPU version
Sword finger offer 27 Image of binary tree