当前位置:网站首页>Final consistency of MESI cache in CPU -- why does CPU need cache
Final consistency of MESI cache in CPU -- why does CPU need cache
2022-07-04 16:22:00 【zxhtom】
「 This is my participation 2022 For the first time, the third challenge is 4 God , Check out the activity details :2022 For the first time, it's a challenge 」
Preface
- We have released the lock chapter 【java How objects are distributed in memory 】、【java What locks are there 】、【synchronized and volatile】. In the above analysis volatile When I reorder instructions, I see an article introducing CPU Cache consistency issues .
- because volatile The prohibition of instruction reordering is due to the implementation of memory barrier . Another feature is memory visibility, which is implemented through CPU Of MESI To achieve .
- When A The thread brushes the modified data back to the main memory ,CPU At the same time, it informs other threads that the corresponding data in the thread is invalid , Need to get it again
What is? MESI
- MESI In fact, it is the abbreviation of four words , They are a state that describes the copy of data in the thread . We go through MESI Come and see before volatile The process of realizing memory visibility
- CPU Caching data is not the data needed for caching, but based on blocks , Here is 64KB Is the smallest unit . So we modified 1 Other values near the value of one place will also become invalid , In turn, other threads will synchronize data , This is pseudo sharing . Actually, here we are mysql So is design , When we facilitate the query, the data is also divided into the smallest pages , Page size 16KB.
CPU Why cache is needed
- The data stored inside the computer is also stored in blocks , Such a storage method leads to our inability to humanize , Arbitrary access will increase the number of our interactions . although CPU Soon , But the speed of memory can't keep up CPU The speed of , Therefore, it is the best way to access the data by packaging .
- The same is true of reading bytes in our network development . Every time we normally read 1024 byte , This reduces our network interaction
CPU With MESI, Why? Java still more volatile
- First Java In order to improve the efficiency of virtual machine, instruction rearrangement will occur , This is also volatile One of the characteristics
- CPU Of MESI What is guaranteed is a single CPU Visible at a single location . however volatile It's all CPU The operation of . therefore volatile It's necessary
In a typical system , There may be several caches ( In a multicore system , Each core will have its own cache ) Shared main memory bus , Each corresponding
CPU
Will issue a read-write request , And the purpose of caching is to reduceCPU
The number of times to read and write shared main memory .
- A cache is divided into
Invalid
It can be satisfied out of state cpu Read request for , OneInvalid
Must be read from main memory ( becomeS
perhapsE
state ) To satisfyCPU
Read request for . - A write request is only if the cache line is M perhaps E State can only be executed , If the cache line is in
S
state , The cache row in other caches must be changed toInvalid
state ( It's not allowed to be differentCPU
Modify the same cache line at the same time , It is not allowed to modify data at different locations in the cache row ). This operation is often done by broadcasting , for example :RequestFor Ownership
(RFO
). - Cache can change a non at any time M The state of the cache line is invalid , Or become
Invalid
state , And oneM
The cache line of the state must first be written back to main memory . - One is in
M
The state cache line must always listen for all attempts to read the cache line relative to main memory , This operation must write the cache row back to main memory in the cache and change the state to S The state was delayed . - One is in S The state cache line must also listen for requests from other caches to invalidate the cache line or to own the cache line , And make the cache line invalid (
Invalid
). - One is in E The state cache line must also listen to other caches reading the cache line in main memory , Once there's this kind of operation , The cache line needs to become
S
state . - about
M
andE
State is always accurate , They are consistent with the true state of the cache line . andS
The state may be inconsistent , If a cache will be inS
The cache line of the state is invalidated , And the other cache might actually have - It's time to cache , But the cache does not promote the cache row to
E
state , This is because other caches don't broadcast their notification to void the cache line , Also, since the cache does not hold the cache linecopy
The number of , therefore ( Even with such a notice ) There is no way to determine whether you have exclusive access to the cache line . - In the sense above E State is a speculative optimization : If one
CPU
Want to modify a position inS
State cache line , The bus transaction needs to transfer all of the cache rowscopy
becomeInvalid
state , And modifyE
State caching does not require bus transactions .
Case list
- There is an introduction to CPU The cache data unit is 64K . Join us Java Two variables manipulated by multithreading are in the same block , Then a thread is modified a Variable , Another thread operates b Variables also involve data synchronization . Here we can see a code provided by dismounted soldier Daniel , I run it locally , It's fun .
@Data
class Store{
private volatile long p1,p2,p3,p4,p5,p6,p7;
private volatile long p;
private volatile long p8,p9,p10,p11,p12,p13,p14;
}
public class StoreRW {
public static Store[] arr = new Store[2];
public static long COUNT = 1_0000_0000l;
static {
arr[0] = new Store();
arr[1] = new Store();
}
public static void main(String[] args) throws InterruptedException {
Store store = new Store();
final Thread t1 = new Thread(new Runnable() {
@Override
public void run() {
for (long i = 0; i < COUNT; i++) {
arr[0].setP(i);
}
}
});
final Thread t2 = new Thread(new Runnable() {
@Override
public void run() {
for (long i = 0; i < COUNT; i++) {
arr[1].setP(i);
}
}
});
final long start = System.currentTimeMillis();
t1.start();
t2.start();
t1.join();
t2.join();
final long end = System.currentTimeMillis();
System.out.println(end - start);
}
}
Copy code
- The code is simple , That is, two threads constantly operate two variables . If we remove redundant attributes from the object . like this Store Only keep p An attribute
@Data
class Store{
private volatile long p;
}
Copy code
- Running our program found that it was basically stable in 100 millisecond . If I add something irrelevant 14 individual long Properties of type . Then the program can be stable in 70 millisecond . Here, the running time of the program depends on the configuration of the computer . But no matter how the configuration is, you can definitely see whether to add it or not 14 The difference between variables .
- This is about CPU Cache unit . If there is only one attribute . that a r r Two objects in the array are likely to be in the same cache block . So thread A operation a object , So thread B There will be a synchronization . But add 14 Variables can guarantee a r r The two objects of the array are definitely not in the same unit block
- Because with 14 After variables , One Store Take up 15*8=120 Bytes . Then put two anyway Store Definitely not in the same block . and p The variable is still in the middle . That's why this effect appears .
- For this operation, some people will think that the code is not aesthetic , But it does improve performance .JDK Comments are also provided for this
@sun.misc.Contended
; But I tested it and felt whether the performance was improved 14 Variables are large . Teacher ma
summary
- That's all for today's introduction . Mainly with MESI The understanding of the .
Reference article
边栏推荐
- 这几年爆火的智能物联网(AIoT),到底前景如何?
- 函数式接口,方法引用,Lambda实现的List集合排序小工具
- Redis' optimistic lock and pessimistic lock for solving transaction conflicts
- Move, say goodbye to the past again
- Using celery in projects
- [North Asia data recovery] data recovery case of database data loss caused by HP DL380 server RAID disk failure
- Big God explains open source buff gain strategy live broadcast
- MySQL学习笔记——数据类型(数值类型)
- What should ABAP do when it calls a third-party API and encounters garbled code?
- Intranet penetrating FRP: hidden communication tunnel technology
猜你喜欢
AI system content recommendation issue 24
Unity script lifecycle day02
Preliminary practice of niuke.com (10)
科普达人丨一文看懂阿里云的秘密武器“神龙架构”
Unity动画Animation Day05
. Net applications consider x64 generation
Functional interface, method reference, list collection sorting gadget implemented by lambda
Unity脚本生命周期 Day02
这几年爆火的智能物联网(AIoT),到底前景如何?
函数式接口,方法引用,Lambda实现的List集合排序小工具
随机推荐
MySQL learning notes - data type (2)
Audio and video technology development weekly | 252
Working group and domain analysis of Intranet
MySQL - MySQL adds self incrementing IDs to existing data tables
Salient map drawing based on OpenCV
Unity脚本API—GameObject游戏对象、Object 对象
CMPSC311 Linear Device
Logstash~Logstash配置(logstash.yml)详解
The new generation of domestic ORM framework sagacity sqltoy-5.1.25 release
TypeError: not enough arguments for format string
2022年九大CIO趋势和优先事项
MySQL index optimization
函數式接口,方法引用,Lambda實現的List集合排序小工具
Unity脚本常用API Day03
How to save the contents of div as an image- How to save the contents of a div as a image?
[book club issue 13] packaging format and coding format of audio files
这几年爆火的智能物联网(AIoT),到底前景如何?
c# 实现定义一套中间SQL可以跨库执行的SQL语句
@EnableAspectAutoJAutoProxy_ Exposeproxy property
PR FAQ: how to set PR vertical screen sequence?