当前位置:网站首页>Final consistency of MESI cache in CPU -- why does CPU need cache
Final consistency of MESI cache in CPU -- why does CPU need cache
2022-07-04 01:51:00 【Don't suffer me】
Preface
- We have released the lock chapter 【java How objects are distributed in memory 】、【java What locks are there 】、【synchronized and volatile】. In the above analysis volatile When I reorder instructions, I see an article introducing CPU Cache consistency issues .
- because volatile The prohibition of instruction reordering is due to the implementation of memory barrier . Another feature is memory visibility, which is implemented through CPU Of MESI To achieve .
- When A The thread brushes the modified data back to the main memory ,CPU At the same time, it informs other threads that the corresponding data in the thread is invalid , Need to get it again
What is? MESI
- MESI In fact, it is the abbreviation of four words , They are a state that describes the copy of data in the thread . We go through MESI Come and see before volatile The process of realizing memory visibility
- CPU Caching data is not the data needed for caching, but based on blocks , Here is 64KB Is the smallest unit . So we modified 1 Other values near the value of one place will also become invalid , In turn, other threads will synchronize data , This is pseudo sharing . Actually, here we are mysql So is design , When we facilitate the query, the data is also divided into the smallest pages , Page size 16KB.
CPU Why cache is needed
- The data stored inside the computer is also stored in blocks , Such storage means that we cannot be willful <typo id="typo-493" data-origin=" Of " ignoretag="true"> Of </typo> take , Arbitrary access will increase the number of our interactions . although CPU Soon , But the speed of memory can't keep up CPU The speed of , Therefore, it is the best way to access the data by packaging .
- The same is true of reading bytes in our network development . Every time we normally read 1024 byte , This reduces our network interaction
CPU With MESI, Why? Java still more volatile
- First Java In order to improve the efficiency of virtual machine, instruction rearrangement will occur , This is also volatile One of the characteristics
- CPU Of MESI What is guaranteed is a single CPU Visible at a single location . however volatile It's all CPU The operation of . therefore volatile It's necessary
In a typical system , There may be several caches ( In a multicore system , Each core will have its own cache ) Shared main memory bus , Each corresponding CPU Will issue a read-write request , And the purpose of caching is to reduce CPU The number of times to read and write shared main memory .
- A cache is divided into Invalid It can be satisfied out of state cpu Read request for , One Invalid Must be read from main memory ( become S perhaps E state ) To satisfy CPU Read request for .
- A write request is only if the cache line is M perhaps E State can only be executed , If the cache line is in S state , The cache row in other caches must be changed to Invalid state ( It's not allowed to be different CPU Modify the same cache line at the same time , It is not allowed to modify data at different locations in the cache row ). This operation is often done by broadcasting , for example :RequestFor Ownership (RFO).
- Cache can change a non at any time M The state of the cache line is invalid , Or become Invalid state , And one M The cache line of the state must first be written back to main memory .
- One is in M The state cache line must always listen for all attempts to read the cache line relative to main memory , This operation must write the cache row back to main memory in the cache and change the state to S The state was delayed .
- One is in S The state cache line must also listen for requests from other caches to invalidate the cache line or to own the cache line , And make the cache line invalid (Invalid).
- One is in E The state cache line must also listen to other caches reading the cache line in main memory , Once there's this kind of operation , The cache line needs to become S state .
- about M and E State is always accurate , They are consistent with the true state of the cache line . and S The state may be inconsistent , If a cache will be in S The cache line of the state is invalidated , And the other cache might actually have
- It's time to cache , But the cache does not promote the cache row to E state , This is because other caches don't broadcast their notification to void the cache line , Also, since the cache does not hold the cache line copy The number of , therefore ( Even with such a notice ) There is no way to determine whether you have exclusive access to the cache line .
- In the sense above E State is a speculative optimization : If one CPU Want to modify a position in S State cache line , The bus transaction needs to transfer all of the cache rows copy become Invalid state , And modify E State caching does not require bus transactions .
Case list
- There is an introduction to CPU The cache data unit is 64K . Join us Java Two variables manipulated by multithreading are in the same block , Then a thread is modified a Variable , Another thread operates b Variables also involve data synchronization . Here we can see a code provided by dismounted soldier Daniel , I run it locally , It's fun .
@Data
class Store{
private volatile long p1,p2,p3,p4,p5,p6,p7;
private volatile long p;
private volatile long p8,p9,p10,p11,p12,p13,p14;
}
public class StoreRW {
public static Store[] arr = new Store[2];
public static long COUNT = 1_0000_0000l;
static {
arr[0] = new Store();
arr[1] = new Store();
}
public static void main(String[] args) throws InterruptedException {
Store store = new Store();
final Thread t1 = new Thread(new Runnable() {
@Override
public void run() {
for (long i = 0; i < COUNT; i++) {
arr[0].setP(i);
}
}
});
final Thread t2 = new Thread(new Runnable() {
@Override
public void run() {
for (long i = 0; i < COUNT; i++) {
arr[1].setP(i);
}
}
});
final long start = System.currentTimeMillis();
t1.start();
t2.start();
t1.join();
t2.join();
final long end = System.currentTimeMillis();
System.out.println(end - start);
}
}
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.
- The code is simple , The two threads are constantly <typo id="typo-2954" data-origin=" Of " ignoretag="true"> Of </typo> Operate on two variables . If we remove redundant attributes from the object . like this Store Only keep p An attribute
@Data
class Store{
private volatile long p;
}
- 1.
- 2.
- 3.
- 4.
- 5.
- Running our program found that it was basically stable in 100<typo id="typo-3081" data-origin=" millisecond " ignoretag="true"> millisecond </typo>. If I add those irrelevant 14 individual long Properties of type . Then the program can be stable in 70 millisecond . The running time of the program here depends on the configuration of the computer . But no matter how the configuration is, you can definitely see the addition and non <typo id="typo-3154" data-origin=" add " ignoretag="true"> add </typo>14 The difference between variables .
- This is about CPU Cache unit . If there is only one attribute . that a r r Two objects in the array are most likely in the same cache block . So thread A operation a object , So thread B There will be a synchronization . But add 14 Variables can guarantee a r r The two objects of the array are definitely not in the same unit block
- Because with 14 After variables , One Store Take up 15*8=120 Bytes . Then put two anyway Store Definitely not in the same block . and p The variable is still in the middle . That's why this effect appears .
- For this operation, some people will think that the code is not aesthetic , But it does improve performance .JDK Comments are also provided for this @sun.misc.Contended ; But I tested it and felt whether the performance was improved 14 Variables are large . Teacher ma
summary
- That's all for today's introduction . Mainly with MESI The understanding of the .
author :zxhtom
link :https://juejin.cn/post/7064365186425028621
边栏推荐
- A fan summed up so many interview questions for you. There is always one you need!
- [leetcode daily question] a single element in an ordered array
- When the watch system of Jerry's is abnormal, it is used to restore the system [chapter]
- Maximum entropy model
- Human resource management online assignment
- How to view the computing power of GPU?
- C import Xls data method summary II (save the uploaded file to the DataTable instance object)
- LeetCode 168. Detailed explanation of Excel list name
- Skku| autonomous handover decision of UAV Based on deep reinforcement learning
- Huawei BFD and NQA
猜你喜欢
Force buckle day32
The reasons why QT fails to connect to the database and common solutions
Applet graduation project based on wechat selection voting applet graduation project opening report function reference
Lightweight Pyramid Networks for Image Deraining
Small program graduation project based on wechat examination small program graduation project opening report function reference
Pyinstaller packaging py script warning:lib not found and other related issues
Luogu p1309 Swiss wheel
C import Xls data method summary II (save the uploaded file to the DataTable instance object)
MySQL deadly serial question 2 -- are you familiar with MySQL index?
Example 073 square sum value judgment programming requires the input of a and B, if a ²+ b ² If the result of is greater than 100, a is output ²+ b ² Value, otherwise output the result of a + B.
随机推荐
Winter vacation daily question -- a single element in an ordered array
Containerization technology stack
Jerry's watch information type table [chapter]
Yyds dry goods inventory it's not easy to say I love you | use the minimum web API to upload files
Cancer biopsy instruments and kits - market status and future development trends
How to delete MySQL components using xshell7?
Remember a lazy query error
Applet graduation project based on wechat selection voting applet graduation project opening report function reference
Force deduction solution summary 1189- maximum number of "balloons"
MySQL introduction - functions (various function statistics, exercises, details, tables)
Stringutils and collectionutils
Flex flexible layout, box in the middle of the page
Prose article appreciation - the rain in the warm country has never changed into cold, hard and brilliant flowers. Knowledgeable people think he is monotonous, and he thinks he is unlucky, doesn't he?
How programmers find girlfriends through blind dates
Use classname to modify style properties
In yolov5, denselayer is used to replace focus, and the FPN structure is changed to bi FPN
After listening to the system clear message notification, Jerry informed the device side to delete the message [article]
Feign implements dynamic URL
Life cycle of instance variables, static variables and local variables
51 MCU external interrupt