当前位置:网站首页>Final consistency of MESI cache in CPU -- why does CPU need cache
Final consistency of MESI cache in CPU -- why does CPU need cache
2022-07-04 01:51:00 【Don't suffer me】
Preface
- We have released the lock chapter 【java How objects are distributed in memory 】、【java What locks are there 】、【synchronized and volatile】. In the above analysis volatile When I reorder instructions, I see an article introducing CPU Cache consistency issues .
- because volatile The prohibition of instruction reordering is due to the implementation of memory barrier . Another feature is memory visibility, which is implemented through CPU Of MESI To achieve .
- When A The thread brushes the modified data back to the main memory ,CPU At the same time, it informs other threads that the corresponding data in the thread is invalid , Need to get it again
What is? MESI

- MESI In fact, it is the abbreviation of four words , They are a state that describes the copy of data in the thread . We go through MESI Come and see before volatile The process of realizing memory visibility

- CPU Caching data is not the data needed for caching, but based on blocks , Here is 64KB Is the smallest unit . So we modified 1 Other values near the value of one place will also become invalid , In turn, other threads will synchronize data , This is pseudo sharing . Actually, here we are mysql So is design , When we facilitate the query, the data is also divided into the smallest pages , Page size 16KB.
CPU Why cache is needed
- The data stored inside the computer is also stored in blocks , Such storage means that we cannot be willful <typo id="typo-493" data-origin=" Of " ignoretag="true"> Of </typo> take , Arbitrary access will increase the number of our interactions . although CPU Soon , But the speed of memory can't keep up CPU The speed of , Therefore, it is the best way to access the data by packaging .
- The same is true of reading bytes in our network development . Every time we normally read 1024 byte , This reduces our network interaction
CPU With MESI, Why? Java still more volatile
- First Java In order to improve the efficiency of virtual machine, instruction rearrangement will occur , This is also volatile One of the characteristics
- CPU Of MESI What is guaranteed is a single CPU Visible at a single location . however volatile It's all CPU The operation of . therefore volatile It's necessary
In a typical system , There may be several caches ( In a multicore system , Each core will have its own cache ) Shared main memory bus , Each corresponding CPU Will issue a read-write request , And the purpose of caching is to reduce CPU The number of times to read and write shared main memory .
- A cache is divided into Invalid It can be satisfied out of state cpu Read request for , One Invalid Must be read from main memory ( become S perhaps E state ) To satisfy CPU Read request for .
- A write request is only if the cache line is M perhaps E State can only be executed , If the cache line is in S state , The cache row in other caches must be changed to Invalid state ( It's not allowed to be different CPU Modify the same cache line at the same time , It is not allowed to modify data at different locations in the cache row ). This operation is often done by broadcasting , for example :RequestFor Ownership (RFO).
- Cache can change a non at any time M The state of the cache line is invalid , Or become Invalid state , And one M The cache line of the state must first be written back to main memory .
- One is in M The state cache line must always listen for all attempts to read the cache line relative to main memory , This operation must write the cache row back to main memory in the cache and change the state to S The state was delayed .
- One is in S The state cache line must also listen for requests from other caches to invalidate the cache line or to own the cache line , And make the cache line invalid (Invalid).
- One is in E The state cache line must also listen to other caches reading the cache line in main memory , Once there's this kind of operation , The cache line needs to become S state .
- about M and E State is always accurate , They are consistent with the true state of the cache line . and S The state may be inconsistent , If a cache will be in S The cache line of the state is invalidated , And the other cache might actually have
- It's time to cache , But the cache does not promote the cache row to E state , This is because other caches don't broadcast their notification to void the cache line , Also, since the cache does not hold the cache line copy The number of , therefore ( Even with such a notice ) There is no way to determine whether you have exclusive access to the cache line .
- In the sense above E State is a speculative optimization : If one CPU Want to modify a position in S State cache line , The bus transaction needs to transfer all of the cache rows copy become Invalid state , And modify E State caching does not require bus transactions .
Case list
- There is an introduction to CPU The cache data unit is 64K . Join us Java Two variables manipulated by multithreading are in the same block , Then a thread is modified a Variable , Another thread operates b Variables also involve data synchronization . Here we can see a code provided by dismounted soldier Daniel , I run it locally , It's fun .
@Data
class Store{
private volatile long p1,p2,p3,p4,p5,p6,p7;
private volatile long p;
private volatile long p8,p9,p10,p11,p12,p13,p14;
}
public class StoreRW {
public static Store[] arr = new Store[2];
public static long COUNT = 1_0000_0000l;
static {
arr[0] = new Store();
arr[1] = new Store();
}
public static void main(String[] args) throws InterruptedException {
Store store = new Store();
final Thread t1 = new Thread(new Runnable() {
@Override
public void run() {
for (long i = 0; i < COUNT; i++) {
arr[0].setP(i);
}
}
});
final Thread t2 = new Thread(new Runnable() {
@Override
public void run() {
for (long i = 0; i < COUNT; i++) {
arr[1].setP(i);
}
}
});
final long start = System.currentTimeMillis();
t1.start();
t2.start();
t1.join();
t2.join();
final long end = System.currentTimeMillis();
System.out.println(end - start);
}
}
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.
- The code is simple , The two threads are constantly <typo id="typo-2954" data-origin=" Of " ignoretag="true"> Of </typo> Operate on two variables . If we remove redundant attributes from the object . like this Store Only keep p An attribute
@Data
class Store{
private volatile long p;
}
- 1.
- 2.
- 3.
- 4.
- 5.
- Running our program found that it was basically stable in 100<typo id="typo-3081" data-origin=" millisecond " ignoretag="true"> millisecond </typo>. If I add those irrelevant 14 individual long Properties of type . Then the program can be stable in 70 millisecond . The running time of the program here depends on the configuration of the computer . But no matter how the configuration is, you can definitely see the addition and non <typo id="typo-3154" data-origin=" add " ignoretag="true"> add </typo>14 The difference between variables .
- This is about CPU Cache unit . If there is only one attribute . that a r r Two objects in the array are most likely in the same cache block . So thread A operation a object , So thread B There will be a synchronization . But add 14 Variables can guarantee a r r The two objects of the array are definitely not in the same unit block
- Because with 14 After variables , One Store Take up 15*8=120 Bytes . Then put two anyway Store Definitely not in the same block . and p The variable is still in the middle . That's why this effect appears .
- For this operation, some people will think that the code is not aesthetic , But it does improve performance .JDK Comments are also provided for this @sun.misc.Contended ; But I tested it and felt whether the performance was improved 14 Variables are large . Teacher ma
summary
- That's all for today's introduction . Mainly with MESI The understanding of the .
author :zxhtom
link :https://juejin.cn/post/7064365186425028621
边栏推荐
- Introduction to Tianchi news recommendation: 4 Characteristic Engineering
- MySQL -- Introduction and use of single line functions
- Valentine's Day - 9 jigsaw puzzles with deep love in wechat circle of friends
- Small program graduation project based on wechat reservation small program graduation project opening report reference
- Conditional test, if, case conditional test statements of shell script
- What are the advantages and disadvantages of data center agents?
- MySQL - use of aggregate functions and group by groups
- Conditional statements of shell programming
- Human resource management online assignment
- 2020-12-02 SSM advanced integration Shang Silicon Valley
猜你喜欢

Introduction to graphics: graphic painting (I)

SQL statement

Make drop-down menu

The automatic control system of pump station has powerful functions and diverse application scenarios

Small program graduation project based on wechat reservation small program graduation project opening report reference

From the 18th line to the first line, the new story of the network security industry

Pytoch residual network RESNET

ES6 deletes an attribute in all array objects through map, deconstruction and extension operators

MySQL introduction - functions (various function statistics, exercises, details, tables)

Write the first CUDA program
随机推荐
Yyds dry goods inventory it's not easy to say I love you | use the minimum web API to upload files
Is Shengang securities company as safe as other securities companies
SRCNN:Learning a Deep Convolutional Network for Image Super-Resolution
Ka! Why does the seat belt suddenly fail to pull? After reading these pictures, I can't stop wearing them
Should enterprises start building progressive web applications?
C import Xls data method summary IV (upload file de duplication and database data De duplication)
Luogu p1309 Swiss wheel
C library function int fprintf (file *stream, const char *format,...) Send formatted output to stream
MySQL - use of aggregate functions and group by groups
Who moved my code!
Openbionics robot project introduction | bciduino community finishing
Jerry's modification setting status [chapter]
After listening to the system clear message notification, Jerry informed the device side to delete the message [article]
Day05 branch and loop (II)
Make drop-down menu
JVM performance tuning and practical basic theory - medium
Mongodb learning notes: command line tools
Mobile phone battery - current market situation and future development trend
Introduction to graphics: graphic painting (I)
Solution to the problem that jsp language cannot be recognized in idea