当前位置:网站首页>Understanding and thinking about multi-core consistency
Understanding and thinking about multi-core consistency
2022-06-13 02:11:00 【Code changes the world CTW】
Quick links :
.
Personal blog notes guide Directory ( All )
Some questions
Many blog posts on the Internet , Mention Cache Multi core consistency necessarily mention MESI、MOESI , Then I'll start talking about MESI、MOESI Principle of maintainability ? Just ask , You really don't understand MES Do you ? You really need to learn MESI? What you don't understand is architecture, right , Instead of learning a ghost agreement !
Now that you want to learn MESI, So here are a few questions :
(1)、ARM Architecture really uses MESI Did you? ?
(2)、MESI It's an agreement ? Who maintained it ? There must be a hardware to implement this protocol , Is in ARM Core in ? CCI-400 in ?SCU in ?DSU in ?
(3)、MESI Four states , Where to record ?
How to maintain multi-core consistency
then , There are three mechanisms for consistency :
- Disable caching Is the simplest mechanism , But it may significantly reduce CPU performance . For maximum performance , The processor runs at a high frequency through a pipeline , And run from a cache that provides very low latency . Caching data that is accessed multiple times can significantly improve performance and reduce costs DRAM Access and power consumption . Mark data as “ Non cache ” May affect performance and power consumption .
- Consistency of software management Is the traditional solution to the problem of data sharing . ad locum , Software ( Usually device drivers ) Dirty data in the cache must be cleared or flushed , And invalidate the old data , To share with other processors or master devices in the system . This requires processor cycles 、 Bus bandwidth and power .
- Consistency of hardware management Provides an alternative to simplifying Software . Use this solution , Any marked as “ share ” Your cached data will always be automatically updated . All processors and bus masters in the shared domain see exactly the same value .
Burning goose , We use the third kind Consistency of hardware management , It means : As a software engineer , We don't have to worry about anything , Someone helped us work , Even so , But we still want to understand the principle of hardware .
Before going back to the principle , Let's add a scene first :
Suppose a thread is running in an operating system , The thread keeps operating 0x4000_0000 Memory at address ( So of course we expect , It always hits ), Due to system scheduling , This time, the thread may run in cpu0 On , Next time, maybe run in cpu1 Yes , The next time may be cpu4 Yes ( In fact, this behavior is also called CPU migration)
Or take a scene like this :
stay Linux Kernel In the system , Defines a global variable , Then multiple kernel threads ( Multiple CPU) Will access this variable .
In the above scenario , There is a piece of memory ( Such as 0x4000_0000 Memory at address ) By different ARM CORE To visit , This will show the data in main-memory、SCU-0 Of L2 cache、SCU-1 Of L2 cache、8 individual Core Of L1 cache A situation of inconsistency .
Now that there is data in memory and different cache Inconsistencies in , Then we need to solve this problem ( Also called maintaining consistency ), So how to maintain it , It says so “ Use Consistency of hardware management ”, Let's write the answer directly , Tell you how hardware maintains consistency .
Let's take a look at one first The old Graph (bit.LITTLE Architecturally ) Well
- core1、core1、SCU-0 Of cache The consistency of , By SCU-0 For maintenance. ( As for whether it is observed here MESI agreement , answer :YES)
- SCU-0、SCU-1、main-memory The consistency of , Is due to CCI-400 For maintenance. ( As for whether it is observed here MESI agreement , answer : I don't know either )

Look at another one new Graph (DynamIQ Architecturally ) Well - core 0 Of L2 cache、core 1 Of L2 cache、DSU-0 L3 cache The consistency of from DSU-0 To maintain the ( As for whether it is observed here MESI agreement , answer : YES)
- DSU-0 L3 cache、DSU-1 L3 cache、Mali Of L2 cache、system cache、main-memory The consistency is due to CCI-550 For maintenance. ( As for whether it is observed here MESI agreement , answer : I don't know either )

MESI、MOESI Introduction to
Since you insist on learning MESI、MOESI, Then we can't help but introduce them , Keep up with the big guys , Copy it online ( Be careful : That's not called copying , That's called learning )
First of all Modified Exclusive Shared Invalid (MESI) The agreement defines 4 Status :
| MESI State | Definition |
|---|---|
| Modified (M) | This line of data is valid , The data has been modified , Not consistent with the data in memory , The data only exists in this cache |
| Exclusive (E) | This line of data is valid , The data is consistent with the data in memory , The data only exists in this cache |
| Shared (S) | This line of data is valid , The data is consistent with the data in memory , Multiple caches have copies of this row of data |
| Invalid (I) | This line of data is invalid |
secondly , stay ARM Many of the core in , Defines the fifth state Shared Modified, This is called MOESI agreement . and ARM Using MOESI A variation of the ( What is a variant , How did it change , What has changed , Wen when it doesn't say )
Then we use the way of data flow graph , Watch MESI These four states The situation of :
MESI Switching between States :
Events:
RH = Read Hit
RMS = Read miss, shared
RME = Read miss, exclusive
WH = Write hit
WM = Write miss
SHR = Snoop hit on read
SHI = Snoop hit on invalidate
LRU = LRU replacement
Bus Transactions:
Push = Write cache line back to memory
Invalidate = Broadcast invalidate
Read = Read cache line from memory
summary
After reading the above information , Let's summarize again , We learn cache Uniformity , What are our biggest puzzles or bottlenecks , I don't understand MESI Do you ? It should still be the understanding and cognition of Architecture . Study MESI, It's better to study DSU、CCI-550 Principle .
边栏推荐
- Day 1 of the 10 day smart lock project (understand the SCM stm32f401ret6 and C language foundation)
- Opencv camera calibration (1): internal and external parameters, distortion coefficient calibration and 3D point to 2D image projection
- Huawei equipment is configured with CE dual attribution
- Basic exercise of test questions decimal to hexadecimal
- Implementation of pointer linked list
- C语言压缩字符串保存到二进制文件,从二进制文件读取压缩字符串后解压。
- rsync 傳輸排除目錄
- 1、 Set up Django automation platform (realize one click SQL execution)
- js获取元素
- [programming idea] communication interface of data transmission and decoupling design of communication protocol
猜你喜欢

Mac使用Docker安装Oracle

传感器:MQ-5燃气模块测量燃气值(底部附代码)

Opencv camera calibration (1): internal and external parameters, distortion coefficient calibration and 3D point to 2D image projection
![[the second day of actual combat of smart lock project based on stm32f401ret6 in 10 days] (lighting with library function and register respectively)](/img/f7/b2463d8ffe75113d352cae332046db.jpg)
[the second day of actual combat of smart lock project based on stm32f401ret6 in 10 days] (lighting with library function and register respectively)
![[work with notes] NDK compiles the open source library ffmpeg](/img/24/ed33e12a07e001fc708e0c023e479c.jpg)
[work with notes] NDK compiles the open source library ffmpeg

C language conditional compilation routine

C语言压缩字符串保存到二进制文件,从二进制文件读取压缩字符串后解压。

The scientific innovation board successfully held the meeting, and the IPO of Kuangshi technology ushered in the dawn

Luzhengyao, who has entered the prefabricated vegetable track, still needs to stop being impatient

ROS learning-7 error in custom message or service reference header file
随机推荐
JS get element
Ten thousand words make it clear that synchronized and reentrantlock implement locks in concurrency
LabVIEW大型项目开发提高质量的工具
SQLserver2008 拒绝了对对象 '****' (数据库 '****',架构 'dbo')的 SELECT 权限
[keras] data of 3D u-net source code analysis py
Répertoire d'exclusion du transport rsync
Huawei equipment is configured with dual reflectors to optimize the backbone layer of the virtual private network
Logging system in chromium
Basic exercises of test questions letter graphics ※
C language complex type description
Get started quickly cmake
Huawei equipment configures private IP routing FRR
Differences between constants and variables (detailed description) (learning note 3 -- variables and constants)
C语言压缩字符串保存到二进制文件,从二进制文件读取压缩字符串后解压。
cmake_ example
LeetCode每日一题——890. 查找和替换模式
Why is Huawei matebook x Pro 2022 leading a "laptop" revolution
华为设备配置CE双归属
传感器:MQ-5燃气模块测量燃气值(底部附代码)
In addition to the full screen without holes under the screen, the Red Devils 7 series also has these black technologies