当前位置:网站首页>Server SMP, NUMA, MPP system learning notes.
Server SMP, NUMA, MPP system learning notes.
2022-07-06 23:37:00 【Galloping tortoise】
Catalog
One 、 background
Commercial processors are committed to the development of single core processors , Its performance has been brought into full play , Simply increasing the speed of a single core chip will generate too much heat and will not bring corresponding performance improvements , but CPU Performance requirements are greater than CPU Speed of development .
Although it can be improved by increasing the assembly line CPU The frequency of , But because of The increase of buffer and the poor control of leakage current , Resulting in a substantial increase in power , The performance is not as good as the previous low-frequency CPU. because CPU The power of the system increases , Lead to CPU The problem of heat dissipation is even more serious , Air cooling can no longer solve the problem .
that , This led to the emergence of new technologies : Multicore processor . As early as 1996 In, there was the first multi-core CPU Prototype Hydra.2001 year IBM Launch the first commercial multi-core processor POWER4,2005 year Intal and AMD Large scale application of multi-core processors .
Multicore processors are becoming more and more popular , stay The server 、 desktop 、 Netbook 、 Flat 、 Mobile phones or medical devices 、 The defence 、 Aerospace and other aspects have been widely used .
Two 、 The development of multi-core processors
2.1 Distinguish from... In terms of architecture
- Isomorphic multi-core architecture : The processors in the system are the same in architecture .
- Heterogeneous multi-core architecture : The processors in the system are different in architecture .
Homogeneous multi-core architecture is relatively simple in hardware and software design , High versatility .
Heterogeneous multicore processors have :TI The Da Vinci platform DM6000 series (ARM9+DSP)、Xilinx Of Zynq7000 series ( Dual core Cortex-A9+FPGA)、Cell processor (1 individual 64 position POWERPC+8 individual 32 Bit coprocessor )
wait .
Homogeneous multicore processors have :Exynos4412,freescale i.mx6 dual and quad series 、TI Of OMAP4460 etc. ,Intel Of Core Duo、Core2 Duo etc. .
2.2 Distinguish from the operation mode
In terms of software , Multi core processors are common 2 Operation mode :
AMP( Asymmetric multiprocessing )
SMP( Symmetric multiprocessing )
2.2.1 SMP Pattern
Two is SMP(Symmetric Multi-processing)
Pattern :SMP
The operating system architecture of mode is a variant of multi-core processor technology , An operating system instance controls all processors , All processors share memory . And AMP Each in the mode CPU Running an operating system instance on is different ,SMP All in the mode system CPU In the same position , Run an operating system instance together , all CPU Share system memory and peripheral resources . be relative to AMP Pattern ,SMP The operating system of mode has Shareable memory 、 High performance and power consumption ratio 、 And easy to realize load balancing Other advantages , It can give full play to the hardware advantages of multi-core processors .
chart 2-3 Shown SMP The mode operation system is responsible for coordinating the work between the two processor cores , Two processor cores share the same operating system instance in main memory . Although the address of the application is the same in each core , But after MMU Map them to different locations in main memory , Thus, the code and data space between the two applications are isolated .
2.2.2 AMP Pattern
One is AMP(Asymmetric Multi-processing)
Pattern :AMP
Mode RTOS In all CPU An operating system instance is running on both ( These operation instances are not necessarily identical ), Each operating system has its own dedicated memory , They communicate with each other through limited access to shared memory .AMP The operating system structure of mode requires users to participate in the allocation of system resources . This type of RTOS Less application , There are only Wind River The company's VxWorks Provide AMP Configuration of mode .
chart 2-4 Typical AMP System structure , every last CPU Running an operation system instance on , Each operating system has its own exclusive resources ( The most basic thing is to monopolize their own CPU), Other resources may be shared by the two systems 、 Or allocated to each system for special . The allocation of resources is decided by the user , So it is visible to users . Among commercial real-time operation systems, only WindRiver The company's VxWorks Provides AMP Mode support , At present, the application of this mode is less .
2.2.3 SMP and AMP Feature summary
SMP Of features yes : There is only one operating system instance , Run on multiple CPU On , Every CPU The structure is the same , Memory 、 Resource sharing . One of the biggest features of this system is Share all resources .
AMP Of features yes : Multiple CPU, each CPU The architecture is different , Every CPU The kernel runs a separate operating system or a separate instance of the same operating system , Every CPU Have your own independent resources . The biggest feature of this structure is Don't share resources .
3、 ... and 、 The development of server architecture
3.1 Distinguish from... In terms of architecture
From the perspective of system architecture , The current commercial servers can be roughly divided into three categories
- Symmetric multiprocessor architecture
(SMP:Symmetric Multi-Processor)
- Inconsistent storage access structure
(NUMA:Non-Uniform Memory Access)
- Massive parallel processing architecture
(MPP:Massive Parallel Processing)
There are two models for shared memory multiprocessors
- Uniform memory access (Uniform-Memory-Access, abbreviation UMA)
- Model nonuniform memory access (Nonuniform-Memory-Access, abbreviation NUMA) Model
3.1.1 SMP(Symmetric Multi-Processor)
So-called Symmetric multiprocessor architecture , It refers to multiple CPU Symmetrical work , There is no primary or secondary or subordinate relationship . various CPU Share the same physical memory , Every CPU It takes the same time to access any address in memory , therefore SMP Also known as consistent memory access structure (UMA:Uniform Memory Access) Yes SMP The way the server can expand includes increasing memory 、 Use faster CPU、 increase CPU、 expand I/O( The number of slots and the number of buses ) And add more peripherals ( Usually disk storage ).
SMP The main feature of the server is sharing , All the resources in the system (CPU、 Memory 、I/O etc. ) It's all Shared . It is precisely because of this characteristic , Led to SMP The main problem with the server , That is, its scalability is very limited .
about SMP For servers , Every shared link can cause SMP The bottleneck of server expansion , The most Limited is memory . Because each CPU The same memory resources must be accessed through the same memory bus , So with CPU An increase in quantity , Memory access conflicts will increase rapidly , In the end CPU Waste of resources , send CPU The effectiveness of performance is greatly reduced . Experimental proof ,SMP The server CPU The best use case is 2 to 4 individual CPU
3.1.2 NUMA(Non-Uniform Memory Access)
because SMP Limitations on scalability , People began to explore how to effectively expand the technology to build large-scale systems ,NUMA It is one of the results of this effort to use NUMA technology , You can put dozens of CPU( Even a hundred CPU) Combined in one server .
- NUMA The multiprocessor model is shown in the figure , Its access time varies with the location of the stored word . Its shared memory is physically distributed on the local memory of all processors . The set of all local memory forms the global address space , It can be accessed by all processors . It is faster for the processor to access local memory , But accessing remote memory belonging to another processor is slower , Because there will be additional delay through the interconnection network .
- NUMA The basic feature of a server is that it has multiple servers CPU modular , Every CPU The module consists of multiple CPU( Such as 4 individual ) form , And has independent local memory 、I/O Notches, etc .
Because the nodes can be interconnected through modules ( It is called Crossbar Switch) Connect and interact with information , So every CPU Can access the memory of the whole system ( This is a NUMA System and MPP Important differences in systems ). obviously , Access to local memory will be much faster than access to remote memory ( Memory of other nodes in the system ) The speed of , This is also inconsistent storage access NUMA The origin of .
Because of this characteristic , In order to better play the system performance , When developing applications, you need to minimize differences CPU Information interaction between modules . utilize NUMA technology , It can solve the problem of SMP The expansion of the system , It can support hundreds of physical servers CPU. Typical NUMA Examples of servers include HP Of Superdome、SUN15K、IBMp690 etc. .
but NUMA Technology also has some flaws , Because the latency of accessing remote memory far exceeds local memory , So when CPU As the number increases , System performance cannot be increased linearly . Such as HP company Superdome Server time , It has been published with HP Other UNIX The relative performance value of the server , Results found ,64 road CPU Of Superdome (NUMA structure ) The relative performance value of is 20, and 8 road N4000( Shared SMP structure ) The relative performance value of is 6.3. From this result we can see that ,8 Multiples CPU In exchange for 3 Double performance improvement .
3.1.3 MPP(Massive Parallel Processing)
and NUMA Different ,MPP It provides another way to expand the system , It consists of multiple SMP The server is connected through a certain node Internet , Working together , Complete the same task , From the user's point of view, it's a server system . Its basic feature is that it consists of many SMP The server ( Every SMP The server is called a node ) It is connected by node Internet , Each node only accesses its own local resources ( Memory 、 Storage, etc ), It's a total no sharing (Share Nothing) structure , So the expansion ability is the best , Theoretically, there is no limit to its expansion , Current technology can achieve 512 Nodes are interconnected , Thousands CPU. At present, there is no standard for node Internet in the industry , Such as NCR Of Bynet,IBM Of SPSwitch, They all adopt different internal implementation mechanisms . But the node Internet is only for MPP Server internal use , Transparent to users .
stay MPP In the system , Every SMP Nodes can also run their own operating systems 、 Database etc. . But and NUMA The difference is , It does not have the problem of remote memory access . In other words , Within each node CPU Can't access the memory of another node . The information interaction between nodes is realized through the node Internet , This process is commonly referred to as data redistribution (Data Redistribution).
however MPP The server needs a complex mechanism to schedule and balance the load and parallel processing of each node . At present, some are based on MPP Technical servers often use system level software ( Such as a database ) To shield this complexity . for instance ,NCR Of Teradata Is based on MPP Technology of a relational database software , When developing applications based on this database , No matter how many nodes the backend server consists of , Developers are faced with the same database system , There is no need to consider how to schedule the load of some nodes .
3.2 Comparison of advantages and disadvantages of Architecture
3.2.1 NUMA、MPP、SMP Performance differences between
NUMA The node interconnection mechanism of is implemented within the same physical server , When a CPU When remote memory access is required , It has to wait , This is also NUMA The server can't implement CPU Performance expands linearly as it increases .MPP The node interconnection mechanism is different SMP Outside the server through I/O Realized , Each node only accesses local memory and storage , The information interaction between nodes and the processing of nodes themselves are carried out in parallel . therefore MPP When adding nodes, the performance can basically achieve linear expansion .SMP be-all CPU Resources are shared , Therefore, linear expansion is fully realized .
3.2.2 NUMA、MPP、SMP Differences between extensions
NUMA Theoretically, it can be expanded infinitely , At present, the technology is relatively mature and can support hundreds of CPU Expand . Such as HP Of SUPERDOME.
MPP Theoretically, it can also achieve infinite expansion , At present, the technology is relatively mature and can support 512 Nodes , Thousands CPU Expand .
SMP Poor scalability , at present 2 A to 4 individual CPU The utilization rate of is the best , however IBM Of BOOK technology , To be able to CPU Extended to 8 individual .
MPP It's made up of many SMP constitute , Multiple SMP The server is connected through a certain node Internet , Working together , Complete the same task .
3.2.3 MPP and SMP、NUMA The difference between applications
MPP The advantages of
MPP The system does not share resources , So for it , Resources than SMP More , When the transaction to be handled reaches a certain scale ,MPP Is more efficient than SMP good . because MPP Because the system needs to transmit information between different processing units , When communication time is short , that MPP The system can give full play to the advantages of resources , To achieve high efficiency . in other words : Operations have nothing to do with each other , There is less communication between processing units , That uses MPP The system is better . therefore ,MPP The system shows its advantages in decision support and data mining .
SMP The advantages of
MPP Because the system needs to transmit information between different processing units , So it's more efficient than SMP It's a little closer . When there is much communication time , that MPP The system can give full play to the advantages of resources . Therefore, currently used OTLP In the program , Users access a central database , If the SMP System structure , It is more efficient than adopting MPP The structure is much faster .
NUMA Advantages of Architecture
NUMA Architecture , It can integrate many in one physical server CPU, Make the system have high transaction processing ability , Because the time delay of remote memory access is much longer than that of local memory access , Therefore, it is necessary to minimize differences CPU Data interaction between modules . obviously ,NUMA Architecture is more suitable for OLTP Transaction processing environment , When used in a data warehouse environment , Because a large number of complex data processing will inevitably lead to a large number of data interaction , Will make CPU Greatly reduce the utilization of .
Four 、 summary
Traditional multi-core computing uses SMP(Symmetric Multi-Processor ) Pattern : Combine multiple processors with a centralized memory and I/O Bus connection . All processors can only access the same physical memory , therefore SMP Systems are sometimes referred to as consistent memory access (UMA) Structure system , Consistency means no matter when , The processor can only keep or share a unique value for each data in memory . Obviously ,SMP The disadvantage is limited scalability , Because in memory and I/O When the interface reaches saturation , Adding processors doesn't get better performance , Corresponding to it are AMP framework , There is a master-slave relationship between different nuclei , For example, one core controls the business of another core , It can be understood as control plane and data plane in multi-core system .
NUMA Pattern is a kind of Distributed memory access , The processor can access different memory addresses at the same time , Greatly improve parallelism . NUMA In mode , The processor is divided into multiple ” node ”(node), The local memory space allocated to each node . The processors in all nodes can access all the physical storage of the system , But the time required to access the memory in this node , It takes much less time to access storage in some remote nodes .
NUMA The main advantage of is scalability .NUMA The architecture has been designed beyond SMP Architecture limitations on scalability . adopt SMP, All memory accesses are passed to the same shared memory bus . This is a great way to CPU A relatively small number of cases , But it does not apply to dozens or even hundreds of CPU The situation of , Because of these CPU Will compete with each other for access to the shared memory bus .NUMA By limiting the on any memory bus CPU Number and rely on high-speed interconnection to connect nodes , This alleviates these bottlenecks .
Reference material :
https://scitechconnect.elsevier.com/asymmetric-multi-processing-amp-vs-symmetric-multi-processing-smp/
边栏推荐
- What does security capability mean? What are the protection capabilities of different levels of ISO?
- After 3 years of testing bytecan software, I was ruthlessly dismissed in February, trying to wake up my brother who was paddling
- Daily question brushing record (XV)
- Talking about the current malpractice and future development
- How does crmeb mall system help marketing?
- 为了交通安全,可以做些什么?
- Knowledge * review
- 每年 2000 亿投资进入芯片领域,「中国芯」创投正蓬勃
- 【系统分析师之路】第七章 复盘系统设计(面向服务开发方法)
- Microsoft win11 is still "unsatisfactory". Multi user feedback will cause frequent MSI crashes
猜你喜欢
If the request URL contains jsessionid, the solution
The important data in the computer was accidentally deleted by mistake, which can be quickly retrieved by this method
Can online reload system software be used safely? Test use experience to share with you
不要再说微服务可以解决一切问题了
机器人材料整理中的套-假-大-空话
公链与私链在数据隐私和吞吐量上的竞争
Ajout, suppression et modification d'un tableau json par JS
B 站弹幕 protobuf 协议还原分析
docker启动mysql及-eMYSQL_ROOT_PASSWORD=my-secret-pw问题解决
【OFDM通信】基于深度学习的OFDM系统信号检测附matlab代码
随机推荐
matplotlib画柱状图并添加数值到图中
A few suggestions for making rust library more beautiful! Have you learned?
【通信】两层无线 Femtocell 网络上行链路中的最优功率分配附matlab代码
Please help xampp to do sqlilab is a black
How does crmeb mall system help marketing?
浅谈现在的弊端与未来的发展
(1)长安链学习笔记-启动长安链
Realize colorful lines and shape your heart
资产安全问题或制约加密行业发展 风控+合规成为平台破局关键
Résumé des connaissances de gradle
Can online reload system software be used safely? Test use experience to share with you
mysql-cdc 的jar包 ,在flink运行模式下,是不是要放在不同的地方呢?
Station B Big utilise mon monde pour faire un réseau neuronal convolutif, Le Cun Forward! Le foie a explosé pendant 6 mois, et un million de fois.
What does front-end processor mean? What is the main function? What is the difference with fortress machine?
亚朵三顾 IPO
零代码高回报,如何用40套模板,能满足工作中95%的报表需求
Laravel8 uses passport authentication to log in and generate a token
【OFDM通信】基于深度学习的OFDM系统信号检测附matlab代码
若依请求url中带有jsessionid的解决办法
谁说新消费品牌大溃败?背后有人赢麻了