当前位置:网站首页>Morsel driven parallelism: a NUMA aware parallel query execution framework
Morsel driven parallelism: a NUMA aware parallel query execution framework
2022-06-09 05:33:00 【Lizhaolong's blog】
This work adopts Creative Commons signature - Noncommercial use - Share in the same way 4.0 International licensing agreement Licensing .
This work ( Zhao Long Li post , from Zhao Long Li A literary creation ), from Zhao Long Li confirm , Reprint please indicate copyright .
introduction
I saw... Some time ago OSAPP in polarDB There is a core data structure in the community to support NUMA Aware The subject of , Although I have already reported ChaosBlade Community do OS Level of fault injection , But I am also very interested in the implementation of the former , If nothing unexpected happens, I will always pay attention to polarDB This PR Implementation process .
I went to Chongqing with Qibao a few days ago , There is no travel plan in the past week , The reply was delayed for several days , So I had a few days' free time to study the knowledge I was interested in , Then I thought of that NUMA Aware The subject of . It took me about a day to review my past NUMA Related issues , Including several articles NUMA The troubleshooting process leading to burrs ,NUMA API Use ( Include migrate_pages and mbind These two system calls , and numactl,numademo,numastat These terminal tools ), several memory policy The practical significance and possibility of NUMA dependent PMU Access method .
In fact, according to the scheduled plan , In the last month of College , I didn't do technical research anymore , But should feel at ease when a lazy dog , Especially today is the Dragon Boat Festival , It is a sure thing to give up and accept teats . however . All of a sudden timeline There is an article in NUMA Relevant papers have not been read , Now that I have learned , Just learn the whole set , This is the origin of this article .
Morsel-Driven Parallelism
Morsel-Driven Parallelism The biggest contribution is to create a new fine-grained parallelism Query Execution Frame and will NUMA The idea is applied to database design .
In my humble opinion , With the continuous enhancement of new hardware , modern NUMA System span node The cost of access is actually getting lower and lower [4][5], And only in the higher order cache Remote access to memory is only possible on misses , It also means that unless false shareding[3] appear , otherwise NUMA It is very difficult for remote access to cause bottlenecks , So the core of this article is to propose a more fine-grained Query Execution frame , In order to adapt the framework is also important relational operators A set of parallel algorithms is designed .
Parallel is mentioned at the beginning of the article Query Execution There are two problems with the development of computer systems ( The increase of computer speed is gradually reflected in the speed of multi-core rather than single core ) There are conflicts :
- To take advantage of multiple cores , All query work must be evenly distributed among hundreds of threads .
- The complexity of modern kernels , As a result, even with accurate data statistics, it is difficult to evenly distribute queries .
The above problems lead to load imbalance and context switching bottleneck among multiple cores , So it is no longer extensible , So this article is actually to solve the above problems , But the solution can be done easily NUMA friendly , In order to solve the imbalance of parallelism in operators .
From the perspective of execution granularity Morsel-Driven Parallelism With the traditional Volcano The model is no different , about plan-driven For example, the optimizer statically determines how many threads should run at query compilation time , Instantiate a query operator plan for each thread , And connect these with the exchange operator . From my immature database experience Volcano In fact, the minimum execution unit is the operator ( Of course, the operator can also be parallel , Suppose the data is distributed at different nodes , This is a typical intra operator parallelism , Of course, this is from the computing node ), It also means in a Query Plan Only in EXCHANGE operator Operator time [1][6] Can be parallel between operators .
In fact, a single operator can be executed in parallel , However, there is no way to achieve uniform distribution among multiple cores ,Morsel-Driven Parallelism Use work stealing to dynamically allocate work between threads ( Thought of the previous thread pool implementation ), This prevents unused due to load imbalance CPU resources , And allow flexibility , That is, you can reallocate between different queries at any time CPU resources .
But from the perspective of pipeline execution Morsel-Driven Parallelism With the traditional Volcano Models are fundamentally different , namely push and pull The difference between , The basic differences are shown in the figure below [13]:
The biggest difference between the two, I think, is that the tradition Volcano Represented by model push In order to Operator Centred , and Morsel-Driven Parallelism Such a model is data centric , The latter is easier to achieve parallel friendliness , Because the executing thread is actually bound to the core , And the data source is obtained by the worker thread itself , This is parallel friendly , This is actually a simple and efficient scheduler .
With ClickHouse For example , Through AST Generate Query Plan after , After some RBO Optimize , take QueryPlan It is transformed into... In the way of post order traversal Pipeline, Generated in this way Pipeline, Actually sum push It's the same , Next Pipeline How to execute is the key to parallel friendliness , This essentially requires a well-designed scheduler ,Morsel-Driven Parallelism The implementation of is excellent .
The basic implementation process is as follows : One Query Execution Will generate a QEPobject, Manage all pipeline-job Dependencies between , This is similar to Volcano Medium query plan, In a pipeline job Execution complete , Will trigger QEPobject Passive state machine , Select the next executable pipeline job, Pass to dispatcher, every last worker Will be bound to a CPU nucleus , And take the initiative from dispatcher Middle pull morsel, Here you can do NUMA Affinity , Priority scheduling can be added , The output results are materialized in NUMA Local storage area in , follow-up pipeline It can be processed based on this .
You can see Morsel-Driven Parallelism In fact, it is to standardize the implementation of operators , That is, all operators use fine-grained elements (morsel) To measure , And the parallelism in the operator is integrated into Query Execution Come on , This can not only achieve uniform load among multiple cores , It can also be done NUMA perception , Even priority scheduling .
Cool!
From a higher point of view ,Morsel Can we extend our thinking to a larger field rather than just an in memory database ? Because we want to introduce IO, So the execution flow needs to use coroutinue To express , however coroutinue The thread is bound to the core , Each core runs n individual coroutine,channel Notify another pipeline, This architecture also looks great (matrixone This is how we play now ). Another way of thinking , In the paper, the overall architecture looks like a three-level control ,worker Class A ,dispatcher Class A ,QEPobject Class A , gradually push, Moving out of the core idea seems to require only modifying the underlying worker That floor , The other two layers are reusable , It looks good, too , But I need pipeline Detailed split of , The implementation is relatively complex .
Of course, the above discussion is actually about the implementation of the scheduler , How to base on Query Execution Fine grained splitting pepeline It is also a problem to be considered , After all, the foundation of parallelism comes from pipeline The combination of ( After the sequence traversal ?), But I am a computing engine Xiaobai , Let's stop this discussion .
Specific solutions can be found in the paper [1][2], Not mentioned here .
summary
The database is extensive and profound , Business ,Query Execution, If you want to make two points clear, you can at least keep them in mind , Not to mention consistency , Multiple copies / How to live , Storage engine , Indexes , performance tuning ,bug screening , Take your time , It is very satisfying for me to figure out a plate .
This article does not represent the general direction of my study , It's just a pastime after dinner . The most important thing at this stage is to practice and use tools , Whether it is the foreseeable specific work needs or , It is also expected to introduce innovative ideas and understanding of chaotic engineering , It is the embodiment of those two points .
By the way, English is also very important , It doesn't mean that it will run out in two years 2333
Reference resources :
- Morsel-Driven Parallelism: A NUMA-Aware Query Evaluation Framework for the Many-Core Age Interpretation of the thesis
- Morsel-Driven Parallelism: A NUMA-Aware Query Evaluation Framework for the Many-Core Age
- from false sharing To cache consistency , This is actually closely related to us
- Corey: an operating system for many cores.
- Everything you always wanted to know about synchronization but were afraid to ask.
- EXCHANGE operator
- OLAP Concurrent execution and scheduling of tasks
- PolarDB-X oriented HTAP Hybrid actuators for
- JOIN Optimization and execution of subqueries
- Parallel queries (Parallel Query)
- PolarDB The past and present life of parallel query
- The depth resolution PolarDB Database parallel query technology
- Push versus pull-based loop fusion in query engines
- Push still Pull, Is this a question ?
边栏推荐
- Analysis of semaphore source code of AQS
- Gstreamer应用开发实战指南(二)
- Missing digit JS in sword finger 0~n-1
- Xtrabackup backup and recovery
- MarathonLb的负载研究
- Youshimu V8 projector opens the "vision" field of high fresh
- Load research of Marathon LB
- 材料之kube-dns.yaml
- Design owlook network novel recommendation system
- AQS 之 ReentrantReadWriteLock 源码分析
猜你喜欢

YOLOv5的Tricks | 【Trick7】指数移动平均(Exponential Moving Average,EMA)

In 2022, the database audit manufacturer will choose cloud housekeeper! Powerful!

Yolov5-6.0系列 | yolov5的模块设计

latex中\cdots后面接上句子,后面的句子格式会乱怎么回事。

力扣今日题-1037. 有效的回旋镖

Li Kou today's question -1037 Effective boomerang

AI video cloud: a good wife in the era of we media

冒泡排序,打印菱形,打印直角三角形,打印倒三角,打印等边三角形,打印九九乘法表

redis 缓存雪崩、穿透、击穿问题
![[it] Fuxin PDF Keeping Tool Selection](/img/1e/87dbd435e830c139bc3d5cf86d6d57.png)
[it] Fuxin PDF Keeping Tool Selection
随机推荐
Tricks | [trick6] learning rate adjustment strategy of yolov5 (one cycle policy, cosine annealing, etc.)
冒泡排序,打印菱形,打印直角三角形,打印倒三角,打印等边三角形,打印九九乘法表
@Differences between jsonformat and @datetimeformat
Li Kou today's question -1037 Effective boomerang
Kube dns yaml
Myql error expression 1 of select list is not in group by claim and contains nonaggregated column
Gstreamer应用开发实战指南(三)
seaweedfs-client适配高版本的seaweedfs服务
Requests segmented downloading of files and multi-threaded downloading
Gstreamer应用开发实战指南(四)
Connecting pyqt5 and SQL Server Databases
validate-npm-package-name
The principle and implementation of lazy image loading
Windows uses php to start ThinkPHP project and deploy configuration
A few minutes to understand the Flink waterline
The 27th issue of product weekly report | members' new interests of black users; CSDN app v5.1.0 release
AspNetPager combines stored procedure paging to speed up access
MySQL add field or create table SQL statement
Alibaba cloud AI training camp -sql basics 5: window functions, etc
In latex, \cdots is followed by a sentence. What's wrong with the format of the following sentence.