当前位置:网站首页>Morsel driven parallelism: a NUMA aware parallel query execution framework

Morsel driven parallelism: a NUMA aware parallel query execution framework

2022-06-09 05:33:00 Lizhaolong's blog

 Insert picture description here This work adopts Creative Commons signature - Noncommercial use - Share in the same way 4.0 International licensing agreement Licensing .

This work ( Zhao Long Li post , from Zhao Long Li A literary creation ), from Zhao Long Li confirm , Reprint please indicate copyright .

introduction

I saw... Some time ago OSAPP in polarDB There is a core data structure in the community to support NUMA Aware The subject of , Although I have already reported ChaosBlade Community do OS Level of fault injection , But I am also very interested in the implementation of the former , If nothing unexpected happens, I will always pay attention to polarDB This PR Implementation process .

I went to Chongqing with Qibao a few days ago , There is no travel plan in the past week , The reply was delayed for several days , So I had a few days' free time to study the knowledge I was interested in , Then I thought of that NUMA Aware The subject of . It took me about a day to review my past NUMA Related issues , Including several articles NUMA The troubleshooting process leading to burrs ,NUMA API Use ( Include migrate_pages and mbind These two system calls , and numactl,numademo,numastat These terminal tools ), several memory policy The practical significance and possibility of NUMA dependent PMU Access method .

In fact, according to the scheduled plan , In the last month of College , I didn't do technical research anymore , But should feel at ease when a lazy dog , Especially today is the Dragon Boat Festival , It is a sure thing to give up and accept teats . however . All of a sudden timeline There is an article in NUMA Relevant papers have not been read , Now that I have learned , Just learn the whole set , This is the origin of this article .

Morsel-Driven Parallelism

Morsel-Driven Parallelism The biggest contribution is to create a new fine-grained parallelism Query Execution Frame and will NUMA The idea is applied to database design .

In my humble opinion , With the continuous enhancement of new hardware , modern NUMA System span node The cost of access is actually getting lower and lower [4][5], And only in the higher order cache Remote access to memory is only possible on misses , It also means that unless false shareding[3] appear , otherwise NUMA It is very difficult for remote access to cause bottlenecks , So the core of this article is to propose a more fine-grained Query Execution frame , In order to adapt the framework is also important relational operators A set of parallel algorithms is designed .

Parallel is mentioned at the beginning of the article Query Execution There are two problems with the development of computer systems ( The increase of computer speed is gradually reflected in the speed of multi-core rather than single core ) There are conflicts :

  1. To take advantage of multiple cores , All query work must be evenly distributed among hundreds of threads .
  2. The complexity of modern kernels , As a result, even with accurate data statistics, it is difficult to evenly distribute queries .

The above problems lead to load imbalance and context switching bottleneck among multiple cores , So it is no longer extensible , So this article is actually to solve the above problems , But the solution can be done easily NUMA friendly , In order to solve the imbalance of parallelism in operators .

From the perspective of execution granularity Morsel-Driven Parallelism With the traditional Volcano The model is no different , about plan-driven For example, the optimizer statically determines how many threads should run at query compilation time , Instantiate a query operator plan for each thread , And connect these with the exchange operator . From my immature database experience Volcano In fact, the minimum execution unit is the operator ( Of course, the operator can also be parallel , Suppose the data is distributed at different nodes , This is a typical intra operator parallelism , Of course, this is from the computing node ), It also means in a Query Plan Only in EXCHANGE operator Operator time [1][6] Can be parallel between operators .

In fact, a single operator can be executed in parallel , However, there is no way to achieve uniform distribution among multiple cores ,Morsel-Driven Parallelism Use work stealing to dynamically allocate work between threads ( Thought of the previous thread pool implementation ), This prevents unused due to load imbalance CPU resources , And allow flexibility , That is, you can reallocate between different queries at any time CPU resources .

But from the perspective of pipeline execution Morsel-Driven Parallelism With the traditional Volcano Models are fundamentally different , namely push and pull The difference between , The basic differences are shown in the figure below [13]:
 Insert picture description here
The biggest difference between the two, I think, is that the tradition Volcano Represented by model push In order to Operator Centred , and Morsel-Driven Parallelism Such a model is data centric , The latter is easier to achieve parallel friendliness , Because the executing thread is actually bound to the core , And the data source is obtained by the worker thread itself , This is parallel friendly , This is actually a simple and efficient scheduler .

With ClickHouse For example , Through AST Generate Query Plan after , After some RBO Optimize , take QueryPlan It is transformed into... In the way of post order traversal Pipeline, Generated in this way Pipeline, Actually sum push It's the same , Next Pipeline How to execute is the key to parallel friendliness , This essentially requires a well-designed scheduler ,Morsel-Driven Parallelism The implementation of is excellent .

The basic implementation process is as follows : One Query Execution Will generate a QEPobject, Manage all pipeline-job Dependencies between , This is similar to Volcano Medium query plan, In a pipeline job Execution complete , Will trigger QEPobject Passive state machine , Select the next executable pipeline job, Pass to dispatcher, every last worker Will be bound to a CPU nucleus , And take the initiative from dispatcher Middle pull morsel, Here you can do NUMA Affinity , Priority scheduling can be added , The output results are materialized in NUMA Local storage area in , follow-up pipeline It can be processed based on this .

You can see Morsel-Driven Parallelism In fact, it is to standardize the implementation of operators , That is, all operators use fine-grained elements (morsel) To measure , And the parallelism in the operator is integrated into Query Execution Come on , This can not only achieve uniform load among multiple cores , It can also be done NUMA perception , Even priority scheduling .

Cool!

From a higher point of view ,Morsel Can we extend our thinking to a larger field rather than just an in memory database ? Because we want to introduce IO, So the execution flow needs to use coroutinue To express , however coroutinue The thread is bound to the core , Each core runs n individual coroutine,channel Notify another pipeline, This architecture also looks great (matrixone This is how we play now ). Another way of thinking , In the paper, the overall architecture looks like a three-level control ,worker Class A ,dispatcher Class A ,QEPobject Class A , gradually push, Moving out of the core idea seems to require only modifying the underlying worker That floor , The other two layers are reusable , It looks good, too , But I need pipeline Detailed split of , The implementation is relatively complex .

Of course, the above discussion is actually about the implementation of the scheduler , How to base on Query Execution Fine grained splitting pepeline It is also a problem to be considered , After all, the foundation of parallelism comes from pipeline The combination of ( After the sequence traversal ?), But I am a computing engine Xiaobai , Let's stop this discussion .

Specific solutions can be found in the paper [1][2], Not mentioned here .

summary

The database is extensive and profound , Business ,Query Execution, If you want to make two points clear, you can at least keep them in mind , Not to mention consistency , Multiple copies / How to live , Storage engine , Indexes , performance tuning ,bug screening , Take your time , It is very satisfying for me to figure out a plate .

This article does not represent the general direction of my study , It's just a pastime after dinner . The most important thing at this stage is to practice and use tools , Whether it is the foreseeable specific work needs or , It is also expected to introduce innovative ideas and understanding of chaotic engineering , It is the embodiment of those two points .

By the way, English is also very important , It doesn't mean that it will run out in two years 2333

Reference resources :

  1. Morsel-Driven Parallelism: A NUMA-Aware Query Evaluation Framework for the Many-Core Age Interpretation of the thesis
  2. Morsel-Driven Parallelism: A NUMA-Aware Query Evaluation Framework for the Many-Core Age
  3. from false sharing To cache consistency , This is actually closely related to us
  4. Corey: an operating system for many cores.
  5. Everything you always wanted to know about synchronization but were afraid to ask.
  6. EXCHANGE operator
  7. OLAP Concurrent execution and scheduling of tasks
  8. PolarDB-X oriented HTAP Hybrid actuators for
  9. JOIN Optimization and execution of subqueries
  10. Parallel queries (Parallel Query)
  11. PolarDB The past and present life of parallel query
  12. The depth resolution PolarDB Database parallel query technology
  13. Push versus pull-based loop fusion in query engines
  14. Push still Pull, Is this a question ?
原网站

版权声明
本文为[Lizhaolong's blog]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/160/202206090527483184.html