当前位置:网站首页>How polardb-x does distributed database hotspot analysis
How polardb-x does distributed database hotspot analysis
2022-07-07 21:40:00 【Alibaba cloud yunqi】
brief introduction : PolarDB-X It is a cloud native distributed database with separate computing and storage , stay PolarDB-X 2.0 Of AUTO In mode , The database will automatically Hash Partition , Distribute the data evenly among all data nodes , Ideally, the data and traffic between partitions are balanced , It can give full play to the distributed processing ability of multiple nodes . In order to achieve the best effect , It requires that the database try to avoid hot partitions , Including traffic hotspots and data volume hotspots . Avoid hot spots , First of all, you need to quickly and easily find hot partitions , Thus, targeted treatment can be carried out , Therefore, quickly and accurately finding hot partitions has become an important ability required by distributed databases .
background
PolarDB-X It is a distributed database with separate computing and storage , The distributed processing capability is PolarDB-X One of the core features of , Multiple computing nodes of a single database instance will share all SQL Traffic , In this way, we can quickly meet different peak traffic scenarios through the expansion and contraction of nodes .
stay PolarDB-X 1.0 Time , Users often use the method of dividing database and table to split the database and table, so as to achieve the balance of data and traffic among multiple nodes , In this mode, the selection of split key plays a key role in the performance of database , To select the best split key combination, users are required to be very familiar with the database table structure and data distribution of the business library at the beginning of creating the table .
In order to help users reduce the technical threshold of using distributed databases ,PolarDB-X 2.0 The era has introduced the concept of transparent distribution , Users no longer need to specify the split key one by one , Using a distributed database is like using a stand-alone MySQL It's as simple as , You can also enjoy the excellent features of distributed database . This is an upgrade of the user experience , It is also a leap in technical architecture and philosophy , From middleware mode to cloud native architecture , Database is no longer a high-level technical component that requires users to care about maintenance , It's a cloud service on demand , Let users fully enjoy the technological dividends brought by Cloud Architecture .
stay PolarDB-X 2.0 Of AUTO Pattern Next , The database will automatically Hash Partition , Distribute the data evenly among all data nodes , The best case is that the data and traffic between partitions are balanced , It can give full play to the distributed processing ability of multiple nodes . In order to achieve the best effect , It requires that the database try to avoid hot partitions , Including traffic hotspots and data volume hotspots . Avoid hot spots , First of all, you need to be able to quickly and easily find hotspot partitions , Thus, targeted treatment can be carried out . Therefore, quickly and accurately finding hot partitions has become PolarDB-X2.0 An important ability required .
Effect display
Functions overview
First, select a small range of data to introduce , Here's the picture , The vertical axis represents the logical library 、 Logic table 、 The relationship between logical partitions , And the partitions are sorted by logical sequence number , The horizontal axis represents time , The column chart at the bottom and right of the image shows the summary data , The bottom bar graph shows the vertical summation , That is, the sum of the visits of all partitions at a certain time , The column on the right shows the horizontal summation , That is, the sum of visits in all time ranges of a partition .
Storage node perspective
How to view the hot spots from the perspective of storage nodes , You can click on the top “DN View” Button to switch to the storage node perspective , Data will be classified according to different storage nodes , It is convenient to analyze whether the data is balanced between physical storage nodes , Whether there are hotspots of physical storage nodes .
TPC-C Hot spot analysis
use TPC-C Flow test , You can see a complete thermal distribution , It is obvious from the figure TPC-C There are two hot areas of traffic , And the hot spots of data volume can also be found through the width comparison of the vertical axis .
Design considerations
1. The presentation should be as simple and understandable as possible
Hot data has the characteristics of multi-dimensional coupling : Data volume 、 Traffic volume 、 Time 、 The relationship between partitions 、 The relationship between logical library tables and partitions 、 The relationship between physical nodes and logical library tables 、 The difference between hot zone and cold zone , The key elements necessary for these analyses are coupled , Be short of one cannot . Clear up complex information , Give users a clear and concise presentation .
2. Avoid affecting the core functions of the database
To accurately find the hot zone , It is necessary to collect the data volume and request volume of the database , Traffic and data volume are constantly changing , Therefore, the collection process also needs to be continuous , This requires that the process of information collection should not have a negative impact on the core functions of the database .
3. Implement links to reduce dependence on external components
With PolarDB-X Product development , A variety of deployment forms have been derived , There is a public cloud version deployed on Alibaba cloud 、 It faces offline PoC Scenario deployment K8s edition 、 There are also lightweight deployments for users' private environments DBStack edition , There are also open source versions contributed to the community, and so on , In order to make as many versions as possible have the same ability , Use as few external components as possible , In this way, the compatibility problems faced by multi-modal deployment will be minimized .
4. Control the amount of data collected
Because the collection of flow data is a continuous process , In theory, there will be endless statistical data , Therefore, the size of statistical data must be limited , There should be a data aging range , Otherwise, infinite data cannot be stored . The amount of data as small as possible can also reduce the amount of IO And network pressure , Reduce the impact on the core functions of the kernel .
design scheme
Interactive mode
After comparing various types of charts , And comparison with other relevant solutions in the industry , Finally choose to use “ Heat map ” This form is used to display the partition heat information , The horizontal axis expresses time , The vertical axis represents the partition , The color brightness of the corresponding rectangle is used to indicate the level of access popularity . In a glance , The brightest rectangle is the hottest zone .
The heat map can well express the hot spots of traffic , So how to show the hot spots of data volume ? We make innovative use of the vertical axis , The height of each zone on the vertical axis is equal , But the width can be different , The larger the amount of data in the partition , The wider the width , thus , By comparing the width, you can find the partition with the largest amount of data at a glance .
With the above two basic elements of presentation , Add some animation : The zoom 、 Drag and drop 、 Color adjustment 、hover And other interaction effects , You can clearly and completely express the information of hotspot partitions .
Data processing
Processing of timeline
According to the display characteristics of the heat map , The frequency of data acquisition is set at 1 minute / Time , The statistics collected can be retained at most 7 God , It is estimated that there will be at most 7 *
24
60 = 10080 A little bit , Storing data over time requires 10080 Row data . However , The width of the web page displayed by the browser is usually 1000px Around the unit , If users want to see 7 Full data of days , that 1px The width of the unit needs to be stuffed 10 Timeline , This kind of display effect will be greatly reduced . Therefore, the data of the timeline must be processed , Reduce the identification density of the timeline , But you can't lose data .
Reduce the number of timelines , It is easy to think of a scheme to reduce the sampling accuracy , For example, change the acquisition frequency to 30 minute / Time , But if users only watch 1 Data in hours , That leaves the page 2 It's time , Obviously, it is also unacceptable . In this way, reducing the sampling frequency will lead to a contradiction : The contradiction between the requirements of display accuracy in a small time period and the requirements of display effect in a large time period .
therefore , Finally, we choose to grade the timeline , Far time range data reduces accuracy , Data in the near time range is retained with high accuracy , This is also in line with the usage habits of most users , The latest data is more detailed . Acquisition accuracy changed to the latest 1 Data within hours 1 minute / Time , The first 1~8 Data within hours 2 minute / Time , The first 8~24 Data within hours 6 minute / Time , The first 24 Hours ~7 Data within days 30 minute / Time . In this way, the maximum number of timelines will be from 100080 Reduced to 60 + 210 + 160 + 288 = 718 individual .
Therefore, the data structure adopted is shown in the figure below , Multilayer ring queue , Each layer inserts new data from the end of the team , Select the data to be specified from the head of the team, merge it and insert it into the tail of the next layer , Then delete the merged data from the team head . Each ring has a specified size , Merge data downward when the ring is full , Directly discard data when the last ring is full .
Processing of partition axis
In order to avoid the dependence of external components , Therefore, the scheduler of the kernel is used , Initiate a collection task every minute at the main computing node , The task is pushed down to each storage node to obtain the original data , Finally, it is processed on the main computing node . thus it can be seen , The performance consumption of the collection process is closely related to the number of partitions , When the number of partitions is small , Almost no performance consumption , But when the number of partitions is particularly large , Each storage node will return a large amount of data to the primary computing node , Computing nodes need to be parsed and sorted , It will cause a lot of memory and CPU pressure .
therefore , The number of collected partitions must be kept within a certain limit , We need to ensure that the hot spot diagnosis function is available without affecting the performance of the database kernel . According to the actual situation of visual effect and data size, it is found that , The number of partitions displayed is controlled at 1600 The best effect will be achieved within , Default single table 16 Partitions can support 100 Hot spot analysis of Zhang Biao , It can meet most application scenarios .
The situation that there are too many partitioned tables will actually exist , Therefore, we designed to make the number of partitions exceed 1600 And less than 8000 The situation of , Partition statistics can be merged , Reduce the partition accuracy to support hot spot analysis in the case of large number of partitions , Theoretically, it can already support 1000 The hot spots of Zhang Biao are analyzed .
For tens of thousands or hundreds of thousands of tables , Both the information collection process and the front-end display will cause great resource pressure on the kernel and functional links , So for extreme cases , By default, hotspot data collection is not performed , But it supports users to dynamically modify database parameters , To specify the library table that needs hot spot analysis , Specify the analysis scope and analyze on demand .
Sum up , Accurate display of small-scale partitions 、 Medium scale partitions reduce the accuracy of the display 、 The super large-scale partition can specify the range display , It covers many different user needs .
Performance analysis
In order to test the impact of hotspot analysis function on the performance of database kernel , Several groups were carried out TPC-C A comparative experiment of , The conclusion is that this function has little impact on the performance of the kernel . Will be PolarDB-X Kernel CPU When the pressure reaches the maximum , Test the extreme conditions of enabling this function and page refresh to continuously obtain diagnostic results , The impact on performance is controlled in 1% Left right fluctuation , Considering the normal statistical error of the test process , It can be considered that this function has little impact on kernel performance .
Functional eggs
When the user does not create any partition tables , The page has no data to display , The conventional idea is to display a line of text on the front “ Temporarily no data ” To remind users , This makes users unable to experience the fun of this function . In order to let users experience the happiness of hotspot analysis function in advance when there is no data ,PolarDB-X For blank pages “ Do things ”, Combined with the front-end features of thermal analysis function , Draw out “NO DATA” Image , Users can also experience the hotspot analysis function when they have no data .
When the number of partitions of the user exceeds the upper limit of the display , Will draw “TOO BIG” Image of .
Thermal analysis function in addition to the above “ Main stream ” Beyond usage , Use your little head , Use your imagination , It can also make all kinds of “ non-mainstream ” usage , For example, we can use the color characteristics of thermal images , Accurately control partition access , Make one “ love ” Thermal image , You will become the first earth person in the world to express your success with a database !
Link to the original text :click.aliyun.com/m/100034823…
This article is the original content of Alibaba cloud , No reprint without permission .
边栏推荐
- Default constraint and zero fill constraint of MySQL constraint
- 恶魔奶爸 B1 听力最后壁垒,一鼓作气突破
- Prometheus remote_ write InfluxDB,unable to parse authentication credentials,authorization failed
- 使用枚举实现英文转盲文
- The latest version of codesonar has improved functional security and supports Misra, c++ parsing and visualization
- What stocks can a new account holder buy? Is the stock trading account safe
- The little money made by the program ape is a P!
- Awk processing JSON processing
- Ad domain group policy management
- Wechat official account oauth2.0 authorizes login and displays user information
猜你喜欢
Mysql子查询关键字的使用方式(exists)
Problems encountered in installing mysql8 for Ubuntu and the detailed installation process
An overview of the latest research progress of "efficient deep segmentation of labels" at Shanghai Jiaotong University, which comprehensively expounds the deep segmentation methods of unsupervised, ro
Goal: do not exclude yaml syntax. Try to get started quickly
2022 how to evaluate and select low code development platforms?
Debugging and handling the problem of jamming for about 30s during SSH login
为什么Win11不能显示秒数?Win11时间不显示秒怎么解决?
Virtual machine network configuration in VMWare
[C language] advanced pointer --- do you really understand pointer?
解决uni-app中uni.request发送POST请求没有反应。
随机推荐
openGl超级宝典学习笔记 (1)第一个三角形「建议收藏」
[uvalive 6663 count the regions] (DFS + discretization) [easy to understand]
Usage of MySQL subquery keywords (exists)
NVR硬盤錄像機通過國標GB28181協議接入EasyCVR,設備通道信息不顯示是什麼原因?
Unity3d 4.3.4f1 execution project
Le capital - investissement est - il légal en Chine? C'est sûr?
What is the reason for the abnormal flow consumption of 4G devices accessing the easygbs platform?
Tupu digital twin coal mining system to create "hard power" of coal mining
Develop those things: go plus c.free to free memory, and what are the reasons for compilation errors?
What stocks can a new account holder buy? Is the stock trading account safe
Codeforces Round #275 (Div. 2) C – Diverse Permutation (构造)[通俗易懂]
浅解ARC中的 __bridge、__bridge_retained和__bridge_transfer
FatMouse' Trade (Hangdian 1009)
2022年在启牛开中银股票的账户安全吗?
Ant destination multiple selection
解决使用uni-app MediaError MediaError ErrorCode -5
Reinforcement learning - learning notes 8 | Q-learning
How to integrate Google APIs with Google's application system (1) -introduction to Google APIs
Restapi version control strategy [eolink translation]
反诈困境,国有大行如何破局?