当前位置:网站首页>Simple understanding of B tree and b+ tree
Simple understanding of B tree and b+ tree
2022-06-29 23:39:00 【Fairy wants carry】
Preface
First , We know , We use it B Trees B+ The tree is to increase the efficiency of our index ( Increase query efficiency )
We all know Binary search tree The time complexity of the search is O(log N), Its search efficiency is high enough , Then why do you still have B Trees and B+ The emergence of trees ? Is the time complexity of binary search tree smaller than that of binary search tree ?
problem :
This will throw disk read / write IO Efficiency is a problem , as everyone knows ,IO The query efficiency is very low , You need to read the disk and load the data into memory , When a large amount of data is stored , We can't load the received data into memory at once , Instead, it gradually loads into Disk page , Each disk page corresponds to the node of the tree , That is, the side shows the nodes of the tree ( data ) The more , More disk pages ,IO The more ( Back to );——> Tree depth and IO Operate the vibration wave
solve :
1. Each data node stores multiple elements , That is, every time you load , More data , More data on disk pages
2. Using multi tree
B Trees
First, let's talk about his structure directly : A pointer to the current data , A pointer pointing down , Another is data
One m Hierarchical B Tree features : When a node has k A child , For example, the head node has three children , So there must be k-1 Key words , That is, two keywords can divide the keywords of the subtree into 3 A subset of
Inquire about :
Pictured above is an example : If the query value is 5:
First disk IO: Locate in memory ( And 17、35 Compare ), Than 17 Small , The left subtree ;
Second disk IO: Locate in memory ( And 8、12 Compare ), Than 8 Small , The left subtree ;
The third disk IO: Locate in memory ( And 3、5 Compare ), find 5, End .
The whole process , We can see that : The number of comparisons is no less than that of binary search trees , Especially when there are many data in a node , But disk IO The number of times is greatly reduced . The comparison is done in memory ,( Because we compare in memory , And we can have more than one data at a time , data up 了 , So it greatly reduces IO), Compared to disk IO The speed of , The time-consuming comparison is almost negligible . So when the height of the tree is low enough , Can greatly improve efficiency . by comparison , It doesn't matter if there are many elements in the node , Just a few more memory interactions , As long as it does not exceed the size of disk pages .
B+ Trees
structure : Pointer pointing down + data ( Only leaf nodes have ), A linked table is used below , Specifically linked data
1. Yes k The middle node of the subtree contains k Elements (B In the tree is k-1 Elements ), Each element does not hold data , Index only , All the data
Are saved in the leaf node .
2. All leaf nodes contain the information of all elements , And pointers to records containing these elements , The size of the node depends on the leaf itself
Link from small to large .
3. All the intermediate node elements exist in the child nodes at the same time , Is the largest of the child node elements ( Or the smallest ) Elements .
advantage : A node stores content compared to B Trees , One pointer is missing , So every time IO Load data from hard disk into memory , Can load more data than B More trees , The side shows B+ The optimization of the tree
lookup :
And B The difference between trees is , We don't have a satellite search ( That is, the current data record pointed to by the index element ), This means that the same disk page can hold more node elements ——>IO Less operation
What needs to be added is , Clustered index in database (Clustered Index) in , The leaf node directly contains satellite data . In the nonclustered index (NonClustered Index) in , The leaf node has a pointer to the satellite data ;
B+ Tree range lookup
First, find the lower limit of the range by binary search , At this point, the function of our linked list is reflected , The upper limit can be found directly through the link sequence traversal of leaf nodes , Process is simple , Efficient ;
summary
B+ Trees are compared to B The advantages of trees :
1. A single node stores more elements , Make the query IO Fewer times ;
2. All queries need to find leaf nodes , Query performance is stable ;
3. All leaf nodes form an ordered list , Easy range query .
边栏推荐
猜你喜欢
海外数字身份验证服务商ADVANCE.AI入选EqualOcean《2022品牌出海服务市场研究报告》
FPGA开发(2)——IIC通信
Matlab exercises -- program control process exercise
Overseas digital authentication service provider advance AI was selected into the "2022 brand sea Service Market Research Report" of equalocean
Yunhe enmo Guoqiang, identifiez - le, saisissez - le, avant l'ébullition de la base de données nationale
雲和恩墨蓋國强,識別它、抓住它,在國產數據庫沸騰以前
简单理解B树和B+树
InfluxDB时序数据库系统
matlab习题 —— 程序控制流程练习
二叉搜索树 230. 二叉搜索树中第K小的元素 1038. 从二叉搜索树到更大和树
随机推荐
打造一个 API 快速开发平台,牛逼!
Database - playful data -pgsql uses UUID as primary key
招商证券靠谱吗?开股票账户安全吗?
Cacti关于spine轮询的设置
HPE launched ARM CPU general server product
Applet plug-in access, development and precautions
Xutils3 transfer set
Use of jetpack's room in combination with flow
C pointer advanced 2-- > function pointer array callback function simplifies calculator code, and implements qsort function based on callback function simulation
@Scheduled annotation pit, I stepped on it for you
为什么 JSX 语法这么香?
flutter 插件版本冲突的解决方法
除子
Constexpr function
Under the epidemic, I left my job for a year, and my income increased 10 times
“微博评论”的高性能高可用计算架构
Leetcode 1385. 两个数组间的距离值
云服务器的安全设置常识
Solr basic operation 4
剑指 Offer 14- II. 剪绳子 II