当前位置:网站首页>Simple understanding of B tree and b+ tree
Simple understanding of B tree and b+ tree
2022-06-29 23:39:00 【Fairy wants carry】
Preface
First , We know , We use it B Trees B+ The tree is to increase the efficiency of our index ( Increase query efficiency )
We all know Binary search tree The time complexity of the search is O(log N), Its search efficiency is high enough , Then why do you still have B Trees and B+ The emergence of trees ? Is the time complexity of binary search tree smaller than that of binary search tree ?
problem :
This will throw disk read / write IO Efficiency is a problem , as everyone knows ,IO The query efficiency is very low , You need to read the disk and load the data into memory , When a large amount of data is stored , We can't load the received data into memory at once , Instead, it gradually loads into Disk page , Each disk page corresponds to the node of the tree , That is, the side shows the nodes of the tree ( data ) The more , More disk pages ,IO The more ( Back to );——> Tree depth and IO Operate the vibration wave
solve :
1. Each data node stores multiple elements , That is, every time you load , More data , More data on disk pages
2. Using multi tree
B Trees
First, let's talk about his structure directly : A pointer to the current data , A pointer pointing down , Another is data
One m Hierarchical B Tree features : When a node has k A child , For example, the head node has three children , So there must be k-1 Key words , That is, two keywords can divide the keywords of the subtree into 3 A subset of

Inquire about :
Pictured above is an example : If the query value is 5:
First disk IO: Locate in memory ( And 17、35 Compare ), Than 17 Small , The left subtree ;
Second disk IO: Locate in memory ( And 8、12 Compare ), Than 8 Small , The left subtree ;
The third disk IO: Locate in memory ( And 3、5 Compare ), find 5, End .
The whole process , We can see that : The number of comparisons is no less than that of binary search trees , Especially when there are many data in a node , But disk IO The number of times is greatly reduced . The comparison is done in memory ,( Because we compare in memory , And we can have more than one data at a time , data up 了 , So it greatly reduces IO), Compared to disk IO The speed of , The time-consuming comparison is almost negligible . So when the height of the tree is low enough , Can greatly improve efficiency . by comparison , It doesn't matter if there are many elements in the node , Just a few more memory interactions , As long as it does not exceed the size of disk pages .
B+ Trees
structure : Pointer pointing down + data ( Only leaf nodes have ), A linked table is used below , Specifically linked data
1. Yes k The middle node of the subtree contains k Elements (B In the tree is k-1 Elements ), Each element does not hold data , Index only , All the data
Are saved in the leaf node .
2. All leaf nodes contain the information of all elements , And pointers to records containing these elements , The size of the node depends on the leaf itself
Link from small to large .
3. All the intermediate node elements exist in the child nodes at the same time , Is the largest of the child node elements ( Or the smallest ) Elements .
advantage : A node stores content compared to B Trees , One pointer is missing , So every time IO Load data from hard disk into memory , Can load more data than B More trees , The side shows B+ The optimization of the tree

lookup :
And B The difference between trees is , We don't have a satellite search ( That is, the current data record pointed to by the index element ), This means that the same disk page can hold more node elements ——>IO Less operation

What needs to be added is , Clustered index in database (Clustered Index) in , The leaf node directly contains satellite data . In the nonclustered index (NonClustered Index) in , The leaf node has a pointer to the satellite data ;
B+ Tree range lookup
First, find the lower limit of the range by binary search , At this point, the function of our linked list is reflected , The upper limit can be found directly through the link sequence traversal of leaf nodes , Process is simple , Efficient ;

summary
B+ Trees are compared to B The advantages of trees :
1. A single node stores more elements , Make the query IO Fewer times ;
2. All queries need to find leaf nodes , Query performance is stable ;
3. All leaf nodes form an ordered list , Easy range query .
边栏推荐
猜你喜欢

FPGA开发(2)——IIC通信

@Scheduled注解的坑,我替你踩了

这个简单的小功能,半年为我们产研团队省下213个小时

LC:最大子数组和

Halcon实用:焊点检出设计思路

2022年PMP项目管理考试敏捷知识点(5)

Overseas digital authentication service provider advance AI was selected into the "2022 brand sea Service Market Research Report" of equalocean

Software testing interface testing JMeter 5.5 installation tutorial

HPE launched ARM CPU general server product

CE第二次作业
随机推荐
疫情下我离职一年,收入增长了10倍
Cacti最大监控数测试
软件测试 接口测试 Jmeter 5.5 安装教程
Solr基础操作
Implementation of aut, a self-developed transport layer protocol for sound network -- dev for dev column
简单理解B树和B+树
label问题排查:打不开标注好的图像
基金的估值,费用,会计核算
Speech signal processing (II): phonation physiology, auditory physiology and auditory psychology
股票开户安全吗?上海股票开户。
请指教什么是在线开户?另外,手机开户安全么?
Matlab exercises -- program control process exercise
RRDTOOL draws MRTG log data
Regular expressions: characters (2)
【一起上水硕系列】Day 8
Metaq cluster installation test
How to solve the problem that the computer time is not automatically updated after proofreading
绿树公司官方网站
海外数字身份验证服务商ADVANCE.AI入选EqualOcean《2022品牌出海服务市场研究报告》
@Scheduled注解的坑,我替你踩了