当前位置:网站首页>MySQL advanced learning summary 7: MySQL data structure - Comparison of hash index, AVL tree, B tree and b+ tree
MySQL advanced learning summary 7: MySQL data structure - Comparison of hash index, AVL tree, B tree and b+ tree
2022-07-02 18:59:00 【koping_ wu】
Mysql *** 7:Mysql data structure -Hash Indexes 、AVL Trees 、B Trees 、B+ The contrast of trees
1、Mysql Data structure selection analysis
The last one introduced B+ How trees are indexed , In fact, there are other indexing methods , This article is about , By the way innoDB Why didn't you choose other indexing methods at that time , They chose to B+ Tree as the default index .
1.1 Full table traversal
PASS, There is no need to compare this .
1.2 Hash structure
Hash The algorithm is through some deterministic algorithm ( such as MD5, SHA1, SHA2, SHA3) Convert input to output . The same input can always get the same output .
Data structure to speed up search , Common are 2 class :
- Trees . For example, balanced binary search tree , Inquire about / Insert / modify / The average time complexity of deletion is O(log2n)
- Hash . such as HashMap, Inquire about / Insert / modify / The average time complexity of deletion is O(1)
Hash High structural efficiency , Then why should the index structure be designed as a tree ?
- Hash The index can only be equal to (=), It's not equal to (!=), and IN Query for . If carried out Range queries , Then the hash index will degenerate into O(n), And the tree feature can still be maintained O(log2n) Time complexity of .
- Hash The index has another flaw , Its data storage is There is no order Of . stay order by Under the circumstances ,Hash The index also needs to reorder the data .
- In the case of a federated index ,Hash The value is the combined index key and then calculated hash Of , Cannot query a single key or several index keys .
- although Equivalent query ,hash Indexing will be more efficient . But if If there are many duplicate values of index columns , Then the efficiency will be reduced . because hash When the conflict , You need to traverse the row pointer of the bucket to compare , Find the keyword of the query , It's very time consuming .
Hash The index uses the storage engine as shown in the following table , Only Memory Storage engine support .
Mysql Of Memory Storage engine support Hash Indexes , If you need to use the temporary table of the query , You can choose Memory Storage engine . Then set a field to Hash Indexes , When the repeatability of the field is low , And it is often necessary to conduct equivalent query , use Hash Indexing is a good choice .
in addition ,InnoDB Although it does not support Hash Indexes , But it provides adaptation Hash Indexes (Adaptive Hash Index). For example, some data is often accessed , that InnoDB The address of this data page will be stored in Hash In the table . In this way, the next time the same search conditions are performed , You can directly find the location of this page .
Can pass innodb_adaptive_hash_index Variable to see if adaptation is turned on Hash:
mysql> show variables like '%adaptive_hash_index';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| innodb_adaptive_hash_index | ON |
+----------------------------+-------+
1 row in set (0.33 sec)
1.2 Binary search tree
1、 Features of binary search tree
- A node has only 2 Child node
- The left child node < This node , The right child node >= This node
2、 Find the rules - If key Equal to the root node , Return the user data record of the root node
- If key Larger than root , In the right subtree
- If key Less than root , In the left subtree
The disadvantage of binary search tree is that in special cases , The time complexity of finding data will degenerate into O(n). For example, below :
In order to avoid the situation in the above figure , Of course, the core is to reduce disk I/O The number of times , You need to reduce the height of the tree as much as possible .
1.3 AVL Trees
In order to solve the problem that the above binary search tree degenerates into a linked list , A balanced binary search tree is also proposed (Balanced Binary Tree), Also known as AVL Trees :
It's an empty tree , Or the height difference between its left and right subtrees is no more than 1, And the left and right subtrees are both balanced binary trees .

In order to reduce the disk as much as possible I/O The number of times , That is to lower the height of the tree , Therefore, the binary tree can be changed to M Fork tree (M>2), In this way, the height of the tree can become lower .
1.4 B Trees
When mentioned above M Fork tree M When it's more valuable , It can be understood as a B Trees .
Pay attention to distinguish B Trees and B+ The difference between trees .
B In the directory entry record of the tree is It directly contains user record data Of , and B+ Only leaf nodes in the tree contain user record data .
Then why use B+ Trees , Do not apply B Trees ?
- B+ Tree query efficiency is more stable .B+ Only when the tree accesses the leaf node can it find the corresponding data , and B In the tree , Non leaf nodes also store data , This will result in unstable query efficiency .
- B+ Tree query is more efficient . because B The tree also stores the data recorded by the user , So each data page can save less than B+ The number of directory entry records of the tree , therefore B+ The height of the tree will be lower , So the disks needed I/O A little less .
- stay Query range ,B+ Trees will be more efficient . Because all keywords appear in B+ The leaf node of the tree , The data is incremental , Because it can be quickly found through binary search . and B Tree is essentially the search logic of binary search , Therefore, you need to traverse the middle order to complete the search of the query range , Much less efficient .
边栏推荐
- How can retail enterprises open the second growth curve under the full link digital transformation
- Eliminate the yellow alarm light on IBM p750 small computer [easy to understand]
- options should NOT have additional properties
- 如何清理废弃pv和其对应的文件夹
- Troubleshooting: kubectl reports an error validationerror: unknown field \u00a0
- Competence of product manager
- How to delete the border of links in IE? [repeat] - how to remove borders around links in IE? [duplicate]
- Mysql高级篇学习总结8:InnoDB数据存储结构页的概述、页的内部结构、行格式
- R language uses lrtest function of epidisplay package to perform likelihood ratio test on multiple GLM models (logisti regression). Compare whether the performance of the two models is different, and
- R language uses the lsnofunction function function of epidisplay package to list all objects in the current space, except user-defined function objects
猜你喜欢

一款简约PHP个人发卡程序V4.0版本
![[daily question] the next day](/img/8a/18329bd9b4a3a4445c8fbbc1ce562b.png)
[daily question] the next day

300+篇文献!一文详解基于Transformer的多模态学习最新进展

Google's official response: we have not given up tensorflow and will develop side by side with Jax in the future
![27: Chapter 3: develop Passport Service: 10: [registration / login] interface: after the registration / login is OK, save the user session information (uid, utoken) to redis and cookies; (one main poi](/img/b9/2066a13b160252114c2881007094f8.png)
27: Chapter 3: develop Passport Service: 10: [registration / login] interface: after the registration / login is OK, save the user session information (uid, utoken) to redis and cookies; (one main poi

呆错图床系统源码图片CDN加速与破J防盗链功能

CDN acceleration and breaking J anti-theft chain function

【JVM调优实战100例】02——虚拟机栈与本地方法栈调优五例
![[fluent] dart data type (VaR data type | object data type)](/img/1b/fe2529af5f6663fad1fb7861f14ab5.jpg)
[fluent] dart data type (VaR data type | object data type)

Yesterday, Alibaba senior wrote a responsibility chain model, and there were countless bugs
随机推荐
彻底搞懂基于Open3D的点云处理教程!
iptable端口重定向 MASQUERADE[通俗易懂]
【每日一题】第二天
开源物联网平台ThingsBoard的安装
Industrial software lecture - core technology analysis of 3D CAD design software - the second lecture of the Forum
SQL training 2
深度神经网络总结
How to enable the run dashboard function of idea
迷你高尔夫球场:伦敦休闲旅游好去处
The official docker image running container in version 1.5.1 can be set to use MySQL 8 driver?
Web实时通信技术之Websocket
【JVM调优实战100例】02——虚拟机栈与本地方法栈调优五例
promise 和 Observable 的区别
options should NOT have additional properties
AI开发调试系列第二弹:多机分布式调测探索之旅
options should NOT have additional properties
STM32G0 USB DFU 升级校验出错-2
科技公司不同人对Bug的反应 | 每日趣闻
哪个券商公司网上开户佣金低又安全又可靠
页面标题组件