当前位置:网站首页>MySQL advanced learning summary 7: MySQL data structure - Comparison of hash index, AVL tree, B tree and b+ tree
MySQL advanced learning summary 7: MySQL data structure - Comparison of hash index, AVL tree, B tree and b+ tree
2022-07-02 18:59:00 【koping_ wu】
Mysql *** 7:Mysql data structure -Hash Indexes 、AVL Trees 、B Trees 、B+ The contrast of trees
1、Mysql Data structure selection analysis
The last one introduced B+ How trees are indexed , In fact, there are other indexing methods , This article is about , By the way innoDB Why didn't you choose other indexing methods at that time , They chose to B+ Tree as the default index .
1.1 Full table traversal
PASS, There is no need to compare this .
1.2 Hash structure
Hash The algorithm is through some deterministic algorithm ( such as MD5, SHA1, SHA2, SHA3) Convert input to output . The same input can always get the same output .
Data structure to speed up search , Common are 2 class :
- Trees . For example, balanced binary search tree , Inquire about / Insert / modify / The average time complexity of deletion is O(log2n)
- Hash . such as HashMap, Inquire about / Insert / modify / The average time complexity of deletion is O(1)
Hash High structural efficiency , Then why should the index structure be designed as a tree ?
- Hash The index can only be equal to (=), It's not equal to (!=), and IN Query for . If carried out Range queries , Then the hash index will degenerate into O(n), And the tree feature can still be maintained O(log2n) Time complexity of .
- Hash The index has another flaw , Its data storage is There is no order Of . stay order by Under the circumstances ,Hash The index also needs to reorder the data .
- In the case of a federated index ,Hash The value is the combined index key and then calculated hash Of , Cannot query a single key or several index keys .
- although Equivalent query ,hash Indexing will be more efficient . But if If there are many duplicate values of index columns , Then the efficiency will be reduced . because hash When the conflict , You need to traverse the row pointer of the bucket to compare , Find the keyword of the query , It's very time consuming .
Hash The index uses the storage engine as shown in the following table , Only Memory Storage engine support .
Mysql Of Memory Storage engine support Hash Indexes , If you need to use the temporary table of the query , You can choose Memory Storage engine . Then set a field to Hash Indexes , When the repeatability of the field is low , And it is often necessary to conduct equivalent query , use Hash Indexing is a good choice .
in addition ,InnoDB Although it does not support Hash Indexes , But it provides adaptation Hash Indexes (Adaptive Hash Index). For example, some data is often accessed , that InnoDB The address of this data page will be stored in Hash In the table . In this way, the next time the same search conditions are performed , You can directly find the location of this page .
Can pass innodb_adaptive_hash_index Variable to see if adaptation is turned on Hash:
mysql> show variables like '%adaptive_hash_index';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| innodb_adaptive_hash_index | ON |
+----------------------------+-------+
1 row in set (0.33 sec)
1.2 Binary search tree
1、 Features of binary search tree
- A node has only 2 Child node
- The left child node < This node , The right child node >= This node
2、 Find the rules - If key Equal to the root node , Return the user data record of the root node
- If key Larger than root , In the right subtree
- If key Less than root , In the left subtree
The disadvantage of binary search tree is that in special cases , The time complexity of finding data will degenerate into O(n). For example, below :
In order to avoid the situation in the above figure , Of course, the core is to reduce disk I/O The number of times , You need to reduce the height of the tree as much as possible .
1.3 AVL Trees
In order to solve the problem that the above binary search tree degenerates into a linked list , A balanced binary search tree is also proposed (Balanced Binary Tree), Also known as AVL Trees :
It's an empty tree , Or the height difference between its left and right subtrees is no more than 1, And the left and right subtrees are both balanced binary trees .

In order to reduce the disk as much as possible I/O The number of times , That is to lower the height of the tree , Therefore, the binary tree can be changed to M Fork tree (M>2), In this way, the height of the tree can become lower .
1.4 B Trees
When mentioned above M Fork tree M When it's more valuable , It can be understood as a B Trees .
Pay attention to distinguish B Trees and B+ The difference between trees .
B In the directory entry record of the tree is It directly contains user record data Of , and B+ Only leaf nodes in the tree contain user record data .
Then why use B+ Trees , Do not apply B Trees ?
- B+ Tree query efficiency is more stable .B+ Only when the tree accesses the leaf node can it find the corresponding data , and B In the tree , Non leaf nodes also store data , This will result in unstable query efficiency .
- B+ Tree query is more efficient . because B The tree also stores the data recorded by the user , So each data page can save less than B+ The number of directory entry records of the tree , therefore B+ The height of the tree will be lower , So the disks needed I/O A little less .
- stay Query range ,B+ Trees will be more efficient . Because all keywords appear in B+ The leaf node of the tree , The data is incremental , Because it can be quickly found through binary search . and B Tree is essentially the search logic of binary search , Therefore, you need to traverse the middle order to complete the search of the query range , Much less efficient .
边栏推荐
- iptable端口重定向 MASQUERADE[通俗易懂]
- Industrial software lecture - core technology analysis of 3D CAD design software - the second lecture of the Forum
- Matlab中弧度转角度、角度转弧度
- STM32G0 USB DFU 升级校验出错-2
- Leetcode (81) -- search rotation sort array II
- The student Tiktok publicized that his alma mater was roast about "reducing the seal of enrollment". Netizen: hahahahahahahaha
- R language ggplot2 visualization: visualize the line chart and add customized X-axis label information to the line chart using labs function
- Redis(7)----数据库与过期键
- Slam | how to align timestamps?
- options should NOT have additional properties
猜你喜欢

How to play when you travel to Bangkok for the first time? Please keep this money saving strategy

Stratégie touristique d'été de Singapour: un jour pour visiter l'île de San taosha à Singapour

The text editor hopes to mark the wrong sentences in red, and the text editor uses markdown
![27: Chapter 3: develop Passport Service: 10: [registration / login] interface: after the registration / login is OK, save the user session information (uid, utoken) to redis and cookies; (one main poi](/img/b9/2066a13b160252114c2881007094f8.png)
27: Chapter 3: develop Passport Service: 10: [registration / login] interface: after the registration / login is OK, save the user session information (uid, utoken) to redis and cookies; (one main poi

彻底搞懂基于Open3D的点云处理教程!

Leetcode interview question 17.04 Vanishing numbers

一款简约PHP个人发卡程序V4.0版本

迷你高尔夫球场:伦敦休闲旅游好去处

UML class diagram

深度学习数学基础
随机推荐
Responses of different people in technology companies to bugs | daily anecdotes
页面标题组件
Distance measurement - Jaccard distance
产品经理应具备的能力
什么是云原生?这回终于能搞明白了!
R语言dplyr包na_if函数把向量数值中的控制转化为缺失值NA、按照映射规则把指定内容转化为缺失值NA
Which securities company has a low, safe and reliable online account opening commission
promise 和 Observable 的区别
链游系统开发(Unity3D链游开发详情)丨链游开发成熟技术源码
医院在线问诊源码 医院视频问诊源码 医院小程序源码
Mysql高级篇学习总结8:InnoDB数据存储结构页的概述、页的内部结构、行格式
距离度量 —— 杰卡德距离(Jaccard Distance)
全链路数字化转型下,零售企业如何打开第二增长曲线
Competence of product manager
Stretchdibits function
R language ggplot2 visualization: gganimate package creates dynamic histogram animation (GIF) and uses transition_ The States function displays a histogram step by step along a given dimension in the
Uncover the whole link communication process of dewu customer service im
The difference between SLC, MLC, TLC and QLC NAND SSD: which is better?
Slam | how to align timestamps?
哪个券商公司网上开户佣金低又安全又可靠