当前位置:网站首页>MySQL advanced learning summary 7: MySQL data structure - Comparison of hash index, AVL tree, B tree and b+ tree
MySQL advanced learning summary 7: MySQL data structure - Comparison of hash index, AVL tree, B tree and b+ tree
2022-07-02 18:59:00 【koping_ wu】
Mysql *** 7:Mysql data structure -Hash Indexes 、AVL Trees 、B Trees 、B+ The contrast of trees
1、Mysql Data structure selection analysis
The last one introduced B+ How trees are indexed , In fact, there are other indexing methods , This article is about , By the way innoDB Why didn't you choose other indexing methods at that time , They chose to B+ Tree as the default index .
1.1 Full table traversal
PASS, There is no need to compare this .
1.2 Hash structure
Hash The algorithm is through some deterministic algorithm ( such as MD5, SHA1, SHA2, SHA3) Convert input to output . The same input can always get the same output .
Data structure to speed up search , Common are 2 class :
- Trees . For example, balanced binary search tree , Inquire about / Insert / modify / The average time complexity of deletion is O(log2n)
- Hash . such as HashMap, Inquire about / Insert / modify / The average time complexity of deletion is O(1)
Hash High structural efficiency , Then why should the index structure be designed as a tree ?
- Hash The index can only be equal to (=), It's not equal to (!=), and IN Query for . If carried out Range queries , Then the hash index will degenerate into O(n), And the tree feature can still be maintained O(log2n) Time complexity of .
- Hash The index has another flaw , Its data storage is There is no order Of . stay order by Under the circumstances ,Hash The index also needs to reorder the data .
- In the case of a federated index ,Hash The value is the combined index key and then calculated hash Of , Cannot query a single key or several index keys .
- although Equivalent query ,hash Indexing will be more efficient . But if If there are many duplicate values of index columns , Then the efficiency will be reduced . because hash When the conflict , You need to traverse the row pointer of the bucket to compare , Find the keyword of the query , It's very time consuming .
Hash The index uses the storage engine as shown in the following table , Only Memory Storage engine support .
Mysql Of Memory Storage engine support Hash Indexes , If you need to use the temporary table of the query , You can choose Memory Storage engine . Then set a field to Hash Indexes , When the repeatability of the field is low , And it is often necessary to conduct equivalent query , use Hash Indexing is a good choice .
in addition ,InnoDB Although it does not support Hash Indexes , But it provides adaptation Hash Indexes (Adaptive Hash Index). For example, some data is often accessed , that InnoDB The address of this data page will be stored in Hash In the table . In this way, the next time the same search conditions are performed , You can directly find the location of this page .
Can pass innodb_adaptive_hash_index Variable to see if adaptation is turned on Hash:
mysql> show variables like '%adaptive_hash_index';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| innodb_adaptive_hash_index | ON |
+----------------------------+-------+
1 row in set (0.33 sec)
1.2 Binary search tree
1、 Features of binary search tree
- A node has only 2 Child node
- The left child node < This node , The right child node >= This node
2、 Find the rules - If key Equal to the root node , Return the user data record of the root node
- If key Larger than root , In the right subtree
- If key Less than root , In the left subtree
The disadvantage of binary search tree is that in special cases , The time complexity of finding data will degenerate into O(n). For example, below :
In order to avoid the situation in the above figure , Of course, the core is to reduce disk I/O The number of times , You need to reduce the height of the tree as much as possible .
1.3 AVL Trees
In order to solve the problem that the above binary search tree degenerates into a linked list , A balanced binary search tree is also proposed (Balanced Binary Tree), Also known as AVL Trees :
It's an empty tree , Or the height difference between its left and right subtrees is no more than 1, And the left and right subtrees are both balanced binary trees .
In order to reduce the disk as much as possible I/O The number of times , That is to lower the height of the tree , Therefore, the binary tree can be changed to M Fork tree (M>2), In this way, the height of the tree can become lower .
1.4 B Trees
When mentioned above M Fork tree M When it's more valuable , It can be understood as a B Trees .
Pay attention to distinguish B Trees and B+ The difference between trees .
B In the directory entry record of the tree is It directly contains user record data Of , and B+ Only leaf nodes in the tree contain user record data .
Then why use B+ Trees , Do not apply B Trees ?
- B+ Tree query efficiency is more stable .B+ Only when the tree accesses the leaf node can it find the corresponding data , and B In the tree , Non leaf nodes also store data , This will result in unstable query efficiency .
- B+ Tree query is more efficient . because B The tree also stores the data recorded by the user , So each data page can save less than B+ The number of directory entry records of the tree , therefore B+ The height of the tree will be lower , So the disks needed I/O A little less .
- stay Query range ,B+ Trees will be more efficient . Because all keywords appear in B+ The leaf node of the tree , The data is incremental , Because it can be quickly found through binary search . and B Tree is essentially the search logic of binary search , Therefore, you need to traverse the middle order to complete the search of the query range , Much less efficient .
边栏推荐
- The text editor hopes to mark the wrong sentences in red, and the text editor uses markdown
- SAP S/4HANA OData Mock Service 介绍
- How to clean up discarded PVs and their corresponding folders
- 一款简约PHP个人发卡程序V4.0版本
- Web version 3D visualization tool, 97 things programmers should know, AI frontier paper | information daily # 2022.07.01
- Mysql高级篇学习总结7:Mysql数据结构-Hash索引、AVL树、B树、B+树的对比
- 问题包含哪些环节
- Leetcode(81)——搜索旋转排序数组 II
- 阿里三面被面试官狂问Redis,简历上再也不敢写'精通'了
- 深度学习数学基础
猜你喜欢
Thoroughly understand the point cloud processing tutorial based on open3d!
A simple PHP personal card issuing program v4.0
距离度量 —— 杰卡德距离(Jaccard Distance)
UML 类图
CDN acceleration and breaking J anti-theft chain function
After 22 years in office, the father of PowerShell will leave Microsoft: he was demoted by Microsoft for developing PowerShell
300+ documents! This article explains the latest progress of multimodal learning based on transformer
Industrial software lecture - core technology analysis of 3D CAD design software - the second lecture of the Forum
新加坡暑假旅游攻略:一天玩转新加坡圣淘沙岛
深度学习数学基础
随机推荐
Redis (7) -- database and expiration key
R language ggplot2 visual Facet: gganimate package is based on Transition_ Time function to create dynamic scatter animation (GIF)
How to set vscode to delete the whole line shortcut key?
Three ways of function parameter transfer in C language
How to use PS to extract image color and analyze color matching
材质UV遮罩的技巧
R语言使用epiDisplay包的lrtest函数对多个glm模型(logisti回归)执行似然比检验(Likelihood ratio test)对比两个模型的性能是否有差异、广义线性模型的似然比检
Leetcode (81) -- search rotation sort array II
Deep neural network Summary
R语言使用epiDisplay包的lsNoFunction函数列出当前空间中的所有对象、除了用户自定义的函数对象
Meta universe chain game system development (logic development) - chain game system development (detailed analysis)
Leetcode interview question 17.01 Addition without plus sign
【JVM调优实战100例】03——JVM堆调优四例
UML class diagram
options should NOT have additional properties
A simple PHP personal card issuing program v4.0
Typical application of "stack" - expression evaluation (implemented in C language)
Radian to angle, angle to radian in MATLAB
Chain game system development (unity3d chain game development details) - chain game development mature technology source code
Distance measurement - Jaccard distance