Here comes your interviewer , Wearing a plaid shirt , With a beer belly , A middle-aged man whose hairline has moved back seriously .
Holding a thermos cup soaked with Chinese wolfberry , With arms MacBook,MacBook There are company slogans on the :“ I love working overtime ”.

The interview begins , Go straight to the point .
interviewer : You know, MySQL Why is the underlying data structure of the index used B+ Trees ? without B Trees 、 Red black tree or ordinary binary tree ?
I : Who knows what the author thinks about this ? He may have used B+ The tree is used to it , Personal hobbies .
interviewer : You are quite open-minded . Let's stop here for today's interview , I will contact you when there is news later .
Could there be any news later ? When did you contact me on your own initiative ?
The truth is rejected , Instead of learning the eight part essay by heart, he was admitted .
ok , Let me see how Yideng summed it up MySQL The eight part essay of .
I : Need to know MySQL Why is the underlying data structure of the index used B+ Trees , First, we need to know what kind of data structure is more suitable for indexing .
To ensure data security , Generally, the data is stored in the disk . When we need to query data , Need to read disk , Disk is generated IO, Compare memory operations , disk IO The reading speed is very slow .
Because the required data may not be continuous on the disk , One data query requires multiple disks IO, So we need to design the index data structure to reduce the disk as much as possible IO frequency .
Let's take a look at the characteristics of these binary trees , And advantages and disadvantages , You know which data structure is more suitable for indexing .
What is a binary search tree :
- If the left subtree is not empty , Then the value of all nodes in the left subtree is less than the value of its root node ;
- If the right subtree is not empty , Then the value of all nodes in the right subtree is greater than the value of its root node ;
- Left 、 Right subtrees are also binary search trees ;

The time complexity of binary search tree is O(logN), As shown in the figure , Search for at most 3 You can find the required data once .
The ideal is full , Reality is thin . In extreme cases , Binary search tree may degenerate into linear linked list .

The search time complexity of linked list is O(N), At this time, you need at most 7 Times to find the required data .
What to do ? So we thought of adding some restrictions to the binary tree , Balance the left and right subtrees , Then, many balance trees are derived : Balanced binary search tree 、 Red and black trees 、B Trees 、B+ Trees . Let's talk about the advantages and disadvantages of these trees , See which tree is best for indexing .
What is a red-black tree ?
- The nodes are red or black
- The root node is black
- All the leaves are black ( Leaves are NIL node )
- The two child nodes of each red node are black ( There cannot be two consecutive red nodes on all paths from each leaf to the root )
- All paths from any node to each leaf contain the same number of black nodes

See if you've been blindfolded ?
So many complicated rules , This is to ensure that the longest path from the root node to the leaf node does not exceed the shortest path 2 times .
When inserting or deleting nodes , In order to satisfy the red black tree rule , Discoloration and rotation may be required , This is a complex and time-consuming process .
The advantages of red and black trees :
The height of the left and right subtrees is limited , No big difference .
shortcoming :
The rules are complicated , Most people want to understand this thing , It's already very hard , Let alone use .
What is? B Trees ?
We know , The higher the height of the tree , The more you look up , That is, disk IO The more times , The longer it takes ,
Can we find a way to reduce the height of the tree , Turn a binary tree into N Fork tree ? therefore B The tree is coming .
For one m Step B Trees :
- The root node has at least 2 Child node
- Each intermediate node contains k-1 Elements and k Child node , among m/2 <= k <= m
- Each leaf node contains k-1 Elements , among m/2 <= k <= m
- The elements of the intermediate nodes are arranged in ascending order
- All leaf nodes are on the same layer

The root node (8) There are two child nodes , The left child node (3 5) And right child nodes (11 15).
The left child node (3 5) There is 2 Elements and 3 Child node .
The element is 3 and 5, In ascending order .
The child node is (1 2)、(4)、(6 7),
and (1 2) Element in is less than 3,(4) The elements in 3 and 5 middle ,(6 7) The element of is greater than 5, accord with B Tree features .
B What are the advantages of a tree design ?
Lower height , Each node contains multiple elements , When searching, you can load all the elements in a node into memory for comparison , Both improvements greatly reduce the number of disks IO frequency .
What is? B+ Trees ?
Comparison B Trees ,B+ The tree made the following agreement :
Yes k The intermediate nodes of the child nodes have k Elements (B In the tree is k-1 Elements ), That is, the number of child nodes = Element quantity .
Each element does not hold data , Index only , All data is stored in the leaf node .All the intermediate node elements exist in the child nodes at the same time , Is the largest of the child node elements ( Or the smallest ) Elements .
Non leaf nodes only store indexes , Don't save data .(B Both are saved in the tree )
The leaf node contains the information of all elements , And the leaf nodes form an ordered list according to the size of the elements .

B+ What are the advantages of this tree design ?
- Each node stores more elements , Looks like B The trees are fatter , Causes the disk to IO Fewer times .
- Non leaf node does not store data , Store index only , The leaf node stores all the data .
This design causes leaf nodes to be found every time , More stable efficiency , It is convenient for performance optimization . - Leaf nodes are connected by an ordered linked list .
This design facilitates range lookup , Just traverse the adjacent elements in the linked list , No need to traverse binary tree twice .
Obviously ,B Trees and B+ Tree is designed for file retrieval system , It is more suitable for index structure .
interviewer : You have to , Just what you concluded , I don't think so , Come to work tomorrow , Salary double.
This paper summarizes the knowledge points :

Articles are constantly updated , You can search through wechat 「 One light architecture 」 Read more technical dry goods for the first time .









