当前位置:网站首页>[hash table basics]
[hash table basics]
2022-07-28 06:40:00 【Step by step b】
What is a hash table
Hashtable (Hash table, Also called a hash table ), According to the key code value (Key value) Data structures that are accessed directly . in other words , It accesses records by mapping key values to a location in a table , To speed up the search . This mapping function is called the hash function , The array that holds the records is called a hash table .
The official explanation may be a little confused , Frankly speaking, an array is actually a hash table .
Hash is a very common lookup data structure , It can be O(1) Data search under the time complexity of .
For example, I have a set with the following data , And I want to quickly find out if a data is in this set , What should I do ?
In general, you can use traversal , But if there is too much data , Then the cost of each traversal will be unacceptable .
that , If they are ordered , You can use the tree data structure for binary search , The efficiency is also very high , But unfortunately, our data are out of order .
So someone thought of a very clever way to find it , Is the data to be found ( Hereinafter referred to as key ) Perform a calculation to get an array subscript value , Then put this value into the corresponding array .
In the future, we will calculate the key every time we look for it to get an array subscript value , Then get the data corresponding to the array through the subscript , You can know if it exists in this array .
The data structure of this data lookup is called hash table , The method of calculating the key is called the hash function .
What is a hash function
Hash function can convert the given data into irregular values of fixed length . The converted irregular values can be used as data summary in various scenarios .
We can think of hash function as a blender , As shown in the figure below .
Put the data into the blender 
After hash function calculation , The mixer outputs irregular values of fixed length . The irregular value output is “ Hash value ”. Although the hash value is a number , But it is often expressed in hexadecimal .
The computer uses binary to manage all data , Although the hash value is expressed in hexadecimal , But it's also data , The computer is storing hash values , It will be converted into binary by calculation for management .
Characteristics of hash function
- The length of the hash value is independent of the size of the input data
- Enter the same data , The output hash value must also be the same
- Enter similar data , The output hash value must be different
- The input data is completely different , But the output hash value may be the same . This situation is called “ Hash Collisions ”
- Hash value is irreversible , It is impossible to deduce the original data from the hash value .
Hash Collisions
Hash conflict means that multiple different keys are hashed to the same array subscript position , The case is as follows :
In the diagram above , ear 、 Duo 、 No The array subscripts of these three words after hashing are 0, And because it's three different values , Therefore, you can't directly overwrite... On the array , Then we need a way to save these three values .
Generally, there are two ways to solve hash conflicts , Zipper method and open address method .
Zipper method : Is to maintain a linked list at the conflicting subscript elements , All conflicting elements are put on the linked list in turn :
In the diagram above , Put the conflicting two keys in the linked list in order , The next time you search, you only need to view the array elements and traverse the linked list .
In fact, zipper method is to choose the appropriate size of hash table , This will not waste a lot of memory because the array is empty , It won't waste too much time searching because the list is too long .
Open address method : It is a relatively simple method to solve conflicts , Its principle is very simple :
It's in the first ear The word has taken up the subscript 0 after , the second Duo The word goes back to find the free subscript , After finding it, set yourself in , therefore Duo The word is subscript 1 It's about , and No The word is subscript 2 It's about .
Depending on how you look for subscripts , The open address method can be divided into the following :
In fact, the principles are almost the same , All search for spare subscripts based on the current subscript , But the steps are different .
Three common hash structures
- Array (array)
- aggregate (set)
- mapping (map)
stay C++ in ,set and map The following three data structures are provided respectively , Its underlying implementation and advantages and disadvantages are shown in the table below :

The disadvantages of hash tables
1. Too many hash conflicts can lead to partial lookups .
2. Can't do range lookup .
For the first disadvantage , Generally, there will not be too many conflicts in normal use
For the second disadvantage , This is an unavoidable hard injury to the hash table , Under the need to find the scope , Or tree data structure can better take into account range search and exponential speed .
Of course, when the hash table records a large amount of data , Processing recorded data is fast , The average operating time is a small constant . This is enough to cover up many shortcomings .
边栏推荐
- Leetcode brush question diary sword finger offer II 053. Medium order successor in binary search tree
- 【C语言】动态内存管理
- OJ 1129 fraction matrix
- 2021-11-10
- 2022-07-19 Damon database connection instance, execution script, system command
- 2022年七夕送女朋友什么礼物好?实用且好看的礼物推荐
- 水瓶效果制作
- Treasure plan TPC system development DAPP construction
- OJ 1505 保险丝
- OJ 1284 counting problem
猜你喜欢

1、 Ffmpeg record audio as PCM file

开放式耳机有哪些、四款音质超好的气传导耳机推荐
![[pta ---- traversal of tree]](/img/d8/260317b30d624f8e518f8758706ab9.png)
[pta ---- traversal of tree]

Antenna effect solution

Development of clip arbitrage / brick carrying arbitrage system

项目编译NoSuch***Error问题

AQS之CyclicBarrier源码解析

Get the current directory in QT

【C笔记】数据类型及存储

What's a good gift for your girlfriend on Chinese Valentine's day? Boys who can't give gifts, look!
随机推荐
Ready to start blogging
Valgrind tool
RayMarching实现体积光渲染
OJ 1131 美丽数
用c语言实现三子棋小游戏
[dynamic planning -- the best period for buying and selling stocks Series 2]
Development of Quantitative Trading Robot System
数组解法秘籍
气传导蓝牙耳机什么牌子好、气传导耳机最好的品牌推荐
七夕送什么礼物好?小众又高级的产品礼物推荐
代码整洁之道(二)
mysql-8.0.17-winx64(附加navicat)手动配置版安装
OJ 1045 reverse and add
[basic knowledge of binary tree]
Hugging face's problem record I
动态规划--多步爬楼梯(爬楼梯进阶版)
Feignclient @requestmapping parameter setting and simple method setting of request header
2022-06-07 ResponseBodyAdvice导致Swagger出现弹框问题
刷题记录----反转链表(反转整个链表)
[PTA----树的遍历]