当前位置:网站首页>The principle of hash table and the solution of hash conflict
The principle of hash table and the solution of hash conflict
2022-07-27 06:07:00 【The last tripod】
Hash table definition
Ideal search method : Without any comparison , Get the elements to search directly from the table at a time . If you construct a storage structure , By some kind of letter
Count (hashFunc) A one-to-one mapping relationship can be established between the storage location of an element and its key , Then when searching, you can quickly
Find the element .
When in this structure :
Insert elements
According to the key of the element to be inserted , This function calculates the storage location of the element and stores it according to this location
Search element
Do the same calculation for the key of the element , Take the function value as the storage location of the element , Take the element comparison in this position in the structure , if
The keys are equal , Then the search is successfulThis is called hash ( hash ) Method , The conversion function used in the hash method is called hash ( hash ) function , The constructed structure is called a hash table ( Or a hash table )
Hash Collisions
Because of the limited space , Hash functions are bound to have limitations , No matter what hash function it is, there will inevitably be multiple key Map to the same value , When one key When the stored location is mapped to another location key When occupied , Storage conflicts occur , Hash conflict .
Conflict is inevitable , But what we can do is try to reduce the conflict rate
Common hash functions
- Direct customization
Take a linear function of keyword as hash address :Hash(Key)= A*Key + B** advantage : Simple 、 uniform shortcoming : You need to know in advance
Distribution of key words Use scenarios : It is suitable for finding relatively small and continuous cases Interview questions : The first character in the string appears only once - Division and remainder
Set the number of addresses allowed in the hash table to m, Take one not greater than m, But closest to or equal to m The prime number of p As divisor , According to the hash function :
**Hash(key) = key% p(p<=m),** Convert key to hash address - Square with the middle method
Assume keyword is 1234, Yes, its square is 1522756, Extract the middle 3 position 227 As a hash address ; Another example is that the keyword is 4321, Yes
It's Square 18671041, Extract the middle 3 position 671( or 710) As a hash address The square middle method is more suitable for : I don't know the score of keywords
cloth , And the number of digits is not very big
Adjustment of load factor
Load factor definition : Load factor a = The number of elements in the table / The length of the hash table
Because the length of the table is a fixed value , The more elements in the table , The more elements that have hash conflicts , That is to say, the larger the load factor , The greater the probability of conflict .
In general , We need to control the load factor at 0.8 following :![[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-IdjmqhOJ-1658664731165)(C:\Users\yang\AppData\Roaming\Typora\typora-user-images\image-20220724194527111.png)]](/img/98/dff55662e4a016dd1e09ec00e84663.png)
Resolution of hash conflicts
- Closed hash
It's also called open addressing , When a hash conflict occurs , If the hash table is not full , It means that there must be a space in the hash table , Then you can.
hold key Stored in a conflict location “ next ” Go to an empty place , There are generally two ways to transfer
Linear detection method
Start from where the conflict occurred , Sequential backward detection , Until we find the next empty position .
Here's the picture , If you want to store 44, We found that 44 The position that should be has been 4 Account for the , So we should put 44 Move straight back , Until there is a spare place

* Second detection
The drawback of linear detection is that conflicting data are stacked together , This has something to do with finding the next empty location , Because the way to find an empty position is to
Look back one by one , Therefore, in order to avoid this problem , The way to find the next empty location is :H = ( H0 + i^2 )% m, among :i Is through the hash function Hash(x) Key to element key The calculated position ,
m Is the size of the table .
> ** Studies have shown that :** When the length of the table is prime and the table loading factor a No more than 0.5 when , New entries must be able to insert , And no location will be explored twice . So as long as there are half empty positions in the table , There will be no table full problem . When searching, you can not consider the situation that the table is full , However, when inserting, you must ensure that the load factor of the table a No more than 0.5, If it exceeds the limit, we must consider increasing capacity .
>
> ** therefore : The biggest disadvantage of hash ratio is that the space utilization is relatively low , This is also the defect of hash **
Hash
Hash method is also called chain address method ( Open chain method ), First, we use hash function to calculate hash address for key set , Keys with the same address belong to the same subset , Each subset is called a bucket , The elements in each bucket are linked by a single chain table , The head nodes of each linked list are stored in the hash table .
The above example uses open hash to store as follows

Opening a list is also a common method , stay java Medium HashMap It is stored in the form of an open list .
stay java Fewer conflicts in a certain location are stored in the form of a linked list , When the limit is exceeded , The linked list will be transformed into a red black tree for storage .
At the same time, when the load factor reaches a certain limit , The hash table will be expanded and all elements will be hashed back to their proper positions
边栏推荐
猜你喜欢

浅记一下十大排序

Li Kou 236. the nearest common ancestor of binary tree

Lightroom Classic 2022 v11.4中文版「最新资源」

2022.6.10 stm32mp157 serial port clock learning

Leetcode每日一题30. 串联所有单词的子串

C# Json编码在TCP通讯中的一些使用总结

Unity 菜单界面的简单介绍

STM32 infrared remote control

Day 3. Suicidal ideation and behavior in institutions of higher learning: A latent class analysis

Unity 窗口界面的简单介绍
随机推荐
[first song] rebirth of me in py introductory training (2): formula programming
Weidongshan digital photo frame project learning (I) display ASCII characters on LCD
力扣每日一题leetcode 513. 找树左下角的值
arcgis style样式表文件转换成geoserver sld文件
arcgis for js api(2) 获取要素服务的id集合
PS 2022 updated in June, what new functions have been added
UnityShader-LowPoly
LaTeX中多个公式公用一个序号时
Lightroom Classic 2022 v11.4中文版「最新资源」
文件的路径
韦东山 数码相框 项目学习(四)简易的TXT文档显示器(电纸书)
使用Markdowm
Leetcode每日一题30. 串联所有单词的子串
数据库索引的一些说明以及使用
安装windows下的redis
2021-06-26
Stm32-fsmc extended memory SRAM
安全帽反光衣检测识别数据集和yolov5模型
2022.6.10 STM32MP157串口时钟的学习
acwing每日一题 正方形数组的数目