当前位置:网站首页>Deng Junhui's notes on data structure and algorithm learning - Chapter 9

Deng Junhui's notes on data structure and algorithm learning - Chapter 9

2020-11-09 08:20:00 osc_0m0d4mbq


day58
Chapter nine The dictionary

9. The dictionary

9.b Hash principle

 Insert picture description here  Insert picture description here Hashing- hash
 Insert picture description here The data comes from a fairly large space , But the actual data to be stored and organized is a very small subset of it , Space efficiency is extremely low
 Insert picture description here Compress space
 Insert picture description here  Insert picture description here The pigeon nest problem , take x From a larger definition domain , Mapping to a smaller range of values , There are ways to reduce conflict ( Design better hash function or increase hash table length M), But it can't be avoided ( How to solve ?)


9.c Hash function

 Insert picture description here  Insert picture description here  Insert picture description here  Insert picture description here gcd: The greatest common factor is 1
 Insert picture description here  Insert picture description here  Insert picture description here Take the middle bit , It can make the influence of each digit of the original key code closer to each other , Here's the picture , Square operations can be decomposed into addition operations
 Insert picture description here  Insert picture description here The more random the hash function is , The more irregular , The better .
 Insert picture description here  Insert picture description here key It may not be an integer , You need to convert it to hashcode, Do it again. , Here's the picture
 Insert picture description here  Insert picture description here It is necessary to , If you use a simple calculation method , It's easy to have hash conflicts
 Insert picture description here




9.d Conflict

9.d1 Hash resolves conflicts 1

 Insert picture description here  Insert picture description here  Insert picture description here  Insert picture description here  Insert picture description here  Insert picture description here  Insert picture description here  Insert picture description here Skip when searching , Insert directly when inserting .

9.d2 Hash resolves conflicts 2

 Insert picture description here Closed addressing vs Open addressing
The linear heuristics described above , It's too close to each other , There will be a lot of unnecessary conflicts . You can open the space properly
 Insert picture description here  Insert picture description here Suppose the size of a cache is 1KB, Only the pointer is stored in the bucket (4 byte ), Well, unless 16 Secondary hash conflict , To make the cache invalid , Need extra I/O.
 Insert picture description here  Insert picture description here b+a≥2, yes M The nontrivial factor of , This is related to M Contradict prime numbers .
 Insert picture description here  Insert picture description here For some primes ( for example 7,11) The long table , The two-way search chain works , Some primes ( for example 5,13) The effect of the surface length is not good .
 Insert picture description here  Insert picture description here  Insert picture description here  Insert picture description here




9.e bucket / Count sorting

 Insert picture description here among n Is the number of elements ,[0,M) Is the element range
 Insert picture description here  Insert picture description here  Insert picture description here Where the red line is the integral of the blue line
 Insert picture description here Linear scan over and over again count[], Scan again count[] obtain accum[], You can sort in linear time .



版权声明
本文为[osc_0m0d4mbq]所创,转载请带上原文链接,感谢