当前位置:网站首页>Bloom filter
Bloom filter
2022-06-30 09:41:00 【Zip-List】
The bloon filter
Definition
The bloon filter is ⼀ A probabilistic data structure , It is characterized by ⾼ Effective insert ⼊ and Inquire about , Can clearly tell a string ⼀ Must not exist perhaps Possible ;
Bloom filter phase ⽐ Traditional query structure ( for example : hash, set, map And so on ) more ⾼ effect , Occupy ⽤ More space ⼩; But the drawback is that it returns The result is probability Of , That is to say, there are errors in the results , Although this error is controllable ; At the same time, it No ⽀ Hold delete operation ;
form
Bitmap (bit Array ) + n individual hash function 
principle
When ⼀ Elements plus ⼊ Bitmap time , adopt k individual hash Function maps this element to the of the bitmap k A little bit , And set them as
1; When retrieving , Re pass k individual hash Function operation to detect the of bitmap k Whether all points are 1; If there is any reason not to 1 The point of , So think
non-existent ; If it's all 1, There may be ( There is an error );
There are only two states for each slot in the bitmap (0 perhaps 1),⼀ Slots are set to 1 state , But it is not clear how many times it has been set ; I don't know how many str1 Hash mapping and which hash Function mapping ; So don't ⽀ Hold the delete operation ;
In practice, we should ⽤ In the process , How the bloom filter makes ⽤? How many hash function , Bitmap of how much space to allocate , How many elements are stored ? In addition, how to control the false positive rate ( The bloan filter can define ⼀ Must not exist , Not clear ⼀ There must be , Then there is an error in the judgment of existence , False positive rate is the probability of wrong judgment )?
In practice, we should ⽤ in , We are sure that n and p, Pass on ⾯ The calculation of m and k; It can also be in ⽹ Select the appropriate value on the station :
https://hur.st/bloomfilterIt is known that k, How to choose k individual hash function ?
// Mining ⽤⼀ individual hash function , to hash Pass on different species ⼦ Offset value
// #define MIX_UINT64(v) ((uint32_t)((v>>32)^(v)))
uint64_t hash1 = MurmurHash2_x64(key, len, Seed);
uint64_t hash2 = MurmurHash2_x64(key, len, MIX_UINT64(hash1));
for (i = 0; i < k; i++) // k yes hash Number of functions
{
Pos[i] = (hash1 + i*hash2) % m; // m It's a bitmap ⼤⼩
}
// Through this kind of ⽅ To simulate k individual hash function Before us ⾯ Open addressing double hash yes ⼀ What kind of thinking
边栏推荐
- qmlplugindump executable not found.It is required to generate the qmltypes file for VTK Qml
- Redis + MySQL implements the like function
- Applet learning path 2 - event binding
- I'm late for school
- Redis docker 主从模式与哨兵sentinel
- Simple redis lock
- Deep Learning with Pytorch - autograd
- Use V-IF with V-for
- UltraEdit delete empty line method
- MySQL explain
猜你喜欢
随机推荐
ES6 learning path (IV) operator extension
Tclistener server and tcpclient client
Simple redis lock
Redis docker master-slave mode and sentinel
UltraEdit delete empty line method
MySQL index and data storage structure foundation
Train an image classifier demo in pytorch [learning notes]
POJ 1753 flip game (DFS 𞓜 bit operation)
Microsoft. Bcl. Async usage summary -- in Net framework 4.5 project Net framework version 4.5 and above can use async/await asynchronous feature in C 5
MySQL-- Entity Framework Code First(EF Code First)
Task summary in NLP
DDD interview
云技能提升好伙伴,亚马逊云师兄今天正式营业
8.8 heap insertion and deletion
Cftpconnection:: getfile() download FTP server files and related parameter descriptions
About the smart platform solution for business hall Terminal Desktop System
Handwriting sorter component
桂林 稳健医疗收购桂林乳胶100%股权 填补乳胶产品线空白
Pass anonymous function to simplification principle
训练一个图像分类器demo in PyTorch【学习笔记】









