当前位置:网站首页>Redis discovery bloom filter
Redis discovery bloom filter
2022-06-26 05:21:00 【Bronze God】
What is a bloon filter ?
The bloon filter (Bloom Filter) yes 1970 Proposed by bron in . He studied the problem of judging whether massive elements exist , That is, how many bitmaps and hash functions are needed , The proposed container is called a bloom filter .
advantage :
Compared with other data structures , The bloon filter has great advantages in space and time . Insertion of Bloom filter 、 The query time complexity is O(N).
The bloom filter does not need to store the elements themselves , So confidentiality is guaranteed .
shortcoming :
As the number of stored elements increases , The miscalculation rate increases with it .
It's hard to delete the elements .
Use of bloon filter
I'm using guava The bloom filter , First learn how to use .
public class BloomFilterTest {
/** Expected inserted data */
private static Integer expectedInsertions = 10000000;
/** Miscalculation rate */
private static Double fpp = 0.001;
/** The bloon filter */
private static BloomFilter<Integer> bloomFilter = BloomFilter.create(Funnels.integerFunnel(), expectedInsertions, fpp);
public static void main(String[] args) {
// Insert 1 Millions of data
for (int i = 0; i < expectedInsertions; i++) {
bloomFilter.put(i);
}
// use 1 10 million data test misjudgment rate
int count = 0;
for (int i = expectedInsertions; i < expectedInsertions * 2; i++) {
if (bloomFilter.mightContain(i)) {
count++;
}
}
System.out.println(" A total of misjudgments :" + count);
System.out.println(" Miscalculation rate :" + (100.0 * count / expectedInsertions) + "%");
}
}result

Realization principle
We all know hashMap, Actually, the bloan filter is similar to hashMap There is a certain similarity .

however hashMap Of hash The probability of collision is too high , We have to optimize . How to optimize , If you feel unsafe at home , We'll lock the door , If you are not afraid of trouble, add more locks , Reduce the possibility that someone else's key can open my house .

So the probability of collision is small . Check whether it exists , Take out more keys , Try to open the door .
Source code analysis
There are four private variables in the source code
// An array that stores data mappings /bitmap: The length is calculated according to the estimated data volume and the error rate , The error rate is inversely proportional to the size of the array .
private final LockFreeBitArray bits;
// perform hash The number of algorithms : according to bits Calculation of array length and estimated data volume , It is inversely proportional to the misjudgment rate .
private final int numHashFunctions;
// data type
private final Funnel<? super T> funnel;
// hash Algorithm
private final BloomFilter.Strategy strategy;put The process of , after numHashFunctions Time hash, Every time I look for a storage location , See if it works , Effective is corresponding to accumulation .
mightContain The process of , after numHashFunctions Time hash, Every time I look for a storage location , See if it works , after numHashFunctions Time hash If it works every time, it proves to exist .
public <T> boolean put(@ParametricNullness T object, Funnel<? super T> funnel, int numHashFunctions, BloomFilterStrategies.LockFreeBitArray bits) {
long bitSize = bits.bitSize();
byte[] bytes = Hashing.murmur3_128().hashObject(object, funnel).getBytesInternal();
long hash1 = this.lowerEight(bytes);
long hash2 = this.upperEight(bytes);
boolean bitsChanged = false;
long combinedHash = hash1;
// after numHashFunctions Time hash
for(int i = 0; i < numHashFunctions; ++i) {
bitsChanged |= bits.set((combinedHash & 9223372036854775807L) % bitSize);
combinedHash += hash2;
}
return bitsChanged;
}
public <T> boolean mightContain(@ParametricNullness T object, Funnel<? super T> funnel, int numHashFunctions, BloomFilterStrategies.LockFreeBitArray bits) {
long bitSize = bits.bitSize();
byte[] bytes = Hashing.murmur3_128().hashObject(object, funnel).getBytesInternal();
long hash1 = this.lowerEight(bytes);
long hash2 = this.upperEight(bytes);
long combinedHash = hash1;
for(int i = 0; i < numHashFunctions; ++i) {
// If bitmap If there is no corresponding value in, it is considered that there is no such value , return false
if (!bits.get((combinedHash & 9223372036854775807L) % bitSize)) {
return false;
}
combinedHash += hash2;
}
return true;
}Practical application scenarios
(1) Code in code is cache penetration .
(2) Spam filtering , By adding spam addresses to the bloom filter , When the bloom filter determines that the address exists in the garbage address list , And then check it .
边栏推荐
- The first gift of the project, the flying oar contract!
- Computer Vision Tools Chain
- Implementation of IM message delivery guarantee mechanism (II): ensure reliable delivery of offline messages
- FastAdmin Apache下设置伪静态
- Install the tp6.0 framework under windows, picture and text. Thinkphp6.0 installation tutorial
- Anaconda creates tensorflow environment
- [greedy college] Figure neural network advanced training camp
- Excellent learning ability is your only sustainable competitive advantage
- How to rewrite a pseudo static URL created by zenpart
- Gd32f3x0 official PWM drive has a small positive bandwidth (inaccurate timing)
猜你喜欢

6.1 - 6.2 introduction to public key cryptography

Experience of reading the road to wealth and freedom

cartographer_ backend_ constraint

Introduction to alluxio

Installation and deployment of alluxio

The localstorage browser stores locally to limit the number of forms submitted when tourists do not log in.

Recursively traverse directory structure and tree presentation

【ARM】在NUC977上搭建基于boa的嵌入式web服务器

基于SDN的DDoS攻击缓解

瀚高数据库自定义操作符‘!~~‘
随机推荐
apktool 工具使用文档
ssh连win10报错:Permission denied (publickey,keyboard-interactive).
Secondary bootloader about boot28 Precautions for ASM application, 28035
How to rewrite a pseudo static URL created by zenpart
A beginner's entry is enough: develop mobile IM from zero
vscode config
线程优先级
Command line interface of alluxio
10 set
Windows下安装Tp6.0框架,图文。Thinkphp6.0安装教程
Sentimentin tensorflow_ analysis_ layer
Redis installation on Linux
cartographer_backend_constraint
Day3 data type and Operator jobs
cartographer_pose_graph_2d
uni-app吸顶固定样式
Experience of reading the road to wealth and freedom
cartographer_ optimization_ problem_ 2d
递归遍历目录结构和树状展现
ECCV 2020 double champion team, take you to conquer target detection on the 7th