当前位置:网站首页>Bloom filter bloom
Bloom filter bloom
2022-08-01 14:31:00 【IABQL】
Use the Bloom filter to filter out data that does not exist in the DB, effectively reducing the possibility of cache penetration.
Let's briefly describe the process:
Hash the data in the DB (usually several operations are required), and store a value of 1 in the calculated position.When a request comes in, first access the redis cache and find that the data does not exist, then access the Bloom filter, and obtain the data at that location after hash operation.If it is 1, it means that the data exists in the DB, then access the specific data in the DB, otherwise do not access the DB.Thereby reducing the amount of access to the DB.
Let's take a look at the specific process of bloom work:
Bloom filter consists of "bitmap array whose initial value is 0" and "N hash functions".When we are writing database data, we make a mark in the Bloom filter, so that the next time we query whether the data is in the database, we only need to query the Bloom filter. If the queried data is not marked, it means that it is not in the database.
Bloom filters complete the tagging in 3 actions:
The first step is to use N hash functions to hash the data respectively to obtain N hash values;
The second step is to pair the N hash values obtained in the first step with the bitmap arrayModulo the length to get the corresponding position of each hash value in the bitmap array.
The third step is to set the value of each hash value in the corresponding position of the bitmap array to 1;
For example, suppose there is a bitmap array with a length of 8 and a distribution of 3 hash functions.Long filter.
After the database writes the data x, when the data x is marked in the Bloom filter, the data x will be calculated by 3 hash functions to obtain 3 hash values, and then the 3 hash values will be paired8 Take the modulo, assuming that the result of the modulo is 1, 4, 6, and then set the value of the 1st, 4th, and 6th positions of the bitmap array to 1.When the application wants to query whether the data x is a database, it only needs to check whether the values in the 1st, 4th, and 6th positions of the bitmap array are all 1 through the Bloom filter. As long as one of the values is 0, it is considered that the data x is not in the database.
Because the Bloom filter is based on the hash function, there is the possibility of hash collision while searching efficiently. For example, data x and data y may both fall in the 1st, 4th, and 6th positions, but in fact, there may be no data y in the database, and there is a misjudgment.You can reduce hash conflicts and reduce misjudgments by increasing the number of hash operations.(Because the more operations, the probability of wanting to have 1 in all positions will decrease, and the natural misjudgment will decrease. But the more operations, the larger the required array length, and the slower the operation speed.So it should be implemented according to actual business needs).
So, querying the Bloom filter that the data exists does not necessarily prove that the data exists in the database, but if the data does not exist in the query, the number must not exist in the database.
Original link: https://blog.csdn.net/qq_34827674/article/details/123463175
边栏推荐
猜你喜欢
随机推荐
【每日一题】952. 按公因数计算最大组件大小
关于Request复用的那点破事儿。研究明白了,给你汇报一下。
D - Draw Your Cards (Simulation)
2022年5月20日最全摸鱼游戏导航
对标丰田!蔚来又一新品牌披露:产品价格低于20万
搭建ntp时间服务器(安装sql2000配置服务器失败)
【论文笔记】MiniSeg: An Extremely Minimum Network for Efficient COVID-19 Segmentation
利用UIRecorder做页面元素巡检
直播系统聊天技术(八):vivo直播系统中IM消息模块的架构实践
珠海首个水环境安全监测系统上线
Gradle系列——Gradle测试,Gradle生命周期,settings.gradle说明,Gradle任务(基于Groovy文档4.0.4)day2-3
灵魂发问:MySQL是如何解决幻读的?
【每日一题】1161. 最大层内元素和
牛客刷SQL--3
阿里巴巴测试开发岗P6面试题
Koreographer Professional Edition丨一款Unity音游插件教程
[LiteratureReview]Optimal and Robust Category-level Perception: Object Pose and Shape Estimation f
openEuler 社区完成首批顾问专家聘用,共同为社区的发展贡献力量
ABC260 E - At Least One (Dual Pointer)
sql中常用到的正则表达