当前位置:网站首页>Bloom filter bloom
Bloom filter bloom
2022-08-01 14:31:00 【IABQL】
Use the Bloom filter to filter out data that does not exist in the DB, effectively reducing the possibility of cache penetration.
Let's briefly describe the process:
Hash the data in the DB (usually several operations are required), and store a value of 1 in the calculated position.When a request comes in, first access the redis cache and find that the data does not exist, then access the Bloom filter, and obtain the data at that location after hash operation.If it is 1, it means that the data exists in the DB, then access the specific data in the DB, otherwise do not access the DB.Thereby reducing the amount of access to the DB.
Let's take a look at the specific process of bloom work:
Bloom filter consists of "bitmap array whose initial value is 0" and "N hash functions".When we are writing database data, we make a mark in the Bloom filter, so that the next time we query whether the data is in the database, we only need to query the Bloom filter. If the queried data is not marked, it means that it is not in the database.
Bloom filters complete the tagging in 3 actions:
The first step is to use N hash functions to hash the data respectively to obtain N hash values;
The second step is to pair the N hash values obtained in the first step with the bitmap arrayModulo the length to get the corresponding position of each hash value in the bitmap array.
The third step is to set the value of each hash value in the corresponding position of the bitmap array to 1;
For example, suppose there is a bitmap array with a length of 8 and a distribution of 3 hash functions.Long filter.
After the database writes the data x, when the data x is marked in the Bloom filter, the data x will be calculated by 3 hash functions to obtain 3 hash values, and then the 3 hash values will be paired8 Take the modulo, assuming that the result of the modulo is 1, 4, 6, and then set the value of the 1st, 4th, and 6th positions of the bitmap array to 1.When the application wants to query whether the data x is a database, it only needs to check whether the values in the 1st, 4th, and 6th positions of the bitmap array are all 1 through the Bloom filter. As long as one of the values is 0, it is considered that the data x is not in the database.
Because the Bloom filter is based on the hash function, there is the possibility of hash collision while searching efficiently. For example, data x and data y may both fall in the 1st, 4th, and 6th positions, but in fact, there may be no data y in the database, and there is a misjudgment.You can reduce hash conflicts and reduce misjudgments by increasing the number of hash operations.(Because the more operations, the probability of wanting to have 1 in all positions will decrease, and the natural misjudgment will decrease. But the more operations, the larger the required array length, and the slower the operation speed.So it should be implemented according to actual business needs).
So, querying the Bloom filter that the data exists does not necessarily prove that the data exists in the database, but if the data does not exist in the query, the number must not exist in the database.
Original link: https://blog.csdn.net/qq_34827674/article/details/123463175
边栏推荐
- 立新能源深交所上市:市值55亿 哈密国投与国有基金是股东
- Pytorch - Distributed Model Training
- 【码蹄集新手村600题】判断一个数字是否为完全平方数
- E - Red and Blue Graph(组合数学)
- OpenSSL SSL_read: Connection was reset, errno 10054
- Amperon IPO meeting: annual revenue of 500 million Tongchuang Weiye and China Mobile Innovation are shareholders
- PAT1166 Summit(25)
- D - Draw Your Cards(模拟)
- 沃文特生物IPO过会:年营收4.8亿 养老基金是股东
- 有谁知道pg12.5版本的数据库驱动在哪里能找到么?
猜你喜欢
随机推荐
龙口联合化学通过注册:年营收5.5亿 李秀梅控制92.5%股权
PAT1166 Summit(25)
ThreadLocal保存用户登录信息
微信UI在线聊天源码 聊天系统PHP采用 PHP 编写的聊天软件,简直就是一个完整的迷你版微信
mysql查询两个字段值相同的记录
ECCV 2022|R2L: 用数据蒸馏加速NeRF
kubernetes之DaemonSet以及滚动更新
直播系统聊天技术(八):vivo直播系统中IM消息模块的架构实践
轮询和长轮询的区别
分布式中的CAP原理
207.数组序号转换
有谁知道pg12.5版本的数据库驱动在哪里能找到么?
Qt实战案例(55)——利用QDir删除选定文件目录下的空文件夹
Amperon IPO meeting: annual revenue of 500 million Tongchuang Weiye and China Mobile Innovation are shareholders
解读selenium webdriver
热心肠:关于肠道菌群和益生菌的10个观点
细读《阿里测试之道》
【每日一题】952. 按公因数计算最大组件大小
E - Red and Blue Graph(组合数学)
响应式2022英文企业官网源码,感觉挺有创意的