当前位置:网站首页>What is redis hyperloglog? The use of these scenes makes me laugh like a dragon
What is redis hyperloglog? The use of these scenes makes me laugh like a dragon
2022-06-21 20:52:00 【InfoQ】
- Count one
APPDaily life of 、 Monthly living number ;
- Count how many different accounts visit a page every day (Unique Visitor,UV));
- Count the number of different terms searched by users every day ;
- Statistical registration IP Count .
HyperLogLogRedission
Use Set Realization
SADD Redis Why so soon? :uv Xiao Cai chicken Xie bage Xiao Cai chicken
(integer) 1
SCARDSCARD Redis Why so soon? :uv
(integer) 2
Use Hash Realization
valueHLENHSET Redis Why so soon? Xiao Cai chicken 1
// Statistics UV
HLEN Redis Why so soon?
Use Bitmap Realization
GETBIT、SETBIT
SETBIT and BITCOUNToffset6SETBIT Skillfully use Redis The data type realizes 100 million level data statistics 6 1
BITCOUNTBITCOUNT Skillfully use Redis The data type realizes 100 million level data statistics
HyperLogLog The king plan
HashMapHyperLogLogHyperLogLog0.81%Redis actual combat
PFADD、PFCOUNT、PFMERGEPFADD
HyperLogLogPFADD Redis Master slave synchronization principle :uv userID1 userID 2 useID3
PFCOUNT
PFCOUNTPFCOUNT Redis Master slave synchronization principle :uv
PFMERGE Use scenarios
HyperLogLog` Except for the top `PFADD` and `PFCOIUNT` Outside , It also provides `PFMERGE
grammar
PFMERGE destkey sourcekey [sourcekey ...]
PFMERGEHyperLogLogPFADD Redis data user1 user2 user3
PFADD MySQL data user1 user2 user4
PFMERGE database Redis data MySQL data
PFCOUNT database // Return value = 4
Redission actual combat
pom rely on
<dependency>
<groupId>org.redisson</groupId>
<artifactId>redisson-spring-boot-starter</artifactId>
<version>3.16.7</version>
</dependency>
Add data to Log
// Add a single element
public <T> void add(String logName, T item) {
RHyperLogLog<T> hyperLogLog = redissonClient.getHyperLogLog(logName);
hyperLogLog.add(item);
}
// Add collection data to HyperLogLog
public <T> void addAll(String logName, List<T> items) {
RHyperLogLog<T> hyperLogLog = redissonClient.getHyperLogLog(logName);
hyperLogLog.addAll(items);
}
Merge
/**
* take otherLogNames Of log Merge into logName
*
* @param logName At present log
* @param otherLogNames Need to merge to current log Other logs
* @param <T>
*/
public <T> void merge(String logName, String... otherLogNames) {
RHyperLogLog<T> hyperLogLog = redissonClient.getHyperLogLog(logName);
hyperLogLog.mergeWith(otherLogNames);
}
Statistical base
public <T> long count(String logName) {
RHyperLogLog<T> hyperLogLog = redissonClient.getHyperLogLog(logName);
return hyperLogLog.count();
}
unit testing
@Slf4j
@RunWith(SpringRunner.class)
@SpringBootTest(classes = RedissionApplication.class)
public class HyperLogLogTest {
@Autowired
private HyperLogLogService hyperLogLogService;
@Test
public void testAdd() {
String logName = " Code byte :Redis Why so soon? :uv";
String item = " Xiao Cai chicken ";
hyperLogLogService.add(logName, item);
log.info(" Additive elements [{}] To log [{}] in .", item, logName);
}
@Test
public void testCount() {
String logName = " Code byte :Redis Why so soon? :uv";
long count = hyperLogLogService.count(logName);
log.info("logName = {} count = {}.", logName, count);
}
@Test
public void testMerge() {
ArrayList<String> items = new ArrayList<>();
items.add(" Xiao Cai chicken ");
items.add(" Xie bage ");
items.add(" Chen Xiaobai ");
String otherLogName = " Code byte :Redis Principle and practice of multithreading model :uv";
hyperLogLogService.addAll(otherLogName, items);
log.info(" add to {} Elements to log [{}] in .", items.size(), otherLogName);
String logName = " Code byte :Redis Why so soon? :uv";
hyperLogLogService.merge(logName, otherLogName);
log.info(" take {} Merge into {}.", otherLogName, logName);
long count = hyperLogLogService.count(logName);
log.info(" After the merger count = {}.", count);
}
}
The basic principle
1/2kk1knk1, k2 ... kn k_max
- https://www.zhihu.com/question/53416615
- https://en.wikipedia.org/wiki/HyperLogLog
- How to count users' daily and monthly activities - Redis HyperLogLog Detailed explanation
HyperLogLogHyperLogLog163842^14maxbitsbitsmaxbits=632^14 * 6 / 8 = 12ksummary
HashBitmapHyperLogLogHash: Method is simple , High precision statistics , Use with a small amount of data , For massive data, it will occupy a lot of memory ;
Bitmap: Bitmap algorithm , Suitable for use 「 Binary statistics scenario 」, Please refer to me for detailsThis article, For a large number of different pages, data statistics will still occupy a large amount of memory .
Set: Use the de duplication feature to realize , One Set Saved the data of tens of millions of users ID, Too many pages and too much memory . stay Redis Inside , EveryHyperLogLogKeys only cost 12 KB Memory , So we can calculate the proximity2^64Cardinality of different elements . becauseHyperLogLogOnly the input elements will be used to calculate the cardinality , Instead of storing the input elements themselves , thereforeHyperLogLogIt can't be like a collection , Return the various elements of the input .
HyperLogLogIt's an algorithm , Is notRedisalone possess
- The purpose is to make cardinality Statistics , Therefore, it is not a set , Metadata will not be saved , Only record the quantity, not the value
- Minimal space consumption , Support the input of very large data volume
- The core is the cardinality estimation algorithm , It is mainly manifested in the use of memory during calculation and the processing of data merging . There is a certain error in the final value
RedisEach of themHyperloglogkey Occupied 12K Memory used to mark cardinality ( Official documents )
pfaddCommands are not assigned at once 12k Memory , But gradually increase the memory allocation as the cardinality increases ; and pfmerge Operation will sourcekey After merging, it is stored in 12k The size of key in , fromhyperloglogPrinciple of merge operation ( TwoHyperloglogWhen merging, you need to compare the values of each bucket separately ) It's easy to understand .
- Error description : The result of cardinality estimation is a function with
0.81%The standard error (standard error) Approximate value . Is an acceptable range
RedisYesHyperLogLogOptimize your storage , When the count is small , The storage space adopts sparse matrix storage , It takes up very little space , Just as the count grows , When the space occupied by sparse matrix gradually exceeds the threshold, it will be transformed into dense matrix at one time , Will occupy 12k Space
Good article recommends
- Redis Actual combat : Skillfully using data types to achieve billion level data statistics
- hardcore | Redis Bron (Bloom Filter) Filter principle and Practice
- Redis Actual combat : Skillfully use Bitmap Achieve billion level massive data statistics
- Redis Actual combat : adopt Geo The type implements that people near meet the goddess
- Redis The correct implementation principle and evolution process of distributed lock Redisson Practical summary
边栏推荐
猜你喜欢

What is more advantageous than domestic spot silver?

机器学习和模式识别怎么区分?

TC3608H高效率 1.2MHz DC-DC 升压器 IC

Flutter 输入框组件

YX2811景观装鉓驱动IC

自然语言处理如何实现聊天机器人?

Anfulai embedded weekly report (issue 270): June 13, 2022 to June 19, 2022

UIButton实现左文字右图片

M3608升压ic芯片High Efficiency 2.6A Boost DC/DC Convertor

pfSense配置TINC站點至站點隧道教程
随机推荐
How to redeem financial products after the opening date?
如何使用Memcached实现Django项目缓存
异步方法 理解(demo附代码)
Visualization of operation and maintenance monitoring data - let the data speak [Huahui data]
随机森林(Random Forest)学习笔记
运维监控数据可视化-让数据自己会说话[华汇数据]
Laravel imports and exports excel using phpoffice
M3608升压ic芯片High Efficiency 2.6A Boost DC/DC Convertor
YB5212A充电IC充电芯片sop8
Flutter PageView组件
【基于合泰HT32F52352的智慧垃圾桶总结】
NS32F103VBT6软硬件替代STM32F103VBT6
拼多多618手机品牌官旗销量同比增长124%,4000+高价位手机同比增长156%
高等代数_第9章:线性映射
零售数字化起锚阶段,更多地关注的是如何借助数字化的手段对流量进行挖掘和转化
History of the Great Game
The Summer Challenge realizes a standard layout of Huawei app with openharmony ETS
How to distinguish between machine learning and pattern recognition?
机器学习之数据处理与可视化【鸢尾花数据分类|特征属性比较】
Highly scalable, emqx 5.0 achieves 100million mqtt connections