当前位置:网站首页>Rowkey design
Rowkey design
2022-07-29 06:28:00 【a568353087】
HBase It's based on Rowkey To search , The system finds some Rowkey ( Or a Rowkey Range ) Where Region, Then route the request for query data to the Region get data .HBase Search support for 3 Ways of planting :
(1) Through a single Rowkey visit , According to some Rowkey The key value is get operation , This gets the only record ;
(2) adopt Rowkey Of range Conduct scan, By setting startRowKey and endRowKey, Scan in this range . In this way, a batch of records can be obtained according to the specified conditions ;
(3) Full table scan , That is, directly scan all the row records in the whole table .
HBASE By single Rowkey The efficiency of retrieval is very high , Time consuming in 1 Under milliseconds , It's available every second 1000~2000 Bar record , But not key Column queries are slow .
At present, operating systems are all 64 Bit system , Memory 8 Byte alignment . Control in 16 Bytes ,8 Integer multiples of bytes take advantage of the best features of the operating system .
(2)MemStore Cache part of the data into memory , If Rowkey If the field is too long, the effective utilization of memory will be reduced , The system will not be able to cache any more data , This reduces retrieval efficiency . therefore Rowkey The shorter the byte length, the better .
(3) At present, operating systems are all 64 Bit system , Memory 8 Byte alignment . Control in 16 Bytes ,8 Integer multiples of bytes take advantage of the best features of the operating system .
If there are no hash fields , The first field directly is the time information that will generate all the new data in one RegionServer Hot spot phenomenon of upper accumulation , This way the load will be concentrated individually while doing the data retrieval RegionServer, Reduce query efficiency .
Through clever RowKey The design allows us to batch access to the elements of the records collection next to each other ( Should be in the same Region Next ), You can get good performance as you iterate over the results .
rowkey The only principle :
It must be unique in design ,rowkey It's stored in dictionary order , therefore , Design rowkey When , Make full use of the characteristics of this sort , Store frequently read data in one piece , Put together data that may be accessed recently .
Add salt
Salt is not added in cryptography , But in rowkey Increases the random number in front of , Specifically, giving rowkey Assign a random prefix so that it matches the previous one rowkey Different start . The number of prefix types assigned should be different from the number of prefix types you want to use region The same quantity . With salt rowkey It's going to spread out according to the randomly generated prefix region On , To avoid hot spots .
Hash
Hash causes the same row to always have a prefix with salt . Hashing can also spread the load across the cluster , But reading is predictable . Using a defined hash allows the client to refactor fully rowkey, have access to get Operation to accurately obtain a certain row data
reverse
The third way to prevent hot spots is to reverse the fixed length or number format rowkey. This will allow rowkey A frequently changing part of ( The least meaningful part ) On the front . This is effectively random rowkey, But the sacrifice rowkey The order of .
边栏推荐
- [beauty of software engineering - column notes] 14 | project management tools: all management problems should be considered whether they can be solved by tools
- 官方教程 Redshift 05 system参数详细解释
- c语言问题
- 服务器135、137、138、139、445等端口解释和关闭方法
- Add time series index to two-dimensional table
- 基于TCP的在线词典
- Unity中简单的cubecap+fresnel shader的实现
- 什么是撞库及撞库攻击的基本原理
- Leetcode 19. delete the penultimate node of the linked list
- Encapsulation - Super keyword
猜你喜欢
LeetCode #1.两数之和
Ue5 landscape conversion Nanite conversion method and it does not support the use method of starting dynamic mesh with lumen and lumen
Vivado IP核之浮点数乘除法 Floating-point
UE5 纹理系统讲解及常见问题设置及解决方案
Redshift还原SP效果 - SP贴图导出设置及贴图导入配置
[leetcode skimming] array 2 - binary search
动态规划总结
[beauty of software engineering - column notes] 14 | project management tools: all management problems should be considered whether they can be solved by tools
子网数、主机数与子网掩码的关系
Vivado IP核之复数浮点数除法 Floating-point
随机推荐
c语言面试准备一(谈谈理解系类)
Encapsulation - Super keyword
角色shader小练习
练习:存放部门信息
Official tutorial redshift 09 camera
官方教程 Redshift 07 Instances and Proxy
基于TCP的在线词典
动态规划总结
赛博朋克版特效shader
模型空间下的旋转和世界空间下的旋转
虹科分享 | 为什么说EtherCAT是提高控制系统性能的最佳解决方案?
Leetcode 26. delete duplicates in the ordered array
[beauty of software engineering - column notes] 19 | as a programmer, you should have product awareness
Official tutorial redshift 03 parameters and general instructions of various GI
Leetcode 167. sum of two numbers II - input ordered array
EtherCAT主站掉线后,如何保证目标系统免受故障影响?
Leetcode 977. Square of ordered array
UE5 纹理系统讲解及常见问题设置及解决方案
c语言问题
Leetcode - Tips