当前位置:网站首页>Huawei cloud gaussdb (for redis) unveils issue 21: using Gauss redis to achieve secondary indexing
Huawei cloud gaussdb (for redis) unveils issue 21: using Gauss redis to achieve secondary indexing
2022-07-06 23:02:00 【Hua Weiyun】
One 、 background
Bring up the index , The first impression is the noun of database , however , gaussian Redis You can also implement secondary indexing !!! gaussian Redis The secondary index in generally uses zset To achieve . gaussian Redis Compared to open source Redis It has higher stability 、 And cost advantage , Use Gauss Redis zset Realize the secondary index of business , You can get a win-win situation of performance and cost .
The essence of index is to use ordered structure to speed up query , So by Zset Structural Gauss Redis It can easily realize the index of numeric type and character type .
• Numeric type index (zset Sort by score ):
• Character type index ( When the scores are the same zset Sort by dictionary order ):
Let's cut into two classic business scenarios , See how to use Gauss Redis To build a stable and reliable secondary index system .
Two 、 Scene one : Complete the dictionary
When Press... In the browser When entering the query , Browsers usually recommend searches with the same prefix according to the possibility , This scenario It can be used gaussian Redis The secondary index function is realized .
2.1 Basic plan
The easiest way is to add each query of the user to the index . When need Conduct User input completion recommend when , Use ZRANGEBYLEX Execute range query that will do . If you don't want to return too many entries , gaussian Redis And support Use LIMIT Option to Reduce the number of results .
• Search the user banana Add to index :
ZADD myindex 0 banana:1
• Suppose the user enters “bit”, And we want to offer the possibility to “bit” Search keywords at the beginning .
ZRANGEBYLEX myindex "[bit" "[bit\xff"
That is to use ZRANGEBYLEX Make a range query , The query range is the string that the user is now entering , And the same string plus a trailing byte 255(\xff). In this way , We can get all strings prefixed with the string typed by the user .
2.2 And frequency Relevant dictionary completion
In practical applications, we usually want to follow the frequency Automatically Sort and complete the entries , Also can Clear out entries that are no longer popular , And automatically adapt to future input . We can still use Gauss Redis Of ZSet Structure to achieve this goal , Just in the index structure , Not only need Store search terms , also need Store the frequency associated with it .
• Search the user banana Add to index
• Judge banana Whether there is
ZRANGEBYLEX myindex "[banana:" + LIMIT 0 1
• hypothesis banana non-existent , add to banana:1, among 1 It's the frequency
ZADD myindex 0 banana:1
• hypothesis banana There is , You need to increase the frequency
if ZRANGEBYLEX myindex "[banana:" + LIMIT 0 1 The frequency returned in is 1
1) Delete old entries :
ZREM myindex 0 banana:1
2) Add one frequency to rejoin :
ZADD myindex 0 banana:2
Please note that , Because there may be concurrent updates , Therefore, it should pass Lua The script sends the above three commands , use Lua script Automatically get the old count and add the entry again after increasing the score .
• Suppose the user enters “banana”, And we want to provide similar Search keywords for . adopt ZRANGEBYLEX The results are sorted by frequency .
ZRANGEBYLEX myindex "[banana:" + LIMIT 0 101) "banana:123"2) "banaooo:1"3) "banned user:49"4) "banning:89"
• Use flow algorithm Clear uncommon input . Select an entry randomly from the returned entries , Score it reduce 1, Then add it again with the new score . however , If the new score is 0, We You need to delete this item from the list Objective .
• If the frequency of randomly selected items is 1, Such as banaooo:1
ZREM myindex 0 banaooo:1
• If the frequency of randomly selected items is greater than 1, Such as banana:123
ZREM myindex 0 banana:123ZADD myindex 0 banana:122
In the long run , The index will contain popular searches , If popular searches change over time , It will also automatically adapt .
3、 ... and 、 Scene two : Multidimensional index
Except for queries on a single dimension , gaussian Redis It also supports retrieval in multidimensional data . for example , Retrieve all ages at 50 to 55 Between the ages of , meanwhile Salary in 70000 to 85000 Between people . The key to realize multi-dimensional secondary index is to convert two-dimensional data into One-dimensional data , Then based on Gauss Redis zset Storage .
from The visual perspective represents a two-dimensional index . The figure below There are some points in space , They represent our data samples , among x and y yes Two variables , Its The maximum values are 400. The blue box in the picture represents our query . We hope Inquire about x Be situated between 50 and 100 Between ,y Be situated between 100 and 300 All points between .
3.1 Data encoding
If the inserted data point is x = 75 and y = 200
1) fill 0( Data maximum is 400, So fill 3 position )
x = 075
y = 200
2) Interleave numbers , With x Represents the leftmost number , With y Represents the leftmost number , And so on , To create a code
027050
If you use 00 and 99 Replace the last two , namely 027000 to 027099,map return x and y, namely :
x = 70-79
y = 200-209
therefore , in the light of x=70-79 and y = 200-209 Two dimensional query , It can be encoded map become 027000 to 027099 One dimensional query , This can be achieved by Gauss Redis Of Zset The structure is easy to realize .
Empathy , We can aim at the last four / 6、 ... and /etc Digit number The same operation , So as to obtain a larger range .
3) Use binary
To obtain finer granularity , Data can be represented in binary , So when replacing numbers , Each time, you will get twice the original search scope . Suppose we only need 9 position ( To indicate at most 400 A number of values ), Our number in binary form will be :
x = 75 -> 001001011
y = 200 -> 011001000
After interweaving ,000111000011001010
Let's look at the use of 0s ad 1s Replace the last 2、4、6、8,... What is our range of bits :
3.2 Add a new element
If the inserted data point is x = 75 and y = 200
x = 75 and y = 200 Binary interleaved code is 000111000011001010,
ZADD myindex 0 000111000011001010
3.3 Inquire about
Inquire about :x Be situated between 50 and 100 Between ,y Be situated between 100 and 300 All points between
Replace from index N Bit will give us a side length of 2^(N/2) Search box . therefore , What we need to do is to check the smaller size of the search box , And check the closest 2 The power of , And constantly divide the remaining space , Then use ZRANGEBYLEX To search .
Here is Sample code :
def spacequery(x0,y0,x1,y1,exp) bits=exp*2 x_start = x0/(2**exp) x_end = x1/(2**exp) y_start = y0/(2**exp) y_end = y1/(2**exp) (x_start..x_end).each{|x| (y_start..y_end).each{|y| x_range_start = x*(2**exp) x_range_end = x_range_start | ((2**exp)-1) y_range_start = y*(2**exp) y_range_end = y_range_start | ((2**exp)-1) puts "#{x},#{y} x from #{x_range_start} to #{x_range_end}, y from #{y_range_start} to #{y_range_end}" # Turn it into interleaved form for ZRANGEBYLEX query. # We assume we need 9 bits for each integer, so the final # interleaved representation will be 18 bits. xbin = x_range_start.to_s(2).rjust(9,'0') ybin = y_range_start.to_s(2).rjust(9,'0') s = xbin.split("").zip(ybin.split("")).flatten.compact.join("") # Now that we have the start of the range, calculate the end # by replacing the specified number of bits from 0 to 1. e = s[0..-(bits+1)]+("1"*bits) puts "ZRANGEBYLEX myindex [#{s} [#{e}" } }endspacequery(50,100,100,300,6)
Four 、 summary
This paper introduces how to pass Gauss Redis Build a secondary index , The secondary index is in e-commerce 、 chart (hexastore)、 Games and other fields have a wide range of application scenarios , gaussian redis There are many similar applications in Xianwang . gaussian Redis Based on the architecture of separation of storage and calculation , Rely on distributed storage pool to ensure strong data consistency , It can easily support the secondary index function , Provide stable and reliable services for enterprise customers 、 Super high concurrency , Core data storage services that can be rapidly and elastically expanded .
appendix
- The author of this article :
Huawei cloud database GaussDB(for Redis) The team
- Hangzhou / Xi'an / Shenzhen resume delivery :
- More product information , Welcome to the official blog :
bbs.huaweicloud.com/blogs/248875
边栏推荐
- ThreadLocal详解
- The application of machine learning in software testing
- The difference between enumeration and define macro
- mysql查看表结构的三种方法总结
- 儿童睡衣(澳大利亚)AS/NZS 1249:2014办理流程
- How to choose indoor LED display? These five considerations must be taken into account
- Project duplicate template
- AcWing 4300. Two operations (minimum number of BFS searches)
- UVa 11732 – strcmp() Anyone?
- 存币生息理财dapp系统开发案例演示
猜你喜欢
Let's see through the network i/o model from beginning to end
企業不想換掉用了十年的老系統
机试刷题1
视图(view)
Method of canceling automatic watermarking of uploaded pictures by CSDN
Matlab tips (27) grey prediction
【全网首发】Redis系列3:高可用之主从架构的
Unified Focal loss: Generalising Dice and cross entropy-based losses to handle class imbalanced medi
On file uploading of network security
Improving Multimodal Accuracy Through Modality Pre-training and Attention
随机推荐
HDU 5077 NAND (violent tabulation)
「小程序容器技术」,是噱头还是新风口?
NPM cannot install sharp
允许全表扫描 那个语句好像不生效set odps.sql.allow.fullscan=true;我
Traversal of a tree in first order, middle order, and then order
存币生息理财dapp系统开发案例演示
【Unity】升级版·Excel数据解析,自动创建对应C#类,自动创建ScriptableObject生成类,自动序列化Asset文件
关于声子和热输运计算中BORN电荷和non-analytic修正的问题
[IELTS speaking] Anna's oral learning record part1
Balanced Multimodal Learning via On-the-fly Gradient Modulation(CVPR2022 oral)
CRMEB商城系统如何助力营销?
Sizeof keyword
MySQL实现字段分割一行转多行的示例代码
Aardio - does not declare the method of directly passing float values
Demonstration of the development case of DAPP system for money deposit and interest bearing financial management
POJ 1258 Agri-Net
MySQL中正则表达式(REGEXP)使用详解
Chapter 19 using work queue manager (2)
CSDN 上传图片取消自动加水印的方法
CUDA exploration