当前位置:网站首页>Huawei cloud gaussdb (for redis) unveils issue 21: using Gauss redis to achieve secondary indexing
Huawei cloud gaussdb (for redis) unveils issue 21: using Gauss redis to achieve secondary indexing
2022-07-06 23:02:00 【Hua Weiyun】
One 、 background
Bring up the index , The first impression is the noun of database , however , gaussian Redis You can also implement secondary indexing !!! gaussian Redis The secondary index in generally uses zset To achieve . gaussian Redis Compared to open source Redis It has higher stability 、 And cost advantage , Use Gauss Redis zset Realize the secondary index of business , You can get a win-win situation of performance and cost .
The essence of index is to use ordered structure to speed up query , So by Zset Structural Gauss Redis It can easily realize the index of numeric type and character type .
• Numeric type index (zset Sort by score ):
.png)
.png)
• Character type index ( When the scores are the same zset Sort by dictionary order ):
.png)
.png)
Let's cut into two classic business scenarios , See how to use Gauss Redis To build a stable and reliable secondary index system .
Two 、 Scene one : Complete the dictionary
When Press... In the browser When entering the query , Browsers usually recommend searches with the same prefix according to the possibility , This scenario It can be used gaussian Redis The secondary index function is realized .
.png)
2.1 Basic plan
The easiest way is to add each query of the user to the index . When need Conduct User input completion recommend when , Use ZRANGEBYLEX Execute range query that will do . If you don't want to return too many entries , gaussian Redis And support Use LIMIT Option to Reduce the number of results .
• Search the user banana Add to index :
ZADD myindex 0 banana:1• Suppose the user enters “bit”, And we want to offer the possibility to “bit” Search keywords at the beginning .
ZRANGEBYLEX myindex "[bit" "[bit\xff"That is to use ZRANGEBYLEX Make a range query , The query range is the string that the user is now entering , And the same string plus a trailing byte 255(\xff). In this way , We can get all strings prefixed with the string typed by the user .
2.2 And frequency Relevant dictionary completion
In practical applications, we usually want to follow the frequency Automatically Sort and complete the entries , Also can Clear out entries that are no longer popular , And automatically adapt to future input . We can still use Gauss Redis Of ZSet Structure to achieve this goal , Just in the index structure , Not only need Store search terms , also need Store the frequency associated with it .
• Search the user banana Add to index
• Judge banana Whether there is
ZRANGEBYLEX myindex "[banana:" + LIMIT 0 1• hypothesis banana non-existent , add to banana:1, among 1 It's the frequency
ZADD myindex 0 banana:1• hypothesis banana There is , You need to increase the frequency
if ZRANGEBYLEX myindex "[banana:" + LIMIT 0 1 The frequency returned in is 1
1) Delete old entries :
ZREM myindex 0 banana:12) Add one frequency to rejoin :
ZADD myindex 0 banana:2Please note that , Because there may be concurrent updates , Therefore, it should pass Lua The script sends the above three commands , use Lua script Automatically get the old count and add the entry again after increasing the score .
• Suppose the user enters “banana”, And we want to provide similar Search keywords for . adopt ZRANGEBYLEX The results are sorted by frequency .
ZRANGEBYLEX myindex "[banana:" + LIMIT 0 101) "banana:123"2) "banaooo:1"3) "banned user:49"4) "banning:89"• Use flow algorithm Clear uncommon input . Select an entry randomly from the returned entries , Score it reduce 1, Then add it again with the new score . however , If the new score is 0, We You need to delete this item from the list Objective .
• If the frequency of randomly selected items is 1, Such as banaooo:1
ZREM myindex 0 banaooo:1• If the frequency of randomly selected items is greater than 1, Such as banana:123
ZREM myindex 0 banana:123ZADD myindex 0 banana:122In the long run , The index will contain popular searches , If popular searches change over time , It will also automatically adapt .
3、 ... and 、 Scene two : Multidimensional index
Except for queries on a single dimension , gaussian Redis It also supports retrieval in multidimensional data . for example , Retrieve all ages at 50 to 55 Between the ages of , meanwhile Salary in 70000 to 85000 Between people . The key to realize multi-dimensional secondary index is to convert two-dimensional data into One-dimensional data , Then based on Gauss Redis zset Storage .
from The visual perspective represents a two-dimensional index . The figure below There are some points in space , They represent our data samples , among x and y yes Two variables , Its The maximum values are 400. The blue box in the picture represents our query . We hope Inquire about x Be situated between 50 and 100 Between ,y Be situated between 100 and 300 All points between .
.png)
3.1 Data encoding
If the inserted data point is x = 75 and y = 200
1) fill 0( Data maximum is 400, So fill 3 position )
x = 075
y = 200
2) Interleave numbers , With x Represents the leftmost number , With y Represents the leftmost number , And so on , To create a code
027050
If you use 00 and 99 Replace the last two , namely 027000 to 027099,map return x and y, namely :
x = 70-79
y = 200-209
therefore , in the light of x=70-79 and y = 200-209 Two dimensional query , It can be encoded map become 027000 to 027099 One dimensional query , This can be achieved by Gauss Redis Of Zset The structure is easy to realize .
.png)
Empathy , We can aim at the last four / 6、 ... and /etc Digit number The same operation , So as to obtain a larger range .
3) Use binary
To obtain finer granularity , Data can be represented in binary , So when replacing numbers , Each time, you will get twice the original search scope . Suppose we only need 9 position ( To indicate at most 400 A number of values ), Our number in binary form will be :
x = 75 -> 001001011
y = 200 -> 011001000
After interweaving ,000111000011001010
Let's look at the use of 0s ad 1s Replace the last 2、4、6、8,... What is our range of bits :
.png)
3.2 Add a new element
If the inserted data point is x = 75 and y = 200
x = 75 and y = 200 Binary interleaved code is 000111000011001010,
ZADD myindex 0 0001110000110010103.3 Inquire about
Inquire about :x Be situated between 50 and 100 Between ,y Be situated between 100 and 300 All points between
Replace from index N Bit will give us a side length of 2^(N/2) Search box . therefore , What we need to do is to check the smaller size of the search box , And check the closest 2 The power of , And constantly divide the remaining space , Then use ZRANGEBYLEX To search .
Here is Sample code :
def spacequery(x0,y0,x1,y1,exp) bits=exp*2 x_start = x0/(2**exp) x_end = x1/(2**exp) y_start = y0/(2**exp) y_end = y1/(2**exp) (x_start..x_end).each{|x| (y_start..y_end).each{|y| x_range_start = x*(2**exp) x_range_end = x_range_start | ((2**exp)-1) y_range_start = y*(2**exp) y_range_end = y_range_start | ((2**exp)-1) puts "#{x},#{y} x from #{x_range_start} to #{x_range_end}, y from #{y_range_start} to #{y_range_end}" # Turn it into interleaved form for ZRANGEBYLEX query. # We assume we need 9 bits for each integer, so the final # interleaved representation will be 18 bits. xbin = x_range_start.to_s(2).rjust(9,'0') ybin = y_range_start.to_s(2).rjust(9,'0') s = xbin.split("").zip(ybin.split("")).flatten.compact.join("") # Now that we have the start of the range, calculate the end # by replacing the specified number of bits from 0 to 1. e = s[0..-(bits+1)]+("1"*bits) puts "ZRANGEBYLEX myindex [#{s} [#{e}" } }endspacequery(50,100,100,300,6)Four 、 summary
This paper introduces how to pass Gauss Redis Build a secondary index , The secondary index is in e-commerce 、 chart (hexastore)、 Games and other fields have a wide range of application scenarios , gaussian redis There are many similar applications in Xianwang . gaussian Redis Based on the architecture of separation of storage and calculation , Rely on distributed storage pool to ensure strong data consistency , It can easily support the secondary index function , Provide stable and reliable services for enterprise customers 、 Super high concurrency , Core data storage services that can be rapidly and elastically expanded .
appendix
- The author of this article :
Huawei cloud database GaussDB(for Redis) The team
- Hangzhou / Xi'an / Shenzhen resume delivery :
- More product information , Welcome to the official blog :
bbs.huaweicloud.com/blogs/248875
边栏推荐
- COSCon'22 社区召集令来啦!Open the World,邀请所有社区一起拥抱开源,打开新世界~
- dockermysql修改root账号密码并赋予权限
- memcached
- Puppeter connects to the existing Chrome browser
- Uniapp setting background image effect demo (sorting)
- DR-Net: dual-rotation network with feature map enhancement for medical image segmentation
- (flutter2) as import old project error: inheritfromwidgetofexacttype
- 浅谈网络安全之文件上传
- three. JS gorgeous bubble effect
- 基于PaddlePaddle平台(EasyDL)设计的人脸识别课堂考勤系统
猜你喜欢

案例推荐丨安擎携手伙伴,保障“智慧法院”更加高效

UE4 blueprint learning chapter (IV) -- process control forloop and whileloop

CUDA exploration

(shuttle) navigation return interception: willpopscope

Let's see through the network i/o model from beginning to end

Improving Multimodal Accuracy Through Modality Pre-training and Attention

Mysql 身份认证绕过漏洞(CVE-2012-2122)
![[unity] upgraded version · Excel data analysis, automatically create corresponding C classes, automatically create scriptableobject generation classes, and automatically serialize asset files](/img/08/1636ae909e7cc70ddf975d2e3122cd.png)
[unity] upgraded version · Excel data analysis, automatically create corresponding C classes, automatically create scriptableobject generation classes, and automatically serialize asset files

Cloud native technology container knowledge points

金融人士必读书籍系列之六:权益投资(基于cfa考试内容大纲和框架)
随机推荐
关于声子和热输运计算中BORN电荷和non-analytic修正的问题
服务器的系统怎么选者
poj 1094 Sorting It All Out (拓扑排序)
NPM cannot install sharp
On the problems of born charge and non analytical correction in phonon and heat transport calculations
MySQL实现字段分割一行转多行的示例代码
Thinkphp5 multi table associative query method join queries two database tables, and the query results are spliced and returned
同构+跨端,懂得小程序+kbone+finclip就够了!
POJ 1258 Agri-Net
Bipartite graph determination
three. JS gorgeous bubble effect
Enterprises do not want to replace the old system that has been used for ten years
Cocoscreator+typescripts write an object pool by themselves
Aardio - construct a multi button component with customplus library +plus
Comparison between variable and "zero value"
室内LED显示屏应该怎么选择?这5点注意事项必须考虑在内
DevSecOps软件研发安全实践——发布篇
QT signal and slot
The statement that allows full table scanning does not seem to take effect set odps sql. allow. fullscan=true; I
memcached