当前位置:网站首页>coming! Gaussdb (for Cassandra) new features appear
coming! Gaussdb (for Cassandra) new features appear
2022-07-07 18:43:00 【Hua Weiyun】
today , Hua Wei Yun GaussDB(for Cassandra) carry Lucene Engine new solution Come on. !
At present , Internet 、 Big data is developing rapidly , The amount of data is growing explosively , In high concurrency 、 High availability 、 Driven by the high expansion of business demand ,NoSQL Database has become the rigid demand of more and more business scenarios . But in terms of query , Conventional NoSQL But it has certain limitations , Strictly speaking , Like open source MongoDB、Cassandra、Hbase Etc. do not have multi-dimensional query of massive data 、 Text retrieval 、 Statistical analysis, etc . Most enterprises are still looking for a more perfect NoSQL Solution .
Huawei cloud native multimode database GaussDB NoSQL Have a strong ecosystem , Support key value 、 A wide watch 、 file 、 Timing four engine interfaces . among , Wide table engine interface GaussDB(for Cassandra) Has been released Lucene Secondary index function , Existing NoSQL The advantages of , It can also support a variety of complex query scenarios , Comprehensively improve users' query experience in massive data scenarios , Spoil powder with strength ! I believe you must have many questions ,GaussDB(for Cassandra) What is it? ? How to use secondary index ?Lucene What are the differences between secondary indexes ? take it easy , Next, let's interpret them one by one .

What is? GaussDB(for Cassandra)?
GaussDB(for Cassandra) It is a Huawei self-developed 、 Distributed cloud database with computing storage separation architecture , In high performance 、 High availability 、 Highly reliable 、 High security 、 On the basis of elastic expansion and contraction , Provides one click deployment 、 Backup recovery 、 Monitoring alarm and other service capabilities ; And highly compatible with open source Cassandra Interface , Provide high read / write performance . At present, it has been widely used in IoT、 meteorological 、 Internet 、 Games and many other fields .
What is a secondary index ?
Let's first understand the concept of index . Index is a storage structure created to speed up data retrieval , It is a design idea of exchanging space for time . The function can be understood as the catalogue of books , Through the directory, you can quickly locate the required content .
stay Cassandra in ,Primary Key It's index. ( Also known as primary index ), At query time , according to Primary Key You can directly retrieve the corresponding records . And secondary index is also called auxiliary index , To help locate the primary index , Then find the corresponding record according to the primary index . We usually use CREATE INDEX The statement establishes a secondary index .
At present Cassandra What are the pain points of the secondary index ?
Native Cassandra The implementation of secondary index in actually creates an implicit table , Of this table Primary Key Is the column that creates the index , The value is the corresponding Primary Key, Implementation is relatively simple , Therefore, it is inevitable to bring some constraints :
1. The first primary key can only be used “=” Inquire about ;
2. The second primary key can use “=、>、<、>=、<=”;
3. Index columns only support “=” Inquire about ;
4. Delete 、 Columns that are updated too frequently are not suitable for indexing ;
5.High-cardinality Columns are not suitable for indexing ;
Based on the above constraints ,Cassandra The query function that secondary index can provide is very limited .
Why Lucene?
Lucene It is currently the most popular open source full-text search engine tool , It has the following characteristics :
1. Stable 、 High indexing performance ;
2. It's efficient 、 accuracy 、 High performance search algorithm ;
3. Rich query types : Support phrase query 、 Wildcard query 、 Approximate query 、 Range query, etc ;
4. There is strong open source community support , Good maintainability ;
therefore , Use integration Lucene Engine to supplement Cassandra The weakness of query ability is the best choice , After all, who would refuse a stable performance 、 Continued growth 、 And update the iterative search engine ?
Lucene The engine has powerful inverted index and columnar storage capacity , Given GaussDB(for Cassandra) Efficient multidimensional query 、 Text retrieval 、 Statistical analysis, etc , It is similar to the native secondary index in use experience , But at the same time, it has richer syntax support .
Use Lucene After secondary index , What changes have taken place in my query ?
More flexible query 、 Filtering method :
All queries can be made without PK Or take part PK, And the index column supports “>、<、in” Wait for the operator , Users no longer need to be limited to using “=”.
Strong text retrieval ability :
Text retrieval ability is Lucene What I'm good at , It's very convenient to use , Just pass the keyword like That is to say .
You can do this :
SELECT * FROM example WHERE field LIKE 'test%'; // Prefix query You can do that :
SELECT * FROM example WHERE field LIKE 'start*end'; // Regular matching It can be like this :
SELECT * FROM example WHERE field LIKE '%+lucene +index%'; // Full text search , High performance , Stable Support the statistics of large amount of data exceeding trillion specifications :
select count(*) from example where pk > 1 and expr(lucene_index, 'count'); Multiple deletion methods :
Support single Single row deletion 、partition Partition deletion 、range Scope delete , Cover all kinds of deletion scenes .
DELETE FROM example WHERE pk1='a' AND field=1; // single Single row deletion DELETE FROM example WHERE pk1='a' AND pk2=5000; // partition Partition deletion DELETE FROM example WHERE pk1='a' AND pk2=3000 AND ck1=2 AND ck2>'a' AND ck2<'c'; // range Scope delete Support extended json Query interface , Easily deal with various complex query scenarios :
Extended json Query interface provides rich query syntax , More diverse usage . The following is a list of keywords :
filter | In the query statement json Search keywords |
term | When querying, judge a document Whether to include a specific value , Word segmentation query will not be performed on the queried value |
match | Segment the queried value , Full text search |
range | Query specifies that a field is in a specific range ( Range query subkey :"eq"/"gte"/"gt"/"lte"/"lt") |
bool | It has to be with "must"、"should"、"must not" Combine complex queries together |
must | bool Type of subquery , The type is list, encapsulation "term"、"match"、"range" Inquire about |
should | bool Type of subquery , The type is list, encapsulation "term"、"match"、"range" Inquire about |
must not | bool Type of subquery , The type is list, encapsulation "term"、"match"、"range" Inquire about |
Take a chestnut :
SELECT * FROM example WHERE EXPR(index_field, '{"filter": {"bool": {"should": [{"bool": {"should": [{"bool": {"must": [{"bool": {"should": [{"range": {"ck1": {"lt": 2}, "ck1": {"gte": 4}}}]}}, {"bool": {"should": [{"range": {"field1": {"lt": 2}, "field1": {"gt": 3}}}]}}]}}, {"bool": {"should": [{"term": {"pk1": "a", "pk1": "b", "pk1": "c"}}]}}]}}, {"bool": {"must": [{"range": {"field2": {"gte":5, "lte": 15}, "pk2": {"gt": 2000}}}]}}]}}}')Add nesting through condition combination , You can DIY In line with their own business sql sentence , And the highest support 200 layer json nesting , Even complex scenes can be handled !
Hua Wei Yun GaussDB(for Cassandra) carrying Lucene engine , adopt Lucene The secondary index sinks the search ability to the bottom , Fundamentally liberated the application layer query , Multi dimensional query 、 Text retrieval 、 Statistical analysis and other abilities , It can perfectly make up for NoSQL Weak query function short board , Let enterprises calmly deal with the complex query scenario of massive data . What are we waiting for? , Come and experience it !
appendix
The author of this article : Huawei cloud Cassandra The team
Hangzhou, Xi'an, Shenzhen resume delivery :[email protected]
More technical articles , Please pay attention to Gauss Cassandra The official blog :https://bbs.huaweicloud.com/community/usersnew/id_1563519101830986
gaussian Cassandra Official home page :https://www.huaweicloud.com/product/gaussdbforcassandra.html
边栏推荐
- Will low code help enterprises' digital transformation make programmers unemployed?
- 备份阿里云实例-oss-browser
- Afghan interim government security forces launched military operations against a hideout of the extremist organization "Islamic state"
- GSAP animation library
- Yunjing network technology interview question [Hangzhou multi tester] [Hangzhou multi tester _ Wang Sir]
- Simple configuration of single arm routing and layer 3 switching
- 现在网上期货开户安全吗?国内有多少家正规的期货公司?
- Hash, bitmap and bloom filter for mass data De duplication
- Idea completely uninstalls installation and configuration notes
- 【C语言】字符串函数
猜你喜欢

Chapter 2 build CRM project development environment (database design)

The highest level of anonymity in C language

Kirk Borne的本周学习资源精选【点击标题直接下载】

C语言中匿名的最高境界

Chapter 3 business function development (to remember account and password)
![[unity shader] insert pass to realize the X-ray perspective effect of model occlusion](/img/86/251404b81ae4ab6dbfd9da73cd11cb.png)
[unity shader] insert pass to realize the X-ray perspective effect of model occlusion

Classification of regression tests

不能忽略的现货白银短线操作小技巧

CVPR 2022 - learning non target knowledge for semantic segmentation of small samples
![[trusted computing] Lesson 13: TPM extended authorization and key management](/img/96/3089e80441949d26e39ba43306edeb.png)
[trusted computing] Lesson 13: TPM extended authorization and key management
随机推荐
Cloud security daily 220707: Cisco Expressway series and telepresence video communication server have found remote attack vulnerabilities and need to be upgraded as soon as possible
标准ACL与扩展ACL
Chapter 3 business function development (safe exit)
Five simple ways to troubleshoot with Stace
[trusted computing] Lesson 11: TPM password resource management (III) NV index and PCR
Backup Alibaba cloud instance OSS browser
Five network IO models
海量数据去重的hash,bitmap与布隆过滤器Bloom Filter
What is the general yield of financial products in 2022?
AI defeated mankind and designed a better economic mechanism
Download, installation and development environment construction of "harmonyos" deveco
PIP related commands
持续测试(CT)实战经验分享
Thread pool and singleton mode and file operation
[trusted computing] Lesson 13: TPM extended authorization and key management
Some key points in the analysis of spot Silver
AI 击败了人类,设计了更好的经济机制
Summary of debian10 system problems
How to open an account for wealth securities? Is it safe to open a stock account through the link
sqlite sql 异常 near “with“: syntax error