当前位置:网站首页>Full text search of MySQL
Full text search of MySQL
2022-07-05 12:34:00 【just4you】
summary
Version Description
- MySQL5.5 in , Only MyISAM Support full text search .
- MySQL5.6 in ,InnoDB Start supporting Full-text Retrieval .
- MySQL5.7 in , have access to N-Gram The plug-in supports full-text retrieval . This plug-in supports / Japan / Korean .
Restrictions on the use of full-text retrieval
- Only when the type is CHAR、VARCHAR perhaps TEXT Create a full-text index on the field of .
- Only support InnoDB and MyISAM engine .
- A table can only be created One Full text search fields . If you need to retrieve multiple fields , You need to create multiple fields together One Indexes .
N-Gram Parser
- MySQL Using global variables in "ngram_token_size" To configure the N-Gram in n Size ; Value range :1~10, The default value is :2.
- Will usually ngram_token_size The value of is set to the minimum number of words to query . If you need to search for words , Will the ngram_token_size Set to 1. The default value is 2 Under the circumstances , You can't get any results by searching for words .
- Chinese words are at least two Chinese characters , The default value is recommended 2.
Parameter setting
modify MySQL Configuration file for :my.ini or my.cnf
[client]
ft_min_word_len=2
[mysqld]
ft_min_word_len=2
ngram_token_size=2
Remember to restart after modification MySQL service
Full text search operation
establish
# Create... When creating a table
CREATE TABLE t_member (
`id` INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
`cn_name` VARCHAR(100),
`remark` TEXT,
FULLTEXT `ft_idx_1`(`cn_name`, `remark`) WITH PARSER ngram
) ENGINE = INNODB;
# Add... When modifying the table
ALTER TABLE t_member ADD FULLTEXT INDEX `ft_idx_1`(`cn_name`,`remark`) WITH PARSER ngram;
# Create directly
CREATE FULLTEXT INDEX `ft_idx_1` ON t_member (`cn_name`,`remark`) WITH PARSER ngram;
Delete
DROP INDEX `ft_idx_1` ON t_member;
The reconstruction
Only applicable to MyISAM engine , Execute after modifying the full-text search settings of the table .
repair table t_member quick;
The use of full-text retrieval
Basic grammar
SELECT < Field table > FROM < Table name > WHERE MATCH( Field ) AGAINST (‘ Keywords to search ’ search mode );
Be careful : MATCH The number of fields in should be the same as that in the definition of full-text search .
# Full text search defines `cn_name`,`remark` Field , So in MATCH Write these two fields in
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST(' Zhang San ');
Query matching degree
It can be used to query the matching degree of data
SELECT `cn_name`, `remark`, MATCH(`cn_name`, `remark`) AGAINST(' Zhang San ') FROM t_member;
Full text retrieval mode
1. Natural language patterns (NATURAL LANGUAGE MODE)
MySQL Default full text retrieval mode . Operators... Cannot be used in this mode , For simple queries .
2. BOOLEAN Pattern (BOOLEAN MODE)
This mode can use operators , You can specify keywords Must appear perhaps Must not appear perhaps Keyword weight And so on .
50% The limitation of
When using full-text retrieval , Often mentioned “50% The limitation of ”. I found a paragraph explaining as follows :
Remove words from more than half of the matching lines , for example , Every line has this Words of this word , The use this When I go to check , There will be no results , This is useful when there are a lot of records , The reason is that the database doesn't think it makes sense to find all the rows , At this time ,this Almost regarded as stopword( Break words ); But if there are only two lines of records , Nothing can be found out , Because every word appears 50%( Or more ), To avoid this situation , Please use IN BOOLEAN MODE.
But in the test , Use Chinese keywords to query , Even if there are keywords in every record , You can also find the results ( Is there something wrong with my test data ? Or this 50% The restriction is only valid for English ?).
BOOLEAN Syntax in patterns
- +: Be sure to have ( Data bars without this keyword are ignored ).
- -: There can be no ( Exclude specified keywords , Those with this keyword are ignored ).
- >: Increase the weight value of the matching data .
- <: Reduce the weight value of the matching data .
- ~: Turn the correlation from positive to negative , Indicates that having the word reduces the correlation ( But not like “-” Rule it out ), It's just at the bottom , The weight value decreases .
- *: All kinds of words , Follow the query keywords .
- “”: Use double quotation marks to indicate that the content to be queried must be completely consistent , Don't split the words .
give an example
# No operator , Represents or
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST (' commonly Search for ' IN BOOLEAN MODE);
# Must include at the same time “ commonly ” and “ Search for ”
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST ('+ commonly + Search for ' IN BOOLEAN MODE);
# Must contain “ Search for ”, But if it includes “ commonly ”, The correlation will be higher .
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST ('+ Search for commonly ' IN BOOLEAN MODE);
# Must contain “ commonly ”, At the same time, it cannot contain “ Search for ”.
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST ('+ commonly - Search for ' IN BOOLEAN MODE);
# Must contain “ Search for ”, But if it also includes “ commonly ” Words , Relevance is better than not including “ commonly ” Your record is low .
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST ('+ Search for ~ commonly ' IN BOOLEAN MODE);
# Query must contain “ commonly ”“ Simple ” perhaps “ commonly ”“ Search for ” The record of , however “ commonly ”“ Simple ” Is more relevant than “ commonly ”“ Search for ” high .
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST ('+ commonly +(> Simple < Search for )' IN BOOLEAN MODE);
# *: asterisk , Query records that contain words beginning with a search .
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST (' Search for *' IN BOOLEAN MODE);
# Double quotes , Enclose the words to be searched , The effect is similar to like '%some words%', for example “some words of wisdom” Will be matched to , and “some noise words” It won't be matched . But for Chinese , It feels like the effect is average .
# No double quotes , It turns out
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST (' commonly Simple Search for ' IN BOOLEAN MODE);
# There are double quotation marks , No results , Generally, Chinese doesn't query like this
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST ('" commonly Simple Search for "' IN BOOLEAN MODE);
Summary
- For small projects ,MySQL Full text search should be enough , even to the extent that like It's enough .
- There is a higher demand for full-text retrieval , Or we should keep up with the upgrading of Technology , Or use it ES Well :).
边栏推荐
- Understand redis persistence mechanism in one article
- GNN(pytorch-geometric)
- Hexadecimal conversion summary
- [superhard core] is the core technology of redis
- Average lookup length when hash table lookup fails
- JDBC -- use JDBC connection to operate MySQL database
- One article tells the latest and complete learning materials of flutter
- Experimental design - using stack to realize calculator
- MySQL log module of InnoDB engine
- Array cyclic shift problem
猜你喜欢

How to clear floating?

Distributed cache architecture - cache avalanche & penetration & hit rate

Pytoch uses torchnet Classerrormeter in meter

Master-slave mode of redis cluster
![[superhard core] is the core technology of redis](/img/5e/d6438f09031c2acbea17441c316a2b.jpg)
[superhard core] is the core technology of redis

About cache exceptions: solutions for cache avalanche, breakdown, and penetration

Resnet18 actual battle Baoke dream spirit

Principle of universal gbase high availability synchronization tool in Nanjing University

Redis clean cache

Pytoch monolayer bidirectional_ LSTM implements MNIST and fashionmnist data classification
随机推荐
Interviewer: is acid fully guaranteed for redis transactions?
7月华清学习-1
Handwriting blocking queue: condition + lock
Clear neo4j database data
[figure neural network] GNN from entry to mastery
Preliminary exploration of basic knowledge of MySQL
GPON technical standard analysis I
图像超分实验:SRCNN/FSRCNN
Deep discussion on the decoding of sent protocol
MySQL transaction
Flutter2 heavy release supports web and desktop applications
强化学习-学习笔记3 | 策略学习
Tabbar configuration at the bottom of wechat applet
Course design of compilation principle --- formula calculator (a simple calculator with interface developed based on QT)
Reinforcement learning - learning notes 3 | strategic learning
Pytoch counts the number of the same elements in the tensor
C language structure is initialized as a function parameter
Instance + source code = see through 128 traps
Why do you always fail in automated tests?
MySQL basic operation -dql