当前位置:网站首页>Full text search of MySQL
Full text search of MySQL
2022-07-05 12:34:00 【just4you】
summary
Version Description
- MySQL5.5 in , Only MyISAM Support full text search .
- MySQL5.6 in ,InnoDB Start supporting Full-text Retrieval .
- MySQL5.7 in , have access to N-Gram The plug-in supports full-text retrieval . This plug-in supports / Japan / Korean .
Restrictions on the use of full-text retrieval
- Only when the type is CHAR、VARCHAR perhaps TEXT Create a full-text index on the field of .
- Only support InnoDB and MyISAM engine .
- A table can only be created One Full text search fields . If you need to retrieve multiple fields , You need to create multiple fields together One Indexes .
N-Gram Parser
- MySQL Using global variables in "ngram_token_size" To configure the N-Gram in n Size ; Value range :1~10, The default value is :2.
- Will usually ngram_token_size The value of is set to the minimum number of words to query . If you need to search for words , Will the ngram_token_size Set to 1. The default value is 2 Under the circumstances , You can't get any results by searching for words .
- Chinese words are at least two Chinese characters , The default value is recommended 2.
Parameter setting
modify MySQL Configuration file for :my.ini or my.cnf
[client]
ft_min_word_len=2
[mysqld]
ft_min_word_len=2
ngram_token_size=2
Remember to restart after modification MySQL service
Full text search operation
establish
# Create... When creating a table
CREATE TABLE t_member (
`id` INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
`cn_name` VARCHAR(100),
`remark` TEXT,
FULLTEXT `ft_idx_1`(`cn_name`, `remark`) WITH PARSER ngram
) ENGINE = INNODB;
# Add... When modifying the table
ALTER TABLE t_member ADD FULLTEXT INDEX `ft_idx_1`(`cn_name`,`remark`) WITH PARSER ngram;
# Create directly
CREATE FULLTEXT INDEX `ft_idx_1` ON t_member (`cn_name`,`remark`) WITH PARSER ngram;
Delete
DROP INDEX `ft_idx_1` ON t_member;
The reconstruction
Only applicable to MyISAM engine , Execute after modifying the full-text search settings of the table .
repair table t_member quick;
The use of full-text retrieval
Basic grammar
SELECT < Field table > FROM < Table name > WHERE MATCH( Field ) AGAINST (‘ Keywords to search ’ search mode );
Be careful : MATCH The number of fields in should be the same as that in the definition of full-text search .
# Full text search defines `cn_name`,`remark` Field , So in MATCH Write these two fields in
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST(' Zhang San ');
Query matching degree
It can be used to query the matching degree of data
SELECT `cn_name`, `remark`, MATCH(`cn_name`, `remark`) AGAINST(' Zhang San ') FROM t_member;
Full text retrieval mode
1. Natural language patterns (NATURAL LANGUAGE MODE)
MySQL Default full text retrieval mode . Operators... Cannot be used in this mode , For simple queries .
2. BOOLEAN Pattern (BOOLEAN MODE)
This mode can use operators , You can specify keywords Must appear perhaps Must not appear perhaps Keyword weight And so on .
50% The limitation of
When using full-text retrieval , Often mentioned “50% The limitation of ”. I found a paragraph explaining as follows :
Remove words from more than half of the matching lines , for example , Every line has this Words of this word , The use this When I go to check , There will be no results , This is useful when there are a lot of records , The reason is that the database doesn't think it makes sense to find all the rows , At this time ,this Almost regarded as stopword( Break words ); But if there are only two lines of records , Nothing can be found out , Because every word appears 50%( Or more ), To avoid this situation , Please use IN BOOLEAN MODE.
But in the test , Use Chinese keywords to query , Even if there are keywords in every record , You can also find the results ( Is there something wrong with my test data ? Or this 50% The restriction is only valid for English ?).
BOOLEAN Syntax in patterns
- +: Be sure to have ( Data bars without this keyword are ignored ).
- -: There can be no ( Exclude specified keywords , Those with this keyword are ignored ).
- >: Increase the weight value of the matching data .
- <: Reduce the weight value of the matching data .
- ~: Turn the correlation from positive to negative , Indicates that having the word reduces the correlation ( But not like “-” Rule it out ), It's just at the bottom , The weight value decreases .
- *: All kinds of words , Follow the query keywords .
- “”: Use double quotation marks to indicate that the content to be queried must be completely consistent , Don't split the words .
give an example
# No operator , Represents or
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST (' commonly Search for ' IN BOOLEAN MODE);
# Must include at the same time “ commonly ” and “ Search for ”
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST ('+ commonly + Search for ' IN BOOLEAN MODE);
# Must contain “ Search for ”, But if it includes “ commonly ”, The correlation will be higher .
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST ('+ Search for commonly ' IN BOOLEAN MODE);
# Must contain “ commonly ”, At the same time, it cannot contain “ Search for ”.
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST ('+ commonly - Search for ' IN BOOLEAN MODE);
# Must contain “ Search for ”, But if it also includes “ commonly ” Words , Relevance is better than not including “ commonly ” Your record is low .
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST ('+ Search for ~ commonly ' IN BOOLEAN MODE);
# Query must contain “ commonly ”“ Simple ” perhaps “ commonly ”“ Search for ” The record of , however “ commonly ”“ Simple ” Is more relevant than “ commonly ”“ Search for ” high .
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST ('+ commonly +(> Simple < Search for )' IN BOOLEAN MODE);
# *: asterisk , Query records that contain words beginning with a search .
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST (' Search for *' IN BOOLEAN MODE);
# Double quotes , Enclose the words to be searched , The effect is similar to like '%some words%', for example “some words of wisdom” Will be matched to , and “some noise words” It won't be matched . But for Chinese , It feels like the effect is average .
# No double quotes , It turns out
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST (' commonly Simple Search for ' IN BOOLEAN MODE);
# There are double quotation marks , No results , Generally, Chinese doesn't query like this
SELECT `cn_name`, `remark` FROM t_member WHERE MATCH(`cn_name`, `remark`) AGAINST ('" commonly Simple Search for "' IN BOOLEAN MODE);
Summary
- For small projects ,MySQL Full text search should be enough , even to the extent that like It's enough .
- There is a higher demand for full-text retrieval , Or we should keep up with the upgrading of Technology , Or use it ES Well :).
边栏推荐
- GPS data format conversion [easy to understand]
- 信息服务器怎么恢复,服务器数据恢复怎么弄[通俗易懂]
- Simple production of wechat applet cloud development authorization login
- MySQL constraints
- MySQL log module of InnoDB engine
- GPS数据格式转换[通俗易懂]
- Differences between IPv6 and IPv4 three departments including the office of network information technology promote IPv6 scale deployment
- Pytorch two-layer loop to realize the segmentation of large pictures
- Distributed solution - distributed lock solution - redis based distributed lock implementation
- Programming skills for optimizing program performance
猜你喜欢

Why learn harmonyos and how to get started quickly?

什么是数字化存在?数字化转型要先从数字化存在开始

Pytoch loads the initialization V3 pre training model and reports an error

Get data from the database when using JMeter for database assertion

How to clear floating?

Take you hand in hand to develop a service monitoring component

Keras implements verification code identification
A guide to threaded and asynchronous UI development in the "quick start fluent Development Series tutorials"

UNIX socket advanced learning diary - advanced i/o functions

Matlab struct function (structure array)
随机推荐
Matlab struct function (structure array)
Flutter2 heavy release supports web and desktop applications
SENT协议译码的深入探讨
Tabbar configuration at the bottom of wechat applet
Keras implements verification code identification
Learning items
Just a coincidence? The mysterious technology of apple ios16 is actually the same as that of Chinese enterprises five years ago!
Distributed solution - distributed lock solution - redis based distributed lock implementation
MySQL installation, Windows version
Want to ask, how to choose a securities firm? Is it safe to open an account online?
Master the new features of fluent 2.10
ZABBIX monitors mongodb templates and configuration operations
Distributed solution - distributed session consistency problem
只是巧合?苹果 iOS16 的神秘技术竟然与中国企业 5 年前产品一致!
GPON other manufacturers' configuration process analysis
Learn memory management of JVM 01 - first memory
Differences between IPv6 and IPv4 three departments including the office of network information technology promote IPv6 scale deployment
Correct opening method of redis distributed lock
How to recover the information server and how to recover the server data [easy to understand]
MySQL multi table operation