当前位置：网站首页>System learning Shell regular expressions

System learning Shell regular expressions

2022-08-03 14:15:00 【edenliuJ】

什么是正则表达式

基本正则表达式

扩展正则表达式

Perl正则表达式

Metacharacter precedence

grep命令

小结

什么是正则表达式

Regular expressions are tools used to describe string matching rules. Select the text of the compound rule,Filter out text that doesn't match the rules,如下图所示：

Regular expressions have a very long history,并且与UNIX有着不可分割的联系.1940年,美国新泽西州的Warrent McCulochRepublican was born in Detroit, USAWalter PittsThe two neurophysiologists,A method has been developed to describe neural networks in a mathematical way.They describe neurons in the nervous system as small and simple automatic control units,This is the prototype of regular expressions.1956年,American mathematician Stephen-科尔-Kleiny used a mathematical notation called a canonical set to describe the model built by the two above,And thus introduce the concept of regular expressions.Later, another famous computer scientist in the United States, Ken-汤普逊,也就是大名鼎鼎的UNIX之父,Introduced regular expressionsUNIX中的一个名为QED的编辑器中,Another very popular editor was introduced latered中.some lastUNIX命令比如grepRegular expression support is also provided.Currently the regular expression is inLinux上得到了广泛的应用.Common tools that support regular expressions are as follows：

grep命令族：匹配文本行
sed流编辑器：改变输入流
awk：Handling string languages
more, less：文件查看器
ed、vi\vim等：文本编辑器

Learning regular expressions mainly learns its metacharacters,Metacharacters are characters used to describe characters.The role of metacharacters is on the content of character expressions、Description of transformations and various operational information.Regular expressions are strings composed of various metacharacters and general characters.

Common regular expressions are3种：

基本正则表达式
扩展正则表达式
Perl正则表达式

下面我们一一介绍,The main metacharacters they support

基本正则表达式

基本正则表达式(Basic Regular Expression, BRE)又称为标准正则表达式,is the earliest regular expression specification,仅支持最基本的元字符集.基本正则表达式是POSIX规范制订的两种正则表达式语法标准之一,The other is extended regular expressions.

The metacharacters defined by basic regular expressions are listed in the following table：

基本正则表达式元字符
元字符	说明
^	行首定位符
$	行尾定位符
.	A single character matcher,用来匹配任意单个字符,包括空格,但是不包括换行符
*	A qualifier,The qualifier itself does not represent any character,It is used to refer to how many times the character preceding it appears to satisfy a match.而*Indicates that the preceding character is matched any number of times,包括0次
[]	字符集匹配,Used to specify a set of characters Hyphens can be used for consecutive numbers or letters-to represent a range eg[1-9]表示1~9的任意整数,[a-f]表示匹配a~f中的任意一个字母.
[^]	字符集不匹配,与[]意思相反,Matches characters that do not satisfy this set of characters
	Defines where the subexpression starts and ends.Subregular expressions can be referenced by escape sequences in subsequent regular expressions.最多定义9个子表达式,比如： "$love$.*\1"表示匹配两个loveA string with arbitrary characters in the middle.其中\1Indicates a reference to the preceding$love$this subexpression.
x\{m,n\}	区间表达式,匹配字符xinterval of repetitions,其中x\{n\}Indicates at most repetitionsn次,x\{m,}表示最少重复m次,x\{m,n\}表示重复m~n次
\<	词首定位符
\>	词尾定位符

扩展正则表达式

扩展正则表达式(Extend Regular Expression ERE)支持比基本正则表达式更多的元字符,但是扩展正则表达式对有些基本正则表达式所支持的元字符并不支持.前面介绍的^,$,.,*,[],[^]这6个元字符ERE都支持,The following highlights some of the new metacharacters

Added metacharacters for extended expressions
元字符	说明
+	限定符,with qualifier*meanings are basically the same,不同点在于+The character preceding the qualification must appear at least1次
？	限定符,Restricts the preceding character to occur at most1次
\|和()	竖线\|表示多个正则表达式之间或的关系,()表示一个集合,The two symbols are often used together.比如： "(ssl\|ssh\|^yum)"表示匹配包含sslstring or containssshor a string containing yumThe string at the beginning of the line

ERESubexpressions are canceled"()"matches the number of times"{m,n}"Escape character references for syntax symbols,所以在ERE中,When using these two metacharacters,No need like in basic expressions,"",x\{m,n\}这样,需要去掉\转义字符.

Perl正则表达式

PerlRegular expression metacharacter ANDEREThe metacharacters are roughly the same,ERE的元字符PerlRegular expressions are supported,另外PerlRegular expressions also add some metacharacters ：

PerlRegular expressions add metacharacters
元字符	说明
\d	数字匹配,和[0-9]效果一样
\D	非数字匹配,等价[^0-9]
\s	空白字符匹配,等价[\f\n\r\t\v]
\S	非空白字符匹配.等价于[^\f\n\r\t\v]

Metacharacter precedence

Regular expressions are evaluated from left to right,And follow certain priorities,This is the same with arithmetic operators.The so-called precedence is expressed in regular expressions,When multiple metacharacters appear at the same time,High-priority metacharacters are interpreted first,The following table lists the precedence of commonly used metacharacters,Arranged in order from highest to lowest：

Regular expression metacharacter precedence
元字符	说明
\	转义符
[]	方括号表达式
()	分组
*,+,？,{m,n},{n},{m,}	限定符
普通字符	按照从左到右的顺序
^,$,\>,\<	定位符
\|	或运算

grep命令

在Shell中,grepIt is a command that is very closely related to regular expressions,We can use this to test and validate our regular expression.grep的语法如下：

grep [options] pattern [file...]

option表示选项 ,pattern表示要匹配的模式,is a regular expression string,fileRepresents a sequence of files,The usual method is to use | A pipe treats the output of the preceding command asgrep的输入.

grepThe default is to use basic regular expressions,如果加上-E Represents extended regular expressions, -P 表示用Perl正则表达式

常用的grep命令选项
选项	说明
-c	只打印匹配的行数,不显示匹配的内容
-i	匹配时忽略大小写
-h	When searching multiple files,不显示匹配文件名前缀
-l	只列出含有匹配的文本行的文件名,不显示内容
-n	Show all matching lines of text without showing line numbers
-s	Do not display information about files that do not exist or that were read incorrectly
-v	只显示不匹配的文本行
-w	匹配整个单词
-x	匹配整个文本行
-r	递归搜索目录
-q	Matching results are not output,Only the status code is returned to indicate whether the search is found
-b	打印匹配的文本行到文件头的偏移量,单位字节
-E	支持扩展正则表达式
-P	支持Perl正则表达式
-F	不支持正则表达式,Match literally

小结

This article mainly explainsShell中的正则表达式是什么,It mainly introduces basic regular expressions,扩展正则表达式,Perl正则表达式,Regular expressions still require a lot of practice.

原网站

版权声明
本文为[edenliuJ]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/215/202208031341590681.html