当前位置:网站首页>regular expression

regular expression

2022-07-07 12:57:00 LC181119

1. Basic regular expression metacharacters

1.1 Character matching

  • .            Match any single character , It can be a Chinese character
  • []          Match any single character in the specified range
  • [^]        Matches any single character outside the specified range
  • [:alnum:]        Represents any upper and lower case letters in English
  • [:lower:]        Lowercase letters
  • [:upper:]        Capital
  • [:blank:]        Blank character ( Spaces and tabs )
  • [:space:]        Including Spaces 、 tabs ( Horizontal or vertical )、 A newline 、 Various types of whitespace such as carriage return
  • [:cntrl:]        Non printable control characters ( Backspace 、 Delete 、 Alarm bell ...)
  • [:digit:]        Decimal number
  • [:xdigit:]        Hexadecimal number
  • [:graph:]        Printable non blank characters
  • [:print:]        Printable characters
  • [:punct:]        Punctuation
  • \w        Components of matching words , Equivalent to [_[:alnum:]]
  • \W        Match non word components , Equivalent to [^_[:alnum:]]
  • \S        Matches any non-whitespace characters , Equivalent to [^ \f\n\r\t\v]
  • \s        Matches any whitespace characters , Including Spaces 、 tabs 、 Page breaks and so on , amount to [ \f\n\r\t\v].

                   Be careful :unicode Regular expressions match full space characters .

1.2 Number of matches

Used after the character to specify the number of times , Used to specify the number of times the preceding characters will appear

  • *                  Match preceding characters any number of times , Include 0 Time , Greedy mode , Match as long as possible
  • .*                Any character of any length
  • \?                Match the character before it to appear 0 Time or 1 Time , namely : not essential
  • \+                Matches the least characters that precede it 1 Time , namely : There must be and >=1 Time         
  • \{n\}            Match preceding characters n Time         
  • \{m,n\}        Match preceding characters at least m Time , at most n Time
  • \{,n\}            Match preceding characters up to n Time ,<=n
  • \{n,\}            Match preceding characters at least n Time

1.3 Position anchoring

  • ^                          Anchor anchoring , Leftmost rule for mode
  • $                          Tail anchoring , For the far right side of the pattern
  • ^PATTERN$        For pattern matching entire line
  • ^$                         Blank line
  • ^[[:space:]]*$        Blank line
  • \< or \b                Initial anchoring , For the left side of the word pattern
  • \> or \b                 Suffix anchor , For the right side of the word pattern
  • \<PATTERN>        Match the whole word

1.4 grouping

grouping :() Binding multiple characters together , Treat it as a whole , Such as :\(root\)+
Backward reference : The matching contents of the patterns in grouping brackets will be recorded in the internal variables by the regular expression engine , The names of these variables The way is : \1, \2, \3, ...
\1 Represents the character to which the pattern between the first open bracket from the left and the matching right bracket matches

 1.5 perhaps        

perhaps :\|

2. Extended regular expression metacharacter

2.1 Character matching

  • .            Match any single character , It can be a Chinese character
  • []          Match any single character in the specified range
  • [^]        Matches any single character outside the specified range
  • [:alnum:]        Represents any upper and lower case letters in English
  • [:lower:]        Lowercase letters
  • [:upper:]        Capital
  • [:blank:]        Blank character ( Spaces and tabs )
  • [:space:]        Including Spaces 、 tabs ( Horizontal or vertical )、 A newline 、 Various types of whitespace such as carriage return
  • [:cntrl:]        Non printable control characters ( Backspace 、 Delete 、 Alarm bell ...)
  • [:digit:]        Decimal number
  • [:xdigit:]        Hexadecimal number
  • [:graph:]        Printable non blank characters
  • [:print:]        Printable characters
  • [:punct:]        Punctuation

2.2 Number matching

  • *   Match preceding characters any number of times
  • ? 0 or 1 Time
  • + 1 Times or times
  • {n} matching n Time
  • {m,n} At least m, at most n Time

2.3 Position anchoring

  • ^ Head of line
  • $ At the end of the line
  • \<, \b Initials
  • \>, \b At the end of the sentence

2.4 Group other

  • () grouping
  • Backward reference :\1, \2, ...
  • | perhaps
  • a|b #a or b
  • C|cat #C or cat
  • (C|c)at #Cat or cat
原网站

版权声明
本文为[LC181119]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202130616434206.html