当前位置：网站首页>Regular expression of shell script value

Regular expression of shell script value

2022-07-04 00:28:00 【Wozkid】

One 、 Regular expressions

1.1 Definition of regular expression

Regular expressions , Also known as regular expression ,（ English ：Regular Expression） In code it is often abbreviated as regex、regexp or RE , A concept of computer science .

Regular expressions are often used for retrieval 、 Replace those that match a pattern ( The rules ) The text of

There is more than one regular expression , and Linux Different programs in may use different regular expressions , Such as ：

Tools ：grep sed awk egrep

Regular expressions —— It is usually used to judge , Used to check whether a string satisfies a certain format

Regular expressions are composed of ordinary characters and metacharacters

Ordinary character ： Include upper and lower case letters 、 Numbers 、 Punctuation and some other symbols

Metacharacters ： It refers to special characters with special meaning in regular expressions , It can be used to specify its leading characters （ The character before the metacharacter ） The occurrence pattern in the target object

LINUX There are two regular expression engines commonly used in

Basic regular expressions ：BRE

Extended regular expression ：ERE

1.2 grep

      
      
       
        Format ：
       
       
grep   [ Options ]...     Search for conditions     Target file 
      
      
      
      
       
       1.
       
       2.

Options meaning

-E Open extension （Extend） Regular expression of

-c Calculate find ‘ Search string ’ The number of times

-i Ignore case differences , So case is the same

-o Only the string matched by the pattern

-v Reverse selection , That is to say, no ‘ Search string ’ Content line ！（ Reverse lookup , Output lines that do not match the search criteria ）

- -color=auto You can add color to the key words you find

-n Output line number by the way

1.3 Basic regular expressions

Common metacharacters ：（ Supported tools ：grep、egrep、sed、awk）

Special characters meaning

( The backslash )\ Escape special characters , Ignore its special significance

^ Match the beginning of the line ,^ Is the beginning of the matching string ^tux Match with tux Beginning line

$ Match the end of the line ,$ Is the end of the matching string tux$ Match with tux The line at the end

. Match break \r\n Any single character other than

[list] matching list A character in the list example ： go[ola]d,[abc]、[a-z]、[a-z0-9]

[ ^list ] Any match is not in list A character in the list example ： [ ^a-z ]、[ ^0-9 ]、 [ ^A-Z0-9 ]

* Match the front face expression 0 Times or more example ：goo*d、go.*d

\ {n \ } Match the previous subexpression n Time , example : go{2}d、’[O-9]{2}' Match two numbers

\ {n, \ } Match the preceding subexpression no less than n Time , example : go{2,}d、’ [0-9]{ 2, \ }' Match two or more digits

\ {n,m \ } Match the previous subexpression n To m Time , example : go{2,3)d、’[0-9]{2,3}' Match two to three digits

notes : egrep、awk Use {n}、{n, }、{n, m} When the match “{}" There is no need to add “ \ ”

      
      
       
       [[email protected] home]# egrep -E -n 'wo{2}d' test.txt 
       
       
8:a wood cross!
       
       
[[email protected] home]# egrep -E -n 'wo{2,3}d' test.txt 
       
       
8:a wood cross!
       
       
12:#woood #
       
       

       
       
 Locator 
       
       
^ ： Matches where the input string starts 
       
       
$ ： Matches the position of the end of the input string 
       
       

       
       
 Nonprinting characters 
       
       
\n ： Match a line break 
       
       
\r ： Match a carriage return 
       
       
\t ： Match a tab 
      
      
      
      
       
       1.
       
       2.
       
       3.
       
       4.
       
       5.
       
       6.
       
       7.
       
       8.
       
       9.
       
       10.
       
       11.
       
       12.
       
       13.
       
       14.

Example ：

example ： see test.txt In file t Beginning line

shell Regular expression of script value _ Regular expressions

      
      
       
       [[email protected] ~]# grep -n '^t' test.txt 
       
       

       
       
4:the tongue is boneless but it breaks bones.12!
       
       

       
       
 example ： see  test.txt  In file  s  The line at the end 
       
       
[[email protected] ~]# grep -n 's$' test.txt 
       
       

       
       
9:Actions speak louder than words
       
       

       
       
 example ： see  test.txt  In file  ,W There are two in the back O 's words 
       
       
[[email protected] ~]# grep -E -n  wo\{2\}d  test.txt 
       
       

       
       
8:a wood cross!
       
       

       
       
 example ： see  test.txt  In file  ,W There are two or more following O 's words 
       
       
[[email protected] ~]# grep -E -n  wo\{2,\}d  test.txt 
       
       

       
       
8:a wood cross!
       
       
12:#woood #
       
       
13:#woooooood
       
       

       
       
 example ：“*” Symbols match any alphanumeric 
       
       
[[email protected] ~]# grep go[osad]  test.txt 
       
       

       
       
google is the best tools for search keyword.
      
      
      
      
       
       1.
       
       2.
       
       3.
       
       4.
       
       5.
       
       6.
       
       7.
       
       8.
       
       9.
       
       10.
       
       11.
       
       12.
       
       13.
       
       14.
       
       15.
       
       16.
       
       17.
       
       18.
       
       19.
       
       20.
       
       21.
       
       22.
       
       23.
       
       24.
       
       25.

When matching characters , What matches is the number of times the last character

shell Regular expression of script value _ Regular expressions _02

1.4 Extended regular expression

In general, it's enough to use basic regular expressions , But sometimes in order to simplify the whole instruction , Need to use A wider range of extended regular expressions

The same type as the underlying regular expression , The extended regular expression also contains multiple metacharacters , Common extended regular expressions The metacharacters of expression mainly include the following ：

Metacharacters effect

+ Repeat one or more previous characters

？ Zero or the previous character of one

I（ Pipe, ） Use or （or） How to find multiple characters

（） lookup “ Group ” character string

（）+ Identify multiple repeating groups

Example ：+：

shell Regular expression of script value _ character string _03

？：

shell Regular expression of script value _ Metacharacters _04

| ：

shell Regular expression of script value _ character string _05

（）：

（）+：

shell Regular expression of script value _ Metacharacters _07

1.5 Metacharacter operation cases

1. Find specific characters

Finding specific characters is very simple , If you execute the following command, you can test.txt Find specific characters in the file “the” The position of . among “-n” Indicates display line number 、“-i” Indicates case insensitive . After the execution of the command , Matching characters , The font color changes to red

If reverse selection , If the search does not contain “the” Lines of characters , You have to go through grep Ordered “-v” Option implementation , And cooperate with “-n” Use together to display line numbers

2. Using brackets “[]” To find set characters

Want to find “shirt” And “short” When these two strings , You can find that both strings contain “sh” And “rt”. At this time, execute the following command to find “shirt” And “short” These two strings , among “[]” No matter how many characters there are , All represent only one character , in other words “[io]” Represents a match “i” perhaps “o”

To find a single character that contains duplicates “oo” when , Just execute the following command

Find out if “oo” The front is not “w” String , Only through the reverse selection of set characters “[ ^ ]” To achieve that goal . For example, to perform “grep -n‘[ ^w ]oo’test.txt” The command indicates in test.txt Find... In the text “oo” The front is not “w” String

In the execution result of the above command, it is found that “woood” And “wooooood” It also meets the matching rules , Both include “w”. In fact, through the implementation results, we can see , Characters that meet the matching criteria are displayed in bold , From the above results, we can know , “#woood #” Bold in shows “ooo”, and “oo” Ahead “o” It meets the matching rules . Empathy “#woooooood #” It also meets the matching rules .

If not “oo” Lowercase before , have access to

“grep -n‘[ ^a-z ]oo’test.txt” Command implementation , among “a-z” For lowercase letters , Capital letters pass “A-Z” Express

To find a row that contains numbers, you can use “grep -n ‘[0-9]’test.txt” Command to implement

3. Find the beginning of a line “^” And end of line characters “$”

The base regular expression contains two positioning metacharacters ：“ ^ ”（ Head of line ） And “ $ ”（ At the end of the line ）. In the example above , Inquire about “the” There are many strings that contain “the” The line of , If you want to query with “the” The line with the string at the beginning of the line , You can use the “^” Metacharacters

Query lines that start with lowercase letters can be through “ ^ [a-z]” Rules to filter , For lines that start with uppercase letters, use

“ ^ [A-Z]” The rules , If the query does not start with a letter, use “^ [ ^a-zA-Z] ” The rules

“^” The symbol is in the metacharacter set “[]” The functions inside and outside the symbol are different , stay “[]” The symbol indicates reverse selection , stay “[]” Outside the symbol stands for positioning the beginning of the line . conversely , If you want to find a line that ends with a particular character, you can use “$” Locator . for example , Execute the following command to implement the query with decimal point （.） The line at the end . Because of the decimal point （.） It's also a metacharacter in regular expressions （ I'll talk about it later ）, So here we need to use escape characters “\” Convert characters with special meaning into ordinary characters

When querying blank rows , perform “grep -n‘^$’test.txt” Command is enough

**4. Find any character “ . ” And repeating characters “ * ” **

Mentioned earlier , In regular expressions, the decimal point （.） It's also a metacharacter , Represents any character . For example, execute the following command to find “w??d” String , Four characters in total , With w start d ending

In the above results ,“wood” character string “w…d” Matching rules . If you want to query oo、ooo、ooooo Other information , Asterisk is required （） Metacharacters . But it should be noted that ,“” Represents the repetition of zero or more preceding single characters . “o*” Indicates zero （ Is an empty character ） Or greater than or equal to one “o” The characters of , Because empty characters are allowed , So execute “grep

-n ‘o*’ test.txt” The command prints everything in the text . If it is “oo*”, Is the first o There must be , the second o Zero or more o, Therefore, the o、oo、ooo、ooo, All the data are up to the standard

If the query contains at least two o String above , execute “grep -n ‘ooo*’ test.txt” Command is enough

Query to w start d ending , At least one in between o String , Execute the following command

Execute the following command to query w start d ending , The middle character can have a string of optional characters

Execute the following command to query the line of any number

5. Find range of consecutive characters “{}”

Used “.” And “*” To set zero to an infinite number of repeating characters , If you want to limit a range of repeated strings, how to achieve it ？ for example , Find three to five o Continuous characters of , At this time, you need to use the limited range of characters in the basic regular expression “{}”. because “{}” stay Shell Has special significance in , So it's using “{}” Character time , Escape character required “\”, take “{}” Character to normal .“{}” The use of characters is as follows

Check two o The characters of

Query to w Begin with d ending , The middle contains 2～5 individual o String

Query to w Begin with d ending , The middle contains 2 Or 2 More than o String

Two 、 Text processor

2.1 cut： Column interceptor

Instructions ：

cut Command to cut bytes from each line of a file 、 Characters and fields and put these bytes 、 Write characters and fields to standard output .

If you don't specify File Parameters ,cut Command will read standard input . Must specify -b、-c or -f One of the signs .

Be careful ：cut Only good at dealing with text separated by a single character

      
      
       
        Format ：cut [ Options ]   Parameters 
      
      
      
      
       
       1.

Common options

Options	effect
-b	Intercept by byte
-c	Intercept by character , Commonly used in Chinese
-d	Specify what to use as the delimiter to intercept , The default is tab
-f	Usually and -d Together

Example ：

example 1： Intercept passwd The first column of the file

example 2： Intercept passwd The first and third columns of the document

example 3: Intercept passwd The first to third columns of the document

example 4： Intercept who The third byte of the query result

example 5： Intercept name The first text in the file

2.2 sort ： Sorting tools

It is a tool to sort the contents of files in behavioral units , It can also be sorted according to different data types . For example, the sorting of data and characters is different

      
      
       
        Format ：sort [ Options ]  Parameters 
      
      
      
      
       
       1.

Common options

Options effect

-t Specify the separator , By default [Tab] Well Key or space separation

-k Specify sorting area , Which interval is sorted

-n Sort by number , The default is to sort in text

-u Equate to uniq, Indicates that only one row of the same data is displayed , Be careful ： If there is a space at the end of the line, de duplication will not succeed

-r Reverse sorting , The default is ascending ,-r It's descending

-o Transfer the sorted results to the specified file

Example ：

      
      
       
       sort passwd.txt    ### Without any options, it is in ascending order of the first column by default , Letters are from a To z From top to bottom 
       
       

       
       
sort -n -t: -k3 passwd.txt    ### Use colon as separator , Sort the third column by number size （ Ascending ）
       
       

       
       
sort -nr -t: -k3 passwd.txt   ### Use colon as separator , Sort the third column by number size （ Descending ）
       
       

       
       
sort -nr -t: -k3 passwd.txt -o passwd.bak    ### Output the input results not on the screen, but to passwd.bak file 
      
      
      
      
       
       1.
       
       2.
       
       3.
       
       4.
       
       5.
       
       6.
       
       7.

shell Regular expression of script value _ Metacharacters _08

2.3 uniq： Remove consecutive duplicate lines

Be careful ： Is a continuous line , So usually with sort Combined with the use of sorting to make it a continuous row, and then perform the de duplication operation , Otherwise, he can't repeat the discontinuous lines

Format ：uniq [ Options ] Parameters

Common options

Options effect

-c Count duplicate rows

-d Show only duplicate lines

-u Show only rows that appear once

Example ：

example ： Create a fruit type file , altogether 9 Row content

shell Regular expression of script value _ Metacharacters _09

example 1： Count the number of repeated lines , He doesn't count as a repeating line

shell Regular expression of script value _ Metacharacters _10

example 2： combination sort Use is the effect we want

shell Regular expression of script value _ Regular expressions _11

example 3： combination sort Use , Filter out duplicate lines

shell Regular expression of script value _ character string _12

example 4： combination sort Use , duplicate removal

shell Regular expression of script value _ character string _13

example 5： You can also use it directly sort -u

shell Regular expression of script value _ Metacharacters _14

Example ： View login users

shell Regular expression of script value _ character string _15

2.4 tr ： Replace tool

You can replace one character with another , Or you can completely remove some characters , You can also use it to remove duplicate characters

Format ： usage ：tr [ Options ]… SET1 [SET2]

Replace... From standard input 、 Downsizing and / Or delete characters , And write the results to standard output .

Common options

Options effect

-d Delete character

-s Delete all duplicate characters , Keep only the first

Example ：

shell Regular expression of script value _ character string _16

example 1： take fruit Lowercase in the file a-z Replace letters with uppercase A-Z

shell Regular expression of script value _ character string _17

example 2： Replace （ Is the replacement of one-to-one letters ）

shell Regular expression of script value _ Regular expressions _18

example 3： Enclose the replaced characters in single quotation marks , Include special characters

shell Regular expression of script value _ character string _19

shell Regular expression of script value _ Metacharacters _20

example 4： Replace multiple characters with one

shell Regular expression of script value _ character string _21

example 5： Remove line breaks

shell Regular expression of script value _ character string _22

example 6： Yes p Character de duplication , Keep only the first

shell Regular expression of script value _ Regular expressions _23

example 7： When multiple carriage returns are encountered, only one carriage return is reserved , Equivalent to removing empty lines

shell Regular expression of script value _ Metacharacters _24

原网站

版权声明
本文为[Wozkid]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202142011552050.html

当前位置：网站首页>Regular expression of shell script value

Regular expression of shell script value

One 、 Regular expressions

1.1 Definition of regular expression

1.2 grep

1.3 Basic regular expressions

1.4 Extended regular expression

1.5 Metacharacter operation cases

Two 、 Text processor

2.1 cut： Column interceptor

2.2 sort ： Sorting tools

2.3 uniq： Remove consecutive duplicate lines

2.4 tr ： Replace tool

边栏推荐

猜你喜欢

随机推荐