当前位置:网站首页>Regular expression of shell script value
Regular expression of shell script value
2022-07-04 00:28:00 【Wozkid】
One 、 Regular expressions
1.1 Definition of regular expression
Regular expressions , Also known as regular expression ,( English :Regular Expression) In code it is often abbreviated as regex、regexp or RE , A concept of computer science .
Regular expressions are often used for retrieval 、 Replace those that match a pattern ( The rules ) The text of
There is more than one regular expression , and Linux Different programs in may use different regular expressions , Such as :
Tools :grep sed awk egrep
Regular expressions —— It is usually used to judge , Used to check whether a string satisfies a certain format
Regular expressions are composed of ordinary characters and metacharacters
Ordinary character : Include upper and lower case letters 、 Numbers 、 Punctuation and some other symbols
Metacharacters : It refers to special characters with special meaning in regular expressions , It can be used to specify its leading characters ( The character before the metacharacter ) The occurrence pattern in the target object
LINUX There are two regular expression engines commonly used in
Basic regular expressions :BRE
Extended regular expression :ERE
1.2 grep
Format :
grep [ Options ]... Search for conditions Target file
- 1.
- 2.
-E Open extension (Extend) Regular expression of
-c Calculate find ‘ Search string ’ The number of times
-i Ignore case differences , So case is the same
-o Only the string matched by the pattern
-v Reverse selection , That is to say, no ‘ Search string ’ Content line !( Reverse lookup , Output lines that do not match the search criteria )
- -color=auto You can add color to the key words you find
-n Output line number by the way
1.3 Basic regular expressions
Common metacharacters :( Supported tools :grep、egrep、sed、awk)
Special characters meaning
( The backslash )\ Escape special characters , Ignore its special significance
^ Match the beginning of the line ,^ Is the beginning of the matching string ^tux Match with tux Beginning line
$ Match the end of the line ,$ Is the end of the matching string tux$ Match with tux The line at the end
. Match break \r\n Any single character other than
[list] matching list A character in the list example : go[ola]d,[abc]、[a-z]、[a-z0-9]
[ ^list ] Any match is not in list A character in the list example : [ ^a-z ]、[ ^0-9 ]、 [ ^A-Z0-9 ]
* Match the front face expression 0 Times or more example :goo*d、go.*d
\ {n \ } Match the previous subexpression n Time , example : go{2}d、’[O-9]{2}' Match two numbers
\ {n, \ } Match the preceding subexpression no less than n Time , example : go{2,}d、’ [0-9]{ 2, \ }' Match two or more digits
\ {n,m \ } Match the previous subexpression n To m Time , example : go{2,3)d、’[0-9]{2,3}' Match two to three digits
notes : egrep、awk Use {n}、{n, }、{n, m} When the match “{}" There is no need to add “ \ ”
[[email protected] home]# egrep -E -n 'wo{2}d' test.txt
8:a wood cross!
[[email protected] home]# egrep -E -n 'wo{2,3}d' test.txt
8:a wood cross!
12:#woood #
Locator
^ : Matches where the input string starts
$ : Matches the position of the end of the input string
Nonprinting characters
\n : Match a line break
\r : Match a carriage return
\t : Match a tab
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
Example :
example : see test.txt In file t Beginning line
[[email protected] ~]# grep -n '^t' test.txt
4:the tongue is boneless but it breaks bones.12!
example : see test.txt In file s The line at the end
[[email protected] ~]# grep -n 's$' test.txt
9:Actions speak louder than words
example : see test.txt In file ,W There are two in the back O 's words
[[email protected] ~]# grep -E -n wo\{2\}d test.txt
8:a wood cross!
example : see test.txt In file ,W There are two or more following O 's words
[[email protected] ~]# grep -E -n wo\{2,\}d test.txt
8:a wood cross!
12:#woood #
13:#woooooood
example :“*” Symbols match any alphanumeric
[[email protected] ~]# grep go[osad] test.txt
google is the best tools for search keyword.
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
When matching characters , What matches is the number of times the last character
1.4 Extended regular expression
In general, it's enough to use basic regular expressions , But sometimes in order to simplify the whole instruction , Need to use A wider range of extended regular expressions
The same type as the underlying regular expression , The extended regular expression also contains multiple metacharacters , Common extended regular expressions The metacharacters of expression mainly include the following :
Metacharacters effect
+ Repeat one or more previous characters
? Zero or the previous character of one
I( Pipe, ) Use or (or) How to find multiple characters
() lookup “ Group ” character string
()+ Identify multiple repeating groups
Example :+:
?:
| :
():
data:image/s3,"s3://crabby-images/91322/913226ae37fad47aed60e7e8fb8f438bd2d6de77" alt="shell Regular expression of script value _ character string _06"
()+:
1.5 Metacharacter operation cases
1. Find specific characters
Finding specific characters is very simple , If you execute the following command, you can test.txt Find specific characters in the file “the” The position of . among “-n” Indicates display line number 、“-i” Indicates case insensitive . After the execution of the command , Matching characters , The font color changes to red
Finding specific characters is very simple , If you execute the following command, you can test.txt Find specific characters in the file “the” The position of . among “-n” Indicates display line number 、“-i” Indicates case insensitive . After the execution of the command , Matching characters , The font color changes to redIf reverse selection , If the search does not contain “the” Lines of characters , You have to go through grep Ordered “-v” Option implementation , And cooperate with “-n” Use together to display line numbers
2. Using brackets “[]” To find set characters
Want to find “shirt” And “short” When these two strings , You can find that both strings contain “sh” And “rt”. At this time, execute the following command to find “shirt” And “short” These two strings , among “[]” No matter how many characters there are , All represent only one character , in other words “[io]” Represents a match “i” perhaps “o”
To find a single character that contains duplicates “oo” when , Just execute the following command
Find out if “oo” The front is not “w” String , Only through the reverse selection of set characters “[ ^ ]” To achieve that goal . For example, to perform “grep -n‘[ ^w ]oo’test.txt” The command indicates in test.txt Find... In the text “oo” The front is not “w” String
In the execution result of the above command, it is found that “woood” And “wooooood” It also meets the matching rules , Both include “w”. In fact, through the implementation results, we can see , Characters that meet the matching criteria are displayed in bold , From the above results, we can know , “#woood #” Bold in shows “ooo”, and “oo” Ahead “o” It meets the matching rules . Empathy “#woooooood #” It also meets the matching rules .
If not “oo” Lowercase before , have access to
“grep -n‘[ ^a-z ]oo’test.txt” Command implementation , among “a-z” For lowercase letters , Capital letters pass “A-Z” Express
To find a row that contains numbers, you can use “grep -n ‘[0-9]’test.txt” Command to implement
3. Find the beginning of a line “^” And end of line characters “$”
The base regular expression contains two positioning metacharacters :“ ^ ”( Head of line ) And “ $ ”( At the end of the line ). In the example above , Inquire about “the” There are many strings that contain “the” The line of , If you want to query with “the” The line with the string at the beginning of the line , You can use the “^” Metacharacters
Query lines that start with lowercase letters can be through “ ^ [a-z]” Rules to filter , For lines that start with uppercase letters, use
“ ^ [A-Z]” The rules , If the query does not start with a letter, use “^ [ ^a-zA-Z] ” The rules
“^” The symbol is in the metacharacter set “[]” The functions inside and outside the symbol are different , stay “[]” The symbol indicates reverse selection , stay “[]” Outside the symbol stands for positioning the beginning of the line . conversely , If you want to find a line that ends with a particular character, you can use “$” Locator . for example , Execute the following command to implement the query with decimal point (.) The line at the end . Because of the decimal point (.) It's also a metacharacter in regular expressions ( I'll talk about it later ), So here we need to use escape characters “\” Convert characters with special meaning into ordinary characters
When querying blank rows , perform “grep -n‘^$’test.txt” Command is enough
**4. Find any character “ . ” And repeating characters “ * ” **
Mentioned earlier , In regular expressions, the decimal point (.) It's also a metacharacter , Represents any character . For example, execute the following command to find “w??d” String , Four characters in total , With w start d ending
In the above results ,“wood” character string “w…d” Matching rules . If you want to query oo、ooo、ooooo Other information , Asterisk is required () Metacharacters . But it should be noted that ,“” Represents the repetition of zero or more preceding single characters . “o*” Indicates zero ( Is an empty character ) Or greater than or equal to one “o” The characters of , Because empty characters are allowed , So execute “grep
-n ‘o*’ test.txt” The command prints everything in the text . If it is “oo*”, Is the first o There must be , the second o Zero or more o, Therefore, the o、oo、ooo、ooo, All the data are up to the standard
If the query contains at least two o String above , execute “grep -n ‘ooo*’ test.txt” Command is enough
Query to w start d ending , At least one in between o String , Execute the following command
Execute the following command to query w start d ending , The middle character can have a string of optional characters
Execute the following command to query the line of any number
5. Find range of consecutive characters “{}”
Used “.” And “*” To set zero to an infinite number of repeating characters , If you want to limit a range of repeated strings, how to achieve it ? for example , Find three to five o Continuous characters of , At this time, you need to use the limited range of characters in the basic regular expression “{}”. because “{}” stay Shell Has special significance in , So it's using “{}” Character time , Escape character required “\”, take “{}” Character to normal .“{}” The use of characters is as follows
Check two o The characters of
Query to w Begin with d ending , The middle contains 2~5 individual o String
Query to w Begin with d ending , The middle contains 2 Or 2 More than o String
Two 、 Text processor
2.1 cut: Column interceptor
Instructions :
cut Command to cut bytes from each line of a file 、 Characters and fields and put these bytes 、 Write characters and fields to standard output .
If you don't specify File Parameters ,cut Command will read standard input . Must specify -b、-c or -f One of the signs .
Be careful :cut Only good at dealing with text separated by a single character
Format :cut [ Options ] Parameters
- 1.
Common options
Options | effect |
-b | Intercept by byte |
-c | Intercept by character , Commonly used in Chinese |
-d | Specify what to use as the delimiter to intercept , The default is tab |
-f | Usually and -d Together |
example 1: Intercept passwd The first column of the file
example 2: Intercept passwd The first and third columns of the document
example 3: Intercept passwd The first to third columns of the document
example 4: Intercept who The third byte of the query result
example 5: Intercept name The first text in the file
2.2 sort : Sorting tools
It is a tool to sort the contents of files in behavioral units , It can also be sorted according to different data types . For example, the sorting of data and characters is different
Format :sort [ Options ] Parameters
- 1.
Common options
Options effect
-t Specify the separator , By default [Tab] Well Key or space separation
-k Specify sorting area , Which interval is sorted
-n Sort by number , The default is to sort in text
-u Equate to uniq, Indicates that only one row of the same data is displayed , Be careful : If there is a space at the end of the line, de duplication will not succeed
-r Reverse sorting , The default is ascending ,-r It's descending
-o Transfer the sorted results to the specified file
Example :
sort passwd.txt ### Without any options, it is in ascending order of the first column by default , Letters are from a To z From top to bottom
sort -n -t: -k3 passwd.txt ### Use colon as separator , Sort the third column by number size ( Ascending )
sort -nr -t: -k3 passwd.txt ### Use colon as separator , Sort the third column by number size ( Descending )
sort -nr -t: -k3 passwd.txt -o passwd.bak ### Output the input results not on the screen, but to passwd.bak file
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
data:image/s3,"s3://crabby-images/5998a/5998a7f9863daca78e0899113686e5533818f3c7" alt="shell Regular expression of script value _ Metacharacters _08"
2.3 uniq: Remove consecutive duplicate lines
Be careful : Is a continuous line , So usually with sort Combined with the use of sorting to make it a continuous row, and then perform the de duplication operation , Otherwise, he can't repeat the discontinuous lines
Format :uniq [ Options ] Parameters
Common options
Options effect
-c Count duplicate rows
-d Show only duplicate lines
-u Show only rows that appear once
Example :
example : Create a fruit type file , altogether 9 Row content
example 1: Count the number of repeated lines , He doesn't count as a repeating line
example 2: combination sort Use is the effect we want
example 3: combination sort Use , Filter out duplicate lines
example 4: combination sort Use , duplicate removal
example 5: You can also use it directly sort -u
Example : View login users
2.4 tr : Replace tool
You can replace one character with another , Or you can completely remove some characters , You can also use it to remove duplicate characters
Format : usage :tr [ Options ]… SET1 [SET2]
Replace... From standard input 、 Downsizing and / Or delete characters , And write the results to standard output .
Common options
Options effect
-d Delete character
-s Delete all duplicate characters , Keep only the first
Example :
example 1: take fruit Lowercase in the file a-z Replace letters with uppercase A-Z
example 2: Replace ( Is the replacement of one-to-one letters )
example 3: Enclose the replaced characters in single quotation marks , Include special characters
example 4: Replace multiple characters with one
example 5: Remove line breaks
example 6: Yes p Character de duplication , Keep only the first
example 7: When multiple carriage returns are encountered, only one carriage return is reserved , Equivalent to removing empty lines
data:image/s3,"s3://crabby-images/ad808/ad808cd6839308c40bdc8fb8760db836a9d97966" alt="shell Regular expression of script value _ Metacharacters _24"
边栏推荐
- What is the potential of pocket network, which is favored by well-known investors?
- [CSDN Q & A] experience and suggestions
- [GNN] hard core! This paper combs the classical graph network model
- Kubedl hostnetwork: accelerating the efficiency of distributed training communication
- [C language] break and continue in switch statement
- [error record] configure NDK header file path in Visual Studio (three header file paths of NDK | ASM header file path selection related to CPU architecture)
- Tencent interview: can you find the number of 1 in binary?
- How to trade spot gold safely?
- [PHP basics] cookie basics, application case code and attack and defense
- [PHP basics] session basic knowledge, application case code and attack and defense
猜你喜欢
Joint examination of six provinces 2017
Similarities and differences of text similarity between Jaccard and cosine
STM32 key light
A method to solve Bert long text matching
A dichotomy of Valentine's Day
[about text classification trick] things you don't know
[error record] configure NDK header file path in Visual Studio (three header file paths of NDK | ASM header file path selection related to CPU architecture)
Ningde times and BYD have refuted rumors one after another. Why does someone always want to harm domestic brands?
(Video + graphics and text) introduction to machine learning series - Chapter 4 naive Bayes
What does redis do? Redis often practices grammar every day
随机推荐
A method to solve Bert long text matching
Recommendation of knowledge base management system
[GNN] hard core! This paper combs the classical graph network model
The frost peel off the purple dragon scale, and the xiariba people will talk about database SQL optimization and the principle of indexing (primary / secondary / clustered / non clustered)
Gossip about redis source code 80
Software testers, how can you quickly improve your testing skills? Ten minutes to teach you
Is the account opening of Guoyuan securities really safe and reliable
Global and Chinese market of melting furnaces 2022-2028: Research Report on technology, participants, trends, market size and share
Selenium library 4.5.0 keyword explanation (4)
Tencent interview: can you find the number of 1 in binary?
What does redis do? Redis often practices grammar every day
Reading notes on how programs run
P1339 [USACO09OCT]Heat Wave G
Development and application of fcitx functional plug-ins
不得不会的Oracle数据库知识点(三)
不得不会的Oracle数据库知识点(二)
What is the GPM scheduler for go?
Unity elementary case notes of angry birds Siki college 1-6
Pytest unit test framework: simple and easy to use parameterization and multiple operation modes
不得不会的Oracle数据库知识点(四)