当前位置:网站首页>Regular expression (4)
Regular expression (4)
2022-07-28 15:37:00 【WHJ226】
Catalog
Regular expressions (regular expression, abbreviation re), Also known as regular expression , It is often used to retrieve and replace text that meets certain rules .
1. regular expression syntax
1.1 Line locators
| character | explain |
| ^ | Matches the beginning of the string |
| $ | Match the end of the string |
1.2 Metacharacters
| Metacharacters | explain |
| . | Matches any character other than a newline character |
| \w | Match letters or numbers or underscores or Chinese characters |
| \s | Match any whitespace |
| \d | Match the Numbers |
| \b | Match the beginning or end of a word |
| \n | Match a line break |
| \t | Match a tab |
| \W | Match non alphabetic or non numeric or non underlined or non Chinese characters |
| \D | Match non numeric |
| \S | Match non whitespace |
1.3 qualifiers
| qualifiers | explain |
| ? | Match zero or one time |
| + | Match once or more |
| * | Match zero or more times |
| {n} | matching n Time |
| {n,} | matching n Times or times |
| {n,m} | matching n Time to m Time |
1.4 Other characters
#[...] Match characters in a character set , for example [abcde][123456][a-zA-Z][0-9]
#[^...] Matches all characters except those in the character group
#a|b Matching character a Or character b
#r or R Native characters , Add... Before the pattern string r or R Become a native character
#.* Greedy matching ( Match as many times as possible )
#.*? Inertia matching ( Match as few times as possible )2. Match string
2.1 match()
match() Method is used to match from the beginning of a string .
The syntax is as follows :
re.match(pattern,string,[flags])
#pattern: Pattern string
#string: String to match
#flags: Optional parameters , Indicate flag bit , Used to control the matching mode , for example :re.S perhaps re.DOTALL Match all characters , Include line breaks
re.I Matching is not case sensitive ,re.X Ignore spaces and comments that are not escaped in the pattern string for example :
import re # The import module
pattern = r'my_\w+' # Pattern string , Match with my Starting string
string1 = 'MY__SCHOOL my_school' # String to match 1
string2 = ' School MY__SCHOOL my_school' # String to match 2
match1 = re.match(pattern,string1,re.I) # Match string , There is no need to divide letters into upper and lower case
match2 = re.match(pattern,string2,re.I) # Match string , There is no need to divide letters into upper and lower case
print(match1)
print(match2)The operation results are as follows :
<re.Match object; span=(0, 10), match='MY__SCHOOL'>
Nonespan=(0, 10) Indicates the matching position ,0 To the first 9 Characters ,match='MY__SCHOOL' Represents matching data ; The return value is None, Because match() Method to match from the beginning of the string , When the first letter does not meet the conditions , Will no longer match , Go straight back to None.
match() Other uses of the method are as follows :
import re # The import module
pattern = r'my_\w+' # Pattern string , Match with my Starting string
string = 'MY__SCHOOL my_school' # String to match 1
match = re.match(pattern,string,re.I) # Match string , There is no need to divide letters into upper and lower case
print(' Output matching results :',match) # Output matching results
print(' Match the start and end positions :',match.start())
print(' Match end position :',match.end())
print(' Tuples matching positions :',match.span())
print(' String to match :',match.string)
print(' Matched data :',match.group())The operation results are as follows :
Output matching results : <re.Match object; span=(0, 10), match='MY__SCHOOL'>
Match the start and end positions : 0
Match end position : 10
Tuples matching positions : (0, 10)
String to match : MY__SCHOOL my_school
Matched data : MY__SCHOOL2.2 search()
search() Method is used to search the entire string for the value of the pattern string that appears for the first time . If the matching string contains the matching object , Then the match is successful , Return match object , Otherwise return to None.
The syntax is as follows :
re.search(pattern,string,[flags])
#pattern: Pattern string
#string: String to match
#flags: Optional parameters , Indicate flag bit , Used to control the matching mode , for example :re.S perhaps re.DOTALL Match all characters , Include line breaks
re.I Matching is not case sensitive ,re.X Ignore spaces and comments that are not escaped in the pattern string for example :
import re # The import module
pattern = r'my_\w+' # Pattern string , Match with my Starting string
string1 = 'MY__SCHOOL my_school' # String to match 1
string2 = ' School MY__SCHOOL my_school' # String to match 2
match1 = re.search(pattern,string1,re.I) # Match string , There is no need to divide letters into upper and lower case
match2 = re.search(pattern,string2,re.I) # Match string , There is no need to divide letters into upper and lower case
print(match1)
print(match2)The operation results are as follows :
<re.Match object; span=(0, 10), match='MY__SCHOOL'>
<re.Match object; span=(2, 12), match='MY__SCHOOL'>search() Other uses of the method are as follows :
import re # The import module
pattern = r'my_\w+' # Pattern string , Match with my Starting string
string = 'MY__SCHOOL my_school' # String to match 1
match = re.search(pattern,string,re.I) # Match string , There is no need to divide letters into upper and lower case
print(' Output matching results :',match) # Output matching results
print(' Match the start and end positions :',match.start())
print(' Match end position :',match.end())
print(' Tuples matching positions :',match.span())
print(' String to match :',match.string)
print(' Matched data :',match.group())The operation results are as follows :
Output matching results : <re.Match object; span=(0, 10), match='MY__SCHOOL'>
Match the start and end positions : 0
Match end position : 10
Tuples matching positions : (0, 10)
String to match : MY__SCHOOL my_school
Matched data : MY__SCHOOL2.3 findall()
findall() Method is used to search the entire string for all strings that match the pattern string , And return... As a list .
The syntax is as follows :
re.findall(pattern,string,[flags])
#pattern: Pattern string
#string: String to match
#flags: Optional parameters , Indicate flag bit , Used to control the matching mode , for example :re.S perhaps re.DOTALL Match all characters , Include line breaks
re.I Matching is not case sensitive ,re.X Ignore spaces and comments that are not escaped in the pattern string for example :
import re # The import module
pattern = r'my_\w+' # Pattern string , Match with my Starting string
string1 = 'MY__SCHOOL my_school' # String to match 1
string2 = ' School MY__SCHOOL my_school' # String to match 2
match1 = re.findall(pattern,string1) # Match string , It needs to be divided into uppercase and lowercase letters
match2 = re.findall(pattern,string2,re.I) # Match string , There is no need to divide letters into upper and lower case
print(match1)
print(match2)The operation results are as follows :
['my_school']
['MY__SCHOOL', 'my_school']2.4 sub()
sub() Method is used to replace a string .
The syntax is as follows :
re.sub(pattern,repl,string,count,flags)
#pattern: Pattern string
#repl: Replace string
#string: The string to be found and replaced
#count: Optional parameters , Number of replacements , Replace all... By default
#flags: Optional parameters , Indicate flag bit , Used to control the matching mode for example :
import re
pattern1 = r'__'
pattern2 = r'oo'
string1 = 'MY__SCHOOL my__school' # String to match 1
string2 = ' School MY__SCHOOL my__school' # String to match 2
result1 = re.sub(pattern1,'**',string1) # take '__' Replace all with **
result2 = re.sub(pattern1,'**',string1,1) # take '__' Replace with **, Replace... Once
result3 = re.sub(pattern2,'**',string2) # take 'oo' Replace all with **
print(result1)
print(result2)
print(result3)The operation results are as follows :
MY**SCHOOL my**school
MY**SCHOOL my__school
School MY__SCHOOL my__sch**l
2.5 replace()
replace() Method is also used to implement string replacement .
The syntax is as follows :
string.replace(pattern,repl,count)
#string: The string to be found and replaced
#pattern: Pattern string , That is, the string that needs to be replaced
#repl: Replace with a string of
#count: Optional parameters , Number of replacements , Replace all... By default for example :
import re
pattern1 = r'__'
pattern2 = r'oo'
string1 = 'MY__SCHOOL my__school' # String to match 1
string2 = ' School MY__SCHOOL my__school' # String to match 2
result1 = string1.replace(pattern1,'**') # take '__' Replace all with **
result2 = string1.replace(pattern1,'**',1) # take '__' Replace with **, Replace... Once
result3 = string2.replace(pattern2,'**') # take 'oo' Replace all with **
print(result1)
print(result2)
print(result3)The operation results are as follows :
MY**SCHOOL my**school
MY**SCHOOL my__school
School MY__SCHOOL my__sch**l3. Split string
split() Method is used to split strings according to regular expressions , And return... As a list .
The syntax is as follows :
re.split(pattern,string,[maxsplit],flags)
#pattern: Pattern string
#string: String to match
#maxsplit: Optional parameters , Maximum number of splits
#flags: Optional parameters , Indicate flag bit , Used to control the matching mode for example :
import re # The import module
pattern1 = '[?]' # Define separator
pattern2 = '[@]' # Define separator
pattern3 = r'[?|@]' # Define separator
string1 = 'MY?SCHOOL?my?school' # String to match 1
string2 = ' School @[email protected][email protected]' # String to match 2
match1 = re.split(pattern1,string1) # Delimited string
match2 = re.split(pattern1,string2) # Delimited string
match3 = re.split(pattern2,string1) # Delimited string
match4 = re.split(pattern2,string2) # Delimited string
match5 = re.split(pattern3,string1) # Delimited string
match6 = re.split(pattern3,string2) # Delimited string
print(match1)
print(match2)
print(match3)
print(match4)
print(match5)
print(match6)The operation results are as follows :
['MY', 'SCHOOL', 'my', 'school']
[' School @[email protected]', 'my', '@school']
['MY?SCHOOL?my?school']
[' School ', 'MY', 'SCHOOL?my?', 'school']
['MY', 'SCHOOL', 'my', 'school']
[' School ', 'MY', 'SCHOOL', 'my', '', 'school']边栏推荐
猜你喜欢
随机推荐
Here comes the full open source free customer service system
MySQL 8.0 common (continuous update)
软件测试的流程规范有哪些?具体要怎么做?
Pytorch - sequential and modulelist
使用Mock技术帮助提升测试效率的小tips,你知道几个?
ECCV 2022 | SSP: 自支持匹配的小样本任务新思想
19、通道分配任务定义
3. Basic constants and macro definitions
4. Main program and cumulative interrupt processing routine implementation code
Set structure byte alignment
NFTScan 与 NFTPlay 在 NFT 数据领域达成战略合作
Table lock query and unlocking in SQL development part 1
Endnote 与word关联
Problems encountered by pyppeter
Celery related
subst命令将一个文件夹镜像成本地的一个磁盘
VS使用技巧
Volatile principle
MATLAB不覆盖导入EXCEL
Principle and configuration of MPLS LDP








