当前位置:网站首页>Regular expression (4)
Regular expression (4)
2022-07-28 15:37:00 【WHJ226】
Catalog
Regular expressions (regular expression, abbreviation re), Also known as regular expression , It is often used to retrieve and replace text that meets certain rules .
1. regular expression syntax
1.1 Line locators
| character | explain |
| ^ | Matches the beginning of the string |
| $ | Match the end of the string |
1.2 Metacharacters
| Metacharacters | explain |
| . | Matches any character other than a newline character |
| \w | Match letters or numbers or underscores or Chinese characters |
| \s | Match any whitespace |
| \d | Match the Numbers |
| \b | Match the beginning or end of a word |
| \n | Match a line break |
| \t | Match a tab |
| \W | Match non alphabetic or non numeric or non underlined or non Chinese characters |
| \D | Match non numeric |
| \S | Match non whitespace |
1.3 qualifiers
| qualifiers | explain |
| ? | Match zero or one time |
| + | Match once or more |
| * | Match zero or more times |
| {n} | matching n Time |
| {n,} | matching n Times or times |
| {n,m} | matching n Time to m Time |
1.4 Other characters
#[...] Match characters in a character set , for example [abcde][123456][a-zA-Z][0-9]
#[^...] Matches all characters except those in the character group
#a|b Matching character a Or character b
#r or R Native characters , Add... Before the pattern string r or R Become a native character
#.* Greedy matching ( Match as many times as possible )
#.*? Inertia matching ( Match as few times as possible )2. Match string
2.1 match()
match() Method is used to match from the beginning of a string .
The syntax is as follows :
re.match(pattern,string,[flags])
#pattern: Pattern string
#string: String to match
#flags: Optional parameters , Indicate flag bit , Used to control the matching mode , for example :re.S perhaps re.DOTALL Match all characters , Include line breaks
re.I Matching is not case sensitive ,re.X Ignore spaces and comments that are not escaped in the pattern string for example :
import re # The import module
pattern = r'my_\w+' # Pattern string , Match with my Starting string
string1 = 'MY__SCHOOL my_school' # String to match 1
string2 = ' School MY__SCHOOL my_school' # String to match 2
match1 = re.match(pattern,string1,re.I) # Match string , There is no need to divide letters into upper and lower case
match2 = re.match(pattern,string2,re.I) # Match string , There is no need to divide letters into upper and lower case
print(match1)
print(match2)The operation results are as follows :
<re.Match object; span=(0, 10), match='MY__SCHOOL'>
Nonespan=(0, 10) Indicates the matching position ,0 To the first 9 Characters ,match='MY__SCHOOL' Represents matching data ; The return value is None, Because match() Method to match from the beginning of the string , When the first letter does not meet the conditions , Will no longer match , Go straight back to None.
match() Other uses of the method are as follows :
import re # The import module
pattern = r'my_\w+' # Pattern string , Match with my Starting string
string = 'MY__SCHOOL my_school' # String to match 1
match = re.match(pattern,string,re.I) # Match string , There is no need to divide letters into upper and lower case
print(' Output matching results :',match) # Output matching results
print(' Match the start and end positions :',match.start())
print(' Match end position :',match.end())
print(' Tuples matching positions :',match.span())
print(' String to match :',match.string)
print(' Matched data :',match.group())The operation results are as follows :
Output matching results : <re.Match object; span=(0, 10), match='MY__SCHOOL'>
Match the start and end positions : 0
Match end position : 10
Tuples matching positions : (0, 10)
String to match : MY__SCHOOL my_school
Matched data : MY__SCHOOL2.2 search()
search() Method is used to search the entire string for the value of the pattern string that appears for the first time . If the matching string contains the matching object , Then the match is successful , Return match object , Otherwise return to None.
The syntax is as follows :
re.search(pattern,string,[flags])
#pattern: Pattern string
#string: String to match
#flags: Optional parameters , Indicate flag bit , Used to control the matching mode , for example :re.S perhaps re.DOTALL Match all characters , Include line breaks
re.I Matching is not case sensitive ,re.X Ignore spaces and comments that are not escaped in the pattern string for example :
import re # The import module
pattern = r'my_\w+' # Pattern string , Match with my Starting string
string1 = 'MY__SCHOOL my_school' # String to match 1
string2 = ' School MY__SCHOOL my_school' # String to match 2
match1 = re.search(pattern,string1,re.I) # Match string , There is no need to divide letters into upper and lower case
match2 = re.search(pattern,string2,re.I) # Match string , There is no need to divide letters into upper and lower case
print(match1)
print(match2)The operation results are as follows :
<re.Match object; span=(0, 10), match='MY__SCHOOL'>
<re.Match object; span=(2, 12), match='MY__SCHOOL'>search() Other uses of the method are as follows :
import re # The import module
pattern = r'my_\w+' # Pattern string , Match with my Starting string
string = 'MY__SCHOOL my_school' # String to match 1
match = re.search(pattern,string,re.I) # Match string , There is no need to divide letters into upper and lower case
print(' Output matching results :',match) # Output matching results
print(' Match the start and end positions :',match.start())
print(' Match end position :',match.end())
print(' Tuples matching positions :',match.span())
print(' String to match :',match.string)
print(' Matched data :',match.group())The operation results are as follows :
Output matching results : <re.Match object; span=(0, 10), match='MY__SCHOOL'>
Match the start and end positions : 0
Match end position : 10
Tuples matching positions : (0, 10)
String to match : MY__SCHOOL my_school
Matched data : MY__SCHOOL2.3 findall()
findall() Method is used to search the entire string for all strings that match the pattern string , And return... As a list .
The syntax is as follows :
re.findall(pattern,string,[flags])
#pattern: Pattern string
#string: String to match
#flags: Optional parameters , Indicate flag bit , Used to control the matching mode , for example :re.S perhaps re.DOTALL Match all characters , Include line breaks
re.I Matching is not case sensitive ,re.X Ignore spaces and comments that are not escaped in the pattern string for example :
import re # The import module
pattern = r'my_\w+' # Pattern string , Match with my Starting string
string1 = 'MY__SCHOOL my_school' # String to match 1
string2 = ' School MY__SCHOOL my_school' # String to match 2
match1 = re.findall(pattern,string1) # Match string , It needs to be divided into uppercase and lowercase letters
match2 = re.findall(pattern,string2,re.I) # Match string , There is no need to divide letters into upper and lower case
print(match1)
print(match2)The operation results are as follows :
['my_school']
['MY__SCHOOL', 'my_school']2.4 sub()
sub() Method is used to replace a string .
The syntax is as follows :
re.sub(pattern,repl,string,count,flags)
#pattern: Pattern string
#repl: Replace string
#string: The string to be found and replaced
#count: Optional parameters , Number of replacements , Replace all... By default
#flags: Optional parameters , Indicate flag bit , Used to control the matching mode for example :
import re
pattern1 = r'__'
pattern2 = r'oo'
string1 = 'MY__SCHOOL my__school' # String to match 1
string2 = ' School MY__SCHOOL my__school' # String to match 2
result1 = re.sub(pattern1,'**',string1) # take '__' Replace all with **
result2 = re.sub(pattern1,'**',string1,1) # take '__' Replace with **, Replace... Once
result3 = re.sub(pattern2,'**',string2) # take 'oo' Replace all with **
print(result1)
print(result2)
print(result3)The operation results are as follows :
MY**SCHOOL my**school
MY**SCHOOL my__school
School MY__SCHOOL my__sch**l
2.5 replace()
replace() Method is also used to implement string replacement .
The syntax is as follows :
string.replace(pattern,repl,count)
#string: The string to be found and replaced
#pattern: Pattern string , That is, the string that needs to be replaced
#repl: Replace with a string of
#count: Optional parameters , Number of replacements , Replace all... By default for example :
import re
pattern1 = r'__'
pattern2 = r'oo'
string1 = 'MY__SCHOOL my__school' # String to match 1
string2 = ' School MY__SCHOOL my__school' # String to match 2
result1 = string1.replace(pattern1,'**') # take '__' Replace all with **
result2 = string1.replace(pattern1,'**',1) # take '__' Replace with **, Replace... Once
result3 = string2.replace(pattern2,'**') # take 'oo' Replace all with **
print(result1)
print(result2)
print(result3)The operation results are as follows :
MY**SCHOOL my**school
MY**SCHOOL my__school
School MY__SCHOOL my__sch**l3. Split string
split() Method is used to split strings according to regular expressions , And return... As a list .
The syntax is as follows :
re.split(pattern,string,[maxsplit],flags)
#pattern: Pattern string
#string: String to match
#maxsplit: Optional parameters , Maximum number of splits
#flags: Optional parameters , Indicate flag bit , Used to control the matching mode for example :
import re # The import module
pattern1 = '[?]' # Define separator
pattern2 = '[@]' # Define separator
pattern3 = r'[?|@]' # Define separator
string1 = 'MY?SCHOOL?my?school' # String to match 1
string2 = ' School @[email protected][email protected]' # String to match 2
match1 = re.split(pattern1,string1) # Delimited string
match2 = re.split(pattern1,string2) # Delimited string
match3 = re.split(pattern2,string1) # Delimited string
match4 = re.split(pattern2,string2) # Delimited string
match5 = re.split(pattern3,string1) # Delimited string
match6 = re.split(pattern3,string2) # Delimited string
print(match1)
print(match2)
print(match3)
print(match4)
print(match5)
print(match6)The operation results are as follows :
['MY', 'SCHOOL', 'my', 'school']
[' School @[email protected]', 'my', '@school']
['MY?SCHOOL?my?school']
[' School ', 'MY', 'SCHOOL?my?', 'school']
['MY', 'SCHOOL', 'my', 'school']
[' School ', 'MY', 'SCHOOL', 'my', '', 'school']边栏推荐
- Differences between two ways of QT creating folders
- Principle and configuration of MPLS LDP
- Leetcode - sliding window extremum, search tree postorder traversal, statistical difference pairs, dividing equal subsets
- Some operations of bit operation
- 根据输入target,返回数组的两个下标。
- DAY:7/11
- 堆操作
- EasyExcel复杂表头导出(一对多)
- 全国985院校考研信息汇总整理
- [leetcode] binary search given an N-element ordered (ascending) integer array num and a target value target, write a function to search the target in num. if the target value exists, return the subscr
猜你喜欢
![【删除指定数字——leetcode]](/img/16/b40492d8414a363a3a24f00b4afd47.png)
【删除指定数字——leetcode]

Matlab导出高清图片、且Word中压缩不失真、转换PDF不失真

Svg verification code recognition experience

No files or folders found to process

详解.NET的求复杂类型集合的差集、交集、并集

MySQL 8.0 common (continuous update)

Pytorch - autograd automatic differentiation

2022年最火的十大测试工具,你掌握了几个

I heard that many merchants of crmeb have added the function of planting grass?

How many tips do you know about using mock technology to help improve test efficiency?
随机推荐
字符串(3)
QT refresh UI interface problem
提速1200倍!MIT开发新一代药物研发AI,吊打老模型
Configure CX Oracle solution (cx_oracle.databaseerror) dpi-1047: cannot locate a 64 bit Oracle client library: "th
20. Channel allocation task implementation
简单入手Swagger
给你一个链表,删除链表的倒数第 n 个结点,并且返回链表的头结点。
Learn PHP reflection classes from ThinkPHP remote code execution
vs动态库调试
2022年最火的十大测试工具,你掌握了几个
20、通道分配任务实现
7. Definitions of real-time data backup and real-time clock
根据输入target,返回数组的两个下标。
封装统一返回对象MessageResult
php parse_ URL bypass whitelist
MATLAB不覆盖导入EXCEL
Pycharm - output exception of program run and default comment of added function
MIT指出公开预训练模型不能乱用
如何获取及嵌入Go二进制执行包信息
Daily news on July 28, 2022: Science: AI has made another breakthrough in protein design, and can design specific functional proteins