当前位置:网站首页>Re regular expressions
Re regular expressions
2022-08-05 07:11:00 【m0_52339560】
活动地址:CSDN21天学习挑战赛
re正则表达式
正则表达式功能强大,But it's still more complicated,Referencing the official documentation is not very easy to write.Here is a brief record,To be proficient or to use more.After writing part of it, I found that it was a bit difficult to write,
概述
正则表达式(称为RE,或正则,or a regular expression pattern)Essentially embedded inPythona tiny one、高度专业化的编程语言.
Strings can be parsed and processed through regular expressions.Commonly used are matching,替换,分割等操作.
匹配
Matching characters are the most important part of a regular expression.
Most letters and characters will only match themselves.比如正则表达式testwill only match the string exactly'test'.注意:Without setting the relevant parameters,Regular expressions are strictly case-sensitive.
Except for normal characters,There are also some special characters in regular,These special characters are called metacharacters.以下是常用的元字符:
. ^ $ * + ? { } [ ] \ | ( )
These are introduced one by one below:
.:This character matches any character except newlines.
^:This character matches the beginning of the string.
$:匹配字符串的结尾
*:Matches the preceding regular expression0到任意次,And it's as many matches as possible.
+:Matches the preceding regular expression1到任意次,And it's as many matches as possible.
?:Matches the preceding regular expression0或1This repeats.
{m}:对其之前的正则式指定匹配 m 个重复;少于 m 的话就会导致匹配失败.
{m,n}:对正则式进行 m 到 n 次匹配,在 m 和 n 之间取尽量多.
{m,n}?:对正则式进行 m 到 n 次匹配,在 m 和 n Take as little as possible.
[...]:matches appear...中的字符.If you want to match a set of characters,They can be listed individually,也可以使用-to concatenate the start and end characters of the set of characters,For example to match all lowercase letters[a-z],从ASCIIlook at the code,[a-z]可以匹配a和z之间的所有字符.
[^...]:Match does not appear...中的字符
|:A|B,A和B可以是任意正则表达式,那么匹配A或者B.
():组合.匹配括号内的任意正则表达式.After the matching is completed, the matching results in parentheses can be extracted.
There are also some special sequences here,如下:
\d \D \s \S \w \W
Introduce their functions:
\d匹配任何十进制数字;这等价于
[0-9].
\D匹配任何非数字字符;这等价于
[^0-9].
\s匹配任何空白字符;这等价于
[ \t\n\r\f\v].
\S匹配任何非空白字符;这相当于
[^ \t\n\r\f\v].
\w匹配任何字母与数字字符;这相当于
[a-zA-Z0-9_].
\W匹配任何非字母与数字字符;这相当于
[^a-zA-Z0-9_].
Backslash disaster
在Python的字符串中,字符\需要使用\\来标识.
Suppose you write a regex to match strings'\section'.Then you need to use regular\\\\section来表示.
in a regex that uses backslashes repeatedly,This results in a lot of repeated backslashes,and makes the resulting string incomprehensible.
使用正则表达式
编译正则表达式
Compiles a regular expression into a pattern object,In turn, various operations are performed through the schema object.
import re p = re.complie('ab*') #Compile the regular expression into a pattern object print(type(p)) #<class 're.Pattern'> res = p.match('abc') print(type(res)) #<class 're.Match'>
应用匹配
Once you have an object representing the compiled regular expression,你用它做什么? Schema objects have several methods and properties. Only the most important ones are covered here.
| 方法 / 属性 | 目的 |
|---|---|
match() | Determines whether the regex matches from the beginning of the string.In fact, it is to match the entire string with the regular expression. |
search() | 扫描字符串,Find anywhere this regex matches. |
findall() | Find all substrings matched by the regular,and return them as a list. |
finditer() | Find all substrings matched by the regular,and return them as one iterator. |
- match():匹配整个字符串,返回Match
import re p = re.compile('[a-z]+') p.match(" ") #None m = p.match("tempo") print(m) #<re.Match object; span=(0, 5), match='tempo'>
- search():Matches the entire string and its substrings,返回Match
p = re.compile('[a-z]+') m = p.search("::: message");print(m) # <re.Match object;span=(4,11), match='message'>
- findall():Returns a list of matching strings
p = re.compile(r'\d+') m = p.findall('12 drummers drumming, 11 pipers piping, 10 lords a-leaping') print(m) # ['12', '11', '10']
- finditer():Returns a sequence of matching objects as oneiterator
iterator = p.finditer('12 drummers drumming, 11 ... 10 ...') print(type(iterator)) # <class 'callable_iterator'>
The code above returnsre.Match有一些常用的方法.
| 方法/属性 | 目的 |
|---|---|
| group() | Returns the regular matched string |
| start() | 返回匹配的开始位置 |
| end() | 返回匹配的结束位置 |
| span() | 返回包含匹配 (start, end) 位置的元组 |
示例:
import re p = re.compile('[a-z]+') p.match(" ") #None m = p.match("tempo") print(m) #<re.Match object; span=(0, 5), match='tempo'> m.group() # 'tempo' print(m.start(), m.end()) #0 5 print(m.span()) #(0, 5)
分组
If you need to extract part of the matched string,Then you need to use grouping.
p = re.compile('(\w*)\s(\w*).*')
res = p.match('abc word hello')
print(res.group(0)) #abc word hello
print(res.group(1)) #abc
print(res.group(2)) #word
贪婪与非贪婪
This is to be introduced.*和.*?.
.*:尽可能多地匹配..*?:尽可能少地匹配.
s = '<html><head><title>Title</title>'
print(re.match('<.*>', s).group())
# <html><head><title>Title</title>
print(re.match('<.*?>', s).group())
# <html>
参考资料
- https://docs.python.org/zh-cn/3.8/howto/regex.html#match-versus-search
- https://docs.python.org/zh-cn/3.8/library/re.html#re-syntax
- https://blog.csdn.net/yuan2019035055/article/details/124217883
边栏推荐
- LaTeX Notes
- 合工大苍穹战队视觉组培训Day9——相机标定
- 栈与队列的基本介绍和创建、销毁、出入、计算元素数量、查看元素等功能的c语言实现,以及栈的压入、弹出序列判断,栈结构的链式表示与实现
- Flink Learning 12: DataStreaming API
- MySQL:连接查询 | 内连接,外连接
- 【动态类型检测 Objective-C】
- C# FileSystemWatcher
- Technical Analysis Patterns (11) How to Trade Head and Shoulders Patterns
- (JLK105D)中山爆款LED恒流电源芯片方案
- 算法拾遗十五补链表相关面试题
猜你喜欢

Task flow scheduling tool AirFlow,, 220804,,

Advanced Redis

Source code analysis of Nacos configuration service (full)

在STM32中使用printf函数

typescript59-泛型工具类型(partial )

合工大苍穹战队视觉组培训Day9——相机标定

在anaconda Promat界面import torch通过,在jupyter notebook中报错的问题(仅提供思路理解!)

MAYA船的建模

Flink Learning 11: Flink Program Parallelism

Using printf function in STM32
随机推荐
共享内存+inotify机制实现多进程低延迟数据共享
日本卫生设备行业协会:日本温水喷淋马桶座出货量达1亿套
游戏思考19:游戏多维计算相关:点乘、叉乘、点线面距离计算
[Tool Configuration] Summary of Common Uses of VSCode
typescript67-索引查询类型
Promise (3) async/await
binary search tree problem
After the firewall iptable rule is enabled, the system network becomes slow
re正则表达式
技术分析模式(十一)如何交易头肩形态
ndk编译so库
GAN生成动漫头像Pytorch
专用机终端安装软件后报IP冲突
2022.7.29好题选讲(计数专题)
RNote108---显示R程序的运行进度
技术分析模式(十)头肩图案
Week 8 Document Clustering(文本聚类)
C# FileSystemWatcher
(4) Rotating object detection data roLabelImg to DOTA format
Database table insert data