当前位置:网站首页>让你的正则表达式可读性提高一百倍
让你的正则表达式可读性提高一百倍
2022-07-29 04:48:00 【Python数据之道】
作者:kingname
来源:未闻 Code
正则表达式这个东西,强大是强大,但写出来跟个表情符号一样。自己写的表达式,过一个月来看,自己都不记得是什么意思了。比如下面这个:
pattern = r"((?:\(\s*)?[A-Z]*H\d+[a-z]*(?:\s*\+\s*[A-Z]*H\d+[a-z]*)*(?:\s*[\):+])?)(.*?)(?=(?:\(\s*)?[A-Z]*H\d+[a-z]*(?:\s*\+\s*[A-Z]*H\d+[a-z]*)*(?:\s*[\):+])?(?![^\w\s])|$)"有没有什么办法提高正则表达式的可读性呢?我们知道,提高代码可读性的方法之一就是写注释,那么正则表达式能不能写注释呢?
例如对于下面这个句子:
msg = '我叫青南,我的密码是:123kingname456,请注意保密。'我要提取其中的密码123kingname456,那么我的正则表达式可能是这样的:
pattern = ':(.*?),'我能不能把它写成这样:
pattern = '''
: # 开始标志
(.*?) #从开始标志的下一个字符开始的任意字符
, #遇到英文逗号就停止
'''这样写就清晰多了,每个部分是什么作用全都清清楚楚。
但显然直接使用肯定什么都提取不到,如下图所示:

但我今天在逛 Python 正则表达式文档[1]的时候,发现了一个好东西:

使用它,可以让你的正则表达式拥有注释,如下图所示:

re.VERBOSE也可以简称为re.X,如下图所示:

本文最开头的复杂正则表达式,使用了注释以后,就会变得更可读:
pattern = r"""
( # code (capture)
# BEGIN multicode
(?: \( \s* )? # maybe open paren and maybe space
# code
[A-Z]*H # prefix
\d+ # digits
[a-z]* # suffix
(?: # maybe followed by other codes,
\s* \+ \s* # ... plus-separated
# code
[A-Z]*H # prefix
\d+ # digits
[a-z]* # suffix
)*
(?: \s* [\):+] )? # maybe space and maybe close paren or colon or plus
# END multicode
)
( .*? ) # message (capture): everything ...
(?= # ... up to (but excluding) ...
# ... the next code
# BEGIN multicode
(?: \( \s* )? # maybe open paren and maybe space
# code
[A-Z]*H # prefix
\d+ # digits
[a-z]* # suffix
(?: # maybe followed by other codes,
\s* \+ \s* # ... plus-separated
# code
[A-Z]*H # prefix
\d+ # digits
[a-z]* # suffix
)*
(?: \s* [\):+] )? # maybe space and maybe close paren or colon or plus
# END multicode
# (but not when followed by punctuation)
(?! [^\w\s] )
# ... or the end
| $
)
"""参考资料
[1]
正则表达式文档: https://docs.python.org/3/library/re.html#re.VERBOSE
-------- End --------

精选内容


边栏推荐
- 钉钉对话框文子转换成图片 不能复制粘贴到文档上
- Mpc5744p introduction and opensda firmware update
- Mysql:The user specified as a definer (‘root‘@‘%‘) does not exist 的解决办法
- Recommendation system of online education
- Mujoco and mujoco_ Install libxcursor.so 1:NO such dictionary
- Vscode configuration makefile compilation
- Dasctf2022.07 empowerment competition
- ssm整合增删改查
- ios面试准备 - objective-c篇
- Various configurations when pulsar starts the client (client, producer, consumer)
猜你喜欢

网络之以太网

Unity基础(3)—— unity中的各种坐标系

img 响应式图片的实现(含srcset属性、sizes属性的使用方法,设备像素比详解)

SSM integration, addition, deletion, modification and query

JVM (heap and stack) memory allocation

spinning up安装完使用教程测试是否成功,出现Library“GLU“ not found和‘from pyglet.gl import *错误解决办法

Review key points and data sorting of information metrology in the second semester of 2022 (teacher zhaorongying of Wuhan University)

用 ZEGO Avatar 做一个虚拟人|虚拟主播直播解决方案

Hengxing Ketong invites you to the 24th China expressway informatization conference and technical product exhibition in Hunan

C语言实现三子棋
随机推荐
Use of construction methods
IOS interview preparation - Online
Software test interview questions (4)
Various configurations when pulsar starts the client (client, producer, consumer)
Go面向并发的内存模型
Introduction to auto.js script development
IOS interview preparation - IOS
The daily life of programmers
安装spinning up教程里与mujoco对应的gym,报错mjpro150
Hengxing Ketong invites you to the 24th China expressway informatization conference and technical product exhibition in Hunan
There are objections and puzzles about joinpoint in afterreturning notice (I hope someone will leave a message)
What is the use of meta-info?
Common rules of makefile (make) (II)
Leetcode 686. KMP method of repeatedly superimposing strings (implemented in C language)
[QT learning notes] * insert pictures in the window
Pyscript cannot import package
OpenCV环境搭建
img 响应式图片的实现(含srcset属性、sizes属性的使用方法,设备像素比详解)
What is the difference between field, variable and property
使用更灵活、更方便的罗氏线圈