当前位置:网站首页>让你的正则表达式可读性提高一百倍
让你的正则表达式可读性提高一百倍
2022-07-29 04:48:00 【Python数据之道】
作者:kingname
来源:未闻 Code
正则表达式这个东西,强大是强大,但写出来跟个表情符号一样。自己写的表达式,过一个月来看,自己都不记得是什么意思了。比如下面这个:
pattern = r"((?:\(\s*)?[A-Z]*H\d+[a-z]*(?:\s*\+\s*[A-Z]*H\d+[a-z]*)*(?:\s*[\):+])?)(.*?)(?=(?:\(\s*)?[A-Z]*H\d+[a-z]*(?:\s*\+\s*[A-Z]*H\d+[a-z]*)*(?:\s*[\):+])?(?![^\w\s])|$)"有没有什么办法提高正则表达式的可读性呢?我们知道,提高代码可读性的方法之一就是写注释,那么正则表达式能不能写注释呢?
例如对于下面这个句子:
msg = '我叫青南,我的密码是:123kingname456,请注意保密。'我要提取其中的密码123kingname456,那么我的正则表达式可能是这样的:
pattern = ':(.*?),'我能不能把它写成这样:
pattern = '''
: # 开始标志
(.*?) #从开始标志的下一个字符开始的任意字符
, #遇到英文逗号就停止
'''这样写就清晰多了,每个部分是什么作用全都清清楚楚。
但显然直接使用肯定什么都提取不到,如下图所示:

但我今天在逛 Python 正则表达式文档[1]的时候,发现了一个好东西:

使用它,可以让你的正则表达式拥有注释,如下图所示:

re.VERBOSE也可以简称为re.X,如下图所示:

本文最开头的复杂正则表达式,使用了注释以后,就会变得更可读:
pattern = r"""
( # code (capture)
# BEGIN multicode
(?: \( \s* )? # maybe open paren and maybe space
# code
[A-Z]*H # prefix
\d+ # digits
[a-z]* # suffix
(?: # maybe followed by other codes,
\s* \+ \s* # ... plus-separated
# code
[A-Z]*H # prefix
\d+ # digits
[a-z]* # suffix
)*
(?: \s* [\):+] )? # maybe space and maybe close paren or colon or plus
# END multicode
)
( .*? ) # message (capture): everything ...
(?= # ... up to (but excluding) ...
# ... the next code
# BEGIN multicode
(?: \( \s* )? # maybe open paren and maybe space
# code
[A-Z]*H # prefix
\d+ # digits
[a-z]* # suffix
(?: # maybe followed by other codes,
\s* \+ \s* # ... plus-separated
# code
[A-Z]*H # prefix
\d+ # digits
[a-z]* # suffix
)*
(?: \s* [\):+] )? # maybe space and maybe close paren or colon or plus
# END multicode
# (but not when followed by punctuation)
(?! [^\w\s] )
# ... or the end
| $
)
"""参考资料
[1]
正则表达式文档: https://docs.python.org/3/library/re.html#re.VERBOSE
-------- End --------

精选内容


边栏推荐
- Vscode configuration makefile compilation
- Basic operation of queue
- Leetcode 686. KMP method of repeatedly superimposing strings (implemented in C language)
- 网络之以太网
- 如何避免示波器电流探头损坏
- Reveal installation configuration debugging
- iOS面试准备 - ios篇
- PHP判断用户是否已经登录,如果登录则显示首页,如果未登录则进入登录页面或注册页面
- IOS interview preparation - IOS
- GCC基础知识
猜你喜欢

Download addresses of various versions of MySQL and multi version coexistence installation

On the use of pyscript (realizing office preview)

网络之以太网

GCC基础知识

在线教育的推荐系统

Install the gym corresponding to mujoco in the spinning up tutorial, and the error mjpro150 is reported

Dasctf2022.07 empowerment competition

读懂 互联网巨头 【中台之战】 以及 中台 发展思维

Tower of Hanoi classic recursion problem (C language implementation)

DASCTF2022.07赋能赛
随机推荐
oracle 更新和删除数据
IOS interview preparation - Online
UE plays video in scene or UMG
img 响应式图片的实现(含srcset属性、sizes属性的使用方法,设备像素比详解)
Makefile+make Basics
Take you to understand JS array
TypeError: Cannot read properties of undefined (reading ‘then‘)
How to build a mobile studio network?
MySQL - deep parsing of MySQL index data structure
ios面试准备 - objective-c篇
[C language] power table of 3 generated by PTA 7-53
[c language] use the reverse order output of the linked list (bidirectional linked list)
C language implementation of three chess
Corresponding order of 18 and 25coco data of openpose and joint points
Common rules of makefile (make) (II)
Opencv environment construction
学术 | [LaTex]超详细Texlive2022+Tex Studio下载安装配置
[C language] PTA 7-91 output leap year
Mongo shell interactive command window
安装spinning up教程里与mujoco对应的gym,报错mjpro150