当前位置:网站首页>让你的正则表达式可读性提高一百倍
让你的正则表达式可读性提高一百倍
2022-07-29 04:48:00 【Python数据之道】
作者:kingname
来源:未闻 Code
正则表达式这个东西,强大是强大,但写出来跟个表情符号一样。自己写的表达式,过一个月来看,自己都不记得是什么意思了。比如下面这个:
pattern = r"((?:\(\s*)?[A-Z]*H\d+[a-z]*(?:\s*\+\s*[A-Z]*H\d+[a-z]*)*(?:\s*[\):+])?)(.*?)(?=(?:\(\s*)?[A-Z]*H\d+[a-z]*(?:\s*\+\s*[A-Z]*H\d+[a-z]*)*(?:\s*[\):+])?(?![^\w\s])|$)"有没有什么办法提高正则表达式的可读性呢?我们知道,提高代码可读性的方法之一就是写注释,那么正则表达式能不能写注释呢?
例如对于下面这个句子:
msg = '我叫青南,我的密码是:123kingname456,请注意保密。'我要提取其中的密码123kingname456,那么我的正则表达式可能是这样的:
pattern = ':(.*?),'我能不能把它写成这样:
pattern = '''
: # 开始标志
(.*?) #从开始标志的下一个字符开始的任意字符
, #遇到英文逗号就停止
'''这样写就清晰多了,每个部分是什么作用全都清清楚楚。
但显然直接使用肯定什么都提取不到,如下图所示:

但我今天在逛 Python 正则表达式文档[1]的时候,发现了一个好东西:

使用它,可以让你的正则表达式拥有注释,如下图所示:

re.VERBOSE也可以简称为re.X,如下图所示:

本文最开头的复杂正则表达式,使用了注释以后,就会变得更可读:
pattern = r"""
( # code (capture)
# BEGIN multicode
(?: \( \s* )? # maybe open paren and maybe space
# code
[A-Z]*H # prefix
\d+ # digits
[a-z]* # suffix
(?: # maybe followed by other codes,
\s* \+ \s* # ... plus-separated
# code
[A-Z]*H # prefix
\d+ # digits
[a-z]* # suffix
)*
(?: \s* [\):+] )? # maybe space and maybe close paren or colon or plus
# END multicode
)
( .*? ) # message (capture): everything ...
(?= # ... up to (but excluding) ...
# ... the next code
# BEGIN multicode
(?: \( \s* )? # maybe open paren and maybe space
# code
[A-Z]*H # prefix
\d+ # digits
[a-z]* # suffix
(?: # maybe followed by other codes,
\s* \+ \s* # ... plus-separated
# code
[A-Z]*H # prefix
\d+ # digits
[a-z]* # suffix
)*
(?: \s* [\):+] )? # maybe space and maybe close paren or colon or plus
# END multicode
# (but not when followed by punctuation)
(?! [^\w\s] )
# ... or the end
| $
)
"""参考资料
[1]
正则表达式文档: https://docs.python.org/3/library/re.html#re.VERBOSE
-------- End --------

精选内容


边栏推荐
- Webrtc realizes simple audio and video call function
- 软件测试面试题(四)
- Go面向并发的内存模型
- C语言实现三子棋
- Pycharm reports an error when connecting to the virtual machine database
- post导出数据,返回
- Leetcode 686. KMP method of repeatedly superimposing strings (implemented in C language)
- [C language] PTA 7-47 binary leading zero
- IOS interview preparation - other articles
- def fasterrcnn_resnet50_fpn()实例测试
猜你喜欢

使用近场探头和电流探头进行EMI干扰排查

iOS面试准备 - ios篇

Several simple and difficult OJ problems with sequential force deduction

用 ZEGO Avatar 做一个虚拟人|虚拟主播直播解决方案

Basic operation of queue

Use more flexible and convenient Rogowski coil

DASCTF2022.07赋能赛
![学术 | [LaTex]超详细Texlive2022+Tex Studio下载安装配置](/img/4d/f8c60c0fbbd98c4da198cfac7989fa.png)
学术 | [LaTex]超详细Texlive2022+Tex Studio下载安装配置

Idea small settings

UE plays video in scene or UMG
随机推荐
settings.xml
Delete blank pages in word documents
Oracle 插入数据
Command line interactive tools (latest version) inquirer practical tutorial
Vscode one click compilation and debugging
Corresponding order of 18 and 25coco data of openpose and joint points
Common current limiting methods
Star a pathfinding in LAYA
(heap sort) heap sort is super detailed, I don't believe you can't (C language code implementation)
软件测试面试题(四)
SGuard64.exe ACE-Guard Client EXE:造成磁盘经常读写,游戏卡顿,及解决方案
Auto.js脚本开发环境搭建
Configure st-gcn environment record [Google lab]
IOS interview preparation - other articles
What is the use of meta-info?
Leetcode (Sword finger offer) - 53 - I. find the number I in the sorted array
Reveal installation configuration debugging
TypeError: Cannot read properties of undefined (reading ‘then‘)
Go memory model for concurrency
EF Core: 一对一,多对多的配置