当前位置:网站首页>Improve the readability of your regular expressions a hundred times
Improve the readability of your regular expressions a hundred times
2022-07-29 04:48:00 【The way of Python data】
author :kingname
source : Unheard of Code
Regular expressions , Powerful is powerful , But it's written like an emoticon . Write your own expression , Come and see in a month , I don't remember what it means . Like this one down here :
pattern = r"((?:\(\s*)?[A-Z]*H\d+[a-z]*(?:\s*\+\s*[A-Z]*H\d+[a-z]*)*(?:\s*[\):+])?)(.*?)(?=(?:\(\s*)?[A-Z]*H\d+[a-z]*(?:\s*\+\s*[A-Z]*H\d+[a-z]*)*(?:\s*[\):+])?(?![^\w\s])|$)"Is there any way to improve the readability of regular expressions ? We know , One way to improve code readability is to write comments , So can regular expressions write comments ?
For example, for the following sentence :
msg = ' My name is Qingnan , My password is :123kingname456, Please keep it confidential .' I want to extract the password 123kingname456, Then my regular expression may be like this :
pattern = ':(.*?),'Can I write it like this :
pattern = '''
: # Start sign
(.*?) # Any character starting from the next character of the start flag
, # Stop when you encounter an English comma
'''It's much clearer to write in this way , The function of each part is clear .
But obviously, nothing can be extracted directly , As shown in the figure below :

But I'm shopping today Python Regular expression documents [1] When , Found a good thing :

Use it , You can make your regular expressions have comments , As shown in the figure below :

re.VERBOSE It can also be abbreviated as re.X, As shown in the figure below :

Complex regular expressions at the beginning of this article , After using comments , It will become more readable :
pattern = r"""
( # code (capture)
# BEGIN multicode
(?: \( \s* )? # maybe open paren and maybe space
# code
[A-Z]*H # prefix
\d+ # digits
[a-z]* # suffix
(?: # maybe followed by other codes,
\s* \+ \s* # ... plus-separated
# code
[A-Z]*H # prefix
\d+ # digits
[a-z]* # suffix
)*
(?: \s* [\):+] )? # maybe space and maybe close paren or colon or plus
# END multicode
)
( .*? ) # message (capture): everything ...
(?= # ... up to (but excluding) ...
# ... the next code
# BEGIN multicode
(?: \( \s* )? # maybe open paren and maybe space
# code
[A-Z]*H # prefix
\d+ # digits
[a-z]* # suffix
(?: # maybe followed by other codes,
\s* \+ \s* # ... plus-separated
# code
[A-Z]*H # prefix
\d+ # digits
[a-z]* # suffix
)*
(?: \s* [\):+] )? # maybe space and maybe close paren or colon or plus
# END multicode
# (but not when followed by punctuation)
(?! [^\w\s] )
# ... or the end
| $
)
"""Reference material
[1]
Regular expression documents : https://docs.python.org/3/library/re.html#re.VERBOSE
-------- End --------

Selected content
The illustration Pandas- Image & Text 01- Data structure introduction
The illustration Pandas- Image & Text 02- Create data objects
The illustration Pandas- Image & Text 03- Read and store Excel file
The illustration Pandas- Image & Text 04- Common data access
The illustration Pandas- Image & Text 05- Common data operations
The illustration Pandas- Image & Text 06- Common mathematical calculations
The illustration Pandas- Image & Text 08- Common data filtering


边栏推荐
- 【Express连接MySQL数据库】
- Introduction to auto.js script development
- Recommendation system of online education
- EF Core: 一对一,多对多的配置
- Use more flexible and convenient Rogowski coil
- Flink+Iceberg环境搭建及生产问题处理
- What is the difference between field, variable and property
- Makefile+Make基础知识
- ssm整合增删改查
- STL source code analysis (Hou Jie) notes -- Classification and testing of stl containers
猜你喜欢

C语言实现三子棋

Webrtc realizes simple audio and video call function

新产品上市最全推广方案

Common current limiting methods

命令行交互工具(最新版) inquirer 实用教程

网络之以太网

谷歌浏览器 打开网页出现 out of memory
![Understand the Internet giant [the war between China and Taiwan] and the development thinking of China and Taiwan](/img/6c/f24407133663c0e19d6fa05c611341.png)
Understand the Internet giant [the war between China and Taiwan] and the development thinking of China and Taiwan

Delete blank pages in word documents

删除word文档中的空白页
随机推荐
[c language] PTA 7-49 have fun with numbers (partially correct)
谷歌浏览器 打开网页出现 out of memory
VScode配置makefile编译
SSM integration, addition, deletion, modification and query
ios面试准备 - objective-c篇
Take you to understand JS array
Leetcode 686. KMP method of repeatedly superimposing strings (implemented in C language)
Classes and objects (II)
PHP判断用户是否已经登录,如果登录则显示首页,如果未登录则进入登录页面或注册页面
Update learning materials daily
Go面向并发的内存模型
Mysql:the user specified as a definer ('root '@'%) does not exist
Idea small settings
Install the gym corresponding to mujoco in the spinning up tutorial, and the error mjpro150 is reported
Actual combat of flutter - DIO of request encapsulation (II)
[c language] PTA 7-50 output Fahrenheit Celsius temperature conversion table
mpc5744p简介与OpenSDA固件更新
SGuard64.exe ACE-Guard Client EXE:造成磁盘经常读写,游戏卡顿,及解决方案
[c language] PTA 7-51 sum the first n terms of odd part sequence
Mysql各版本下载地址及多版本共存安装