当前位置:网站首页>\W and [a-za-z0-9_], \Are D and [0-9] equivalent?
\W and [a-za-z0-9_], \Are D and [0-9] equivalent?
2022-06-27 21:59:00 【JAPAN_ is_ shit】
When I first looked at regular expressions, I had this doubt , Why does Baidu Encyclopedia say so ?
You have to understand unicode Character set , Of course, you can also figure out the character set of Chinese characters, English and numbers ?Unicode Encyclopedia of characters
Chinese characters are in unicode In the table is 4e00-9fa5
english , Numbers and special symbols belong to unicode Latin in
therefore \w Just like [A-Za-z0-9_] It's much more extensive , For example, it can match the words of other countries , and \d Can match the numbers of other countries .
More than \w and \d The scope of is , In regular metacharacters . \W,\D,\s,\S,\b,\B It can also match other words , So how can it not match all Unicode The characters ?
adopt re.ASCII To set only match ASCII character
import re
# Expand Arabia - Indic digit
s="۱۲۳۴۵۶۷۸۹"
print(s.isdigit())
a= re.match(r'\d+', s)
print(a.group())
#True
# Mongolian
d = 'ᠠᠡᠢᠣᠤᠶᠿ'
b= re.match(r'\w+', d) # Match alphanumeric underscores
print(b.group())
#۱۲۳۴۵۶۷۸۹
# Mongolian
d = 'ᠠᠡᠢᠣᠤᠶᠿ'
b= re.match(r'\D+', d) # Match a non number
print(b.group())
#ᠠᠡᠢᠣᠤᠶᠿ
# Mongolian
d = 'ᠠᠡᠢᠣᠤᠶᠿ'
b= re.match(r'\S+', d) # Match a visible character
print(b.group())
#ᠠᠡᠢᠣᠤᠶᠿ
s="۱۲۳۴۵۶۷۸۹"
print(s.isdigit())
a= re.match(r'.+', s)
print(a.group())
# Mongolian
d = 'ᠠᠡᠢᠣᠤᠶᠿᠢᠣᠤ'
b= re.findall(r'\bᠠᠡ', d) # Matches a word boundary
print(b)
# ['ᠠᠡ']
after re.ASCII Set up , \w You can't match anything by matching Mongolian
# Mongolian
d = 'ᠠᠡᠢᠣᠤᠶᠿᠢᠣᠤ'
b= re.findall(r'\wᠠᠡ', d,re.ASCII)# Matches a word boundary , Limit to ASCII in
print(b)
# [] It doesn't match anything
边栏推荐
- GBase 8a OLAP函数group by grouping sets的使用样例
- Simulink method for exporting FMU model files
- [sword offer ii] sword finger offer II 029 Sorted circular linked list
- [LeetCode]508. The most frequent subtree elements and
- Special tutorial - Captain selection game
- 软件测试自动化测试之——接口测试从入门到精通,每天学习一点点
- Go从入门到实战——CSP并发机制(笔记)
- What is the core competitiveness of front-line R & D personnel aged 35~40 in this position?
- 分享|智慧环保-生态文明信息化解决方案(附PDF)
- Quick excel export
猜你喜欢

Go从入门到实战——Panic和recover(笔记)

真香,自从用了Charles,Fiddler已经被我彻底卸载了

Go from introduction to actual combat - package (notes)
How to design an elegant caching function

熊市慢慢,Bit.Store提供稳定Staking产品助你穿越牛熊

Go from introduction to practice - Interface (notes)

开源技术交流丨一站式全自动化运维管家ChengYing入门介绍

微服务之远程调用

【MySQL】数据库函数通关教程下篇(窗口函数专题)

JVM memory structure when creating objects
随机推荐
GBase 8a数据库用户密码安全相关参数汇总
"Apprendre cette image" apparaît sur le Bureau win11 comment supprimer
Installing Oracle11g under Linux
百万年薪独家专访,开发人员不修复bug怎么办?
石子合并问题分析
Go from introduction to actual combat - package (notes)
JVM memory structure when creating objects
What is the core competitiveness of front-line R & D personnel aged 35~40 in this position?
[LeetCode]161. Edit distance of 1
QT large file generation MD5 check code
Analysis of stone merging
Go从入门到实战——协程机制(笔记)
Bean paste green protects your eyes
Bit.Store:熊市漫漫,稳定Staking产品或成主旋律
鲜为人知的mysql导入数据
Common problems encountered by burp Suite
Summary of gbase 8A database user password security related parameters
Figure countdownlatch and cyclicbarrier based on AQS queue
Method of reading file contents by Excel
The difference between scrum and Kanban