当前位置:网站首页>Chinese name extraction (toy code - accurate head is too small, right to play)
Chinese name extraction (toy code - accurate head is too small, right to play)
2022-07-02 13:16:00 【Fantasy elves_ cq】
Python Official website :https://www.python.org/
Free: Big coffee free “ Bible ” course 《 python Complete self study course 》, It's not just the basics ……
- My CSDN Home page 、My HOT Bo 、My Python Study personal memos
- A good writer recommends 、 Laoqi classroom
Self study is not a mysterious thing , A person's self-study time is always longer than that in school , There are always more times when there are no teachers than when there are teachers .
—— Hua Luogeng

- 1、 Origin of notes
- 2、 Directory structure
- 3、 Code running effect
- 4、 This exercise complete source code

Based on this comment , I “ Sacrifice one's life ” Trial .
I happen to have hundreds of family names , Take hundreds of family names and “ Chinese name commonly used words ” To make a toy ——“ Chinese name extraction ”.
“ toy ” Directory structure 

Code trial ( With “ The romance of The Three Kingdoms .txt”、“ Dafeng is a watchman _19txt” Two texts “ Make fun of ”)


This exercise complete source code
#!/sur/bin/nve python
# coding: utf-8
from re import findall # from re Module loading findall Method .
''' filename = 're_Chinese_name.py' author = ' The dream spirit _cq' time = '2022-06-29' '''
from os import system
class re_Chinese_name:
''' Extract Chinese names from text '''
def __init__(self):
l = system('clear')
with open('data/firstnames_one_100.txt') as f:
self.firstnames = f.read().strip().split(',')
with open('data/firstnames_two_85.txt') as f:
self.firstnames_two = f.read().strip().split(',')
self.firstnames.extend(self.firstnames_two)
with open('data/boy_names.txt') as f:
self.names_chr = f.read()
with open('data/girl_names.txt') as f:
self.names_chr += f.read()
self.names = "".join(self.names_chr.strip().split(','))
#input(f"\n\n surname :{self.firstnames}\n The name is written :{self.names_chr}")
def get_names(self, text):
''' Extract names ,text Is the text from which the name is to be extracted .'''
names = []
for firstname in self.firstnames:
if firstname in text:
re_s = f"{
firstname}"r'\w{3}'
#print(re_s) # Debug wins statement .
names.extend(findall(re_s, text))
print(' Sorting the extracted names …… '.center(39, '~'))
names = self.isname(list(set(names)))
return set(names)
def isname(self, names_list):
''' Chinese name determination '''
names = []
n = self.names_chr
for name in names_list:
if name[:2] in self.firstnames_two:
if name[2] in n and name[3] in n:
names.append(name)
elif name[2] in n and name[3] not in n:
names.append(name[:-1])
else:
if name[1] in n and name[2] in n and name[3] in n:
names.append(name)
elif name[1] in n and name[2] in n:
names.append(name[:3])
elif name[1] in n:
names.append(name[:2])
return names
if __name__ == '__main__':
rn = re_Chinese_name()
names = rn.get_names(open('data/ The romance of The Three Kingdoms .txt').read())
names2 = rn.get_names(open('data/ Dafeng is a watchman _19.txt').read())
print(f"\n\n{
u' re Extract Chinese names '.center(44, '~')}\n\n《 The romance of The Three Kingdoms 》:\n{
','.join(names)}\n\n《 Dafeng is a watchman 》 Chapter nineteen :\n{
','.join(names2)}\n\n")

__ Last one :__ CSV File format —— It is convenient to use the smallest data transmission method
__ Next :__
my HOT Bo :
- practice : Calculation of bank compound interest ( use for Solve a junior high school problem in a circular way )(1052 read )
- pandas Data type DataFrame(1321 read )
- Is it difficult for someone in the class to have the same birthday as me ?( probability probability、 Monte Carlo stochastic simulation method )(2080 read )
- Python The string is centered (1469 read )
- practice : Even sum 、 Threshold segmentation and subtraction ( list Two basic questions of the object )(1638 read )
- use pandas Solve a small problem (1964 read )
- Iteratable object and four functions (1065 read )
- “ Happy number ” Judge (1226 read )
- Roman digital converter ( Construct element module )(1933 read )
- Hot: Rome digital ( converter | Luo )(3571 read )
- Hot: Give Way QQ Group nickname color change code (26511 read )
- Hot: Fibonacci sequence ( recursive | for )(4038 read )
- The largest rectangle in the histogram (1646 read )
- Repeat start and end of sorting array elements (1236 read )
- Telephone dialing keyboard letter combination (1343 read )
- Password strength detector (1791 read )
- Find the balance point of the list (1812 read )
- Hot: String statistics (4281 read )
- Hot: Nim game ( Smart version starts )(3415 read ) Nim game ( Optimized version )(979 read )
Recommended conditions Click to read a thousand

Excellent articles :
- A good writer recommends :《python Complete self study course 》 Qi Wei manuscript free Serial
- OPP The three major characteristics : In the package property
- Understand through built-in objects python'
- Regular expressions
- python in “*” The role of
- Python A complete self-study manual
- Walrus operators
- Python Medium `!=` And `is not` Different
- The right way to learn programming
source : Laoqi classroom
Python Getting started 【Python 3.6.3】
A good writer recommends :
High quality creators in the whole stack field —— Cold guy ( Or a domestic college student ) Good writing :《 Non technical paper — About English and how to ask questions correctly 》,“ English ” and “ I will ask questions ” Are two sharp tools for learning .
CSDN Practical skills blog :
- 8 A good one Python Practical skills
- python Ignore the warning
- Python Code specification
- Python Of docstring standard ( Describe the standard writing of the document )
边栏推荐
- Word efficiency guide - word's own template
- Jerry's watch gets the default ringtone selection list [article]
- Fully autonomous and controllable 3D cloud CAD: crowncad's convenient command search can quickly locate the specific location of the required command.
- 2022零代码/低代码开发白皮书【伙伴云出品】附下载
- Unity SKFramework框架(十六)、Package Manager 開發工具包管理器
- Jerry's weather code table [chapter]
- JS iterator generator asynchronous code processing promise+ generator - > await/async
- 国内首款、完全自主、基于云架构的三维CAD平台——CrownCAD(皇冠CAD)
- Std:: vector batch import fast de duplication method
- 完全自主可控三维云CAD:CrownCAD便捷的命令搜索,快速定位所需命令具体位置。
猜你喜欢
随机推荐
Jerry's weather code table [chapter]
无向图的桥
Professor of Shanghai Jiaotong University: he Yuanjun - bounding box (containment / bounding box)
Unity skframework framework (XV), singleton singleton
免费SSL证书知多少?免费SSL证书和收费SSL证书的区别
C operator
VIM super practical guide collection of this one is enough
每日一题:1175.质数排列
Unity skframework framework (XIII), question module
JS逆向之巨量创意signature签名
解答:EasyDSS视频点播时音频是否可以设置为默认开启?
moon
Unity SKFramework框架(十六)、Package Manager 开发工具包管理器
Oracle from entry to mastery (4th Edition)
nohup命令
Numpy array calculation
Unity SKFramework框架(二十一)、Texture Filter 贴图资源筛选工具
Record idea shortcut keys
Day4 operator, self increasing, self decreasing, logical operator, bit operation, binary conversion decimal, ternary operator, package mechanism, document comment
C modifier









![[opencv learning] [moving object detection]](/img/2e/9b437b7fe22f1d57334529eda68e37.jpg)