当前位置:网站首页>Chinese name extraction (toy code - accurate head is too small, right to play)
Chinese name extraction (toy code - accurate head is too small, right to play)
2022-07-02 13:16:00 【Fantasy elves_ cq】
Python Official website :https://www.python.org/
Free: Big coffee free “ Bible ” course 《 python Complete self study course 》, It's not just the basics ……
- My CSDN Home page 、My HOT Bo 、My Python Study personal memos
- A good writer recommends 、 Laoqi classroom
Self study is not a mysterious thing , A person's self-study time is always longer than that in school , There are always more times when there are no teachers than when there are teachers .
—— Hua Luogeng

- 1、 Origin of notes
- 2、 Directory structure
- 3、 Code running effect
- 4、 This exercise complete source code

Based on this comment , I “ Sacrifice one's life ” Trial .
I happen to have hundreds of family names , Take hundreds of family names and “ Chinese name commonly used words ” To make a toy ——“ Chinese name extraction ”.
“ toy ” Directory structure 

Code trial ( With “ The romance of The Three Kingdoms .txt”、“ Dafeng is a watchman _19txt” Two texts “ Make fun of ”)


This exercise complete source code
#!/sur/bin/nve python
# coding: utf-8
from re import findall # from re Module loading findall Method .
''' filename = 're_Chinese_name.py' author = ' The dream spirit _cq' time = '2022-06-29' '''
from os import system
class re_Chinese_name:
''' Extract Chinese names from text '''
def __init__(self):
l = system('clear')
with open('data/firstnames_one_100.txt') as f:
self.firstnames = f.read().strip().split(',')
with open('data/firstnames_two_85.txt') as f:
self.firstnames_two = f.read().strip().split(',')
self.firstnames.extend(self.firstnames_two)
with open('data/boy_names.txt') as f:
self.names_chr = f.read()
with open('data/girl_names.txt') as f:
self.names_chr += f.read()
self.names = "".join(self.names_chr.strip().split(','))
#input(f"\n\n surname :{self.firstnames}\n The name is written :{self.names_chr}")
def get_names(self, text):
''' Extract names ,text Is the text from which the name is to be extracted .'''
names = []
for firstname in self.firstnames:
if firstname in text:
re_s = f"{
firstname}"r'\w{3}'
#print(re_s) # Debug wins statement .
names.extend(findall(re_s, text))
print(' Sorting the extracted names …… '.center(39, '~'))
names = self.isname(list(set(names)))
return set(names)
def isname(self, names_list):
''' Chinese name determination '''
names = []
n = self.names_chr
for name in names_list:
if name[:2] in self.firstnames_two:
if name[2] in n and name[3] in n:
names.append(name)
elif name[2] in n and name[3] not in n:
names.append(name[:-1])
else:
if name[1] in n and name[2] in n and name[3] in n:
names.append(name)
elif name[1] in n and name[2] in n:
names.append(name[:3])
elif name[1] in n:
names.append(name[:2])
return names
if __name__ == '__main__':
rn = re_Chinese_name()
names = rn.get_names(open('data/ The romance of The Three Kingdoms .txt').read())
names2 = rn.get_names(open('data/ Dafeng is a watchman _19.txt').read())
print(f"\n\n{
u' re Extract Chinese names '.center(44, '~')}\n\n《 The romance of The Three Kingdoms 》:\n{
','.join(names)}\n\n《 Dafeng is a watchman 》 Chapter nineteen :\n{
','.join(names2)}\n\n")

__ Last one :__ CSV File format —— It is convenient to use the smallest data transmission method
__ Next :__
my HOT Bo :
- practice : Calculation of bank compound interest ( use for Solve a junior high school problem in a circular way )(1052 read )
- pandas Data type DataFrame(1321 read )
- Is it difficult for someone in the class to have the same birthday as me ?( probability probability、 Monte Carlo stochastic simulation method )(2080 read )
- Python The string is centered (1469 read )
- practice : Even sum 、 Threshold segmentation and subtraction ( list Two basic questions of the object )(1638 read )
- use pandas Solve a small problem (1964 read )
- Iteratable object and four functions (1065 read )
- “ Happy number ” Judge (1226 read )
- Roman digital converter ( Construct element module )(1933 read )
- Hot: Rome digital ( converter | Luo )(3571 read )
- Hot: Give Way QQ Group nickname color change code (26511 read )
- Hot: Fibonacci sequence ( recursive | for )(4038 read )
- The largest rectangle in the histogram (1646 read )
- Repeat start and end of sorting array elements (1236 read )
- Telephone dialing keyboard letter combination (1343 read )
- Password strength detector (1791 read )
- Find the balance point of the list (1812 read )
- Hot: String statistics (4281 read )
- Hot: Nim game ( Smart version starts )(3415 read ) Nim game ( Optimized version )(979 read )
Recommended conditions Click to read a thousand

Excellent articles :
- A good writer recommends :《python Complete self study course 》 Qi Wei manuscript free Serial
- OPP The three major characteristics : In the package property
- Understand through built-in objects python'
- Regular expressions
- python in “*” The role of
- Python A complete self-study manual
- Walrus operators
- Python Medium `!=` And `is not` Different
- The right way to learn programming
source : Laoqi classroom
Python Getting started 【Python 3.6.3】
A good writer recommends :
High quality creators in the whole stack field —— Cold guy ( Or a domestic college student ) Good writing :《 Non technical paper — About English and how to ask questions correctly 》,“ English ” and “ I will ask questions ” Are two sharp tools for learning .
CSDN Practical skills blog :
- 8 A good one Python Practical skills
- python Ignore the warning
- Python Code specification
- Python Of docstring standard ( Describe the standard writing of the document )
边栏推荐
- Unity skframework Framework (XVI), package manager Development Kit Manager
- Unity SKFramework框架(十三)、Question 问题模块
- Analog to digital converter (ADC) ade7913ariz is specially designed for three-phase energy metering applications
- 文件的下载与图片的预览
- 最近公共祖先LCA的三种求法
- PXE installation UOS prompt NFS over TCP not available from 10 x.x.x
- Structured data, semi-structured data and unstructured data
- Js3day (array operation, JS bubble sort, function, debug window, scope and scope chain, anonymous function, object, Math object)
- 操作教程:EasyDSS如何将MP4点播文件转化成RTSP视频流?
- 3 a VTT terminal regulator ncp51200mntxg data
猜你喜欢

(6) Web security | penetration test | network security encryption and decryption ciphertext related features, with super encryption and decryption software

C operator

2022零代码/低代码开发白皮书【伙伴云出品】附下载

日本赌国运:Web3.0 ,反正也不是第一次失败了!

Unity SKFramework框架(十二)、Score 计分模块
![Jerry's watch gets the default ringtone selection list [article]](/img/94/e469864fa6ab688dabe46f606efdbc.jpg)
Jerry's watch gets the default ringtone selection list [article]

挥发性有机物TVOC、VOC、VOCS气体检测+解决方案

解答:EasyDSS视频点播时音频是否可以设置为默认开启?

屠榜多目标跟踪!BoT-SORT:稳健的关联多行人跟踪

Japan bet on national luck: Web3.0, anyway, is not the first time to fail!
随机推荐
Counter attack of flour dregs: MySQL 66 questions, 20000 words + 50 pictures in detail! A little six
口袋奇兵点评
Fully autonomous and controllable 3D cloud CAD: crowncad's convenient command search can quickly locate the specific location of the required command.
Std:: vector batch import fast de duplication method
Unity skframework framework (XV), singleton singleton
Daily question: 1175 Prime permutation
Unity skframework framework (XVII), freecameracontroller God view / free view camera control script
[opencv learning] [common image convolution kernel]
Js2day (also i++ and ++i, if statements, ternary operators, switch, while statements, for loop statements)
Day4 operator, self increasing, self decreasing, logical operator, bit operation, binary conversion decimal, ternary operator, package mechanism, document comment
Unity skframework framework (XXI), texture filter map resource filtering tool
Js5day (event monitoring, function assignment to variables, callback function, environment object this, select all, invert selection cases, tab column cases)
West digital decided to raise the price of flash memory products immediately after the factory was polluted by materials
Jerry's weather code table [chapter]
Five best software architecture patterns that architects must understand
ADB basic commands
JS逆向之巨量创意signature签名
Js3day (array operation, JS bubble sort, function, debug window, scope and scope chain, anonymous function, object, Math object)
Browser storage scheme
[opencv learning] [image filtering]