当前位置:网站首页>Forward maximum matching method
Forward maximum matching method
2022-07-06 21:08:00 【wx5d786476cd8b2】
class MM(object):
def __init__(self,dic_path):
self.dictionary=set()
self.maximum=0
# Read the dictionary
with open(dic_path,'r',encoding='utf-8') as f:
for line in f:
line=line.strip()
if not line:
continue
self.dictionary.add(line)
if self.maximum<len(line):
self.maximum=len(line)
def cut(self, text):
print(self.dictionary)
result=[]
index=len(text)
print(index)
n=0
while index>0:
word=None
for size in range(self.maximum,0,-1):
print(size)
if index-size<0:
continue
piece = text[n:n+size]
print('piece',piece)
if piece in self.dictionary:
word=piece
result.append(word)
index-=size
print('ooooop',index)
n+=size
break
if word is None:
n+=1
index-=1
return result[::]
def main():
text=" Nanjing Yangtze River Bridge "
t=MM(r'C:\Users\ljy\Desktop\learning-nlp-master\chapter-3\data\imm_dic.utf8')
print(len(t.cut(text)))
main()
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.
边栏推荐
- Redis insert data garbled solution
- This year, Jianzhi Tencent
- 【mysql】游标的基本使用
- Deployment of external server area and dual machine hot standby of firewall Foundation
- Swagger UI教程 API 文档神器
- @GetMapping、@PostMapping 和 @RequestMapping详细区别附实战代码(全)
- How to turn a multi digit number into a digital list
- OneNote in-depth evaluation: using resources, plug-ins, templates
- 基于深度学习的参考帧生成
- 动态切换数据源
猜你喜欢
After working for 5 years, this experience is left when you reach P7. You have helped your friends get 10 offers
强化学习-学习笔记5 | AlphaGo
【微信小程序】运行机制和更新机制
HMS Core 机器学习服务打造同传翻译新“声”态,AI让国际交流更顺畅
[200 opencv routines] 220 Mosaic the image
审稿人dis整个研究方向已经不仅仅是在审我的稿子了怎么办?
2022菲尔兹奖揭晓!首位韩裔许埈珥上榜,四位80后得奖,乌克兰女数学家成史上唯二获奖女性
OneNote 深度评测:使用资源、插件、模版
Aiko ai Frontier promotion (7.6)
数据湖(八):Iceberg数据存储格式
随机推荐
请问sql group by 语句问题
20220211 failure - maximum amount of data supported by mongodb
The biggest pain point of traffic management - the resource utilization rate cannot go up
2017 8th Blue Bridge Cup group a provincial tournament
拼多多败诉,砍价始终差0.9%一案宣判;微信内测同一手机号可注册两个账号功能;2022年度菲尔兹奖公布|极客头条
每个程序员必须掌握的常用英语词汇(建议收藏)
Common English vocabulary that every programmer must master (recommended Collection)
Aiko ai Frontier promotion (7.6)
How to turn a multi digit number into a digital list
Mtcnn face detection
use. Net drives the OLED display of Jetson nano
@GetMapping、@PostMapping 和 @RequestMapping详细区别附实战代码(全)
OSPF multi zone configuration
[200 opencv routines] 220 Mosaic the image
Leetcode hot topic Hot 100 day 32: "minimum coverage substring"
OAI 5G NR+USRP B210安装搭建
SAP UI5 框架的 manifest.json
15 millions d'employés sont faciles à gérer et la base de données native du cloud gaussdb rend le Bureau des RH plus efficace
HMS core machine learning service creates a new "sound" state of simultaneous interpreting translation, and AI makes international exchanges smoother
[MySQL] basic use of cursor