当前位置:网站首页>对需求的内容进行jieba分词并按词频排序输出excel文档
对需求的内容进行jieba分词并按词频排序输出excel文档
2022-07-25 22:21:00 【佛系人僧】
读入excel数据结构:
import pandas as pd
import jieba
df = pd.read_excel('xuqiufenxi.xls')
print(df)
#新建一列存放分词结果
df['fenci'] = ''
#遍历每一行的文本,并将分词结果存入新建的列中
for i in range(len(df)):
print(i)
df['fenci'][i] = ' '.join(jieba.cut(df['需求内容'][i]))
print(df['fenci'][i])
#统计每个词出现的次数
word_count = {
}
for word in df['fenci'][i].split():
if word in word_count:
word_count[word] += 1
else:
word_count[word] = 1
# 将word_count字典转换成dataframe
word_count_df = pd.DataFrame(word_count.items(), columns=['word', 'count'])
# 按照count值降序排序
word_count_df = word_count_df.sort_values(by='count', ascending=False)
#输出excel
word_count_df.to_excel(f"{
df['功能'][i]}.xlsx", index=False)
输出:
边栏推荐
- Fill the whole square with the float property
- 【数据库学习】Redis 解析器&&单线程&&模型
- Math programming classification
- D3.js 学习
- 【Leetcode】502.IPO(困难)
- Wechat applet (anti shake, throttling), which solves the problem that users keep pulling down refresh requests or clicking buttons to submit information; Get the list information and refresh the data
- MapGIS格式转ArcGIS方法
- About vscode usage+ Solutions to the problem of tab failure
- Jenkins+svn configuration
- synchronized与volatile
猜你喜欢

How to implement an app application to limit users' time use?

Arcgis10.2 configuring postgresql9.2 standard tutorial

Playwright tutorial (I) suitable for Xiaobai

Advanced database · how to add random data for data that are not in all user data - Dragonfly Q system users without avatars how to add avatar data - elegant grass technology KIR

win10搭建flutter环境踩坑日记

Jenkins+svn configuration

After three years of software testing at Tencent, I was ruthlessly dismissed in July, trying to wake up my brother who was paddling

6-18 vulnerability exploitation - backdoor connection
![[C syntax] void*](/img/34/b29b7bbf8eae9f1730352cac1301a4.png)
[C syntax] void*

jenkins+SVN配置
随机推荐
『Skywalking』. Net core fast access distributed link tracking platform
开户就可以购买收益在百分之六以上的理财产品了吗
How to call the size of two numbers with a function--- Xiao Tang
别投了,软件测试岗位饱和了...
Having met a tester with three years' experience in Tencent, I saw the real test ceiling
JS timer and swiper plug-in
3. Editors (vim)
xxl-job中 关于所有日志系统的源码的解读(一行一行源码解读)
突破性思维在测试工作中的应用
什么是类加载?类加载的过程?
Call of addition, subtraction, multiplication and division of integer type only
torchvision
The third day of Xiaobai programmer
3dslicer importing medical image data
The dragon lizard exhibition area plays a new trick this time. Let's see whose DNA moved?
Wechat applet (anti shake, throttling), which solves the problem that users keep pulling down refresh requests or clicking buttons to submit information; Get the list information and refresh the data
启牛商学院和微淼商学院哪个靠谱?老师推荐的开户安全吗?
The second short contact of gamecloud 1608
Common source code for ArcGIS development
Jenkins+svn configuration