当前位置:网站首页>对需求的内容进行jieba分词并按词频排序输出excel文档
对需求的内容进行jieba分词并按词频排序输出excel文档
2022-07-25 22:21:00 【佛系人僧】
读入excel数据结构:
import pandas as pd
import jieba
df = pd.read_excel('xuqiufenxi.xls')
print(df)
#新建一列存放分词结果
df['fenci'] = ''
#遍历每一行的文本,并将分词结果存入新建的列中
for i in range(len(df)):
print(i)
df['fenci'][i] = ' '.join(jieba.cut(df['需求内容'][i]))
print(df['fenci'][i])
#统计每个词出现的次数
word_count = {
}
for word in df['fenci'][i].split():
if word in word_count:
word_count[word] += 1
else:
word_count[word] = 1
# 将word_count字典转换成dataframe
word_count_df = pd.DataFrame(word_count.items(), columns=['word', 'count'])
# 按照count值降序排序
word_count_df = word_count_df.sort_values(by='count', ascending=False)
#输出excel
word_count_df.to_excel(f"{
df['功能'][i]}.xlsx", index=False)
输出:
边栏推荐
- 3dslicer importing medical image data
- JS timer and swiper plug-in
- Unity performance optimization direction
- TFrecord写入与读取
- The technical aspects of ByteDance are all over, but the result is still brushed. Ask HR why...
- SMART S7-200 PLC通道自由映射功能块(DO_Map)
- D3.js learning
- 如何将一个域名解析到多个IP地址?
- internship:普通常用的工具类编写
- Sofa weekly | open source person - Niu Xuewei, QA this week, contributor this week
猜你喜欢

如何实现一个App应用程序,限制用户时间使用?

Get together for ten years, tell your story, millions of gifts are waiting for you

On the difference between break and continue statements

Three ways to allocate disk space

What have I experienced to become a harder tester than development?

在进行自动化测试,遇到验证码的问题,怎么办?

IPv4地址已经完全耗尽,互联网还能正常运转,NAT是最大功臣!

Advanced database · how to add random data for data that are not in all user data - Dragonfly Q system users without avatars how to add avatar data - elegant grass technology KIR

Wechat official account application development (I)

『Skywalking』. Net core fast access distributed link tracking platform
随机推荐
Output Yang Hui triangle with two-dimensional array
Formal parameters, arguments and return values in functions
ArcGIS中的WKID
QML module not found
JS interview questions
[C syntax] void*
Application of breakthrough thinking in testing work
H5幸运刮刮乐抽奖 免公众号+直运营
xxl-job中 关于所有日志系统的源码的解读(一行一行源码解读)
字节跳动技术面都过了,结果还是被刷了,问HR原因竟是。。。
Playwright tutorial (I) suitable for Xiaobai
Xiaobai programmer's first day
如何将一个域名解析到多个IP地址?
点亮字符串中所有需要点亮的位置,至少需要点几盏灯
『Skywalking』. Net core fast access distributed link tracking platform
微信发卡小程序源码-自动发卡小程序源码-带流量主功能
6-18 vulnerability exploitation - backdoor connection
What should I do if I encounter the problem of verification code during automatic testing?
字符型常量和字符串常量的区别?
How to call the size of two numbers with a function--- Xiao Tang