当前位置:网站首页>Perform Jieba word segmentation on the required content and output EXCEL documents according to word frequency
Perform Jieba word segmentation on the required content and output EXCEL documents according to word frequency
2022-07-25 22:21:00 【Buddhist monk】
Read in excel data structure :
import pandas as pd
import jieba
df = pd.read_excel('xuqiufenxi.xls')
print(df)
# Create a new column to store word segmentation results
df['fenci'] = ''
# Traverse the text of each line , And save the word segmentation results into the new column
for i in range(len(df)):
print(i)
df['fenci'][i] = ' '.join(jieba.cut(df[' Content of requirements '][i]))
print(df['fenci'][i])
# Count the number of times each word appears
word_count = {
}
for word in df['fenci'][i].split():
if word in word_count:
word_count[word] += 1
else:
word_count[word] = 1
# take word_count The dictionary is converted into dataframe
word_count_df = pd.DataFrame(word_count.items(), columns=['word', 'count'])
# according to count Value descending sort
word_count_df = word_count_df.sort_values(by='count', ascending=False)
# Output excel
word_count_df.to_excel(f"{
df[' function '][i]}.xlsx", index=False)
Output :
边栏推荐
- 2day
- SQL basic statement DQL select and extract DML insert delete
- What is class loading? Class loading process?
- What is the difference between minor GC and full GC?
- SQL基本语句 DQL select与提取 DML插入删除
- Redis foundation 2 (notes)
- Wechat official account application development (I)
- 6-18 vulnerability exploitation - backdoor connection
- Fill the whole square with the float property
- 3dslicer importing medical image data
猜你喜欢

After 2 years of functional testing, I feel like I can't do anything. Where should I go in 2022?

Visitor mode

Imitation Tiktok homepage interface

6-17 vulnerability exploitation - deserialization remote command execution vulnerability

谷歌分析UA怎么转最新版GA4最方便

What should I do if I encounter the problem of verification code during automatic testing?

About vscode usage+ Solutions to the problem of tab failure
![[go basics 02] the first procedure](/img/af/f32762a828f384bf6aa063ebf959aa.png)
[go basics 02] the first procedure

Ffmpeg plays audio and video, time_ Base solves the problem of audio synchronization and SDL renders the picture

核电站在席卷欧洲的热浪中努力保持安全工作
随机推荐
Based on if nesting and function call
聚名十年,说出你的故事,百万豪礼等你拿
Acwing 866. determining prime numbers by trial division
启牛商学院和微淼商学院哪个靠谱?老师推荐的开户安全吗?
How to resolve a domain name to multiple IP addresses?
Leetcode 106. 从中序与后序遍历序列构造二叉树
Playwright tutorial (I) suitable for Xiaobai
xxl-job中 关于所有日志系统的源码的解读(一行一行源码解读)
力矩电机控制基本原理
JMeter websocket interface test
Flex layout
Recursive case -c
如何实现一个App应用程序,限制用户时间使用?
Wet- a good choice for people with English difficulties - console translation
ThreadLocal summary (to be continued)
3dslicer introduction and installation tutorial
PySpark数据分析基础:pyspark.sql.SparkSession类方法详解及操作+代码展示
6-18 vulnerability exploitation - backdoor connection
Selenium basic use and use selenium to capture the recruitment information of a website (continuously updating)
『SignalR』.NET使用 SignalR 进行实时通信初体验