当前位置:网站首页>Sort according to a number in a string in a column of CSV file
Sort according to a number in a string in a column of CSV file
2022-07-06 08:24:00 【Don't seek great wealth, just seek wealth to rival the country】
**
The file is shown below :
**
Sort according to the size of the fourth number in the first column ( Please note that “ Chinese characters ” The sequence needs to correspond to the previous video sequence )
**
Solutions :
**
1、 Extract the data
2、 Put the first column split Make a list
3、zip Combine : The split list and the second column , because key Is hashable and cannot be a list , and value It can be modified , So take the first column as value, Second row Dang key, dict Convert to dictionary
4、 basis value, Use sorted Sort
5、 adopt for: The result after cyclic sorting , Split the data join After combination , Append to two empty lists respectively , Again zip that will do ( In order to reverse the order of the two columns )( There may be questions : Why not use “ Key value pair ” Reverse form , And make it so troublesome , In the second step , Dictionary key It can't be a list , Therefore, it cannot be solved by reversing key value pairs )
**
The specific code is as follows :
**
import os
import pandas as pd
path=r"C:\Users\jam96\PycharmProjects\all_module\pandas_test\a"
dir_path=os.path.dirname(os.path.abspath(__file__))
result_path=os.path.join(dir_path,"data")
if not os.path.exists(result_path):
os.mkdir(result_path)
files=os.listdir(path)
print(files)
num=1
for i in files:
res=pd.read_csv(filepath_or_buffer=os.path.join(path,i),header=None)
a=res.values[:,0]
b=res.values[:,1]
d=[]
for i in a:
c=i.split("_")
d.append(c)
new=dict(zip(b,d))
#sorted Back to the list
new1=sorted(new.items(),key=lambda x:int(x[1][3]))
k=[]
j=[]
for i in new1:
k.append(i[0])
j.append("_" .join(i[1]))
result=zip(j,k)
pd1=pd.DataFrame(data=result)
pd1.to_csv(result_path+os.path.sep+"C00"+str(num)+".csv",index=False,header=None)
num+=1
The operation results are as follows :
Optimize the code
Later, I found that the above code is redundant , Actually, it doesn't need to be so complicated , On the other hand, it is due to the lack of proficiency in dictionaries , The optimized code is as follows :
import pandas as pd
#csv File path
path=r"D:\TestSet\csv\dms\abc.csv"
# read csv file
res=pd.read_csv(path,encoding="gbk",header=None)
# Get the first and second columns
a=res.values[:,0]
b=res.values[:,1]
# Combine the data in the first column with the data in the second column
c=dict(zip(a,b))
# according to key Size of the fourth item , Please pay attention to the use of “int” The result is cast to integer
d=sorted(c.items(),key=lambda x:int(x[0].split("_")[3]))
# Write the sorted results into the new csv
e=pd.DataFrame(data=d)
e.to_csv(path_or_buf=r"C:\Users\xdjiang6\PycharmProjects\ Log results batch modify annotation set \data\a.csv",index=False,header=None)
边栏推荐
- The Vice Minister of the Ministry of industry and information technology of "APEC industry +" of the national economic and information technology center led a team to Sichuan to investigate the operat
- leetcode刷题 (5.28) 哈希表
- Analysis of Top1 accuracy and top5 accuracy examples
- Online yaml to CSV tool
- Wireshark grabs packets to understand its word TCP segment
- Chinese Remainder Theorem (Sun Tzu theorem) principle and template code
- [Yugong series] creation of 009 unity object of U3D full stack class in February 2022
- Nft智能合约发行,盲盒,公开发售技术实战--拼图篇
- IP lab, the first weekly recheck
- 2022.02.13 - NC002. sort
猜你喜欢
[Yugong series] February 2022 U3D full stack class 011 unity section 1 mind map
2022 Inner Mongolia latest construction tower crane (construction special operation) simulation examination question bank and answers
NFT smart contract release, blind box, public offering technology practice -- contract
好用的TCP-UDP_debug工具下载和使用
Use Alibaba icon in uniapp
synchronized 解决共享带来的问题
Analysis of pointer and array written test questions
PLT in Matplotlib tight_ layout()
【MySQL】锁
Circular reference of ES6 module
随机推荐
LDAP应用篇(4)Jenkins接入
2022.02.13 - NC002. sort
备份与恢复 CR 介绍
从 TiDB 集群迁移数据至另一 TiDB 集群
Analysis of Top1 accuracy and top5 accuracy examples
Pointer advanced --- pointer array, array pointer
Asia Pacific Financial Media | designer universe | Guangdong responds to the opinions of the national development and Reform Commission. Primary school students incarnate as small community designers
Artcube information of "designer universe": Guangzhou implements the community designer system to achieve "great improvement" of urban quality | national economic and Information Center
wincc7.5下载安装教程(Win10系统)
Grayscale upgrade tidb operator
LDAP應用篇(4)Jenkins接入
Artcube information of "designer universe": Guangzhou implements the community designer system to achieve "great improvement" of urban quality | national economic and Information Center
1202 character lookup
Asia Pacific Financial Media | female pattern ladyvision: forced the hotel to upgrade security. The drunk woman died in the guest room, and the hotel was sentenced not to pay compensation | APEC secur
在 uniapp 中使用阿里图标
将 NFT 设置为 ENS 个人资料头像的分步指南
synchronized 解决共享带来的问题
Hungry for 4 years + Ali for 2 years: some conclusions and Thoughts on the road of research and development
[research materials] 2021 China online high growth white paper - Download attached
Nft智能合约发行,盲盒,公开发售技术实战--合约篇