当前位置:网站首页>Sort according to a number in a string in a column of CSV file
Sort according to a number in a string in a column of CSV file
2022-07-06 08:24:00 【Don't seek great wealth, just seek wealth to rival the country】
**
The file is shown below :
**
Sort according to the size of the fourth number in the first column ( Please note that “ Chinese characters ” The sequence needs to correspond to the previous video sequence )
**
Solutions :
**
1、 Extract the data
2、 Put the first column split Make a list
3、zip Combine : The split list and the second column , because key Is hashable and cannot be a list , and value It can be modified , So take the first column as value, Second row Dang key, dict Convert to dictionary
4、 basis value, Use sorted Sort
5、 adopt for: The result after cyclic sorting , Split the data join After combination , Append to two empty lists respectively , Again zip that will do ( In order to reverse the order of the two columns )( There may be questions : Why not use “ Key value pair ” Reverse form , And make it so troublesome , In the second step , Dictionary key It can't be a list , Therefore, it cannot be solved by reversing key value pairs )
**
The specific code is as follows :
**
import os
import pandas as pd
path=r"C:\Users\jam96\PycharmProjects\all_module\pandas_test\a"
dir_path=os.path.dirname(os.path.abspath(__file__))
result_path=os.path.join(dir_path,"data")
if not os.path.exists(result_path):
os.mkdir(result_path)
files=os.listdir(path)
print(files)
num=1
for i in files:
res=pd.read_csv(filepath_or_buffer=os.path.join(path,i),header=None)
a=res.values[:,0]
b=res.values[:,1]
d=[]
for i in a:
c=i.split("_")
d.append(c)
new=dict(zip(b,d))
#sorted Back to the list
new1=sorted(new.items(),key=lambda x:int(x[1][3]))
k=[]
j=[]
for i in new1:
k.append(i[0])
j.append("_" .join(i[1]))
result=zip(j,k)
pd1=pd.DataFrame(data=result)
pd1.to_csv(result_path+os.path.sep+"C00"+str(num)+".csv",index=False,header=None)
num+=1
The operation results are as follows :

Optimize the code
Later, I found that the above code is redundant , Actually, it doesn't need to be so complicated , On the other hand, it is due to the lack of proficiency in dictionaries , The optimized code is as follows :
import pandas as pd
#csv File path
path=r"D:\TestSet\csv\dms\abc.csv"
# read csv file
res=pd.read_csv(path,encoding="gbk",header=None)
# Get the first and second columns
a=res.values[:,0]
b=res.values[:,1]
# Combine the data in the first column with the data in the second column
c=dict(zip(a,b))
# according to key Size of the fourth item , Please pay attention to the use of “int” The result is cast to integer
d=sorted(c.items(),key=lambda x:int(x[0].split("_")[3]))
# Write the sorted results into the new csv
e=pd.DataFrame(data=d)
e.to_csv(path_or_buf=r"C:\Users\xdjiang6\PycharmProjects\ Log results batch modify annotation set \data\a.csv",index=False,header=None)
边栏推荐
- All the ArrayList knowledge you want to know is here
- Asia Pacific Financial Media | female pattern ladyvision: forced the hotel to upgrade security. The drunk woman died in the guest room, and the hotel was sentenced not to pay compensation | APEC secur
- [research materials] 2022 enterprise wechat Ecosystem Research Report - Download attached
- Online yaml to CSV tool
- 从 SQL 文件迁移数据到 TiDB
- [MySQL] database stored procedure and storage function clearance tutorial (full version)
- 2022.02.13 - NC001. Reverse linked list
- hcip--mpls
- CISP-PTE实操练习讲解
- Migrate data from a tidb cluster to another tidb cluster
猜你喜欢

ESP series pin description diagram summary

Golang DNS 随便写写

Golang DNS write casually

IOT -- interpreting the four tier architecture of the Internet of things

The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
![[cloud native] teach you how to build ferry open source work order system](/img/fb/507f763791235bd00bc8201e5d7741.png)
[cloud native] teach you how to build ferry open source work order system

Nft智能合约发行,盲盒,公开发售技术实战--合约篇
![[untitled]](/img/38/bc025310b9742b5bf0bd28c586ec0d.jpg)
[untitled]

2022.02.13 - NC002. sort

synchronized 解决共享带来的问题
随机推荐
2. File operation - write
matplotlib. Widgets are easy to use
Leetcode skimming (5.29) hash table
Ruffian Heng embedded bimonthly, issue 49
灰度升级 TiDB Operator
ESP系列引脚说明图汇总
[MySQL] database stored procedure and storage function clearance tutorial (full version)
Summary of MySQL index failure scenarios
ESP系列引脚說明圖匯總
[2022 Guangdong saim] Lagrange interpolation (multivariate function extreme value divide and conquer NTT)
Migrate data from CSV files to tidb
Hill sort c language
C语言 - 位段
在 uniapp 中使用阿里图标
从表中名称映射关系修改视频名称
Tidb backup and recovery introduction
Hungry for 4 years + Ali for 2 years: some conclusions and Thoughts on the road of research and development
Remote storage access authorization
Asia Pacific Financial Media | "APEC industry +" Western Silicon Valley invests 2trillion yuan in Chengdu Chongqing economic circle to catch up with Shanghai | stable strategy industry fund observatio
Wincc7.5 download and installation tutorial (win10 system)