当前位置:网站首页>Sort according to a number in a string in a column of CSV file
Sort according to a number in a string in a column of CSV file
2022-07-06 08:24:00 【Don't seek great wealth, just seek wealth to rival the country】
**
The file is shown below :
**
Sort according to the size of the fourth number in the first column ( Please note that “ Chinese characters ” The sequence needs to correspond to the previous video sequence )
**
Solutions :
**
1、 Extract the data
2、 Put the first column split Make a list
3、zip Combine : The split list and the second column , because key Is hashable and cannot be a list , and value It can be modified , So take the first column as value, Second row Dang key, dict Convert to dictionary
4、 basis value, Use sorted Sort
5、 adopt for: The result after cyclic sorting , Split the data join After combination , Append to two empty lists respectively , Again zip that will do ( In order to reverse the order of the two columns )( There may be questions : Why not use “ Key value pair ” Reverse form , And make it so troublesome , In the second step , Dictionary key It can't be a list , Therefore, it cannot be solved by reversing key value pairs )
**
The specific code is as follows :
**
import os
import pandas as pd
path=r"C:\Users\jam96\PycharmProjects\all_module\pandas_test\a"
dir_path=os.path.dirname(os.path.abspath(__file__))
result_path=os.path.join(dir_path,"data")
if not os.path.exists(result_path):
os.mkdir(result_path)
files=os.listdir(path)
print(files)
num=1
for i in files:
res=pd.read_csv(filepath_or_buffer=os.path.join(path,i),header=None)
a=res.values[:,0]
b=res.values[:,1]
d=[]
for i in a:
c=i.split("_")
d.append(c)
new=dict(zip(b,d))
#sorted Back to the list
new1=sorted(new.items(),key=lambda x:int(x[1][3]))
k=[]
j=[]
for i in new1:
k.append(i[0])
j.append("_" .join(i[1]))
result=zip(j,k)
pd1=pd.DataFrame(data=result)
pd1.to_csv(result_path+os.path.sep+"C00"+str(num)+".csv",index=False,header=None)
num+=1
The operation results are as follows :
Optimize the code
Later, I found that the above code is redundant , Actually, it doesn't need to be so complicated , On the other hand, it is due to the lack of proficiency in dictionaries , The optimized code is as follows :
import pandas as pd
#csv File path
path=r"D:\TestSet\csv\dms\abc.csv"
# read csv file
res=pd.read_csv(path,encoding="gbk",header=None)
# Get the first and second columns
a=res.values[:,0]
b=res.values[:,1]
# Combine the data in the first column with the data in the second column
c=dict(zip(a,b))
# according to key Size of the fourth item , Please pay attention to the use of “int” The result is cast to integer
d=sorted(c.items(),key=lambda x:int(x[0].split("_")[3]))
# Write the sorted results into the new csv
e=pd.DataFrame(data=d)
e.to_csv(path_or_buf=r"C:\Users\xdjiang6\PycharmProjects\ Log results batch modify annotation set \data\a.csv",index=False,header=None)
边栏推荐
- LDAP application (4) Jenkins access
- [research materials] 2022 China yuancosmos white paper - Download attached
- Step by step guide to setting NFT as an ens profile Avatar
- 指针和数组笔试题解析
- hcip--mpls
- Colorlog结合logging打印有颜色的日志
- 2022.02.13 - 238. Maximum number of "balloons"
- Image fusion -- challenges, opportunities and Countermeasures
- Erc20 token agreement
- The State Economic Information Center "APEC industry +" Western Silicon Valley will invest 2trillion yuan in Chengdu Chongqing economic circle, which will surpass the observation of Shanghai | stable
猜你喜欢
指针进阶---指针数组,数组指针
[MySQL] database stored procedure and storage function clearance tutorial (full version)
Pyqt5 development tips - obtain Manhattan distance between coordinates
C语言自定义类型:结构体
Personalized online cloud database hybrid optimization system | SIGMOD 2022 selected papers interpretation
2.10transfrom attribute
Asia Pacific Financial Media | designer universe | Guangdong responds to the opinions of the national development and Reform Commission. Primary school students incarnate as small community designers
ESP系列引脚说明图汇总
将 NFT 设置为 ENS 个人资料头像的分步指南
The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
随机推荐
The resources of underground pipe holes are tight, and the air blowing micro cable is not fragrant?
"Friendship and righteousness" of the center for national economy and information technology: China's friendship wine - the "unparalleled loyalty and righteousness" of the solidarity group released th
Let the bullets fly for a while
C语言 - 位段
Mobile Test Engineer occupation yyds dry goods inventory
Migrate data from a tidb cluster to another tidb cluster
Flash return file download
sys.argv
使用 TiDB Lightning 恢复 S3 兼容存储上的备份数据
leetcode刷题 (5.31) 字符串
使用 TiUP 升级 TiDB
1204 character deletion operation (2)
Online yaml to CSV tool
Make learning pointer easier (3)
【MySQL】日志
Pointer advanced --- pointer array, array pointer
It's hard to find a job when the industry is in recession
Uibehavior, a comprehensive exploration of ugui source code
VMware 虚拟化集群
使用 BR 恢复 S3 兼容存储上的备份数据