当前位置:网站首页>根据csv文件某一列字符串中某个数字排序
根据csv文件某一列字符串中某个数字排序
2022-07-06 08:19:00 【不求大富大贵只求富可敌国】
**
文件如下所示:
**
根据第一列第四个数字大小进行排序(请注意“汉字”顺序需要与前面的视频顺序对应)
**
解决思路:
**
1、把数据提取出来
2、把第一列split成列表
3、zip组合:拆分后的列表与第二列,由于key是可哈希的且不能是列表,而value是可以修改,所以把第一列当value,第二列当key, dict转换为字典
4、依据value,使用sorted排序
5、通过for:循环排序后的结果,把拆分的数据join组合后,分别追加到两个空列表,再次zip即可(为了把两列的顺序调换回来)(可能有人会有疑问:为什么不采用“键值对”反转的形式,而搞得那么麻烦,在第二步骤也讲了,字典的key不能是列表,所以不能通过反转键值对解决)
**
具体代码如下:
**
import os
import pandas as pd
path=r"C:\Users\jam96\PycharmProjects\all_module\pandas_test\a"
dir_path=os.path.dirname(os.path.abspath(__file__))
result_path=os.path.join(dir_path,"data")
if not os.path.exists(result_path):
os.mkdir(result_path)
files=os.listdir(path)
print(files)
num=1
for i in files:
res=pd.read_csv(filepath_or_buffer=os.path.join(path,i),header=None)
a=res.values[:,0]
b=res.values[:,1]
d=[]
for i in a:
c=i.split("_")
d.append(c)
new=dict(zip(b,d))
#sorted返回的是列表
new1=sorted(new.items(),key=lambda x:int(x[1][3]))
k=[]
j=[]
for i in new1:
k.append(i[0])
j.append("_" .join(i[1]))
result=zip(j,k)
pd1=pd.DataFrame(data=result)
pd1.to_csv(result_path+os.path.sep+"C00"+str(num)+".csv",index=False,header=None)
num+=1
运行结果如下:
优化代码
后来发现以上代码写的冗余,其实没必要那么复杂,另一方面是由于对字典的掌握熟练度不够,优化后的代码如下:
import pandas as pd
#csv文件路径
path=r"D:\TestSet\csv\dms\abc.csv"
#读csv文件
res=pd.read_csv(path,encoding="gbk",header=None)
#得到第一列与第二列
a=res.values[:,0]
b=res.values[:,1]
#把第一列的数据与第二列结合起来
c=dict(zip(a,b))
#根据key的第四项大小进行排序,请注意使用了“int”对结果进行了强制转换为整型
d=sorted(c.items(),key=lambda x:int(x[0].split("_")[3]))
#把排序后的结果再次写入新的csv
e=pd.DataFrame(data=d)
e.to_csv(path_or_buf=r"C:\Users\xdjiang6\PycharmProjects\日志结果批量修改标注集\data\a.csv",index=False,header=None)
边栏推荐
- A Closer Look at How Fine-tuning Changes BERT
- Grayscale upgrade tidb operator
- 21. Delete data
- VMware 虚拟化集群
- ESP series pin description diagram summary
- Database basic commands
- C language - bit segment
- It's hard to find a job when the industry is in recession
- leetcode刷题 (5.31) 字符串
- Summary of phased use of sonic one-stop open source distributed cluster cloud real machine test platform
猜你喜欢
22. Empty the table
24. Query table data (basic)
Secure captcha (unsafe verification code) of DVWA range
Asia Pacific Financial Media | "APEC industry +" Western Silicon Valley invests 2trillion yuan in Chengdu Chongqing economic circle to catch up with Shanghai | stable strategy industry fund observatio
[research materials] 2021 China online high growth white paper - Download attached
Asia Pacific Financial Media | art cube of "designer universe": Guangzhou community designers achieve "great improvement" in urban quality | observation of stable strategy industry fund
[research materials] 2021 Research Report on China's smart medical industry - Download attached
Zhong Xuegao, who cannot be melted, cannot escape the life cycle of online celebrity products
synchronized 解决共享带来的问题
National economic information center "APEC industry +": economic data released at the night of the Spring Festival | observation of stable strategy industry fund
随机推荐
远程存储访问授权
Vocabulary notes for postgraduate entrance examination (3)
Analysis of Top1 accuracy and top5 accuracy examples
IoT -- 解读物联网四层架构
CAD ARX gets the current viewport settings
Analysis of pointer and array written test questions
在 uniapp 中使用阿里图标
23. Update data
Image fusion -- challenges, opportunities and Countermeasures
24. Query table data (basic)
[2022 广东省赛M] 拉格朗日插值 (多元函数极值 分治NTT)
Data governance: data quality
The Vice Minister of the Ministry of industry and information technology of "APEC industry +" of the national economic and information technology center led a team to Sichuan to investigate the operat
Pyqt5 development tips - obtain Manhattan distance between coordinates
Asia Pacific Financial Media | art cube of "designer universe": Guangzhou community designers achieve "great improvement" in urban quality | observation of stable strategy industry fund
National economic information center "APEC industry +": economic data released at the night of the Spring Festival | observation of stable strategy industry fund
It's hard to find a job when the industry is in recession
Wincc7.5 download and installation tutorial (win10 system)
Artcube information of "designer universe": Guangzhou implements the community designer system to achieve "great improvement" of urban quality | national economic and Information Center
leetcode刷题 (5.31) 字符串