当前位置:网站首页>Sort according to a number in a string in a column of CSV file
Sort according to a number in a string in a column of CSV file
2022-07-06 08:24:00 【Don't seek great wealth, just seek wealth to rival the country】
**
The file is shown below :
**
Sort according to the size of the fourth number in the first column ( Please note that “ Chinese characters ” The sequence needs to correspond to the previous video sequence )
**
Solutions :
**
1、 Extract the data
2、 Put the first column split Make a list
3、zip Combine : The split list and the second column , because key Is hashable and cannot be a list , and value It can be modified , So take the first column as value, Second row Dang key, dict Convert to dictionary
4、 basis value, Use sorted Sort
5、 adopt for: The result after cyclic sorting , Split the data join After combination , Append to two empty lists respectively , Again zip that will do ( In order to reverse the order of the two columns )( There may be questions : Why not use “ Key value pair ” Reverse form , And make it so troublesome , In the second step , Dictionary key It can't be a list , Therefore, it cannot be solved by reversing key value pairs )
**
The specific code is as follows :
**
import os
import pandas as pd
path=r"C:\Users\jam96\PycharmProjects\all_module\pandas_test\a"
dir_path=os.path.dirname(os.path.abspath(__file__))
result_path=os.path.join(dir_path,"data")
if not os.path.exists(result_path):
os.mkdir(result_path)
files=os.listdir(path)
print(files)
num=1
for i in files:
res=pd.read_csv(filepath_or_buffer=os.path.join(path,i),header=None)
a=res.values[:,0]
b=res.values[:,1]
d=[]
for i in a:
c=i.split("_")
d.append(c)
new=dict(zip(b,d))
#sorted Back to the list
new1=sorted(new.items(),key=lambda x:int(x[1][3]))
k=[]
j=[]
for i in new1:
k.append(i[0])
j.append("_" .join(i[1]))
result=zip(j,k)
pd1=pd.DataFrame(data=result)
pd1.to_csv(result_path+os.path.sep+"C00"+str(num)+".csv",index=False,header=None)
num+=1
The operation results are as follows :
Optimize the code
Later, I found that the above code is redundant , Actually, it doesn't need to be so complicated , On the other hand, it is due to the lack of proficiency in dictionaries , The optimized code is as follows :
import pandas as pd
#csv File path
path=r"D:\TestSet\csv\dms\abc.csv"
# read csv file
res=pd.read_csv(path,encoding="gbk",header=None)
# Get the first and second columns
a=res.values[:,0]
b=res.values[:,1]
# Combine the data in the first column with the data in the second column
c=dict(zip(a,b))
# according to key Size of the fourth item , Please pay attention to the use of “int” The result is cast to integer
d=sorted(c.items(),key=lambda x:int(x[0].split("_")[3]))
# Write the sorted results into the new csv
e=pd.DataFrame(data=d)
e.to_csv(path_or_buf=r"C:\Users\xdjiang6\PycharmProjects\ Log results batch modify annotation set \data\a.csv",index=False,header=None)
边栏推荐
- Char to leading 0
- Upgrade tidb with tiup
- Introduction to number theory (greatest common divisor, prime sieve, inverse element)
- Migrate data from a tidb cluster to another tidb cluster
- Use Alibaba icon in uniapp
- Hcip day 16
- [Yugong series] February 2022 U3D full stack class 010 prefabricated parts
- 使用 TiUP 升级 TiDB
- Synchronized solves problems caused by sharing
- ESP系列引脚說明圖匯總
猜你喜欢
wincc7.5下载安装教程(Win10系统)
Leetcode question brushing (5.28) hash table
Asia Pacific Financial Media | female pattern ladyvision: forced the hotel to upgrade security. The drunk woman died in the guest room, and the hotel was sentenced not to pay compensation | APEC secur
Asia Pacific Financial Media | art cube of "designer universe": Guangzhou community designers achieve "great improvement" in urban quality | observation of stable strategy industry fund
CISP-PTE实操练习讲解
On the day of resignation, jd.com deleted the database and ran away, and the programmer was sentenced
根据csv文件某一列字符串中某个数字排序
让学指针变得更简单(三)
Zhong Xuegao, who cannot be melted, cannot escape the life cycle of online celebrity products
Pyqt5 development tips - obtain Manhattan distance between coordinates
随机推荐
Migrate data from a tidb cluster to another tidb cluster
Make learning pointer easier (3)
Mobile Test Engineer occupation yyds dry goods inventory
MFC 给列表控件发送左键单击、双击、以及右键单击消息
Circular reference of ES6 module
"Designer universe" APEC design +: the list of winners of the Paris Design Award in France was recently announced. The winners of "Changsha world center Damei mansion" were awarded by the national eco
在 uniapp 中使用阿里图标
Wincc7.5 download and installation tutorial (win10 system)
CAD ARX 获取当前的视口设置
It's hard to find a job when the industry is in recession
Asia Pacific Financial Media | "APEC industry +" Western Silicon Valley invests 2trillion yuan in Chengdu Chongqing economic circle to catch up with Shanghai | stable strategy industry fund observatio
备份与恢复 CR 介绍
【云原生】手把手教你搭建ferry开源工单系统
Asia Pacific Financial Media | female pattern ladyvision: forced the hotel to upgrade security. The drunk woman died in the guest room, and the hotel was sentenced not to pay compensation | APEC secur
Convolution, pooling, activation function, initialization, normalization, regularization, learning rate - Summary of deep learning foundation
[secretly kill little partner pytorch20 days -day01- example of structured data modeling process]
Step by step guide to setting NFT as an ens profile Avatar
使用 Dumpling 备份 TiDB 集群数据到兼容 S3 的存储
从 SQL 文件迁移数据到 TiDB
[research materials] 2021 Research Report on China's smart medical industry - Download attached