当前位置:网站首页>Sort according to a number in a string in a column of CSV file
Sort according to a number in a string in a column of CSV file
2022-07-06 08:24:00 【Don't seek great wealth, just seek wealth to rival the country】
**
The file is shown below :
**
Sort according to the size of the fourth number in the first column ( Please note that “ Chinese characters ” The sequence needs to correspond to the previous video sequence )
**
Solutions :
**
1、 Extract the data
2、 Put the first column split Make a list
3、zip Combine : The split list and the second column , because key Is hashable and cannot be a list , and value It can be modified , So take the first column as value, Second row Dang key, dict Convert to dictionary
4、 basis value, Use sorted Sort
5、 adopt for: The result after cyclic sorting , Split the data join After combination , Append to two empty lists respectively , Again zip that will do ( In order to reverse the order of the two columns )( There may be questions : Why not use “ Key value pair ” Reverse form , And make it so troublesome , In the second step , Dictionary key It can't be a list , Therefore, it cannot be solved by reversing key value pairs )
**
The specific code is as follows :
**
import os
import pandas as pd
path=r"C:\Users\jam96\PycharmProjects\all_module\pandas_test\a"
dir_path=os.path.dirname(os.path.abspath(__file__))
result_path=os.path.join(dir_path,"data")
if not os.path.exists(result_path):
os.mkdir(result_path)
files=os.listdir(path)
print(files)
num=1
for i in files:
res=pd.read_csv(filepath_or_buffer=os.path.join(path,i),header=None)
a=res.values[:,0]
b=res.values[:,1]
d=[]
for i in a:
c=i.split("_")
d.append(c)
new=dict(zip(b,d))
#sorted Back to the list
new1=sorted(new.items(),key=lambda x:int(x[1][3]))
k=[]
j=[]
for i in new1:
k.append(i[0])
j.append("_" .join(i[1]))
result=zip(j,k)
pd1=pd.DataFrame(data=result)
pd1.to_csv(result_path+os.path.sep+"C00"+str(num)+".csv",index=False,header=None)
num+=1
The operation results are as follows :
Optimize the code
Later, I found that the above code is redundant , Actually, it doesn't need to be so complicated , On the other hand, it is due to the lack of proficiency in dictionaries , The optimized code is as follows :
import pandas as pd
#csv File path
path=r"D:\TestSet\csv\dms\abc.csv"
# read csv file
res=pd.read_csv(path,encoding="gbk",header=None)
# Get the first and second columns
a=res.values[:,0]
b=res.values[:,1]
# Combine the data in the first column with the data in the second column
c=dict(zip(a,b))
# according to key Size of the fourth item , Please pay attention to the use of “int” The result is cast to integer
d=sorted(c.items(),key=lambda x:int(x[0].split("_")[3]))
# Write the sorted results into the new csv
e=pd.DataFrame(data=d)
e.to_csv(path_or_buf=r"C:\Users\xdjiang6\PycharmProjects\ Log results batch modify annotation set \data\a.csv",index=False,header=None)
边栏推荐
- PHP - Common magic method (nanny level teaching)
- 1202 character lookup
- Let the bullets fly for a while
- vulnhub hackme: 1
- Use dumping to back up tidb cluster data to S3 compatible storage
- NFT smart contract release, blind box, public offering technology practice -- jigsaw puzzle
- Migrate data from a tidb cluster to another tidb cluster
- 使用 BR 备份 TiDB 集群数据到兼容 S3 的存储
- 在 uniapp 中使用阿里图标
- "Designer universe": "benefit dimension" APEC public welfare + 2022 the latest slogan and the new platform will be launched soon | Asia Pacific Financial Media
猜你喜欢
IoT -- 解读物联网四层架构
【MySQL】数据库的存储过程与存储函数通关教程(完整版)
[untitled]
Asia Pacific Financial Media | "APEC industry +" Western Silicon Valley invests 2trillion yuan in Chengdu Chongqing economic circle to catch up with Shanghai | stable strategy industry fund observatio
hcip--mpls
Golang DNS 随便写写
[research materials] 2022 China yuancosmos white paper - Download attached
IOT -- interpreting the four tier architecture of the Internet of things
What is the use of entering the critical point? How to realize STM32 single chip microcomputer?
"Designer universe" Guangdong responds to the opinions of the national development and Reform Commission. Primary school students incarnate as small community designers | national economic and Informa
随机推荐
Online yaml to CSV tool
"Friendship and righteousness" of the center for national economy and information technology: China's friendship wine - the "unparalleled loyalty and righteousness" of the solidarity group released th
CAD ARX gets the current viewport settings
Summary of phased use of sonic one-stop open source distributed cluster cloud real machine test platform
MFC sends left click, double click, and right click messages to list controls
LDAP應用篇(4)Jenkins接入
How to use information mechanism to realize process mutual exclusion, process synchronization and precursor relationship
Vocabulary notes for postgraduate entrance examination (3)
[luatos-air551g] 6.2 repair: restart caused by line drawing
LDAP应用篇(4)Jenkins接入
PHP - Common magic method (nanny level teaching)
Circular reference of ES6 module
在 uniapp 中使用阿里图标
指针和数组笔试题解析
logback1.3. X configuration details and Practice
[research materials] 2021 live broadcast annual data report of e-commerce - Download attached
Use dumping to back up tidb cluster data to S3 compatible storage
[untitled]
JS select all and tab bar switching, simple comments
Golang DNS 随便写写