当前位置:网站首页>S series · add data to the text file without adding duplicate values
S series · add data to the text file without adding duplicate values
2022-06-10 04:35:00 【Python advanced】
S series · Add data to the text file without adding duplicate values
S Also known as water , It can also be read as Small, In the process of daily work and study , Occasionally I will find something I haven't seen before 、 Small 、 Interesting operation , Perhaps these operations are not meaningful for solving the current problems , Still want to record , Maybe you can write a complete article by yourself , Then write it down as a daily account .
Series article description :
S series ·<< Article title >>
platform :
windows 10.0
python 3.8
Purpose
To a text file ( May not exist ) Add a new column of data and the data in the text file is not repeated .
processing method
The text file does not exist
data = ['243', '122', '782', '577', '478', '334', '334', '738', '122', '112', '634']The data to be saved is as above , Where the string '122' and '334' Have duplicate values , It can be de reprocessed before saving .
data2 = list(set(data)) The above uses set Function to remove duplicates , If you need to keep the original order , Can be sorted as follows , Or we can use our own function to remove duplicates .
data2.sort(key=data.index) The file does not exist and can be used mode='w' Pattern , Can be specified encoding='utf-8' Code to save .
with open('test.txt', 'w', encoding='utf-8') as f:
f.write('\n'.join(data2) + '\n')Text file exists
The hypothesis already exists test.txt file , And the content is as above , New data needs to be ( as follows ) Add to text file .
new = ['243', '122', '989', '989', '577', '159']Two problems need to be solved here :
The newly added data itself needs to be de duplicated
New data and existing data also need to be reprocessed
Observe ,'989' Is a duplicate value , and '243', '122', '577' It already exists , That is, only one '989' and '159' Add to test.txt in .
new = ['243', '122', '989', '989', '577', '159']
with open('test.txt', encoding='utf-8') as f:
data_list = []
r_data = f.readline()
while r_data.strip():
data_list.append(r_data.strip())
r_data = f.readline()
new2 = list(set(new).difference(set(data_list)))
new2.sort(key=new.index)
with open('test.txt', 'a', encoding='utf-8') as f:
f.write('\n'.join(new2) + '\n') first with open First read out the contents of the text file , Use line by line reading , Reduce memory usage , And remove the newline character when reading each line , If one-time reading still need to deal with newline characters , Consider the use of line by line reading , Then judge the data to be added with the existing data only in new The data that appears in , stay the second with open use mode='a' Method to write the data to be added into the text .
Need to use two separate open To read and write , A little trouble , Can be mode Set to a+, Data can also be read out in the new mode .
new = ['243', '122', '989', '989', '577', '159', '777']
with open('test.txt', 'a+', encoding='utf-8') as f:
f.seek(0) # Place the file cursor at the beginning of the file
data_list = []
r_data = f.readline()
while r_data.strip():
data_list.append(r_data.strip())
r_data = f.readline()
new2 = list(set(new).difference(set(data_list)))
new2.sort(key=new.index)
f.write('\n'.join(new2) + '\n')Compare with the previous method , except open The number of is reduced to one , You need to move the comparison de duplication code to with open within , Use 'a' Mode on , The cursor of the file will be placed at the end of the text by default , Add new data at the end , Each use read, The cursor moves back , In the use of readline You can also read to the end of the file , because a The mode cursor is at the end , Use it directly read It is impossible to read the existing data , Need to use seek, Place the cursor at the beginning of the text , And then we'll do it again read Can read out the existing data in the envisaged way , Then de duplicate and compare the data to be added , write in .
summary
How to add data to existing text , And no duplicate data , Start with the file does not exist , Gradually increasing , End use a+ Mode writing , Merged the case where the document does not exist , Set the cursor at the beginning , Read the existing data smoothly , Then do the reprocessing , Other factors are not taken into account , There may be mistakes in code design .
Like a fierce Falcon , wild and intractable .
2022.6.7 leave
边栏推荐
- Nx logo from brushing to switching on
- tensorflow 中的 cross_entropy
- 常见的数据库-字段类型映射关系
- OpenJudge NOI 1.13 13:人民币支付
- Fastapi-14-file upload-2
- pytorch的add_module(name,module)用mindspore怎么表示
- Acl2022 | the introduction of comparative learning to add negative samples to the generation process enables the model to effectively learn knowledge at different levels
- Huawei, this is too strong
- JDBC 入门示例
- Unit test with%unittest
猜你喜欢

Storage engine of MySQL database

Yyds dry goods inventory solution sword finger offer: rectangular coverage

Log management of MySQL database

Unity光照黑莫名其妙的偏色问题

Basic methods of stack and related problems

FastApi-15-文件上传-3

Good news 𞓜 wangchain technology signed the Miluo cultural, tourism and sports industry project to create a digital village on the "chain"

S系列·在已作出的matplotlib图中新增图例

多商户商城小程序源码有何优势?

Tcp/ip protocol (1)
随机推荐
Log management of MySQL database
Su Tao: application of counter sample technology in the field of Internet Security
Tutorial on using midway
mindconvert模型转换中clip算子报错
Unit test overview
Celery | task queue artifact
Byte order, object class
Fastapi-15-file upload-3
Application of PhD debate 𞓜 self supervised learning in Recommendation System
Crack the five myths of programmers, and new programmer 004 is officially launched!
Zero basic network: command line (CLI) debugging firewall practice
midway的使用教程
Quic and the future of Internet transmission
[Android L]SEAndroid增强Androd安全性背景概要及带来的影响
golang学习之四:闭包、defer
NX从刷机到更换开机logo
【深度学习】《PyTorch入门到项目实战》(十一):卷积层
Ammonium tech, a well-known network security hardware platform manufacturer, joined dragon lizard community
测试工程师提高质量的OKR该如何写?
2022.5.25-----leetcode. four hundred and sixty-seven