当前位置:网站首页>51 lines of code, self-made TX to MySQL software!
51 lines of code, self-made TX to MySQL software!
2022-06-29 05:34:00 【cousin】
One 、 Preface
Hello everyone , I'm an old watch , I watched it this morning B standing , I found that the home page gave me a recent attention up Lord (@ Is my _ Is my _ Is my , For the convenience of the following paragraphs, the following words are used Little yes Substitute for ) video , So I opened it , So there is the next story ~
@ Screenshot use has been approved
Little yes What I want to achieve is a task given by the teacher : Read txt file , And store it to mysql, Just two days ago, I helped the readers write a excel to sqlserver Software for , And finally Little yes Said the present java Two problems with version :
- Can only read string Type data ( I don't quite understand this , It may be read file format or file field type )
- Dynamic modification of read file and database configuration is not supported ( Write a gui Or simply write the terminal logic program directly )
Let me think , I can , And then it provides the idea , Occupy a pit , Next is my implementation code .
Two 、 Start thinking
All source code of this project + Environmental Science + The test files are open source , If you don't want to see the code implementation process, you can skip to the next part Direct consumption method .
2.0 Environmental preparation
What I use here is :
- python 3.10
- Third party packages and corresponding versions :
pandas==1.3.5 PyMySQL==1.0.2 SQLAlchemy==1.4.30 PySimpleGUI==4.56.0
Facilitate project environmental management , I usually use pipenv Create and manage virtual environments , If you're interested in , You can read what I wrote before pipenv Basic use tutorial .
pipenv install # Creating a virtual environment pipenv shell # Enter the virtual environment pip install pandas PyMySQL SQLAlchemy PySimpleGUI # Install the required packages in the virtual environment exit # Exit virtual environment , Direct closure cmd Can also be
2.1 data fetch
Looking at the sample data, we found that there are 2 Two separators , Spaces and tabs \t, So we also need to specify two separators when reading data , In addition, this file has no header , So it is convenient for data processing and storage , It is better to add the header , Consistent with the database field name .
The following code :
import pandas as pd
'''
read_csv Parameter interpretation :
1、 To read the file path
2、sep Specify the separator , Reading data , Use | You can add multiple delimiters
3、header=None There is no meter The first line is the header by default
4、engine Set up the program engine
'''
data = pd.read_csv('./resources/ctd2020-09-27.txt', sep=' |\t',header=None, engine='python')
data
It's not hard to see. , There are two columns for direct reading nan, This is because there are two spaces separated , No problem , We delete the entire column as nan The column of , After the data is read without error , We are adding a header , The implementation code is as follows :
# Read the file
def get_txt_data(filepath):
columns = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N']
data = pd.read_csv(filepath, sep=' |\t',header=None,engine='python')
# Delete all data nan The column of ( If there is such a column , You can add , No effect )
data.dropna(axis=1, how='all', inplace=True)
# Specifies the column name
data.columns = columns
return data
get_txt_data('./resources/ctd2020-09-27.txt')
2.2 Data processing
because Little yes There is no indication of what data processing will take place ( In addition to the above data reading obstacles ), So here we simply delete the duplicate columns , Share next time excel to sqlserver Some data processing will be involved ( Such as : Remove duplicates according to the specified fields 、 Date format conversion, etc )
# Data processing
def process_data(data):
# Contains no columns to process , Then simply remove the weight 、 Store in database
data.drop_duplicates(inplace=True)
return data
2.3 data storage
Because we want to store data in mysql, So before storage , We need to connect to the database first , What I'm using here is sqlalchemy+pymysql link mysql database , The code is as follows :
# Linked database
def link_mysql(user, password, database):
# create_engine(" Database type + Database driven :// Database user name : Database password @IP Address : port / database ", The other parameters )
engine = create_engine(f'mysql+pymysql://{user}:{password}@localhost:3306/{database}?charset=utf8')
return engine
And then use pandas Of to_sql The function can simply and quickly convert Dataframe Format data is stored in the database , If you are interested, please read what I wrote before Python Data storage read ,6 A thousand words to deal with all kinds of methods , There is a comparison in it. It can be used directly pymysql And use pandas Of to_sql The difference in the rate at which data is stored , The description may not be accurate , Welcome to read and correct .
# Store the data
def data_to_sql(data, user='root', password='Zjh!1997', database='sql_study', table='ctd'):
engine = link_mysql(user, password, database)
# call pandas Of to_sql Store the data
t1 = time.time() # Time stamp Unit second
print(' Data insertion start time :{0}'.format(t1))
# The first parameter : Table name
# The second parameter : Database connection engine
# The third parameter : Whether to store index
# Fourth parameter : If the table exists Just append data
data.to_sql(table, engine, index=False, if_exists='append')
t2 = time.time() # Time stamp Unit second
print(' Data insertion end time :{0}'.format(t2))
print(' Successfully inserted data %d strip ,'%len(data), ' Time consuming :%.5f second .'%(t2-t1))
Finally, we write a summary function , It is here that the above logic is connected :
# The text file is stored in mysql
def txt_to_sql(filepath, user='root', password='Zjh!1997', database='sql_study', table='ctd'):
# Read the file
data = get_txt_data(filepath)
# Data processing
data = process_data(data)
# data storage
data_to_sql(data, user, password, database, table)
2.4 Call function test
filepath = './resources/ctd2020-09-27.txt' # Specify only the file path , Other parameters use default values Convenient test txt_to_sql(filepath)
Then you can write a database reading function to further test whether the data is really stored in the database :
# Reading data from a database
def read_mysql(user='root', password='Zjh!1997', database='sql_study', table='ctd'):
engine = link_mysql(user, password, database)
# Read the sql sentence
sql = f'select * from {table} limit 3'
# The first parameter : Inquire about sql sentence
# The second parameter :engine, Database connection engine
pd_read_sql = pd.read_sql(sql, engine)
return pd_read_sql
Call function , Check storage , That's all right. .
Come here , The function part of the program is finished , The next step is to write gui Interface , Make it easier for users to use , such as : File path Database parameter input, etc .
2.5 Write a GUI
Here we are using PySimpleGUI package , As the name says , It's really simple , The bottom layer is Python Self contained tkinter.
- Import related packages
# Write a GUI import PySimpleGUI as sg # Call the data store function from txt_to_sql import txt_to_sql
- To write GUI Layout
# Set up GUI Layout
# Related parameters key: Receiving input data key target: The corresponding data is displayed to the specified target
# default_text: Set the default value of the input box
layout = [
[sg.Text(' Read the contents of the specified file , After processing, it is stored in the specified database table ~')],
[sg.FileBrowse(' Click to select a file ', key='filepath', target='file'), sg.Text(key='file')],
[sg.Text(' Login username '), sg.InputText(key='user', default_text='root', )],
[sg.Text(' The login password '), sg.InputText(key='password', default_text='Zjh!1997')],
[sg.Text(' Database name '), sg.InputText(key='database', default_text='sql_study')],
[sg.Text(' Stored table name '), sg.InputText(key='table', default_text='ctd')],
[sg.Button(' Start to deal with '), sg.Button(' sign out ')]
]
- Create program window 、 Business logic
# Create a window program
window = sg.Window('Txt To MySQL', layout, default_element_size=(100,))
while True:
event, values = window.read() # get data
# print(event)
if event==' Start to deal with ':
# Pass input data into the data handler
txt_to_sql(values['filepath'], values['user'], values['password'], values['database'], values['table'])
else:
# event in (None, ' sign out '): # Click to exit Shut down the program
break
window.close()
- According to the effect
In the layout section ,layout For a list data , Each element in the list is also a list , Means a column , The commonly used layout modules are :Text( This article shows that )、InputText( Input box )、Button( General button )、FileBrowse( Single file selection );
In the create window program section , It mainly sets the default window size default_element_size, Just set the width , Height adapts to layout controls , In addition, getting the input value is very simple , direct read that will do , The dictionary is returned , Data processing is more convenient .
Come here , We will complete the development of all programs , Then there is the method of using the program directly .
3、 ... and 、 Direct consumption method
Two code files , Remove whitespace and comments , also 51 Line code , Hey ~
3.1 function
- GUI Interface , Support the selection of specified files 、 Enter the database user name password Database name Table name .
- Read the specified file , After data processing , Stored in the specified database table , If the table does not exist, you can directly create a new table to store data ; Otherwise, add data directly to the data table .
3.2 Usage method
Download the project code :https://github.com/XksA-me/txt-to-mysql
Unpack and open the file :python-Jonny, This document contains all python Code + Test data + Environmental Science +windows bat Run the file , Other documents are @ use Java Written txt to mysql Methods and related configuration files ,
Original project address :https://github.com/schatz0-0/txt-to-mysql Original project B Station video sharing address :https://www.bilibili.com/video/BV12b4y1J7pD
Continue with how to use python edition , First of all, we need to unzip the python Environment package , It can be extracted directly , No need for secondary installation .
The relevant documents in the screenshot above explain :
├── Pipfile Virtual environment profile ( Never mind ) ├── Pipfile.lock Virtual environments depend on package relationships ( Never mind ) ├── __pycache__ ( Never mind ) │ └── txt_to_sql.cpython-310.pyc ( Never mind ) ├── python-Jonny-tJ_VXFMA.7z ( Virtual environment zip , You need to decompress it directly ) ├── requirements.txt ( The third... Required for this project Python package , Have been installed in the given virtual environment ) ├── resources ( Test data ) │ └── ctd2020-09-27.txt ├── start.bat (windwos You can run the file directly under , Start project ) ├── txt_to_sql.py (Python Code file , Contains data reading Handle Storage ) └── txt_to_sql_gui.py (Python Code file , contain gui Interface , Call in txt_to_sql.py file , So just run this file )
After the virtual environment is decompressed , According to our local directory , Modify the start.bat file , The contents are as follows :
@echo off C: cd C:\Users\Administrator\Desktop\python-Jonny C:\Users\Administrator\Desktop\python-Jonny\python-Jonny-tJ_VXFMA\Scripts\python txt_to_sql_gui.py exit
I don't understand this very well , Sell now , The general meaning of the above is : Get into c Disk project directory , Then take advantage of the virtual environment python Executable file Run the code I can , Last exit Exit procedure .
What you need to modify is the file directory involved , Just be consistent with your local , I wrote it on the ECS and put it on c disc ( There's only one dish ), You can choose to put it on other plates , Easy to manage .
After modification , Just click start.bat To run the project , A black box will pop up (cmd), And a gui Program interface , The log of program execution output will be displayed in the black box ( It's in the program print Or misinformation ),gui First, we need to click the button to select the stored file , Then enter the database information , Set the default value , And then click Start to deal with Button to run the program 、 Store the data , Click the exit button to close the program .
Four 、 Can expand
- Currently only supported txt, And the data format is the specified type ( Spaces or tabs \t separated ), Have the time , Everyone needs it , It can be extended to support multiple format files , Add a file suffix to identify
- The interface is simple , I saw it in the morning [email protected] Is my _ Is my _ It's the video I sent , Just thought of using python Writing is also very convenient , Hasty time , The interface is quite general , But tools , It is important to realize the function at the beginning .
This project has many shortcomings and can be improved , Welcome to study and exchange ~
Record today + A simple clip , Tomorrow, I will give a video explanation .
边栏推荐
- Experience sharing of system analysts in preparing for exams: phased and focused
- QT precautions and RCC download address
- Mvcc principle in MySQL
- HTTP Caching Protocol practice
- 《软件体系结构》期末复习总结
- β- Tetraphenyl nickel porphyrin with all chlorine substitution| β- Thiocyano tetraphenyl porphyrin copper| β- Dihydroxy tetraphenyl porphyrin 𞓜 2-nitroporphyrin | supplied by Qiyue
- Quickly write MVVM code using source generators
- Easy to get started naturallanguageprocessing series topic 7 text classification based on fasttext
- Tcapulusdb Jun · industry news collection (V)
- Common methods for describing 3D models of objects and their advantages and disadvantages
猜你喜欢

轻松入门自然语言处理系列 专题7 基于FastText的文本分类

How to insert pseudo code into word documents simply and quickly?

IDENTITY

2022 recommended tire industry research report industry development prospect market analysis white paper

Kubernetes backup disaster recovery service product experience tutorial

Use VS to create a static link library Lib and use

It is said on the Internet that a student from Guangdong has been admitted to Peking University for three times and earned a total of 2million yuan in three years

Introduction to Photoshop (the first case)

2022 recommended cloud computing industry research report investment strategy industry development prospect market analysis (the attachment is a link to the online disk, and the report is continuously

2-nitro-5,10,15,20-tetra (4-methylphenyl) porphyrin copper (no2tmpp) Cu) /2-nitro-5,10,15,20-tetra (4-methylphenyl) porphyrin (no2tmpp) H2) Qiyue porphyrin supply
随机推荐
Love that can't be met -- what is the intimate relationship maintained by video chat
Can use the mouse, will reinstall the computer system tutorial sharing
Research Report on the overall scale, major manufacturers, major regions, products and applications of semiconductor CMP wafer fixed ring in the global market in 2022
Embedded RTOS
Design risc-v processor from scratch -- data adventure of five stage pipeline
Est - ce que l'ouverture d'un compte de titres est sécurisée? Y a - t - il un danger?
Top ten Devops best practices worthy of attention in 2022
2022 recommended quantum industry research industry development planning prospect investment market analysis report (the attachment is a link to the online disk, and the report is continuously updated
2022 recommended tire industry research report industry development prospect market analysis white paper
机器人强化学习——Transferring End-to-End Visuomotor Control from Simulation to RealWorld (CoRL 2017)
【IoT】公众号“简一商业”更名为“产品人卫朋”说明
[high concurrency] deeply analyze the callable interface
be based on. NETCORE development blog project starblog - (13) add friendship link function
HTTP Caching Protocol practice
[CV] wuenda machine learning course notes Chapter 13
(practice C language every day) matrix
Easy to get started naturallanguageprocessing series topic 7 text classification based on fasttext
QT precautions and RCC download address
Research on heuristic intelligent task scheduling
C語言用 printf 打印 《愛心》《火星撞地球》等,不斷更新