当前位置:网站首页>Trial of the combination of RDS and crawler
Trial of the combination of RDS and crawler
2022-07-03 09:24:00 【zeng31403】
This cloud database RDS It's for , In order to understand this RDS Users of , Also in order to do an automatic data storage crawler on the host , Try this today RDS.
Or feeling the stones step by step to cross the river .
Log in to the database RDS
- First log in from the console
- Enter instance - mysql 8.0 Of ,20g
- Internal and external whitelist settings
First add the intranet to the white list , According to the help beside, I added 0.0.0.0/0 try , This defaults to the intranet host
Internet first click to apply for Internet address , Then I came down after a while .
Both internal and external websites have web addresses . With these two websites , Theoretically , Both internal and external networks can be accessed .
- Create account , And log in to view
Next mysql Try the client , Connected to , Start building tables , I'm still not used to operating on the command line .
The client under is ok .Now local python Try connecting inside , This needs to be installed first pymysql
pip install pymysql
- The crawler code starts writing , This time, we mainly try to climb down the model information of e-car.com
url The rule is http://car.bitauto.com/tree_chexing/+type+"_"+id
however type and id It's in the brand information , I climbed once before . - Import module information
import pymysql
import requests as rq
import re
import bs4
import json
Database connection , It's still very simple
Fill in the intranet address , Just fill in the extranet , There is .Read the newly filled information from the database , Limited space , Only this and nothing more
Reptiles mainly use requests, BeautifulSoup Still practicing .
# Get... From the database brand Information is stored in the data dictionary
dict_car_brand={}
if bl_get_dict_car_brand is False:
try:
# perform SQL sentence
cursor = db.cursor()
sql="SELECT id,type,name,url FROM car_brand"
cursor.execute(sql)
# Get a list of all records
results = cursor.fetchall()
for row in results:
int_id = row[0]
str_type = row[1]
str_name = row[2]
str_url = row[3]
# Add a dictionary
dict_car_brand[str(int_id)]={"type":str_type,"name":str_name,"url":str_url}
bl_get_dict_car_brand=True
except:
print ("Error: unable to fetch data")
finally:
cursor.close()
#print(dict_car_brand)
边栏推荐
- In the digital transformation, what problems will occur in enterprise equipment management? Jnpf may be the "optimal solution"
- Sword finger offer II 091 Paint the house
- [solution to the new version of Flink without bat startup file]
- LeetCode 513. Find the value in the lower left corner of the tree
- Computing level network notes
- Explanation of the answers to the three questions
- We have a common name, XX Gong
- LeetCode 515. Find the maximum value in each tree row
- Hudi学习笔记(三) 核心概念剖析
- Flink-CDC实践(含实操步骤与截图)
猜你喜欢
Digital management medium + low code, jnpf opens a new engine for enterprise digital transformation
Recommend a low code open source project of yyds
Using Hudi in idea
Navicat, MySQL export Er graph, er graph
Excel is not as good as jnpf form for 3 minutes in an hour. Leaders must praise it when making reports like this!
Vs2019 configuration opencv3 detailed graphic tutorial and implementation of test code
Install third-party libraries such as Jieba under Anaconda pytorch
[point cloud processing paper crazy reading classic version 12] - foldingnet: point cloud auto encoder via deep grid deformation
Solve POM in idea Comment top line problem in XML file
Principles of computer composition - cache, connection mapping, learning experience
随机推荐
Construction of simple database learning environment
There is no open in default browser option in the right click of the vscade editor
LeetCode 871. Minimum refueling times
[point cloud processing paper crazy reading classic version 13] - adaptive graph revolutionary neural networks
The server denied password root remote connection access
Simple use of MATLAB
Crawler career from scratch (IV): climb the bullet curtain of station B through API
Build a solo blog from scratch
Go language - Reflection
Basic knowledge of network security
[solution to the new version of Flink without bat startup file]
Digital statistics DP acwing 338 Counting problem
Hudi 集成 Spark 数据分析示例(含代码流程与测试结果)
The idea of compiling VBA Encyclopedia
Linxu learning (4) -- Yum and apt commands
Hudi学习笔记(三) 核心概念剖析
AcWing 786. Number k
Vs2019 configuration opencv3 detailed graphic tutorial and implementation of test code
The less successful implementation and lessons of RESNET
Just graduate student reading thesis