当前位置:网站首页>1. Get data - requests.get()
1. Get data - requests.get()
2022-07-31 04:52:00 【m0_54861649】
1. The working principle of crawler
Get data – parse data – extract data – store data
2. Get data
The essence is to send a request to the server through the URL, and the server encapsulates the relevant content into a Response object and returns it to us, which is achieved through requests.get().There are four commonly used methods (status_code, content, text, encoding) under the Response object we obtained.
3. requests.get()
import requests #Introduce requests module
res = requests.get('url') # Request data from the server, the result returned by the server is a Response object
print(type(res)) # Terminal display:
This means res is an object of the requests.models.Response class.
3. response.status_code
Usage: print(variable.status_code),
It is used to check whether the request is responded correctly. If the response status code is 200, it means the request is successful.
The response status code indicates the server's response to the request.For example, 200 means the server responded successfully, 403 means access is forbidden, 404 means the page was not found, and 408 means the request timed out.The browser will make corresponding processing according to the status code.In the crawler, the status of the server can be judged according to the status code. If the status code is 200, continue to process the data, otherwise ignore it directly.
4. response.content
response.content can return the content of the Response object in the form of binary data, which is suitable for downloading pictures, audio and video. Example:
import requests
#Image address
URL=‘‘https://img1.baidu.com/it/u=2076064484,1314795796&fm=253&fmt=auto&app=120&f=JPEGw=531&h=309’’
Send the request and put the returned result in the variable res
res = requests.get(url)
# Return the content of the Reponse object as binary data
pic = res.content
# Download a picture file and name it picture.jpg. The content of the picture needs to be written only in binary wb.
with open(r'C:UsersAveryDesktop estpicture.jpg', 'wb') as f:
Get the binary content of pic and write f
f.write(pic)
In this way, our pictures are downloaded successfully!
5. response.text
response.text This method can return the content of the Response object in the form of a string, which is suitable for downloading text and webpage source code.Here's an example:
import requests
Article address
url = 'https://localprod.pandateacher.com/python-manuscript/crawler-html/sanguo.md'
Send the request and put the returned result in the variable res
res = requests.get(url)
# Return the content of the Response object as a string
novel = res.text
#Print variables
print(novel[0:170])
6. response.encoding
response.encoding method, it can help us define the encoding of the Response object, the example is as follows:
import requests
Article address
url = 'https://localprod.pandateacher.com/python-manuscript/crawler-html/sanguo.md'
Send the request and put the returned result in the variable res
res = requests.get(url)
# Define the encoding corresponding to the response as utf-8
res.encoding = 'utf-8'
Returns the content of the Response object as a string
novel = res.text
Print variables
print(novel[0:170])
Let me introduce myself first. The editor graduated from Jiaotong University in 2013. I worked in a small company and went to big factories such as Huawei and OPPO. I joined Ali in 2018, until now.I know that most junior and intermediate java engineers want to upgrade their skills, they often need to explore their own growth or sign up to study, but for training institutions, the tuition fee is nearly 10,000 yuan, which is really stressful.Self-learning that is not systematic is very inefficient and lengthy, and it is easy to hit the ceiling and the technology stops.Therefore, I collected a "full set of learning materials for java development" for everyone. The original intention is also very simple. I hope to help friends who want to learn by themselves but don't know where to start, and at the same time reduce everyone's burden.Add the business card below to get a full set of learning materials
边栏推荐
- mysql基础知识(二)
- ERP Production Operation Control Kingdee
- [py script] batch binarization processing images
- Open Source Database Innovation in the Digital Economy Era | 2022 Open Atom Global Open Source Summit Database Sub-Forum Successfully Held
- 手把手实现图片预览插件(三)
- [R language] [3] apply, tapply, lapply, sapply, mapply and par function related parameters
- Mysql应用安装后找不到my.ini文件
- Doris学习笔记之监控
- idea工程明明有依赖但是文件就是显示没有,Cannot resolve symbol ‘XXX‘
- Gaussian distribution and its maximum likelihood estimation
猜你喜欢
城市内涝及桥洞隧道积水在线监测系统
已解决(最新版selenium框架元素定位报错)NameError: name ‘By‘ is not defined
STM32HAL库修改Hal_Delay为us级延时
ERROR 2003 (HY000) Can‘t connect to MySQL server on ‘localhost3306‘ (10061)解决办法
VScode+ESP32 quickly install ESP-IDF plugin
ERROR 2003 (HY000) Can't connect to MySQL server on 'localhost3306' (10061)Solution
C language confession code?
Win10 CUDA CUDNN 安装配置(torch paddlepaddle)
HCIP Day 10_BGP Route Summary Experiment
Basic knowledge of mysql (2)
随机推荐
ERROR 1064 (42000) You have an error in your SQL syntax; check the manual that corresponds to your
1. 获取数据-requests.get()
行业落地呈现新进展 | 2022开放原子全球开源峰会OpenAtom OpenHarmony分论坛圆满召开
three.js make 3D photo album
Unity shader forge和自带的shader graph,有哪些优缺点?
Duplicate entry ‘XXX‘ for key ‘XXX.PRIMARY‘解决方案。
MySQL优化:从十几秒优化到三百毫秒
开放原子开源基金会秘书长孙文龙 | 凝心聚力,共拓开源
el-image tag doesn't work after binding click event
prompt.ml/15中<svg>标签使用解释
简易网络文件拷贝的C实现
sql语句-如何以一个表中的数据为条件据查询另一个表中的数据
PCL calculates the point cloud coordinate maximum and its index
Interview | Cheng Li, CTO of Alibaba: Cloud + open source together form a credible foundation for the digital world
Industry-university-research application to build an open source talent ecosystem | 2022 Open Atom Global Open Source Summit Education Sub-Forum was successfully held
DVWA之SQL注入
Lua,ILRuntime, HybridCLR(wolong)/huatuo hot update comparative analysis
On Governance and Innovation | 2022 OpenAtom Global Open Source Summit OpenAnolis sub-forum was successfully held
SQL语句中对时间字段进行区间查询
益智类游戏关卡设计:逆推法--巧解益智类游戏关卡设计