当前位置:网站首页>1. Get data - requests.get()
1. Get data - requests.get()
2022-07-31 04:52:00 【m0_54861649】
1. The working principle of crawler
Get data – parse data – extract data – store data

2. Get data
The essence is to send a request to the server through the URL, and the server encapsulates the relevant content into a Response object and returns it to us, which is achieved through requests.get().There are four commonly used methods (status_code, content, text, encoding) under the Response object we obtained.
3. requests.get()
import requests #Introduce requests module
res = requests.get('url') # Request data from the server, the result returned by the server is a Response object
print(type(res)) # Terminal display:
This means res is an object of the requests.models.Response class.
3. response.status_code
Usage: print(variable.status_code),
It is used to check whether the request is responded correctly. If the response status code is 200, it means the request is successful.

The response status code indicates the server's response to the request.For example, 200 means the server responded successfully, 403 means access is forbidden, 404 means the page was not found, and 408 means the request timed out.The browser will make corresponding processing according to the status code.In the crawler, the status of the server can be judged according to the status code. If the status code is 200, continue to process the data, otherwise ignore it directly.
4. response.content
response.content can return the content of the Response object in the form of binary data, which is suitable for downloading pictures, audio and video. Example:
import requests
#Image address
URL=‘‘https://img1.baidu.com/it/u=2076064484,1314795796&fm=253&fmt=auto&app=120&f=JPEGw=531&h=309’’
Send the request and put the returned result in the variable res
res = requests.get(url)
# Return the content of the Reponse object as binary data
pic = res.content
# Download a picture file and name it picture.jpg. The content of the picture needs to be written only in binary wb.
with open(r'C:UsersAveryDesktop estpicture.jpg', 'wb') as f:
Get the binary content of pic and write f
f.write(pic)
In this way, our pictures are downloaded successfully!
5. response.text
response.text This method can return the content of the Response object in the form of a string, which is suitable for downloading text and webpage source code.Here's an example:
import requests
Article address
url = 'https://localprod.pandateacher.com/python-manuscript/crawler-html/sanguo.md'
Send the request and put the returned result in the variable res
res = requests.get(url)
# Return the content of the Response object as a string
novel = res.text
#Print variables
print(novel[0:170])
6. response.encoding
response.encoding method, it can help us define the encoding of the Response object, the example is as follows:
import requests
Article address
url = 'https://localprod.pandateacher.com/python-manuscript/crawler-html/sanguo.md'
Send the request and put the returned result in the variable res
res = requests.get(url)
# Define the encoding corresponding to the response as utf-8
res.encoding = 'utf-8'
Returns the content of the Response object as a string
novel = res.text
Print variables
print(novel[0:170])
Let me introduce myself first. The editor graduated from Jiaotong University in 2013. I worked in a small company and went to big factories such as Huawei and OPPO. I joined Ali in 2018, until now.I know that most junior and intermediate java engineers want to upgrade their skills, they often need to explore their own growth or sign up to study, but for training institutions, the tuition fee is nearly 10,000 yuan, which is really stressful.Self-learning that is not systematic is very inefficient and lengthy, and it is easy to hit the ceiling and the technology stops.Therefore, I collected a "full set of learning materials for java development" for everyone. The original intention is also very simple. I hope to help friends who want to learn by themselves but don't know where to start, and at the same time reduce everyone's burden.Add the business card below to get a full set of learning materials
边栏推荐
- MySQL常见面试题汇总(建议收藏!!!)
- 重磅 | 开放原子校源行活动正式启动
- The Vue project connects to the MySQL database through node and implements addition, deletion, modification and query operations
- Unity手机游戏性能优化系列:针对CPU端的性能调优
- sql语句-如何以一个表中的数据为条件据查询另一个表中的数据
- 开源社区三十年 | 2022开放原子全球开源峰会开源社区三十年专题活动圆满召开
- ERROR 1819 (HY000) Your password does not satisfy the current policy requirements
- Minio上传文件ssl证书不受信任
- The 15th day of the special assault version of the sword offer
- Fusion Cloud Native, Empowering New Milestones | 2022 Open Atom Global Open Source Summit Cloud Native Sub-Forum Successfully Held
猜你喜欢

ERROR 2003 (HY000) Can‘t connect to MySQL server on ‘localhost3306‘ (10061)解决办法

数字经济时代的开源数据库创新 | 2022开放原子全球开源峰会数据库分论坛圆满召开

STM32HAL库修改Hal_Delay为us级延时

MySQL database addition, deletion, modification and query (detailed explanation of basic operation commands)

马斯克对话“虚拟版”马斯克,脑机交互技术离我们有多远

论治理与创新 | 2022开放原子全球开源峰会OpenAnolis分论坛圆满召开

专访 | 阿里巴巴首席技术官程立:云+开源共同形成数字世界的可信基础

Reinforcement learning: from entry to pit to shit

Blockbuster | foundation for platinum, gold, silver gave nameboards donors

HCIP Day 10_BGP Route Summary Experiment
随机推荐
Duplicate entry ‘XXX‘ for key ‘XXX.PRIMARY‘解决方案。
产学研用 共建开源人才生态 | 2022开放原子全球开源峰会教育分论坛圆满召开
open failed: EACCES (Permission denied)
DVWA安装教程(懂你的不懂·详细)
PCL calculates the point cloud coordinate maximum and its index
ERROR 1064 (42000) You have an error in your SQL syntax; check the manual that corresponds to your
MySQL模糊查询可以使用INSTR替代LIKE
Recursive implementation of the Tower of Hanoi problem
高斯分布及其极大似然估计
MySQL常见面试题汇总(建议收藏!!!)
两个地址池r2负责管地址池r1负责管dhcp中继
BUG消灭者!!实用调试技巧超全整理
论治理与创新 | 2022开放原子全球开源峰会OpenAnolis分论坛圆满召开
MySQL database must add, delete, search and modify operations (CRUD)
ERP Production Operation Control Kingdee
聚变云原生,赋能新里程 | 2022开放原子全球开源峰会云原生分论坛圆满召开
Lua,ILRuntime, HybridCLR(wolong)/huatuo热更新对比分析
XSS靶场(三)prompt to win
The input input box displays the precision of two decimal places
ES 源码 API调用链路源码分析