当前位置:网站首页>1. Get data - requests.get()
1. Get data - requests.get()
2022-07-31 04:52:00 【m0_54861649】
1. The working principle of crawler
Get data – parse data – extract data – store data

2. Get data
The essence is to send a request to the server through the URL, and the server encapsulates the relevant content into a Response object and returns it to us, which is achieved through requests.get().There are four commonly used methods (status_code, content, text, encoding) under the Response object we obtained.
3. requests.get()
import requests #Introduce requests module
res = requests.get('url') # Request data from the server, the result returned by the server is a Response object
print(type(res)) # Terminal display:
This means res is an object of the requests.models.Response class.
3. response.status_code
Usage: print(variable.status_code),
It is used to check whether the request is responded correctly. If the response status code is 200, it means the request is successful.

The response status code indicates the server's response to the request.For example, 200 means the server responded successfully, 403 means access is forbidden, 404 means the page was not found, and 408 means the request timed out.The browser will make corresponding processing according to the status code.In the crawler, the status of the server can be judged according to the status code. If the status code is 200, continue to process the data, otherwise ignore it directly.
4. response.content
response.content can return the content of the Response object in the form of binary data, which is suitable for downloading pictures, audio and video. Example:
import requests
#Image address
URL=‘‘https://img1.baidu.com/it/u=2076064484,1314795796&fm=253&fmt=auto&app=120&f=JPEGw=531&h=309’’
Send the request and put the returned result in the variable res
res = requests.get(url)
# Return the content of the Reponse object as binary data
pic = res.content
# Download a picture file and name it picture.jpg. The content of the picture needs to be written only in binary wb.
with open(r'C:UsersAveryDesktop estpicture.jpg', 'wb') as f:
Get the binary content of pic and write f
f.write(pic)
In this way, our pictures are downloaded successfully!
5. response.text
response.text This method can return the content of the Response object in the form of a string, which is suitable for downloading text and webpage source code.Here's an example:
import requests
Article address
url = 'https://localprod.pandateacher.com/python-manuscript/crawler-html/sanguo.md'
Send the request and put the returned result in the variable res
res = requests.get(url)
# Return the content of the Response object as a string
novel = res.text
#Print variables
print(novel[0:170])
6. response.encoding
response.encoding method, it can help us define the encoding of the Response object, the example is as follows:
import requests
Article address
url = 'https://localprod.pandateacher.com/python-manuscript/crawler-html/sanguo.md'
Send the request and put the returned result in the variable res
res = requests.get(url)
# Define the encoding corresponding to the response as utf-8
res.encoding = 'utf-8'
Returns the content of the Response object as a string
novel = res.text
Print variables
print(novel[0:170])
Let me introduce myself first. The editor graduated from Jiaotong University in 2013. I worked in a small company and went to big factories such as Huawei and OPPO. I joined Ali in 2018, until now.I know that most junior and intermediate java engineers want to upgrade their skills, they often need to explore their own growth or sign up to study, but for training institutions, the tuition fee is nearly 10,000 yuan, which is really stressful.Self-learning that is not systematic is very inefficient and lengthy, and it is easy to hit the ceiling and the technology stops.Therefore, I collected a "full set of learning materials for java development" for everyone. The original intention is also very simple. I hope to help friends who want to learn by themselves but don't know where to start, and at the same time reduce everyone's burden.Add the business card below to get a full set of learning materials
边栏推荐
- ENSP,划分VLAN、静态路由,三层交换机综合配置
- 30 Years of Open Source Community | 2022 Open Atom Global Open Source Summit 30 Years of Open Source Community Special Event Held Successfully
- Blockbuster | foundation for platinum, gold, silver gave nameboards donors
- 开源汇智创未来 | 2022开放原子全球开源峰会OpenAtom openEuler分论坛圆满召开
- .NET-6.WinForm2.NanUI学习和总结
- ERP Production Operation Control Kingdee
- MySQL database addition, deletion, modification and query (detailed explanation of basic operation commands)
- Sql解析转换之JSqlParse完整介绍
- exsl文件预览,word文件预览网页方法
- [Linear Neural Network] softmax regression
猜你喜欢

递归实现汉诺塔问题

The Vue project connects to the MySQL database through node and implements addition, deletion, modification and query operations

Unity资源管理系列:Unity 框架如何做好资源管理

MySQL优化:从十几秒优化到三百毫秒

30 Years of Open Source Community | 2022 Open Atom Global Open Source Summit 30 Years of Open Source Community Special Event Held Successfully
![Unity Tutorial: URP Rendering Pipeline Practical Tutorial Series [1]](/img/7c/c9ab32bbf43b933e5f84f0d142f7bd.jpg)
Unity Tutorial: URP Rendering Pipeline Practical Tutorial Series [1]

BUG destroyer!!Practical debugging skills are super comprehensive

Open Source Database Innovation in the Digital Economy Era | 2022 Open Atom Global Open Source Summit Database Sub-Forum Successfully Held

MySQL database must add, delete, search and modify operations (CRUD)

MySQL database addition, deletion, modification and query (detailed explanation of basic operation commands)
随机推荐
MySQL基础操作
Understanding of the presence of a large number of close_wait states
Explanation of
【C语言】操作符详解
Error EPERM operation not permitted, mkdir ‘Dsoftwarenodejsnode_cache_cacach两种解决办法
【线性神经网络】softmax回归
已解决(最新版selenium框架元素定位报错)NameError: name ‘By‘ is not defined
1. 获取数据-requests.get()
专访 | 阿里巴巴首席技术官程立:云+开源共同形成数字世界的可信基础
Solved (the latest version of selenium framework element positioning error) NameError: name 'By' is not defined
STM32HAL library modifies Hal_Delay to us-level delay
Unity资源管理系列:Unity 框架如何做好资源管理
开放原子开源基金会秘书长孙文龙 | 凝心聚力,共拓开源
MySQL数据库必会的增删查改操作(CRUD)
Unity URP渲染管线摄像机核心机制剖析
BUG消灭者!!实用调试技巧超全整理
论治理与创新 | 2022开放原子全球开源峰会OpenAnolis分论坛圆满召开
mysql数据库安装(详细)
两个地址池r2负责管地址池r1负责管dhcp中继
input输入框展示两位小数之precision