当前位置:网站首页>1. Get data - requests.get()
1. Get data - requests.get()
2022-07-31 04:52:00 【m0_54861649】
1. The working principle of crawler
Get data – parse data – extract data – store data

2. Get data
The essence is to send a request to the server through the URL, and the server encapsulates the relevant content into a Response object and returns it to us, which is achieved through requests.get().There are four commonly used methods (status_code, content, text, encoding) under the Response object we obtained.
3. requests.get()
import requests #Introduce requests module
res = requests.get('url') # Request data from the server, the result returned by the server is a Response object
print(type(res)) # Terminal display:
This means res is an object of the requests.models.Response class.
3. response.status_code
Usage: print(variable.status_code),
It is used to check whether the request is responded correctly. If the response status code is 200, it means the request is successful.

The response status code indicates the server's response to the request.For example, 200 means the server responded successfully, 403 means access is forbidden, 404 means the page was not found, and 408 means the request timed out.The browser will make corresponding processing according to the status code.In the crawler, the status of the server can be judged according to the status code. If the status code is 200, continue to process the data, otherwise ignore it directly.
4. response.content
response.content can return the content of the Response object in the form of binary data, which is suitable for downloading pictures, audio and video. Example:
import requests
#Image address
URL=‘‘https://img1.baidu.com/it/u=2076064484,1314795796&fm=253&fmt=auto&app=120&f=JPEGw=531&h=309’’
Send the request and put the returned result in the variable res
res = requests.get(url)
# Return the content of the Reponse object as binary data
pic = res.content
# Download a picture file and name it picture.jpg. The content of the picture needs to be written only in binary wb.
with open(r'C:UsersAveryDesktop estpicture.jpg', 'wb') as f:
Get the binary content of pic and write f
f.write(pic)
In this way, our pictures are downloaded successfully!
5. response.text
response.text This method can return the content of the Response object in the form of a string, which is suitable for downloading text and webpage source code.Here's an example:
import requests
Article address
url = 'https://localprod.pandateacher.com/python-manuscript/crawler-html/sanguo.md'
Send the request and put the returned result in the variable res
res = requests.get(url)
# Return the content of the Response object as a string
novel = res.text
#Print variables
print(novel[0:170])
6. response.encoding
response.encoding method, it can help us define the encoding of the Response object, the example is as follows:
import requests
Article address
url = 'https://localprod.pandateacher.com/python-manuscript/crawler-html/sanguo.md'
Send the request and put the returned result in the variable res
res = requests.get(url)
# Define the encoding corresponding to the response as utf-8
res.encoding = 'utf-8'
Returns the content of the Response object as a string
novel = res.text
Print variables
print(novel[0:170])
Let me introduce myself first. The editor graduated from Jiaotong University in 2013. I worked in a small company and went to big factories such as Huawei and OPPO. I joined Ali in 2018, until now.I know that most junior and intermediate java engineers want to upgrade their skills, they often need to explore their own growth or sign up to study, but for training institutions, the tuition fee is nearly 10,000 yuan, which is really stressful.Self-learning that is not systematic is very inefficient and lengthy, and it is easy to hit the ceiling and the technology stops.Therefore, I collected a "full set of learning materials for java development" for everyone. The original intention is also very simple. I hope to help friends who want to learn by themselves but don't know where to start, and at the same time reduce everyone's burden.Add the business card below to get a full set of learning materials
边栏推荐
猜你喜欢

ES 源码 API调用链路源码分析

Minesweeper game - C language

sql语句-如何以一个表中的数据为条件据查询另一个表中的数据

BUG消灭者!!实用调试技巧超全整理

Lua,ILRuntime, HybridCLR(wolong)/huatuo hot update comparative analysis

已解决(最新版selenium框架元素定位报错)NameError: name ‘By‘ is not defined

打造基于ILRuntime热更新的组件化开发

MySQL数据库必会的增删查改操作(CRUD)

SOLVED: After accidentally uninstalling pip (two ways to manually install pip)

ERP生产作业控制 金蝶
随机推荐
ERROR 2003 (HY000) Can't connect to MySQL server on 'localhost3306' (10061)Solution
三道leetcode上的oj题
BUG destroyer!!Practical debugging skills are super comprehensive
Unity框架设计系列:Unity 如何设计网络框架
行业落地呈现新进展 | 2022开放原子全球开源峰会OpenAtom OpenHarmony分论坛圆满召开
[C language] Detailed explanation of operators
高斯分布及其极大似然估计
SQL行列转换
城市内涝及桥洞隧道积水在线监测系统
从零开始,一镜到底,纯净系统搭建除草机(Grasscutter)
PCL calculates the point cloud coordinate maximum and its index
Go language study notes - dealing with timeout problems - Context usage | Go language from scratch
VScode+ESP32 quickly install ESP-IDF plugin
MATLAB/Simulink&&STM32CubeMX工具链完成基于模型的设计开发(MBD)(三)
unity2d小游戏
Puzzle Game Level Design: Reverse Method--Explaining Puzzle Game Level Design
Lua,ILRuntime, HybridCLR(wolong)/huatuo热更新对比分析
From scratch, a mirror to the end, a pure system builds a grasscutter (Grasscutter)
prompt.ml/15中<svg>标签使用解释
【C语言】操作符详解