当前位置:网站首页>Request attribute in crawler
Request attribute in crawler
2022-07-27 00:21:00 【For a long time, the duck will become a goose】
List of articles
Review reptiles ····
Learning source is wind change programming ( Strong wind driven transformer )
And the lovely Songtian teacher of Beijing University of Technology !
Basic introduction
3steps
Access to web pages ——— Parse web pages —— Store the data
Initiate network request , Get web information ———— from HTML Extract the desired data from the source code ———— Store the parsed data
use requsets The library initiates a network request
import requests
res = requests.get(' website ')
# call requests.get() function , Fill in the parameter position URL, Assign the result to a variable res On
# see res The content of
print(res)
# Use type() Function to see requests.get() The type of the result returned by the function
res_type = type(res)
print(res_type)
# requests.get() The function returns a result that belongs to requests.models.Response Class object
# Print Response Object's status_code attribute , Status code
print(res.status_code)
Response.text attribute
text = res.text
Response.text You can get the string form of web page code
use Response.text Attribute can get specific content
Response.encoding attribute
Previous variables res Is an object of a class
Objects have many properties
res.encoding = 'utf-8'

http And https The difference between
http Agreement , The data is not encrypted , So the account number and password entered on the network , Personal information and other data are very dangerous if they are obtained .
and https The agreement will be in http Encrypt the data based on the Protocol , be relative to http It's safer for me 、 reliable .
see robots agreement
Add /robots.txt that will do
User-agent:*
Disallow:/
# * Number means everything , No one is allowed to climb here
边栏推荐
- Several search terms
- Sliding window problem summary
- LeetCode题目——数组篇
- Abstract classes and interfaces (sorting out some knowledge points)
- Method of setting QQ to blank ID
- Machine learning model -- lightgbm
- Hcip day 2_ HCIA review comprehensive experiment
- Baidu website Collection
- 13_ conditional rendering
- CCPD data set processing (target detection and text recognition)
猜你喜欢

Nacos installation and pit stepping

数据库:MySQL基础+CRUD基本操作

Leetcode topic - array

DHCP, VLAN, NAT, large comprehensive experiment

The place where the dream begins ---- first knowing C language (2)

20220720 toss deeplobcut2
![[Gorm] model relationship -hasone](/img/90/3069059ddd09dc538c10f76d659b08.png)
[Gorm] model relationship -hasone

What scenarios are Tencent cloud lightweight application servers suitable for?

Chapter 1 develop the first restful application

The difference between SQL join and related subinquiry
随机推荐
数据库:MySQL基础+CRUD基本操作
SSRF (server side request forgery) -- Principle & bypass & Defense
DHCP, VLAN, NAT, large comprehensive experiment
Midge paper reading notes
Chapter 3 cross domain issues
Xshell连接服务器时报“Could not load host key”错误
LeetCode题目——二叉树篇
Training team lpoj round10 d come minion!
Go exceed API source code reading (IV) -- save (), SaveAs (name string)
Design of intelligent humidification controller based on 51 single chip microcomputer
信号与系统冲激响应与阶跃响应
10_ Name Case - Calculation attribute
今日份20220719折腾deeplabcut
4-4 对象生命周期
4-4 object lifecycle
Codeforces E. maximum subsequence value (greed + pigeon nest principle)
LeetCode——链表篇
"Could not load host key" error when xshell connects to the server
第7章 课程总结
解析网页的完整回顾