当前位置:网站首页>Lesson 3 urllib
Lesson 3 urllib
2022-06-25 20:43:00 【Osmanthus rice wine balls】
The third class urllib
One 、 Encapsulate the source code in the web page into an object
import urllib.request
# Get one get request
response = urllib.request.urlopen("http://www.baidu.com") # Packaged in response in
print(response.read().decode('utf-8')) #decode('utf-8') Decode the obtained web page code , To prevent the occurrence of Chinese characters , Print out the web source code
# Get one post request ( Used to simulate login ( password , user ))
use httpbin.org
import urllib.parse # Parser , Parsing key value pairs
data = bytes(urllib.parse.urlencode({
"hello":"world"}),encoding = "utf-8")# Forms , Package that encapsulates key value pair information into binary ,encoding = "utf-8" Encapsulation
response = urllib.request.urlopen("http://httpbin.org/post",data = data)
print(response.read().decode('utf-8'))
Two 、 Timeout problem
try:
response = urllib.request.urlopen("http://httpbin.org/post",timeout=0.01)# For more than 0.01 second
print(response.read().decode('utf-8'))
except urllib.error.URLError as e:
print("time out!")
3、 ... and 、 Response header questions ( Pretend to be a browser )
url = "https://httpbin.org/post"
headers = {
"User-Agent":"……"}
data = bytes(urllib.parse.urlencode({
"hello":"world"}),encoding = "utf-8")
req = urllib.request.Request(url=url,data=data,headers=headers,method='post')# encapsulation , A browser that simulates reality
response = urllib.request.urlopen(req)# encapsulation
print(response.read().decode("utf-8"))
look for User-Agent Methods ( look for headers The key/value pair ):
Find in the network
[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-yaNApJ6z-1644636635823)(C:\Users\ litchi \AppData\Roaming\Typora\typora-user-images\image-20220204161745986.png)]
Four 、 get data
# Crawl to the web
def getData(baseurl):
dataist = []
for i in range(0,10):# Call the function to get page information ,10 Time
url = baseurl + str(i*25)
html = askURL(url)# Save the source code of the web page
return datalist
# Get the designated one URL The web content of
def askURL(url):
head = {
"User-Agent":"……"
}# To disguise , Simulate browser header information
request = urllib.request.Request(url,headers=head)# carry headers To visit url
try:
response = urllib.request.urlopen(request)# Get information about the entire web page
html = response.read().decode("utf-8")# Read information ( Web source code )
except urllib.error.URLError as e:# Capture the error
if hasattr(e,"code"):
print(e.code)# Print code, See what's wrong with the coding
if hasattr(e,"reason"):
print(e.reason)# Print out the reasons for the failure
return html
r(e,“reason”):
print(e.reason)# Print out the reasons for the failure
return html
边栏推荐
- Leetcode daily [2022 - 02 - 17]
- Online yaml to XML tool
- Png to NII
- 2022 "gold, silver and four" is a must for job hopping. You must know 100 questions in 2022 intermediate and advanced Android interview to realize your big factory dream
- NMS reduction box
- 8. iterators and generators
- Swin UNET reading notes
- Leetcode daily question - 28 Implement strstr() (simple)
- The last core step of configuring theano GPU
- Interface automation -md5 password encryption
猜你喜欢

206. reverse linked list (insert, iteration and recursion)

Cvpr2020 | the latest cvpr2020 papers are the first to see, with all download links attached!

Install and initialize MySQL (under Windows)
Day 28/100 CI CD basic introductory concepts

Pcl+vs2019+opencv environment configuration

Nnformer reading notes

Pcl+vs2019 configuration and some source code test cases and demos
[golang] leetcode intermediate - the kth largest element in the array &

Splunk series: Splunk installation and deployment (I)

Bank digital transformation layout in the beginning of the year, 6 challenges faced by financial level structure and Countermeasures
随机推荐
[phase 23] phased summary of spring recruitment practice (Alibaba cloud has OC)
Log4j2 vulnerability battle case
Great changes in the interaction between people and the digital world
PIP command -fatal error in launcher: unable to create process using How to resolve the error after migrating the virtual environment?
How to view and explain robots protocol
Leaflet modify popup style
[data recovery in North Asia] a data recovery case in which the upper virtual machine data is lost due to the hard disk failure and disconnection of raid6 disk array
Clickhouse disables automatic clearing of tables / columns, that is, disables TTL
Remember to deploy selenium crawler on the server
[opencv] opencv from introduction to mastery -- detailed explanation of input and output XML and yaml files
8. iterators and generators
Leetcode daily [2022 - 02 - 17]
The super easy-to-use test tool sorted out by Ali P8 for a week
Several ways to obtain domain administrator privileges
TypeError: __ init__ () takes 1 positional argument but 5 were given
2022 "gold, silver and four" is a must for job hopping. You must know 100 questions in 2022 intermediate and advanced Android interview to realize your big factory dream
6. exception handling
Lantern Festival, learning at the right time! Novice training camp attacks again, learning buff continues to fill up
CiteSpace download installation tutorial
Dice、Sensitivity、ppv、miou