当前位置:网站首页>Lesson 3 urllib
Lesson 3 urllib
2022-06-25 20:43:00 【Osmanthus rice wine balls】
The third class urllib
One 、 Encapsulate the source code in the web page into an object
import urllib.request
# Get one get request
response = urllib.request.urlopen("http://www.baidu.com") # Packaged in response in
print(response.read().decode('utf-8')) #decode('utf-8') Decode the obtained web page code , To prevent the occurrence of Chinese characters , Print out the web source code
# Get one post request ( Used to simulate login ( password , user ))
use httpbin.org
import urllib.parse # Parser , Parsing key value pairs
data = bytes(urllib.parse.urlencode({
"hello":"world"}),encoding = "utf-8")# Forms , Package that encapsulates key value pair information into binary ,encoding = "utf-8" Encapsulation
response = urllib.request.urlopen("http://httpbin.org/post",data = data)
print(response.read().decode('utf-8'))
Two 、 Timeout problem
try:
response = urllib.request.urlopen("http://httpbin.org/post",timeout=0.01)# For more than 0.01 second
print(response.read().decode('utf-8'))
except urllib.error.URLError as e:
print("time out!")
3、 ... and 、 Response header questions ( Pretend to be a browser )
url = "https://httpbin.org/post"
headers = {
"User-Agent":"……"}
data = bytes(urllib.parse.urlencode({
"hello":"world"}),encoding = "utf-8")
req = urllib.request.Request(url=url,data=data,headers=headers,method='post')# encapsulation , A browser that simulates reality
response = urllib.request.urlopen(req)# encapsulation
print(response.read().decode("utf-8"))
look for User-Agent Methods ( look for headers The key/value pair ):
Find in the network
[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-yaNApJ6z-1644636635823)(C:\Users\ litchi \AppData\Roaming\Typora\typora-user-images\image-20220204161745986.png)]
Four 、 get data
# Crawl to the web
def getData(baseurl):
dataist = []
for i in range(0,10):# Call the function to get page information ,10 Time
url = baseurl + str(i*25)
html = askURL(url)# Save the source code of the web page
return datalist
# Get the designated one URL The web content of
def askURL(url):
head = {
"User-Agent":"……"
}# To disguise , Simulate browser header information
request = urllib.request.Request(url,headers=head)# carry headers To visit url
try:
response = urllib.request.urlopen(request)# Get information about the entire web page
html = response.read().decode("utf-8")# Read information ( Web source code )
except urllib.error.URLError as e:# Capture the error
if hasattr(e,"code"):
print(e.code)# Print code, See what's wrong with the coding
if hasattr(e,"reason"):
print(e.reason)# Print out the reasons for the failure
return html
r(e,“reason”):
print(e.reason)# Print out the reasons for the failure
return html
边栏推荐
- 8. iterators and generators
- Section 13: simplify your code with Lombok
- Avoid material "minefields"! Play super high conversion rate
- Reasons for network timeout app flash back
- Leetcode topic [array] -33- search rotation sort array
- Intra domain information collection for intranet penetration
- Installing MySQL under Linux (CentOS 7)
- Paddledtx v1.0 has been released, and its security and flexibility have been comprehensively improved!
- Yolov4 reading notes (with mind map)! YOLOv4: Optimal Speed and Accuracy of Object Detection
- Delete the page specified in PDF and merge pdf
猜你喜欢
Cloud development practice of the small program for brushing questions in the postgraduate entrance examination - page design and production (home page of the question bank, ranking page, my)

Splunk series: Splunk data import (II)
Interviewer: why does TCP shake hands three times and break up four times? Most people can't answer!

Cvpr2020 | the latest cvpr2020 papers are the first to see, with all download links attached!

2020-11-14-Alexnet

SQL statement select summary

node. JS express connect mysql write webapi Foundation

Yolov4 reading notes (with mind map)! YOLOv4: Optimal Speed and Accuracy of Object Detection
Tencent music knowledge map search practice
Instant aesthetics of the Centennial Olympic Games: beauty in the air, condensed in minutes and seconds - Alibaba cloud video cloud AI editorial department "cloud smart scissors"
随机推荐
Splunk series: Splunk installation and deployment (I)
Leetcode daily [2022 - 02 - 18]
An unusual interview question: why doesn't the database connection pool adopt IO multiplexing?
Png to NII
Clickhouse disables automatic clearing of tables / columns, that is, disables TTL
Day 29/100 local SSH password free login to remote
Instant aesthetics of the Centennial Olympic Games: beauty in the air, condensed in minutes and seconds - Alibaba cloud video cloud AI editorial department "cloud smart scissors"
Solution to big noise of OBS screen recording software
Pcl+vs2019 configuration and some source code test cases and demos
R language quantile autoregressive QAR analysis pain index: time series of unemployment rate and inflation rate
Yunzhisheng atlas supercomputing platform: computing acceleration practice based on fluid + alluxio (Part I)
Modifying routes without refreshing the interface
Slenium tips: how to handle some dialog boxes that may appear on Web pages
Cvpr2020 | the latest cvpr2020 papers are the first to see, with all download links attached!
[untitled]
Corporate finance formula_ P1_ Accounting statement and cash flow
Expand and check the specified node when loading ztree
The secret of metaktv technology of sound network: 3D space sound effect + air attenuation + vocal blur
The live registration is hot to start | the first show of Apache dolphin scheduler meetup in 2022!
Short video is just the time. How can you quickly build your video creation ability in your app?