当前位置:网站首页>Reptile practice
Reptile practice
2022-07-05 21:49:00 【Computer Trainee】
Reptile practice —urllib Library usage
Hello ! Welcome to Computer Trainee Blog post .
If you want to learn relevant content , You can follow bloggers Computer Trainee , Communicate and discuss problems with bloggers .
urllib Library basic use
# -*- coding=utf-8 -*-
# ------------urllib Basic use ----------------
import urllib.request
# Enter url
url = 'http://www.baidu.com'
# Impersonate a browser to send a request , return response
response = urllib.request.urlopen(url)
# read Read the content ,decode decode ,utf-8 by <html><head><meta http-equiv="Content-Type" content="text/html;charset=utf-8"
content = response.read().decode('utf-8')
# Print the content
print(content)
# 1, A byte by byte read
content1 = response.read()
# 2, Before reading 5 Bytes
content2 = response.read(5)
# 3, Read a line
content3 = response.readline()
# 4, Read multiple lines
content4 = response.readlines()
# 5, Read the head
content5 = response.getheaders()
# Read the status code ,200 It's normal
code = response.getcode()
# Read URL
urll = response.geturl()
# ------------------------------------------------
The above content is for basic use , Readers can run , View results . If you have questions, you can leave a message .
urllib Download Baidu pictures and videos
- Find the link address of the video or picture
The picture link address is right click , Directly copy the address into the program
The video link address is shown in the figure below ( The steps are left click ):
find src, That is the link address of the video . Copy it into the program .
- Use urlretrieve Download
urlretrieve For downloaded functions , Usage method: :urlretrieve( video / The link address of the picture , ‘ Save the path ’)
# ---------------- Download pictures and videos ----------------------
from urllib.request import urlopen, urlretrieve
url = 'http://www.baidu.com'
# Copy image address
url_img = 'https://gimg2.baidu.com/image_search/src=http%3A%2F%2Fimg.jj20.com%2Fup%2Fallimg%2F4k%2Fs%2F02%2F2109242129504953-0-lp.jpg&refer=http%3A%2F%2Fimg.jj20.com&app=2002&size=f9999,10000&q=a80&n=0&g=0n&fmt=jpeg?sec=1646899077&t=6585c66ba1ae162ac19a4665f8120aea'
# url_img File download path ,' File save path '
urlretrieve(url_img, 'F://python- introduction / Reptile practice /girl.jpg')
# Check , Copy the address of the video , See crawler handwriting for detailed operation
url_video = 'https://vd4.bdstatic.com/mda-kkefq6gkpfrcniwa/sc/cae_h264_nowatermark/1605409952/mda-kkefq6gkpfrcniwa.mp4?v_from_s=hkapp-haokan-nanjing&auth_key=1644309721-0-0-3296c254900172089ec79be4faf7c27e&bcevod_channel=searchbox_feed&pd=1&pt=3&logid=0721124788&vid=5479348676568395032&abtest=100534_1&klogid=0721124788'
urlretrieve(url_video, 'F://python- introduction / Reptile practice / beauty .mp4')
# -------------------------------------------------
Customization of request header ——UA Back climbing
- Right click the mouse , Click to check , Refresh web page
Click on Network, Click the icon to find the website and click , Slide down to UA, Copy into the program .
# ----------------- Request object customization ---------------------
from urllib.request import urlopen, Request
url = 'http://www.baidu.com/'
# Set request header
header = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36'
}
# Simulate the browser to send a request to the server
request = Request(url=url, headers=header)
# Return the requested content
response = urlopen(request)
# Read request content ,Content-Type: text/html;charset=utf-8
content = response.read().decode('utf-8')
print(content)
# ---------------------------------------------------
The above is the actual operation of reptile entry , If you have any questions, you can contact the author for communication .
Paying attention to the author can learn more about the actual operation of the program .
边栏推荐
猜你喜欢

An exception occurred in Huawei game multimedia calling the room switching method internal system error Reason:90000017

EasyExcel的讀寫操作

Dbeaver executes multiple insert into error processing at the same time

Access Zadig self-test environment outside the cluster based on ingress controller (best practice)

1.2 download and installation of the help software rstudio

怎么利用Tensorflow2进行猫狗分类识别

Simple interest mode - evil Chinese style

华为游戏多媒体服务调用屏蔽指定玩家语音方法,返回错误码3010

MQ----activeMq

Experienced inductance manufacturers tell you what makes the inductance noisy. Inductance noise is a common inductance fault. If the used inductance makes noise, you don't have to worry. You just need
随机推荐
oracle 控制文件的多路复用
ICMP 介绍
[daily training] 729 My schedule I
Yolov5 training custom data set (pycharm ultra detailed version)
面试官:并发编程实战会吗?(线程控制操作详解)
Poj3414 extensive search
[daily training -- Tencent select 50] 89 Gray code (only after seeing the solution of the problem)
Robot operation mechanism
Codeforces 12D ball tree array simulation 3 sorting elements
Chapter 05_ Storage engine
Teach yourself to train pytorch model to Caffe (III)
Golang (1) | from environmental preparation to quick start
Explain various hot issues of Technology (SLB, redis, mysql, Kafka, Clickhouse) in detail from the architecture
Emotional analysis of wechat chat records on Valentine's day based on Text Mining
SQL knowledge leak detection
Kingbasees v8r3 data security case - audit record clearing case
PostGIS installation geographic information extension
HYSBZ 2243 染色 (树链拆分)
Xlrd common operations
Evolution of zhenai microservice underlying framework from open source component encapsulation to self-development