当前位置:网站首页>Reptile practice
Reptile practice
2022-07-05 21:49:00 【Computer Trainee】
Reptile practice —urllib Library usage
Hello ! Welcome to Computer Trainee Blog post .
If you want to learn relevant content , You can follow bloggers Computer Trainee , Communicate and discuss problems with bloggers .
urllib Library basic use
# -*- coding=utf-8 -*-
# ------------urllib Basic use ----------------
import urllib.request
# Enter url
url = 'http://www.baidu.com'
# Impersonate a browser to send a request , return response
response = urllib.request.urlopen(url)
# read Read the content ,decode decode ,utf-8 by <html><head><meta http-equiv="Content-Type" content="text/html;charset=utf-8"
content = response.read().decode('utf-8')
# Print the content
print(content)
# 1, A byte by byte read
content1 = response.read()
# 2, Before reading 5 Bytes
content2 = response.read(5)
# 3, Read a line
content3 = response.readline()
# 4, Read multiple lines
content4 = response.readlines()
# 5, Read the head
content5 = response.getheaders()
# Read the status code ,200 It's normal
code = response.getcode()
# Read URL
urll = response.geturl()
# ------------------------------------------------
The above content is for basic use , Readers can run , View results . If you have questions, you can leave a message .
urllib Download Baidu pictures and videos
- Find the link address of the video or picture
The picture link address is right click , Directly copy the address into the program
The video link address is shown in the figure below ( The steps are left click ):
find src, That is the link address of the video . Copy it into the program . - Use urlretrieve Download
urlretrieve For downloaded functions , Usage method: :urlretrieve( video / The link address of the picture , ‘ Save the path ’)
# ---------------- Download pictures and videos ----------------------
from urllib.request import urlopen, urlretrieve
url = 'http://www.baidu.com'
# Copy image address
url_img = 'https://gimg2.baidu.com/image_search/src=http%3A%2F%2Fimg.jj20.com%2Fup%2Fallimg%2F4k%2Fs%2F02%2F2109242129504953-0-lp.jpg&refer=http%3A%2F%2Fimg.jj20.com&app=2002&size=f9999,10000&q=a80&n=0&g=0n&fmt=jpeg?sec=1646899077&t=6585c66ba1ae162ac19a4665f8120aea'
# url_img File download path ,' File save path '
urlretrieve(url_img, 'F://python- introduction / Reptile practice /girl.jpg')
# Check , Copy the address of the video , See crawler handwriting for detailed operation
url_video = 'https://vd4.bdstatic.com/mda-kkefq6gkpfrcniwa/sc/cae_h264_nowatermark/1605409952/mda-kkefq6gkpfrcniwa.mp4?v_from_s=hkapp-haokan-nanjing&auth_key=1644309721-0-0-3296c254900172089ec79be4faf7c27e&bcevod_channel=searchbox_feed&pd=1&pt=3&logid=0721124788&vid=5479348676568395032&abtest=100534_1&klogid=0721124788'
urlretrieve(url_video, 'F://python- introduction / Reptile practice / beauty .mp4')
# -------------------------------------------------
Customization of request header ——UA Back climbing
- Right click the mouse , Click to check , Refresh web page
Click on Network, Click the icon to find the website and click , Slide down to UA, Copy into the program .
# ----------------- Request object customization ---------------------
from urllib.request import urlopen, Request
url = 'http://www.baidu.com/'
# Set request header
header = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36'
}
# Simulate the browser to send a request to the server
request = Request(url=url, headers=header)
# Return the requested content
response = urlopen(request)
# Read request content ,Content-Type: text/html;charset=utf-8
content = response.read().decode('utf-8')
print(content)
# ---------------------------------------------------
The above is the actual operation of reptile entry , If you have any questions, you can contact the author for communication .
Paying attention to the author can learn more about the actual operation of the program .
边栏推荐
- 华为游戏多媒体调用切换房间方法出现异常Internal system error. Reason:90000017
- Simple interest mode - lazy type
- regular expression
- Learning notes of statistical learning methods -- Chapter 1 Introduction to statistical learning methods
- HDU 4391 paint the wall segment tree (water
- Selenium's method of getting attribute values in DOM
- How to organize an actual attack and defense drill
- Summary of data analysis steps
- EL与JSTL注意事项汇总
- Codeforces 12D Ball 树形阵列模拟3排序元素
猜你喜欢
微服务入门(RestTemplate、Eureka、Nacos、Feign、Gateway)
华为游戏多媒体服务调用屏蔽指定玩家语音方法,返回错误码3010
Huawei cloud modelarts text classification - takeout comments
Exercise 1 simple training of R language drawing
Feng Tang's "spring breeze is not as good as you" digital collection, logged into xirang on July 8!
DBeaver同时执行多条insert into报错处理
MATLAB | App Designer·我用MATLAB制作了一款LATEX公式实时编辑器
MQ----activeMq
Explain various hot issues of Technology (SLB, redis, mysql, Kafka, Clickhouse) in detail from the architecture
EasyExcel的讀寫操作
随机推荐
Comprehensive optimization of event R & D workflow | Erda version 2.2 comes as "7"
有些事情让感情无处安放
华为游戏多媒体调用切换房间方法出现异常Internal system error. Reason:90000017
Kingbasees v8r3 cluster maintenance case -- online addition of standby database management node
Summary of data analysis steps
Cross end solution to improve development efficiency rapidly
Deployment of Jenkins under win7
854. 相似度为 K 的字符串 BFS
張麗俊:穿透不確定性要靠四個“不變”
MMAP
About the writing method of SQL field "this includes" and "included in" strings
递归查询多级菜单数据
Tips for using SecureCRT
Parker驱动器维修COMPAX控制器维修CPX0200H
An exception occurred in Huawei game multimedia calling the room switching method internal system error Reason:90000017
EasyExcel的讀寫操作
使用Aspect制作全局异常处理类
R language learning notes
Haas506 2.0 development tutorial - Alibaba cloud OTA - PAC firmware upgrade (only supports versions above 2.2)
Teach yourself to train pytorch model to Caffe (III)