当前位置:网站首页>爬虫练习题(二)
爬虫练习题(二)
2022-07-05 19:05:00 【InfoQ】
"""
目标网站:https://www.1ppt.com/moban/
爬取要求:
1、 翻页爬取这个网页上面的源代码
2、 并且保存到本地,注意编码
"""
'''
1.分析网站:
https://www.1ppt.com/moban/ 第一页
https://www.1ppt.com/moban/ppt_moban_2.html 第二页
https://www.1ppt.com/moban/ppt_moban_3.html 第三页
'''
import urllib.request
start = int(input("输入起始页")) # 转int
end = int(input("输入结束页"))
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36'
}
for n in range(start, end + 1):
url = 'https://www.1ppt.com/moban/ppt_moban_{}.html'.format(n)
print(url)
req = urllib.request.Request(url, headers=headers) # 实例化请求对象
response = urllib.request.urlopen(req) # 发送请求的方法
with open(f'第{n}页.html', 'a', encoding='gb2312') as f:
f.write(response.read().decode('gb2312'))
边栏推荐
- Explain in detail the functions and underlying implementation logic of the groups sets statement in SQL
- HiEngine:可媲美本地的云原生内存数据库引擎
- How to quickly advance automated testing? Listen to the personal feelings of the three bat test engineers
- How to realize the Online timer and offline timer in the game
- Tupu software digital twin | visual management system based on BIM Technology
- After the company went bankrupt, the blackstones came
- 开源 SPL 消灭数以万计的数据库中间表
- Analysis of postman core functions - parameterization and test report
- Fuzor 2020软件安装包下载及安装教程
- 毫米波雷达人体感应器,智能感知静止存在,人体存在检测应用
猜你喜欢
cf:B. Almost Ternary Matrix【对称 + 找规律 + 构造 + 我是构造垃圾】
android中常见的面试题,2022金九银十Android大厂面试题来袭
国内低代码开发平台靠谱的都有哪些?
Fuzor 2020软件安装包下载及安装教程
How to convert word into PDF? Word to PDF simple way to share!
自动化测试的好处
Summer Challenge database Xueba notes, quick review of exams / interviews~
The problem of returning the longtext field in MySQL and its solution
Advanced application of C # language
毫米波雷达人体感应器,智能感知静止存在,人体存在检测应用
随机推荐
【AI 框架基础技术】自动求导机制 (Autograd)
2022全网最全的腾讯后台自动化测试与持续部署实践【万字长文】
Tutoriel de téléchargement et d'installation du progiciel fuzor 2020
C# 语言的基本语法结构
A cloud opens a new future of smart transportation
在线协作产品哪家强?微软 Loop 、Notion、FlowUs
Get wechat avatar and nickname with uniapp
Reflection and imagination on the notation like tool
Mariadb root用户及普通用户的密码 重置
How to realize the Online timer and offline timer in the game
[AI framework basic technology] automatic derivation mechanism (autograd)
毫米波雷达人体感应器,智能感知静止存在,人体存在检测应用
5. Data access - entityframework integration
图扑软件数字孪生 | 基于 BIM 技术的可视化管理系统
The problem of returning the longtext field in MySQL and its solution
Fundamentals of machine learning (III) -- KNN / naive Bayes / cross validation / grid search
618“低调”谢幕,百秋尚美如何携手品牌跨越“不确定时代”?
【历史上的今天】7 月 5 日:Google 之母出生;同一天诞生的两位图灵奖先驱
从零实现深度学习框架——LSTM从理论到实战【实战】
Vagrant2.2.6 supports virtualbox6.1