当前位置:网站首页>爬虫练习题(二)
爬虫练习题(二)
2022-07-05 19:05:00 【InfoQ】
"""
目标网站:https://www.1ppt.com/moban/
爬取要求:
1、 翻页爬取这个网页上面的源代码
2、 并且保存到本地,注意编码
"""
'''
1.分析网站:
https://www.1ppt.com/moban/ 第一页
https://www.1ppt.com/moban/ppt_moban_2.html 第二页
https://www.1ppt.com/moban/ppt_moban_3.html 第三页
'''
import urllib.request
start = int(input("输入起始页")) # 转int
end = int(input("输入结束页"))
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36'
}
for n in range(start, end + 1):
url = 'https://www.1ppt.com/moban/ppt_moban_{}.html'.format(n)
print(url)
req = urllib.request.Request(url, headers=headers) # 实例化请求对象
response = urllib.request.urlopen(req) # 发送请求的方法
with open(f'第{n}页.html', 'a', encoding='gb2312') as f:
f.write(response.read().decode('gb2312'))
边栏推荐
- Word finds red text word finds color font word finds highlighted formatted text
- Postman核心功能解析-参数化和测试报告
- The monthly list of Tencent cloud developer community videos was released in May 2022
- Talking about fake demand from takeout order
- 如何快速进阶自动化测试?听听这3位BAT大厂测试工程师的切身感想....
- 14、用户、组和权限(14)
- Ultrasonic ranging based on FPGA
- 开源 SPL 消灭数以万计的数据库中间表
- 中国银河证券开户安全吗 证券开户
- Technology sharing | common interface protocol analysis
猜你喜欢
2022最新大厂Android面试真题解析,Android开发必会技术
Hiengine: comparable to the local cloud native memory database engine
5年经验Android程序员面试27天,2022程序员进阶宝典
5. Data access - entityframework integration
Tupu software digital twin | visual management system based on BIM Technology
word如何转换成pdf?word转pdf简单的方法分享!
如何实现游戏中的在线计时器和离线计时器
Go语言 | 01 WSL+VSCode环境搭建避坑指南
HiEngine:可媲美本地的云原生内存数据库引擎
【历史上的今天】7 月 5 日:Google 之母出生;同一天诞生的两位图灵奖先驱
随机推荐
从外卖点单浅谈伪需求
R语言可视化散点图(scatter plot)图、为图中的部分数据点添加标签、始终显示所有标签,即使它们有太多重叠、ggrepel包来帮忙
How to convert word into PDF? Word to PDF simple way to share!
5 years of experience, 27 days of Android programmer interview, 2022 programmer advanced classic
Mysql如何对json数据进行查询及修改
How MySQL queries and modifies JSON data
Go语言学习教程(十五)
Hiengine: comparable to the local cloud native memory database engine
MMO項目學習一:預熱
Get wechat avatar and nickname with uniapp
[today in history] July 5: the mother of Google was born; Two Turing Award pioneers born on the same day
2022 the most complete Tencent background automation testing and continuous deployment practice in the whole network [10000 words]
开源 SPL 消灭数以万计的数据库中间表
潘多拉 IOT 开发板学习(HAL 库)—— 实验8 定时器中断实验(学习笔记)
HAC集群修改管理员用户密码
ELK分布式日志分析系统部署(华为云)
Tianyi cloud understands enterprise level data security in this way
Why can't Bi software do correlation analysis? Take you to analyze
基于FPGA的超声波测距
5. Data access - entityframework integration