当前位置:网站首页>爬虫练习题(二)
爬虫练习题(二)
2022-07-05 19:05:00 【InfoQ】
"""
目标网站:https://www.1ppt.com/moban/
爬取要求:
1、 翻页爬取这个网页上面的源代码
2、 并且保存到本地,注意编码
"""
'''
1.分析网站:
https://www.1ppt.com/moban/ 第一页
https://www.1ppt.com/moban/ppt_moban_2.html 第二页
https://www.1ppt.com/moban/ppt_moban_3.html 第三页
'''
import urllib.request
start = int(input("输入起始页")) # 转int
end = int(input("输入结束页"))
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36'
}
for n in range(start, end + 1):
url = 'https://www.1ppt.com/moban/ppt_moban_{}.html'.format(n)
print(url)
req = urllib.request.Request(url, headers=headers) # 实例化请求对象
response = urllib.request.urlopen(req) # 发送请求的方法
with open(f'第{n}页.html', 'a', encoding='gb2312') as f:
f.write(response.read().decode('gb2312'))边栏推荐
- 自动化测试的好处
- 数据库 逻辑处理功能
- Oracle fault handling: ora-10873:file * needs to be either taken out of backup or media recovered
- 手把手教你处理 JS 逆向之图片伪装
- Vagrant2.2.6 supports virtualbox6.1
- Oracle故障处理:Ora-10873:file * needs to be either taken out of backup or media recovered
- How MySQL queries and modifies JSON data
- Django使用mysqlclient服务连接并写入数据库的操作过程
- 5 years of experience, 27 days of Android programmer interview, 2022 programmer advanced classic
- [today in history] July 5: the mother of Google was born; Two Turing Award pioneers born on the same day
猜你喜欢

Analysis of postman core functions - parameterization and test report

Mysql database indexing tutorial (super detailed)

数学分析_笔记_第9章:曲线积分与曲面积分
![Cf:b. almost Terry matrix [symmetry + finding rules + structure + I am structural garbage]](/img/5d/06229ff7cfa144dbcb60ae43d5c435.png)
Cf:b. almost Terry matrix [symmetry + finding rules + structure + I am structural garbage]

1亿单身男女撑起一个IPO,估值130亿

【FAQ】华为帐号服务报错 907135701的常见原因总结和解决方法

如何实现游戏中的在线计时器和离线计时器

Low code practice of xtransfer, a cross-border payment platform: how to integrate with other medium-sized platforms is the core

关于 Notion-Like 工具的反思和畅想

2022全网最全的腾讯后台自动化测试与持续部署实践【万字长文】
随机推荐
Talking about fake demand from takeout order
MMO項目學習一:預熱
JMeter 常用的几种断言方法,你会了吗?
PHP uses ueditor to upload pictures and add watermarks
为什么 BI 软件都搞不定关联分析?带你分析分析
UWB超宽带定位技术,实时厘米级高精度定位应用,超宽带传输技术
Summer Challenge database Xueba notes, quick review of exams / interviews~
Oracle date format conversion to_ date,to_ char,to_ Timestamp mutual conversion
PHP利用ueditor实现上传图片添加水印
详解SQL中Groupings Sets 语句的功能和底层实现逻辑
Fuzor 2020軟件安裝包下載及安裝教程
The easycvr authorization expiration page cannot be logged in. How to solve it?
ELK分布式日志分析系统部署(华为云)
Can Leica capture the high-end market offered by Huawei for Xiaomi 12s?
Common interview questions in Android, 2022 golden nine silver ten Android factory interview questions hit
JS解力扣每日一题(十二)——556. 下一个更大元素 III(2022-7-3)
Hiengine: comparable to the local cloud native memory database engine
Go语言学习教程(十六)
Tupu software digital twin smart wind power system
毫米波雷达人体感应器,智能感知静止存在,人体存在检测应用