当前位置:网站首页>Reptile exercises (II)
Reptile exercises (II)
2022-07-05 19:22:00 【InfoQ】
"""
The target site :https://www.1ppt.com/moban/
Climbing requirements :
1、 Turn the page and crawl the source code on this page
2、 And save it locally , Pay attention to coding
"""
'''
1. Analysis website :
https://www.1ppt.com/moban/ first page
https://www.1ppt.com/moban/ppt_moban_2.html The second page
https://www.1ppt.com/moban/ppt_moban_3.html The third page
'''
import urllib.request
start = int(input(" Enter the start page ")) # turn int
end = int(input(" Enter the end page "))
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36'
}
for n in range(start, end + 1):
url = 'https://www.1ppt.com/moban/ppt_moban_{}.html'.format(n)
print(url)
req = urllib.request.Request(url, headers=headers) # Instantiate the request object
response = urllib.request.urlopen(req) # The method to send the request
with open(f' The first {n} page .html', 'a', encoding='gb2312') as f:
f.write(response.read().decode('gb2312'))边栏推荐
- Is it safe for Guohai Securities to open an account online?
- How to convert word into PDF? Word to PDF simple way to share!
- Android面试,android音视频开发
- 机器学习基础(三)——KNN/朴素贝叶斯/交叉验证/网格搜索
- 如何实现游戏中的在线计时器和离线计时器
- 手机开户选择哪家券商公司比较好哪家平台更安全
- android中常见的面试题,2022金九银十Android大厂面试题来袭
- 2022 the most complete Tencent background automation testing and continuous deployment practice in the whole network [10000 words]
- Reinforcement learning - learning notes 4 | actor critical
- Fundamentals of shell programming (Part 8: branch statements -case in)
猜你喜欢

Ten years at sea: old and new relay, dark horse rising

Notion 类生产力工具如何选择?Notion 、FlowUs 、Wolai 对比评测

Why can't Bi software do correlation analysis? Take you to analyze

XaaS 陷阱:万物皆服务(可能)并不是IT真正需要的东西

Hiengine: comparable to the local cloud native memory database engine

Go deep into the underlying C source code and explain the core design principles of redis

Tutoriel de téléchargement et d'installation du progiciel fuzor 2020

Analysis of postman core functions - parameterization and test report

如何实现游戏中的在线计时器和离线计时器
![[performance test] jmeter+grafana+influxdb deployment practice](/img/32/f07792734d040829398a90a2040146.png)
[performance test] jmeter+grafana+influxdb deployment practice
随机推荐
Which securities company is better and which platform is safer for mobile account opening
UWB超宽带定位技术,实时厘米级高精度定位应用,超宽带传输技术
PG basics -- Logical Structure Management (user and permission management)
【C语言】字符串函数及模拟实现strlen&&strcpy&&strcat&&strcmp
软件测试工程师是做什么的?待遇前景怎么样?
Oracle fault handling: ora-10873:file * needs to be either taken out of backup or media recovered
2022 the most complete Tencent background automation testing and continuous deployment practice in the whole network [10000 words]
R language uses lubridate package to process date and time data
Postman核心功能解析-参数化和测试报告
Django uses mysqlclient service to connect and write to the database
Why can't Bi software do correlation analysis? Take you to analyze
UWB ultra wideband positioning technology, real-time centimeter level high-precision positioning application, ultra wideband transmission technology
cf:B. Almost Ternary Matrix【對稱 + 找規律 + 構造 + 我是構造垃圾】
Pandora IOT development board learning (HAL Library) - Experiment 8 timer interrupt experiment (learning notes)
[AI framework basic technology] automatic derivation mechanism (autograd)
How to choose the notion productivity tools? Comparison and evaluation of notion, flowus and WOLAI
Cf:b. almost Terry matrix [symmetry + finding rules + structure + I am structural garbage]
什么是面上项目
cf:B. Almost Ternary Matrix【对称 + 找规律 + 构造 + 我是构造垃圾】
Tutoriel de téléchargement et d'installation du progiciel fuzor 2020