当前位置:网站首页>Beginner crawler notes (collecting data)
Beginner crawler notes (collecting data)
2022-08-04 15:39:00 【Sweat always outweighs talent】
import urllib.requestdef main():#1. Crawl the web (parse the data one by one in this)baseurl = 'https://movie.douban.com/top250?start='datalist = getData(baseurl)#2. Save dataprint()#crawl the webdef getData(baseurl):#First you need to get a page of data, and then use a loop to get the information of each pagedatalist = []for i in range(0,10):url = baseurl + str(i*25)html = askURL(url)return datalist#Request web pagedef askURL(url):header = {"User-Agent": "Mozilla/5.0(Linux;Android6.0;Nexus5 Build / MRA58N) AppleWebKit / 537.36(KHTML, likeGecko) Chrome / 103.0.5060.134MobileSafari / 537.36Edg / 103.0.1264.77"}request = urllib.request.Request(url, headers = header)html = ""try :response = urllib.request.urlopen(request)html = response.read().decode()print(html)except urllib.error.URLerror as e:if hasattr(e,"code"):print(e.code)if hasattr(e,"reason"):print(e.reason)return htmlif __name__ == '__main__':main()The code has only completed the task of collecting data, it has not been perfected, and will continue to be updated in the future!!!(The source of the tutorial and station B, if there is any offense, please contact me to delete it by private message)
‘
边栏推荐
- 攻防视角下,初创企业安全实战经验分享
- 有哪些好用的IT资产管理平台?
- 明明加了唯一索引,为什么还是产生重复数据?
- 直播回放含 PPT 下载|基于 Flink & DeepRec 构建 Online Deep Learning
- Li Mu's deep learning notes are here!
- 基于 Next.js实现在线Excel
- 弄懂#if #ifdef #if defined
- Flutter 运动鞋商铺小demo
- Many merchants mall system function and dismantling 24 - ping the strength distribution of members
- postman “header“:{“retCode“:“999999“
猜你喜欢
随机推荐
Redis 高可用
Next -20- 使用自定义样式 (custom style)
我说MySQL联合索引遵循最左前缀匹配原则,面试官让我回去等通知
IP第十七天笔记
【已解决】allure无法生成json文件和AttributeError: module ‘allure‘ has no attribute ‘severity_level‘
Redis-哨兵模式
项目里的各种配置,你都了解吗?
"Research Report on the Development of Global Unicorn Enterprises in the First Half of 2022" released - DEMO WORLD World Innovation Summit ended successfully
解决dataset.mnist无法加载进去的情况
实战:10 种实现延迟任务的方法,附代码!
【北亚数据恢复】IBM System Storage存储lvm信息丢失,卷访问不了的数据恢复方案
Many merchants mall system function and dismantling 24 - ping the strength distribution of members
【Harmony OS】【FAQ】Hongmeng Questions Collection 2
Request method ‘POST‘ not supported。 Failed to load resource: net::ERR_FAILED
Go Go 简单的很,标准库之 fmt 包的一键入门
What is the difference between member variable and local variable
Xi'an Zongheng Information × JNPF: Adapt to the characteristics of Chinese enterprises, fully integrate the cost management and control system
什么是 DevOps?看这一篇就够了!
攻防视角下,初创企业安全实战经验分享
多商户商城系统功能拆解24讲-平台端分销会员








