当前位置:网站首页>我的爬虫笔记(七) 通过爬虫实现blog访问量+1
我的爬虫笔记(七) 通过爬虫实现blog访问量+1
2022-07-27 00:19:00 【睡醒继续做梦】
无聊看视频又刷到 别人下载图片的教程 觉得还挺简单 于是有了这篇
虽然 他们都是爬取图片 突然想到是否可以刷下 自己博客的浏览量
给定博客主页 依次进入博客地址 以实现刷取浏览量的效果
理论存在 实践开始
import requests
from bs4 import BeautifulSoup
import time
#替换头部信息(网上找的) 不然进不去
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36'}
count = 1
url = "https://blog.csdn.net/qq_53950686"#以自己的博客主页为例
respon = requests.get(url,headers=headers)
respon.encoding = 'utf-8'
print(respon.status_code)
#这里应该是把网址转成代码吧
code = BeautifulSoup(respon.text,"html.parser")
#字面意思 标题
print(code.title)
#↓寻找<div class="mainContent"></div>中的a标签
#↓print(code.find("div",class_="mainContent"))取消注释 看一下就会理解
html = code.find("div",class_="mainContent").find_all("a")
#print(html)#取消注释 看一下就会理解
while(1):
for a in html:
blog_url = a.get('href')#博客地址在href中
#print(blog_url)
#下面的就是模拟点击进去
resp = requests.get(blog_url,headers=headers)
resp.encoding = 'utf-8'
code_blog = BeautifulSoup(resp.text,"html.parser")
print(code_blog.title)
print('成功' + str(count) + '次')
count+=1
time.sleep(5)#太快会被制裁
已知的问题
①手动点击进去网址后面会有(?spm=) 百度了一下 叫做埋点 大概是 网页会记录下来源 不清楚有没有危险
②没有加入异常解决办法
③访问量+1好像也只有好看一点 没啥大的卵用 展现量应该才是 具体不知道
也是一不小心 又get一点爬虫知识 又可以水一篇
爬虫小白 如有错误请指正
边栏推荐
- ZJCTF_login
- Goatgui invites you to attend a machine learning seminar
- Rust Web(一)—— 自建TCP Server
- Ansible series: do not collect host information gather_ facts: False
- 万字长文,带你搞懂 Kubernetes 网络模型
- 使用 WebSocket 实现一个网页版的聊天室(摸鱼更隐蔽)
- Is the low commission account opening of Galaxy Securities Fund reliable, reliable and safe
- Manually build ABP framework from 0 -abp official complete solution and manually build simplified solution practice
- [nisactf 2022] upper
- 小程序utils
猜你喜欢

com.fasterxml.jackson.databind.exc.InvalidDefinitionException

pyqt5使用pyqtgraph画动态散点图

Leetcode skimming -- no.238 -- product of arrays other than itself

Blog competition dare to try BAC for beginners

White box test case design (my grandfather can understand it)

iNFTnews | GGAC联合中国航天ASES 独家出品《中国2065典藏版》

LeetCode刷题——NO.238——除自身以外数组的乘积

从ACL 2022 Onsite经历看NLP热点

CuteOne:一款OneDrive多网盘挂载程序/带会员/同步等功能

Invalid target distribution: solution for 17
随机推荐
Debezium系列之:记录从库服务器挂掉后binlog文件无法恢复,任务切换到主库并保证数据不丢失的方法
Favicon web page collection icon online production PHP website source code /ico image online generation / support multiple image format conversion
Knowledge points of test questions related to software testing
系统安全测试要怎么做,详细来说说
Redis installation and operation (Linux)
Okaleido tiger logged into binance NFT on July 27, and has achieved good results in the first round
对象创建的流程分析
If you want to thoroughly optimize the performance, you must first understand the underlying logic~
Database knowledge required by testers: MySQL common syntax
Okaleido tiger is about to log in to binance NFT in the second round, which has aroused heated discussion in the community
调用JShaman的Web API接口,实现JS代码加密。
快速排序(Quick sort)
人们为什么热衷于给事物排序
Sort icons with swiper
JMeter interface test, quickly complete a single interface request
Use of formdata
Arduino UNO +74hc164 water lamp example
Go to export excel form
ArduinoUNO驱动RGB模块全彩效果示例
MySQL master-slave database configuration based on docker for Ubuntu