当前位置:网站首页>三国演义小说
三国演义小说
2022-08-02 08:35:00 【赵颂@】
import requests
from bs4 import BeautifulSoup
from fake_useragent import UserAgent
#爬取三国演义小说所有章节标题和章节内容 https://www.shicimingju.com/book/sanguoyanyi.html
if __name__ =='__main__':
headers={
"User-Agent":UserAgent().chrome
}
get_url='https://www.shicimingju.com/book/sanguoyanyi.html'
#发起请求,获取响应
page_text=requests.get(url=get_url,headers=headers).text.encode('ISO-8859-1')
#在首页中解析出章节标题和章节内容
#1. 实例化BeautifulSoup对象,将html数据加载到该对象中
soup=BeautifulSoup(page_text,'lxml')
# print(soup)
#2.解析章节标题和详情页的url
list_data=soup.select('.book-mulu > ul > li')
fp=open('./sanguo.text','w',encoding='utf-8')
for i in list_data:
title=i.a.text
detail_url='https://www.shicimingju.com/'+ i.a['href']
#对详情页的url发送请求,
detail_text=requests.get(url=detail_url,headers=headers).text.encode('ISO-8859-1')
detail_soup=BeautifulSoup(detail_text,'lxml')
#获取章节内容
content=detail_soup.find('div',class_='chapter_content').text
#持久化存储
fp.write(title+":"+content+"\n")
print(title,'下载完成')
边栏推荐
- 那些年我们踩过的 Flink 坑系列
- 【论文阅读】Distilling the Knowledge in a Neural Network
- Axial Turbine Privacy Policy
- Biotin - LC - Hydrazide | CAS: 109276-34-8 | Biotin - LC - Hydrazide
- MySQL Workbench 安装及使用
- Bigder:41/100生产bug有哪些分类
- 近期在SLAM建图和定位方面的进展
- cas: 139504-50-0 Maytansine DM1|Mertansine|
- spark:商品热门品类TOP10统计(案例)
- Gorilla Mux 和 GORM 的使用方法
猜你喜欢
随机推荐
HCIP笔记十六天
USACO美国信息学奥赛竞赛12月份开赛,中国学生备赛指南
BGP solves routing black hole through MPLS
pycharm的基本使用教程(1)
文章解读 -- FlowNet3D:Learning Scene Flow in 3D Point Clouds
(Note)阿克西斯ACASIS DT-3608双盘位硬盘阵列盒RAID设置
R language plotly visualization: use the plotly visualization model to predict the true positive rate (True positive) TPR and false positive rate (False positive) FPR curve under different thresholds
Jenkins--部署--3.1--代码提交自动触发jenkins--方式1
了解下C# 不安全代码
那些年我们踩过的 Flink 坑系列
LeetCode_2357_使数组种所有元素都等于零
在 QT Creator 上配置 opencv 环境的一些认识和注意点
(Note) AXIS ACASIS DT-3608 Dual-bay Hard Disk Array Box RAID Setting
Analysis of software testing technology How far is Turing test from us
A little bit of knowledge - why do not usually cook with copper pots
QT web development - Notes - 3
主流监控系统工具选型及落地场景参考
Technology Cloud Report: To realize the metaverse, NVIDIA starts from building an infrastructure platform
Bigder:41/100生产bug有哪些分类
postman下载安装汉化及使用









