当前位置:网站首页>简单上手的页面请求和解析案例
简单上手的页面请求和解析案例
2022-07-05 12:40:00 【南湖渔歌】
<html>
<head>
<meta http-equiv=Content-Type content="text/html;charset=utf-8">
<title>网页标题</title>
</head>
<body>
<h1>标题1</h1>
<h2>标题2</h2>
<h3>标题3</h3>
<h4>标题4</h4>
<div id="content" class="default">
<p>段落</p>
<a href="http://www.baidu.com">百度</a> <br/>
<a href="http://www.crazyant.net">疯狂的蚂蚁</a> <br/>
<a href="http://www.iqiyi.com">爱奇艺</a> <br/>
<img src="https://www.python.org/static/img/python-logo.png"/>
</div>
</body>
</html>

# -*- coding=utf-8 -*-
from bs4 import BeautifulSoup
with open('./test.html',encoding='utf-8') as fin:
html_doc = fin.read()
soup = BeautifulSoup(html_doc,'html.parser')
links = soup.find_all('a')
for link in links:
print(link.name,link['href'],link.get_text())
img = soup.find('img')
print(img['src'])
# 升级版:
from bs4 import BeautifulSoup
with open('./test.html',encoding='utf-8') as fin:
html_doc = fin.read()
soup = BeautifulSoup(html_doc,'html.parser')
div_node = soup.find('div',id = 'content') # 先查找大的区块
print(div_node)
print("#"*50)
links = div_node.find_all('a')
for link in links:
print(link.name,link['href'],link.get_text())
img = div_node.find('img')
print(img['src'])

边栏推荐
猜你喜欢

关于 SAP UI5 floating footer 显示与否的单步调试以及使用 SAP UI5 的收益

SAP UI5 DynamicPage 控件介紹

使用 jMeter 对 SAP Spartacus 进行并发性能测试

石臻臻的2021总结和2022展望 | 文末彩蛋

2021.12.16-2021.12.20 empty four hand transaction records

SAP UI5 FlexibleColumnLayout 控件介绍

MySQL 巨坑:update 更新慎用影响行数做判断!!!

解决 UnicodeDecodeError: ‘gbk‘ codec can‘t decode byte 0xa2 in position 107

Annotation problem and hidden Markov model

RHCSA3
随机推荐
Sqoop import and export operation
SAP SEGW 事物码里的 Association 建模方式
上午面了个腾讯拿 38K 出来的,让我见识到了基础的天花
How can non-technical departments participate in Devops?
RHCAS6
Language model
Wechat enterprise payment to change access, open quickly
OPPO小布推出预训练大模型OBERT,晋升KgCLUE榜首
CVPR 2022 | 基于稀疏 Transformer 的单步三维目标识别器
Simply take stock reading notes (1/8)
国内市场上的BI软件,到底有啥区别
Kotlin变量
SAP SEGW 事物码里的 ABAP 类型和 EDM 类型映射的一个具体例子
Taobao flag insertion remarks | logistics delivery interface
太方便了,钉钉上就可完成代码发布审批啦!
What if wechat is mistakenly sealed? Explain the underlying logic of wechat seal in detail
A deep long article on the simplification and acceleration of join operation
Rasa Chat Robot Tutorial (translation) (1)
逆波兰表达式
Lepton 无损压缩原理及性能分析