当前位置:网站首页>简单上手的页面请求和解析案例
简单上手的页面请求和解析案例
2022-07-05 12:40:00 【南湖渔歌】
<html>
<head>
<meta http-equiv=Content-Type content="text/html;charset=utf-8">
<title>网页标题</title>
</head>
<body>
<h1>标题1</h1>
<h2>标题2</h2>
<h3>标题3</h3>
<h4>标题4</h4>
<div id="content" class="default">
<p>段落</p>
<a href="http://www.baidu.com">百度</a> <br/>
<a href="http://www.crazyant.net">疯狂的蚂蚁</a> <br/>
<a href="http://www.iqiyi.com">爱奇艺</a> <br/>
<img src="https://www.python.org/static/img/python-logo.png"/>
</div>
</body>
</html>

# -*- coding=utf-8 -*-
from bs4 import BeautifulSoup
with open('./test.html',encoding='utf-8') as fin:
html_doc = fin.read()
soup = BeautifulSoup(html_doc,'html.parser')
links = soup.find_all('a')
for link in links:
print(link.name,link['href'],link.get_text())
img = soup.find('img')
print(img['src'])
# 升级版:
from bs4 import BeautifulSoup
with open('./test.html',encoding='utf-8') as fin:
html_doc = fin.read()
soup = BeautifulSoup(html_doc,'html.parser')
div_node = soup.find('div',id = 'content') # 先查找大的区块
print(div_node)
print("#"*50)
links = div_node.find_all('a')
for link in links:
print(link.name,link['href'],link.get_text())
img = div_node.find('img')
print(img['src'])

边栏推荐
猜你喜欢

Taobao, pinduoduo, jd.com, Doudian order & Flag insertion remarks API solution
![[cloud native] event publishing and subscription in Nacos -- observer mode](/img/0f/34ab42b7fb0085f58f36eb67b6f107.png)
[cloud native] event publishing and subscription in Nacos -- observer mode

2021.12.16-2021.12.20 empty four hand transaction records

Distance measuring sensor chip 4530a used in home intelligent lighting

I met Tencent in the morning and took out 38K, which showed me the basic smallpox

Taobao order interface | order flag remarks, may be the most stable and easy-to-use interface

从39个kaggle竞赛中总结出来的图像分割的Tips和Tricks

Install rhel8.2 virtual machine

国内市场上的BI软件,到底有啥区别

Setting up sqli lab environment
随机推荐
石臻臻的2021总结和2022展望 | 文末彩蛋
Taobao flag insertion remarks | logistics delivery interface
Transactions from January 14 to 19, 2022
Kotlin流程控制、循环
What if wechat is mistakenly sealed? Explain the underlying logic of wechat seal in detail
2021-12-22 transaction record
【云原生】Nacos-TaskManager 任务管理的使用
上午面了个腾讯拿 38K 出来的,让我见识到了基础的天花
10 minute fitness method reading notes (2/5)
VoneDAO破解组织发展效能难题
Transactions on December 23, 2021
10 minute fitness method reading notes (5/5)
逆波兰表达式
Database connection pool & jdbctemplate
RHCSA4
Simply take stock reading notes (2/8)
Language model
从39个kaggle竞赛中总结出来的图像分割的Tips和Tricks
What is the difference between Bi software in the domestic market
RHCSA1