当前位置:网站首页>Simple page request and parsing cases
Simple page request and parsing cases
2022-07-05 13:01:00 【South Lake Fishing Song】
<html>
<head>
<meta http-equiv=Content-Type content="text/html;charset=utf-8">
<title> Webpage title </title>
</head>
<body>
<h1> title 1</h1>
<h2> title 2</h2>
<h3> title 3</h3>
<h4> title 4</h4>
<div id="content" class="default">
<p> The paragraph </p>
<a href="http://www.baidu.com"> Baidu </a> <br/>
<a href="http://www.crazyant.net"> Crazy ant </a> <br/>
<a href="http://www.iqiyi.com"> Iqiyi </a> <br/>
<img src="https://www.python.org/static/img/python-logo.png"/>
</div>
</body>
</html>
# -*- coding=utf-8 -*-
from bs4 import BeautifulSoup
with open('./test.html',encoding='utf-8') as fin:
html_doc = fin.read()
soup = BeautifulSoup(html_doc,'html.parser')
links = soup.find_all('a')
for link in links:
print(link.name,link['href'],link.get_text())
img = soup.find('img')
print(img['src'])
# Upgraded version :
from bs4 import BeautifulSoup
with open('./test.html',encoding='utf-8') as fin:
html_doc = fin.read()
soup = BeautifulSoup(html_doc,'html.parser')
div_node = soup.find('div',id = 'content') # Find large blocks first
print(div_node)
print("#"*50)
links = div_node.find_all('a')
for link in links:
print(link.name,link['href'],link.get_text())
img = div_node.find('img')
print(img['src'])
边栏推荐
- Wechat enterprise payment to change access, open quickly
- #yyds干货盘点# 解决名企真题:搬圆桌
- Concurrent performance test of SAP Spartacus with JMeter
- LeetCode20.有效的括号
- MySQL splits strings for conditional queries
- insmod 提示 Invalid module format
- Transactions from December 27 to 28, 2021
- Transactions on December 23, 2021
- Simply take stock reading notes (2/8)
- 初识Linkerd项目
猜你喜欢
RHCSA7
SAP UI5 视图里的 OverflowToolbar 控件
10 minute fitness method reading notes (3/5)
RHCAS6
Distance measuring sensor chip 4530a used in home intelligent lighting
CVPR 2022 | single step 3D target recognizer based on sparse transformer
阿里云SLB负载均衡产品基本概念与购买流程
SAP SEGW 事物码里的 ABAP Editor
初识Linkerd项目
I'm doing open source in Didi
随机推荐
Get to know linkerd project for the first time
《信息系统项目管理师》备考笔记---信息化知识
#yyds干货盘点# 解决名企真题:搬圆桌
[cloud native] event publishing and subscription in Nacos -- observer mode
Hiengine: comparable to the local cloud native memory database engine
逆波兰表达式
SAP SEGW 事物码里的导航属性(Navigation Property) 和 EntitySet 使用方法
NFT: how to make money with unique assets?
使用 jMeter 对 SAP Spartacus 进行并发性能测试
以VMware创新之道,重塑多云产品力
Setting up sqli lab environment
单独编译内核模块
Le rapport de recherche sur l'analyse matricielle de la Force des fournisseurs de RPA dans le secteur bancaire chinois en 2022 a été officiellement lancé.
事务的基本特性和隔离级别
How can non-technical departments participate in Devops?
Didi open source Delta: AI developers can easily train natural language models
PyCharm安装第三方库图解
SAP UI5 DynamicPage 控件介绍
简单上手的页面请求和解析案例
A small talk caused by the increase of sweeping