当前位置:网站首页>Simple page request and parsing cases
Simple page request and parsing cases
2022-07-05 13:01:00 【South Lake Fishing Song】
<html>
<head>
<meta http-equiv=Content-Type content="text/html;charset=utf-8">
<title> Webpage title </title>
</head>
<body>
<h1> title 1</h1>
<h2> title 2</h2>
<h3> title 3</h3>
<h4> title 4</h4>
<div id="content" class="default">
<p> The paragraph </p>
<a href="http://www.baidu.com"> Baidu </a> <br/>
<a href="http://www.crazyant.net"> Crazy ant </a> <br/>
<a href="http://www.iqiyi.com"> Iqiyi </a> <br/>
<img src="https://www.python.org/static/img/python-logo.png"/>
</div>
</body>
</html>

# -*- coding=utf-8 -*-
from bs4 import BeautifulSoup
with open('./test.html',encoding='utf-8') as fin:
html_doc = fin.read()
soup = BeautifulSoup(html_doc,'html.parser')
links = soup.find_all('a')
for link in links:
print(link.name,link['href'],link.get_text())
img = soup.find('img')
print(img['src'])
# Upgraded version :
from bs4 import BeautifulSoup
with open('./test.html',encoding='utf-8') as fin:
html_doc = fin.read()
soup = BeautifulSoup(html_doc,'html.parser')
div_node = soup.find('div',id = 'content') # Find large blocks first
print(div_node)
print("#"*50)
links = div_node.find_all('a')
for link in links:
print(link.name,link['href'],link.get_text())
img = div_node.find('img')
print(img['src'])

边栏推荐
- 百日完成国产数据库opengausss的开源任务--openGuass极简版3.0.0安装教程
- Four common problems of e-commerce sellers' refund and cash return, with solutions
- 以VMware创新之道,重塑多云产品力
- Rocky basics 1
- Pandora IOT development board learning (HAL Library) - Experiment 7 window watchdog experiment (learning notes)
- 石臻臻的2021总结和2022展望 | 文末彩蛋
- 【云原生】Nacos-TaskManager 任务管理的使用
- JXL notes
- Transactions from December 29, 2021 to January 4, 2022
- Simply take stock reading notes (1/8)
猜你喜欢

2021-12-22 transaction record
![[cloud native] use of Nacos taskmanager task management](/img/ad/24bdd4572ef9990238913cb7cd16f8.png)
[cloud native] use of Nacos taskmanager task management

Transactions from January 6 to October 2022

开发者,云原生数据库是未来吗?

What if wechat is mistakenly sealed? Explain the underlying logic of wechat seal in detail

Introduction aux contrôles de la page dynamique SAP ui5

2021-12-21 transaction record

函数传递参数小案例

stm32和电机开发(从架构图到文档编写)

It's too convenient. You can complete the code release and approval by nailing it!
随机推荐
DNS的原理介绍
Simply take stock reading notes (1/8)
CF:A. The Third Three Number Problem【关于我是位运算垃圾这个事情】
Run, open circuit
Transactions from January 14 to 19, 2022
跨平台(32bit和64bit)的 printf 格式符 %lld 输出64位的解决方式
Reshape the power of multi cloud products with VMware innovation
Taobao, pinduoduo, jd.com, Doudian order & Flag insertion remarks API solution
你的下一台电脑何必是电脑,探索不一样的远程操作
百日完成国产数据库opengausss的开源任务--openGuass极简版3.0.0安装教程
Kotlin process control and circulation
【Nacos云原生】阅读源码第一步,本地启动Nacos
Developers, is cloud native database the future?
Kotlin variable
Detailed explanation of navigation component of openharmony application development
946. Verify stack sequence
RHCSA2
About the single step debugging of whether SAP ui5 floating footer is displayed or not and the benefits of using SAP ui5
以VMware创新之道,重塑多云产品力
NFT: how to make money with unique assets?