当前位置:网站首页>robobrowser的简单使用
robobrowser的简单使用
2022-07-28 18:05:00 【华为云】
robobrowser简单介绍:
简单来说,robobrowser是一个轻量级的浏览器,自动测试库,和selenium类似,但运行比selenium更为隐蔽,因为它不会打开浏览器页面。RoboBrowser,Your friendly neighborhood web scraper!由纯 Python 编写;
项目地址:打开
安装
推荐使用pip方式安装,简单!
robobrowser库依赖其他库,bs4,lxml,所以两个库都要安装;
打开命令行窗口,输入代码:
pip install lxml
pip install bs4
pip install robobrowser
等待安装成功;
验证一下:
打开命令行窗口,输入python进行解释器窗口;
输入代码:import robobrowser
如果提示以下错误
File “D:\python3.8\lib\site-packages\robobrowser\browser.py”, line 8, in <module>
from werkzeug import cached_property
ImportError: cannot import name ‘cached_property’ from ‘werkzeug’ (D:\python3.8\lib\site-packages\werkzeug_init_.py)
需要打开文件:
D:\python3.8\lib\site-packages\werkzeug_init_.py
输入from werkzeug.utils import cached_property即可修复
from .serving import run_simple as run_simplefrom .test import Client as Clientfrom .wrappers import Request as Requestfrom .wrappers import Response as Response#显示引入from werkzeug.utils import cached_property # 这是引入的包 __version__ = "2.1.2"
简单使用;
模拟百度搜索,提交表单:
home_url = 'http://www.baidu.com' # parser: 解析器,HTML parser; used by BeautifulSoup# 官方推荐:lxmlrb = RoboBrowser(history=True, parser='lxml') # 打开目标网站rb.open(home_url)#print(rb.parsed())# 获取表单对象bd_form = rb.get_form()print(bd_form)bd_form['wd'].value = "robobrowser"# 提交表单,模拟一次搜索rb.submit_form(bd_form)#print(rb.parsed())sleep(1)# 查看结果result_elements = rb.select(".result")print(result_elements)
以上获取到的结果,可以使用
from bs4 import BeautifulSoup
进行解析;
如获取标题,链接等;
for index, element in enumerate(result_elements):
title = element.find(“a”).text
href = element.find(“a”)[‘href’]
其他操作
跳转链接rb.follow_link(first_href)获取历史print(rb.url)
更多的操作,可以参考官方文档;
边栏推荐
- [网络]跨区域网络的通信学习IPv4地址的分类和计算
- CDGA|工业互联网行业怎么做好数据治理?
- WPF--实现WebSocket服务端
- Leetcode day3 find duplicate email addresses
- [C language] advanced pointer exercise 1
- Know small and medium LAN WLAN
- [C language] guessing numbers game [function]
- Read how to deploy highly available k3s with external database
- Two methods to judge the size end
- Redis notes
猜你喜欢
云原生编程挑战赛火热开赛,51 万奖金等你来挑战!
Edge detection and connection of image segmentation realized by MATLAB
Cdga | how can the industrial Internet industry do a good job in data governance?
JS batch add event listening onclick this event delegate target currenttarget onmouseenter OnMouseOver
leetcode day1 分数排名
Thoroughly understand bit operations - shift left, shift right
English translation Arabic - batch English translation Arabic tools free of charge
English Translation Spanish - batch English Translation Spanish tools free of charge
C language pointer and two-dimensional array
Hebei: stabilizing grain and expanding beans to help grain and oil production improve quality and efficiency
随机推荐
Getting started with enterprise distributed crawler framework
Failed to install app-debug. apk: Failure [INSTALL_FAILED_TEST_ONLY: installPackageLI]
克服“看牙恐惧”,我们用技术改变行业
Concurrent programming, do you really understand?
How navicate modifies the database name
Crawl IP
Find the memory occupied by the structure
zfoo增加类似于mydog的路由
[C language] Gobang game [array and function]
[NPP installation plug-in]
[网络]跨区域网络的通信学习路由表的工作原理
Digital filter design matlab
The results of the second quarter online moving people selection of "China Internet · moving 2022" were announced
leetcode day1 分数排名
9. Pointer of C language (2) wild pointer, what is wild pointer, and the disadvantages of wild pointer
C language function
Labelme (I)
9. Pointer of C language (4) pointer and one-dimensional array, pointer operation
Design of air combat game based on qtgui image interface
Cloud computing notes part.2 - Application Management