当前位置:网站首页>robobrowser的简单使用
robobrowser的简单使用
2022-07-28 18:05:00 【华为云】
robobrowser简单介绍:
简单来说,robobrowser是一个轻量级的浏览器,自动测试库,和selenium类似,但运行比selenium更为隐蔽,因为它不会打开浏览器页面。RoboBrowser,Your friendly neighborhood web scraper!由纯 Python 编写;
项目地址:打开
安装
推荐使用pip方式安装,简单!
robobrowser库依赖其他库,bs4,lxml,所以两个库都要安装;
打开命令行窗口,输入代码:
pip install lxml
pip install bs4
pip install robobrowser
等待安装成功;
验证一下:
打开命令行窗口,输入python进行解释器窗口;
输入代码:import robobrowser
如果提示以下错误
File “D:\python3.8\lib\site-packages\robobrowser\browser.py”, line 8, in <module>
from werkzeug import cached_property
ImportError: cannot import name ‘cached_property’ from ‘werkzeug’ (D:\python3.8\lib\site-packages\werkzeug_init_.py)

需要打开文件:
D:\python3.8\lib\site-packages\werkzeug_init_.py
输入from werkzeug.utils import cached_property即可修复
from .serving import run_simple as run_simplefrom .test import Client as Clientfrom .wrappers import Request as Requestfrom .wrappers import Response as Response#显示引入from werkzeug.utils import cached_property # 这是引入的包 __version__ = "2.1.2"简单使用;
模拟百度搜索,提交表单:
home_url = 'http://www.baidu.com' # parser: 解析器,HTML parser; used by BeautifulSoup# 官方推荐:lxmlrb = RoboBrowser(history=True, parser='lxml') # 打开目标网站rb.open(home_url)#print(rb.parsed())# 获取表单对象bd_form = rb.get_form()print(bd_form)bd_form['wd'].value = "robobrowser"# 提交表单,模拟一次搜索rb.submit_form(bd_form)#print(rb.parsed())sleep(1)# 查看结果result_elements = rb.select(".result")print(result_elements)以上获取到的结果,可以使用
from bs4 import BeautifulSoup
进行解析;
如获取标题,链接等;
for index, element in enumerate(result_elements):
title = element.find(“a”).text
href = element.find(“a”)[‘href’]
其他操作
跳转链接rb.follow_link(first_href)获取历史print(rb.url)更多的操作,可以参考官方文档;
边栏推荐
- Servlet learning notes
- MIR专题征稿 | 常识知识与推理:表示、获取与应用 (10月31日截稿)
- Kubeedge releases white paper on cloud native edge computing threat model and security protection technology
- 克服“看牙恐惧”,我们用技术改变行业
- [C language] Fibonacci sequence [recursion and iteration]
- [C language] header file of complex number four operations and complex number operations
- How to write the SQL statement of time to date?
- Concurrent programming, do you really understand?
- Idea properties file display \u solution of not displaying Chinese
- C language implementation of strncpy
猜你喜欢

Edge detection and connection of image segmentation realized by MATLAB

Implementation of markdown editor in editor.md

Deploy ZABBIX automatically with saltstack

数字滤波器设计——Matlab
![[C language] string reverse order implementation (recursion and iteration)](/img/c3/02d0a72f6026df8a67669293e55ef2.png)
[C language] string reverse order implementation (recursion and iteration)

Kubeedge releases white paper on cloud native edge computing threat model and security protection technology
![[C language] simulation implementation of strlen (recursive and non recursive)](/img/73/e92fe714515491f1ea366d6924c9ec.png)
[C language] simulation implementation of strlen (recursive and non recursive)
![[C language] Fibonacci sequence [recursion and iteration]](/img/02/6cff776db583f1b149686e15649d41.png)
[C language] Fibonacci sequence [recursion and iteration]

Cell review: single cell methods in human microbiome research

软考中级(系统集成项目管理工程师)高频考点
随机推荐
zfoo增加类似于mydog的路由
KPMG China: insights into information technology audit projects of securities fund management institutions
Cdga | how can the industrial Internet industry do a good job in data governance?
C language function
Leetcode day4 the highest paid employee in the Department
Leetcode day3 employees who exceed the manager's income
河北邯郸:拓展基层就业空间 助力高校毕业生就业
[网络]跨区域网络的通信学习路由表的工作原理
Data system of saltstack
Source code analysis of scripy spider
C+ + core programming
C+ + core programming
Array method added in ES6
Servlet learning notes
Crawl IP
Overcome the "fear of looking at teeth", and we use technology to change the industry
JS batch add event listening onclick this event delegate target currenttarget onmouseenter OnMouseOver
C language pointer and two-dimensional array
认识中小型局域网WLAN
[C language] string reverse order implementation (recursion and iteration)