当前位置:网站首页>Selenium crawl notes
Selenium crawl notes
2022-06-24 20:36:00 【Yu Xu】
Import third-party library selenium.
import selenium
from selenium import webdriverDownload the corresponding browser driver :
edge:https://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver/
chrome:https://code.google.com/p/chromedriver/downloads/list
firefox:https://github.com/mozilla/geckodriver/releases/
IE:NuGet Gallery | Selenium.WebDriver.IEDriver 4.0.0
After downloading, it is a compressed folder , Open folder , There's a webmsedgedriver.exe file , Copy this file to division C In a dish other than a dish , Then configure the path to the system environment of this computer .
The path of the configuration environment is “ This computer — Right click properties — About — Advanced system setup — senior — environment variable — System variables —path
take msedgedriver.exe The path of the file is configured , And then click OK .
# Create a browser object , I am here edge browser , If you are using chrome Browser words , there edge To be converted into chrome,firefox So it is with , The first letter should be capitalized !!
driver = webdriver.Edge()
driver.get('https://www.taobao.com/?spm=a21bo.jianhua.201857.1.5af911d9NTiGPH')
# Page maximization
driver.maximize_window()Run it here , Find out driver = webdriver.Edge() There is an error .
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the specified file .
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\learn\ test .py", line 4, in <module>
driver = webdriver.Edge()
File "D:\ Study \pycharm practice \learn\lib\site-packages\selenium\webdriver\edge\webdriver.py", line 62, in __init__
super(WebDriver, self).__init__(DesiredCapabilities.EDGE['browserName'], "ms",
File "D:\ Study \pycharm practice \learn\lib\site-packages\selenium\webdriver\chromium\webdriver.py", line 90, in __init__
self.service.start()
File "D:\ Study \pycharm practice \learn\lib\site-packages\selenium\webdriver\common\service.py", line 81, in start
raise WebDriverException(
selenium.common.exceptions.WebDriverException: Message: 'msedgedriver' executable needs to be in PATH. Please download from https://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver/
Here it is said that the driver needs to be in the configuration , But I thought I had configured the path , How to configure , Later I found out , The original path is given to in the form of an object webdriver.Edge() In this way .
So the code has to be changed to this .
# Of course, here it is edge You have to change it to your own browser name , Lowercase is OK
from selenium.webdriver.edge.service import Service
# use Service() Method to give a path to a variable s, Regular expressions are used here
s = Service(r'D:\msedgedriver.exe')
# there service yes Edge Parameters in methods , The specific usage can be selected with the mouse Edge, Then press and hold ctrl, Then click with the left mouse button , The corresponding method file will pop up
driver = webdriver.Edge(service=s)
driver.get('https://www.taobao.com/?spm=a21bo.jianhua.201857.1.5af911d9NTiGPH')
# Page maximization
driver.maximize_window()Then run the code , Taobao will pop up , There's a point here , When code and people browse the web, there will be different situations :
1、 If people come to visit the web , Search in search , Select items , Until the purchase is finalized , The interface pop-up window for logging in to the user account will pop up ;
2、 If it is the code to manipulate the driver to browse the web , Then you will enter the set product in the search column , Pop up the pop-up window of the login interface directly .
Let's first write the code of the content to search .
Here is another content :
General is to use find_element_by_xpath() To get web page elements , It turned out to be mine pycharm But on the bottom
# Here we need to use a different method , Add a... To it from selenium.webdriver.common.by import By
# It is not recommended to use find_element_by_xpath(), Please use find_element() Methods to replace
find_element_by_* commands are deprecated. Please use find_element() instead
# That is to say find_elemnet_by_xpath() == find_element(By.XAPTH, ‘ The element you are looking for ')This is used here. xpath Method to get the web page elements of the search box , Then set the random delay of the web page 1 To 3 second .
import random
driver.find_element(By.XPATH, '//*[@id="J_TSearchForm"]/div[1]/button').click()
time.sleep(random.randint(1, 3))Then get the search button , Also set random delay 1 To 3 second .
边栏推荐
- Set up your own website (14)
- Combination mode -- stock speculation has been cut into leeks? Come and try this investment strategy!
- 大一女生废话编程爆火!懂不懂编程的看完都拴Q了
- Hosting service and SASE, enjoy the integration of network and security | phase I review
- [cann document express issue 06] first knowledge of tbe DSL operator development
- First understand redis' data structure - string
- Basic properties and ergodicity of binary tree
- 消息称腾讯正式宣布成立“XR”部门,押注元宇宙;谷歌前 CEO:美国即将输掉芯片竞争,要让台积电、三星建更多工厂...
- Enjoy yuan mode -- a large number of flying dragons
- Berkeley, MIT, Cambridge, deepmind et d'autres grandes conférences en ligne: vers une IA sûre, fiable et contrôlable
猜你喜欢
思源笔记工具栏中的按钮名称变成了 undefined,有人遇到过吗?

Bean lifecycle flowchart

Byte and Tencent have also come to an end. How fragrant is this business of "making 30million yuan a month"?

Leetcode (135) - distribute candy

Leetcode(135)——分发糖果

网络安全审查办公室对知网启动网络安全审查,称其“掌握大量重要数据及敏感信息”

Image panr

苹果不差钱,但做内容“没底气”

Cooking business experience of young people: bloggers are busy selling classes and bringing goods, and the organization earns millions a month

基于SSM的物料管理系统(源码+文档+数据库)
随机推荐
What is showcase? What should showcase pay attention to?
Dx12 engine development course progress - where does this course go
CVPR 2022缅怀孙剑!同济、阿里获最佳学生论文奖,何恺明入围
两位湖南老乡,联手干出一个百亿IPO
首个大众可用PyTorch版AlphaFold2复现,哥大开源OpenFold,star量破千
Leetcode(455)——分发饼干
C langage pour le déminage (version simplifiée)
华为云ModelArts第四次蝉联中国机器学习公有云服务市场第一!
Difference between map and object
CVPR 2022 remembers Sun Jian! Tongji and Ali won the best student thesis award, and hekaiming was shortlisted
C language to realize mine sweeping (simple version)
苹果不差钱,但做内容“没底气”
等保备案是等保测评吗?两者是什么关系?
DX12引擎开发课程进度-这个课程到底讲到哪里了
Huawei cloud modelarts has ranked first in China's machine learning public cloud service market for the fourth time!
JVM tuning
Ribbon源码分析之@LoadBalanced与LoadBalancerClient
科技抗疫: 运营商网络洞察和实践白皮书 | 云享书库NO.20推荐
Basic properties and ergodicity of binary tree
Where is 5g really powerful? What is the difference with 4G?