当前位置:网站首页>Selenium crawls Baidu pictures
Selenium crawls Baidu pictures
2022-07-05 13:48:00 【Weichi Begonia】
Selenium Crawling Baidu pictures
# coding=utf-8
""" obtain 10 Baidu pictures """
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys
import time, requests
def download_img(kw):
# Open the browser
browser = webdriver.Chrome()
time.sleep(3)
# The other party visits the corresponding web page
url = r'https://image.baidu.com/'
browser.get(url)
time.sleep(3)
# Enter the corresponding... In the input box key words
keyword = browser.find_element_by_id('kw')
keyword.send_keys(kw) # Enter key
keyword.send_keys(Keys.ENTER) # enter
time.sleep(3)
# Set the picture size
size = browser.find_element_by_id('sizeFilter')
big_size = browser.find_element_by_xpath('/html/body/div[1]/div[4]/div[2]/div/div[2]/div/div[1]')
ActionChains(browser).click(size).move_to_element(big_size).click().perform()
# Click on the first picture
first_pic = browser.find_element_by_xpath('//*[@id="imgid"]/div/ul/li[1]/div/a/img')
ActionChains(browser).click(first_pic).perform() # call perform() When the method is used , The events in the queue will execute in turn .
time.sleep(5)
# Switch to new window in
browser.switch_to.window(browser.window_handles[1])
for i in range(20):
# Get photo
pic = browser.find_element_by_xpath('//*[@id="currentImg"]')
src = pic.get_attribute('src')
r = requests.get(src)
title = browser.find_element_by_class_name('pic-title') # Output picture title at the same time
print(i, ' ', title.text)
if r.status_code == 200:
# Save the picture to a file
file_name = r'D:\1. learning\python_web\pachong\img\{}.jpg'.format(i)
with open(file_name, 'wb') as f:
f.write(r.content) # Use of words r.text / Other formats use r.content
# Switch to the next picture
next_btn = browser.find_element_by_class_name('img-next')
next_btn.click()
time.sleep(5)
return
if __name__ == '__main__':
download_img(' Mickey Mouse ')
边栏推荐
- asp. Net read TXT file
- ZABBIX monitoring
- Can and can FD
- Redis6 data type and operation summary
- 【云资源】云资源安全管理用什么软件好?为什么?
- js 从一个数组对象中取key 和value组成一个新的对象
- jasypt配置文件加密|快速入门|实战
- Operational research 68 | the latest impact factors in 2022 were officially released. Changes in journals in the field of rapid care
- Solve the problem of invalid uni app configuration page and tabbar
- Usage, installation and use of TortoiseSVN
猜你喜欢

内网穿透工具 netapp

:: ffff:192.168.31.101 what address is it?

MySQL - database query - sort query, paging query

Could not set property ‘id‘ of ‘class XX‘ with value ‘XX‘ argument type mismatch 解决办法

ELFK部署

MMSeg——Mutli-view时序数据检查与可视化

The development of speech recognition app with uni app is simple and fast.

Binder communication process and servicemanager creation process

Win10——轻量级小工具

Can graduate students not learn English? As long as the score of postgraduate entrance examination English or CET-6 is high!
随机推荐
Self built shooting range 2022
What is information security? What is included? What is the difference with network security?
Programmer growth Chapter 8: do a good job of testing
Hide Chinese name
搭建一个仪式感点满的网站,并内网穿透发布到公网 2/2
Idea set method annotation and class annotation
Kotlin协程利用CoroutineContext实现网络请求失败后重试逻辑
Network security HSRP protocol
Win10——轻量级小工具
华为推送服务内容,阅读笔记
通讯录(链表实现)
What is a network port
PHP basic syntax
MySQL if else use case use
leetcode 10. Regular expression matching regular expression matching (difficult)
:: ffff:192.168.31.101 what address is it?
什么叫做信息安全?包含哪些内容?与网络安全有什么区别?
Interviewer soul torture: why does the code specification require SQL statements not to have too many joins?
What about data leakage? " Watson k'7 moves to eliminate security threats
Attack and defense world web WP