当前位置:网站首页>Use selenium to climb the annual box office of Yien
Use selenium to climb the annual box office of Yien
2022-07-03 06:15:00 【Black~boy】
utilize selenium Climb to the annual box office of Yien
1. summary
1.1 selenium
Selenium Is a Web Tools for application testing .Selenium Test runs directly in browser , It's like a real user is doing it . Supported browsers include IE(7, 8, 9, 10, 11),Mozilla Firefox,Safari,Google Chrome,Opera,Edge etc. . The main functions of this tool include : Test compatibility with browser —— Test the application to see if it works well on different browsers and operating systems . Test system functions —— Create regression tests to verify software functionality and user requirements . Support automatic recording of actions and automatic generation .Net、Java、Perl Test scripts in different languages .( From baidu baike )
2. Crawling principle
utilize selenium Crawl the data in the website , And save it to mysql In the database
3. Preparation
3.1webdrive: Similar to drive ( The principle is as follows )
Webdriver It is developed for different browsers , Different browsers have different webdriver. For example, for Chrome The use of chromedriver.
remind :webdriver It must be consistent with the browser version !
3.2 selenium library
install selenium library :
3.3 mysql Database installation
Installation details mysql Installation tutorial
3.4 mysql And python Connection Library ( Be similar to webdrive)
There are many connection libraries , Please see the link below for details
Connection Library
This case uses pymysql:
3.5 re( Regular expressions ) library
A regular expression is a special sequence of characters , It can help you easily check whether a string matches a certain pattern .
Python since 1.5 Version has been added re modular , It provides Perl Style regular expression pattern .
re Module enable Python The language has all the regular expression functions .
compile Function to generate a regular expression object based on a pattern string and optional flag parameters . This object has a series of methods for matching and replacing regular expressions .
re The module also provides functions that are fully consistent with the functions of these methods , These functions use a pattern string as their first argument .
4. Code instance
import re
import pymysql
from selenium import webdriver
from selenium.webdriver.support.select import Select
import time
db = pymysql.connect(host='127.0.0.1', port=3306,user = 'root',password='123456',database='dianying',charset='utf8') # Database name 、 The password is defined for yourself
driver = webdriver.Chrome()
driver.get('https://www.endata.com.cn/BoxOffice/BO/Year/index.html')
sel_el = driver.find_element_by_xpath('//*[@id="OptionDate"]')
sel = Select(sel_el)
for i in range(len(sel.options)):
sel.select_by_index(i)
time.sleep(2)
table2 = driver.find_element_by_xpath('/html/body/section[1]/div/div[2]/div/div/div[2]/table/tbody')
ss = table2.text
ss1 = re.split(r'[\n ]',ss)
for j in range(25):
cursor = db.cursor()
demo = cursor.execute('INSERT INTO data VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s)',(str(2021-i),ss1[j*8+0],ss1[j*8+1],ss1[j*8+2],ss1[j*8+3],ss1[j*8+4],ss1[j*8+5],ss1[j*8+6],ss1[j*8+7]))
lists = cursor.fetchall()
db.commit()
print("==================================")
db.close()
driver.close()
5. design sketch

6 explain
If there is any infringement , Contact deletion [email protected]
边栏推荐
- Selenium ide installation recording and local project maintenance
- Why should there be a firewall? This time xiaowai has something to say!!!
- JDBC connection database steps
- Svn branch management
- 深入解析kubernetes controller-runtime
- Interesting research on mouse pointer interaction
- 从小数据量 MySQL 迁移数据到 TiDB
- The programmer shell with a monthly salary of more than 10000 becomes a grammar skill for secondary school. Do you often use it!!!
- About the difference between count (1), count (*), and count (column name)
- 冒泡排序的简单理解
猜你喜欢

Support vector machine for machine learning

JDBC connection database steps

Kubesphere - build MySQL master-slave replication structure

Bernoulli distribution, binomial distribution and Poisson distribution, and the relationship between maximum likelihood (incomplete)

Reinstalling the system displays "setup is applying system settings" stationary

Kubernetes notes (II) pod usage notes

Project summary --2 (basic use of jsup)

輕松上手Fluentd,結合 Rainbond 插件市場,日志收集更快捷

ruoyi接口权限校验

Detailed explanation of contextclassloader
随机推荐
Jedis source code analysis (I): jedis introduction, jedis module source code analysis
Simple solution of small up main lottery in station B
Selenium ide installation recording and local project maintenance
輕松上手Fluentd,結合 Rainbond 插件市場,日志收集更快捷
SVN分支管理
Migrate data from Amazon aurora to tidb
Leetcode solution - 02 Add Two Numbers
[set theory] relational closure (relational closure solution | relational graph closure | relational matrix closure | closure operation and relational properties | closure compound operation)
Solve the problem that Anaconda environment cannot be accessed in PowerShell
轻松上手Fluentd,结合 Rainbond 插件市场,日志收集更快捷
Oracle database synonym creation
[set theory] equivalence relation (concept of equivalence relation | examples of equivalence relation | equivalence relation and closure)
When PHP uses env to obtain file parameters, it gets strings
[system design] proximity service
Sorry, this user does not exist!
After the Chrome browser is updated, lodop printing cannot be called
MySQL带二进制的库表导出导入
Introduction to software engineering
Use abp Zero builds a third-party login module (I): Principles
致即将毕业大学生的一封信