当前位置:网站首页>51 data analysis post
51 data analysis post
2022-07-04 10:45:00 【She was your flaw】
51 Data analysis position
import csv
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
# Launch the browser
browser = webdriver.Chrome()
# Access to web site
browser.get('https://www.51job.com/')
# Open window size
browser.set_window_size(1000, 800)
# Delay waiting
browser.implicitly_wait(10)
# Find the search box
search_input = browser.find_element(By.ID, 'kwdselectid')
# Enter search content
search_input.send_keys(' Data analysis ')
# Click enter
search_input.send_keys(Keys.ENTER)
# Select crawl data page
for page in range(1, 2):
browser.implicitly_wait(10)
# Sleep for a second
time.sleep(1)
# Position
titles = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > a > p.t > span.jname.at')
# Company name
names = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > div.er > a')
# Company type
business = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > div.er > p.int.at')
# Company conditions
claims = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > a > p.info > span.d.at')
# Release time
release_times = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > a > p.t > span.time')
for title, name, busines, claim, release_time in zip(titles, names, business, claims, release_times):
print(' Position :', title.text)
print(' Company name :', name.text)
print(' Company type :', busines.text)
print(' Conditions :', claim.text)
print(' Release time :', release_time.text)
# Save crawl data as CSV File for
with open(r'51job Data analysis position .csv', 'a', encoding='utf-8') as file:
writer = csv.writer(file)
writer.writerow([title.text, name.text, busines.text, claim.text, release_time.text])
if page < 1:
# Click next
browser.find_element(By.XPATH, '/html/body/div[2]/div[3]/div/div[2]/div[4]/div[2]/div/div/div/ul/li[last()]/a').click()
else:
break
边栏推荐
- [Galaxy Kirin V10] [server] NFS setup
- OSPF summary
- 如果不知道這4種緩存模式,敢說懂緩存嗎?
- 【Day2】 convolutional-neural-networks
- VI text editor and user rights management, group management and time management
- DCL statement of MySQL Foundation
- Uniapp--- initial use of websocket (long link implementation)
- If you don't know these four caching modes, dare you say you understand caching?
- Rhcsa day 10 operation
- [advantages and disadvantages of outsourcing software development in 2022]
猜你喜欢
Tables in the thesis of latex learning
Time complexity and space complexity
What is an excellent architect in my heart?
leetcode1-3
If you don't know these four caching modes, dare you say you understand caching?
Error C4996 ‘WSAAsyncSelect‘: Use WSAEventSelect() instead or define _ WINSOCK_ DEPRECATED_ NO_ WARN
DNS hijacking
Rhcsa12
Three schemes of ZK double machine room
Knapsack problem and 0-1 knapsack problem
随机推荐
如果不知道這4種緩存模式,敢說懂緩存嗎?
Four characteristics and isolation levels of database transactions
Article publishing experiment
BGP ---- border gateway routing protocol ----- basic experiment
/*The rewriter outputs the contents of the IA array. It is required that the type defined by typedef cannot be used in the outer loop*/
Talk about scalability
Rhcsa operation
Network connection (II) three handshakes, four waves, socket essence, packaging of network packets, TCP header, IP header, ACK confirmation, sliding window, results of network packets, working mode of
The last month before a game goes online
The most detailed teaching -- realize win10 multi-user remote login to intranet machine at the same time -- win10+frp+rdpwrap+ Alibaba cloud server
DDL statement of MySQL Foundation
leetcode842. Split the array into Fibonacci sequences
2、 Operators and branches
Introduction to extensible system architecture
Occasional pit compiled by idea
Es advanced series - 1 JVM memory allocation
Does any teacher know how to inherit richsourcefunction custom reading Mysql to do increment?
Realsense of d435i, d435, d415, t265_ Matching and installation of viewer environment
Linked list operation can never change without its roots
Sword finger offer 05 (implemented in C language)