当前位置:网站首页>51 data analysis post
51 data analysis post
2022-07-04 10:45:00 【She was your flaw】
51 Data analysis position
import csv
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
# Launch the browser
browser = webdriver.Chrome()
# Access to web site
browser.get('https://www.51job.com/')
# Open window size
browser.set_window_size(1000, 800)
# Delay waiting
browser.implicitly_wait(10)
# Find the search box
search_input = browser.find_element(By.ID, 'kwdselectid')
# Enter search content
search_input.send_keys(' Data analysis ')
# Click enter
search_input.send_keys(Keys.ENTER)
# Select crawl data page
for page in range(1, 2):
browser.implicitly_wait(10)
# Sleep for a second
time.sleep(1)
# Position
titles = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > a > p.t > span.jname.at')
# Company name
names = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > div.er > a')
# Company type
business = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > div.er > p.int.at')
# Company conditions
claims = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > a > p.info > span.d.at')
# Release time
release_times = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > a > p.t > span.time')
for title, name, busines, claim, release_time in zip(titles, names, business, claims, release_times):
print(' Position :', title.text)
print(' Company name :', name.text)
print(' Company type :', busines.text)
print(' Conditions :', claim.text)
print(' Release time :', release_time.text)
# Save crawl data as CSV File for
with open(r'51job Data analysis position .csv', 'a', encoding='utf-8') as file:
writer = csv.writer(file)
writer.writerow([title.text, name.text, busines.text, claim.text, release_time.text])
if page < 1:
# Click next
browser.find_element(By.XPATH, '/html/body/div[2]/div[3]/div/div[2]/div[4]/div[2]/div/div/div/ul/li[last()]/a').click()
else:
break
边栏推荐
- Debug:==42==ERROR: AddressSanitizer: heap-buffer-overflow on address
- Unittest+airtest+beatiulreport combine the three to make a beautiful test report
- Three schemes of ZK double machine room
- leetcode1-3
- 如果不知道這4種緩存模式,敢說懂緩存嗎?
- Realsense d435 d435i d415 depth camera obtains RGB map, left and right infrared camera map, depth map and IMU data under ROS
- Dos:disk operating system, including core startup program and command program
- Today's sleep quality record 78 points
- Write a program to judge whether the elements contained in a vector < int> container are 9.20: exactly the same as those in a list < int> container.
- leetcode1229. Schedule the meeting
猜你喜欢

The most detailed teaching -- realize win10 multi-user remote login to intranet machine at the same time -- win10+frp+rdpwrap+ Alibaba cloud server

【Day2】 convolutional-neural-networks

BGP ---- border gateway routing protocol ----- basic experiment

Work order management system OTRs

Unittest+airtest+beatiulreport combine the three to make a beautiful test report

system design

Summary of several job scheduling problems
![[Galaxy Kirin V10] [server] NUMA Technology](/img/9b/65466c6fc6336e27e842f50c26b9c3.jpg)
[Galaxy Kirin V10] [server] NUMA Technology
![[Galaxy Kirin V10] [server] soft RAID configuration](/img/d5/789387613fafc18f623d0cff45093b.jpg)
[Galaxy Kirin V10] [server] soft RAID configuration

Learning XML DOM -- a typical model for parsing XML documents
随机推荐
For and while loops
Rhcsa learning practice
Debug:==42==ERROR: AddressSanitizer: heap-buffer-overflow on address
[Galaxy Kirin V10] [server] failed to start the network
What is an excellent architect in my heart?
Four characteristics and isolation levels of database transactions
Write a program to judge whether the elements contained in a vector < int> container are 9.20: exactly the same as those in a list < int> container.
Map container
[Galaxy Kirin V10] [desktop] build NFS to realize disk sharing
[advantages and disadvantages of outsourcing software development in 2022]
If the uniapp is less than 1000, it will be displayed according to the original number. If the number exceeds 1000, it will be converted into 10w+ 1.3k+ display
[FAQ] summary of common causes and solutions of Huawei account service error 907135701
Latex learning insertion number - list of filled dots, bars, numbers
Introduction to extensible system architecture
Talk about scalability
The future education examination system cannot answer questions, and there is no response after clicking on the options, and the answers will not be recorded
Network connection (III) functions and similarities and differences of hubs, switches and routers, routing tables and tables in switches, why do you need address translation and packet filtering?
Two way process republication + routing policy
The most detailed teaching -- realize win10 multi-user remote login to intranet machine at the same time -- win10+frp+rdpwrap+ Alibaba cloud server
DDL statement of MySQL Foundation