当前位置:网站首页>51 data analysis post
51 data analysis post
2022-07-04 10:45:00 【She was your flaw】
51 Data analysis position
import csv
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
# Launch the browser
browser = webdriver.Chrome()
# Access to web site
browser.get('https://www.51job.com/')
# Open window size
browser.set_window_size(1000, 800)
# Delay waiting
browser.implicitly_wait(10)
# Find the search box
search_input = browser.find_element(By.ID, 'kwdselectid')
# Enter search content
search_input.send_keys(' Data analysis ')
# Click enter
search_input.send_keys(Keys.ENTER)
# Select crawl data page
for page in range(1, 2):
browser.implicitly_wait(10)
# Sleep for a second
time.sleep(1)
# Position
titles = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > a > p.t > span.jname.at')
# Company name
names = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > div.er > a')
# Company type
business = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > div.er > p.int.at')
# Company conditions
claims = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > a > p.info > span.d.at')
# Release time
release_times = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > a > p.t > span.time')
for title, name, busines, claim, release_time in zip(titles, names, business, claims, release_times):
print(' Position :', title.text)
print(' Company name :', name.text)
print(' Company type :', busines.text)
print(' Conditions :', claim.text)
print(' Release time :', release_time.text)
# Save crawl data as CSV File for
with open(r'51job Data analysis position .csv', 'a', encoding='utf-8') as file:
writer = csv.writer(file)
writer.writerow([title.text, name.text, busines.text, claim.text, release_time.text])
if page < 1:
# Click next
browser.find_element(By.XPATH, '/html/body/div[2]/div[3]/div/div[2]/div[4]/div[2]/div/div/div/ul/li[last()]/a').click()
else:
break
边栏推荐
- How to quickly parse XML documents through C (in fact, other languages also have corresponding interfaces or libraries to call)
- Talk about scalability
- Pod management
- Network disk installation
- If you don't know these four caching modes, dare you say you understand caching?
- [Galaxy Kirin V10] [server] KVM create Bridge
- Rhcsa learning practice
- Basic principle of servlet and application of common API methods
- Online troubleshooting
- When I forget how to write SQL, I
猜你喜欢
Knapsack problem and 0-1 knapsack problem
The most detailed teaching -- realize win10 multi-user remote login to intranet machine at the same time -- win10+frp+rdpwrap+ Alibaba cloud server
[Galaxy Kirin V10] [desktop] FTP common scene setup
TS type gymnastics: illustrating a complex advanced type
Basic data types of MySQL
What is an excellent architect in my heart?
Four characteristics and isolation levels of database transactions
Occasional pit compiled by idea
Network connection (III) functions and similarities and differences of hubs, switches and routers, routing tables and tables in switches, why do you need address translation and packet filtering?
[Galaxy Kirin V10] [server] FTP introduction and common scenario construction
随机推荐
Rhcsa learning practice
RHCE day 3
DCL statement of MySQL Foundation
When I forget how to write SQL, I
How to quickly parse XML documents through C (in fact, other languages also have corresponding interfaces or libraries to call)
If the uniapp is less than 1000, it will be displayed according to the original number. If the number exceeds 1000, it will be converted into 10w+ 1.3k+ display
On binary tree (C language)
Legion is a network penetration tool
Introduction to extensible system architecture
Four characteristics and isolation levels of database transactions
Reasons and solutions for the 8-hour difference in mongodb data date display
[Galaxy Kirin V10] [desktop] login system flash back
Linked list operation can never change without its roots
[Galaxy Kirin V10] [server] NFS setup
Error C4996 ‘WSAAsyncSelect‘: Use WSAEventSelect() instead or define _ WINSOCK_ DEPRECATED_ NO_ WARN
Huge number (C language)
[test theory] test the dimension of professional ability
Recursive method to achieve full permutation (C language)
The future education examination system cannot answer questions, and there is no response after clicking on the options, and the answers will not be recorded
C language - stack