当前位置:网站首页>51 data analysis post
51 data analysis post
2022-07-04 10:45:00 【She was your flaw】
51 Data analysis position
import csv
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
# Launch the browser
browser = webdriver.Chrome()
# Access to web site
browser.get('https://www.51job.com/')
# Open window size
browser.set_window_size(1000, 800)
# Delay waiting
browser.implicitly_wait(10)
# Find the search box
search_input = browser.find_element(By.ID, 'kwdselectid')
# Enter search content
search_input.send_keys(' Data analysis ')
# Click enter
search_input.send_keys(Keys.ENTER)
# Select crawl data page
for page in range(1, 2):
browser.implicitly_wait(10)
# Sleep for a second
time.sleep(1)
# Position
titles = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > a > p.t > span.jname.at')
# Company name
names = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > div.er > a')
# Company type
business = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > div.er > p.int.at')
# Company conditions
claims = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > a > p.info > span.d.at')
# Release time
release_times = browser.find_elements(By.CSS_SELECTOR,
'body > div:nth-child(4) > div.j_result > div > div.leftbox > div:nth-child(4) > div.j_joblist > div > a > p.t > span.time')
for title, name, busines, claim, release_time in zip(titles, names, business, claims, release_times):
print(' Position :', title.text)
print(' Company name :', name.text)
print(' Company type :', busines.text)
print(' Conditions :', claim.text)
print(' Release time :', release_time.text)
# Save crawl data as CSV File for
with open(r'51job Data analysis position .csv', 'a', encoding='utf-8') as file:
writer = csv.writer(file)
writer.writerow([title.text, name.text, busines.text, claim.text, release_time.text])
if page < 1:
# Click next
browser.find_element(By.XPATH, '/html/body/div[2]/div[3]/div/div[2]/div[4]/div[2]/div/div/div/ul/li[last()]/a').click()
else:
break
边栏推荐
- For and while loops
- shell awk
- Native div has editing ability
- Si vous ne connaissez pas ces quatre modes de mise en cache, vous osez dire que vous connaissez la mise en cache?
- Work order management system OTRs
- Write a program to judge whether the elements contained in a vector < int> container are 9.20: exactly the same as those in a list < int> container.
- DDL statement of MySQL Foundation
- BGP ---- border gateway routing protocol ----- basic experiment
- Using Lua to realize 99 multiplication table
- Press the button wizard to learn how to fight monsters - identify the map, run the map, enter the gang and identify NPC
猜你喜欢
For programmers, if it hurts the most...
Evolution from monomer architecture to microservice architecture
PHP code audit 3 - system reload vulnerability
DNS hijacking
[Galaxy Kirin V10] [desktop] can't be started or the screen is black
Huge number multiplication (C language)
Number of relationship models
[machine] [server] Taishan 200
[Galaxy Kirin V10] [server] iSCSI deployment
[FAQ] summary of common causes and solutions of Huawei account service error 907135701
随机推荐
Collection of practical string functions
Recursive method to achieve full permutation (C language)
[test theory] test process management
Button wizard business running learning - commodity quantity, price reminder, judgment Backpack
Reasons and solutions for the 8-hour difference in mongodb data date display
Rhcsa day 10 operation
leetcode729. My schedule 1
PHP programming language (1) - operators
Application and Optimization Practice of redis in vivo push platform
Doris / Clickhouse / Hudi, a phased summary in June
[untitled]
Online troubleshooting
Latex error: missing delimiter (. Inserted) {\xi \left( {p,{p_q}} \right)} \right|}}
Network connection (II) three handshakes, four waves, socket essence, packaging of network packets, TCP header, IP header, ACK confirmation, sliding window, results of network packets, working mode of
Rhcsa operation
Student achievement management system (C language)
[test theory] test phase analysis (unit, integration, system test)
Does any teacher know how to inherit richsourcefunction custom reading Mysql to do increment?
Linked list operation can never change without its roots
On binary tree (C language)