当前位置:网站首页>Basic teaching of crawler code
Basic teaching of crawler code
2022-07-03 06:42:00 【pjiang000】
from http.client import ResponseNotReady
import json
from unicodedata import name
import requests
from lxml import etree
import csv
import xlwt
# base_url = 'https://www.basketball-reference.com'
base_url = https://www.baidu.com
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36',
}
response = requests.get(base_url, headers=headers)
html = etree.HTML(response.text)
url_list = html.xpath('//*[@id="site_menu"]/ul/li[2]/div/a/@href')
team_my_names = html.xpath('//*[@id="site_menu"]/ul/li[2]/div/a/text()')
teams_list = url_list
# print(teams_list)
for h in range(len(teams_list)):
base_url = 'https://www.basketball-reference.com' + teams_list[h]
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36',
}
xls = xlwt.Workbook()
sht1 = xls.add_sheet('Sheet1')
response = requests.get(base_url, headers=headers)
html = etree.HTML(response.text)
url_list = html.xpath('//*[@id="roster"]/tbody/tr/td[1]/a/@href')
names = html.xpath('//*[@id="roster"]/tbody/tr/td[1]/a')
# print(url_list)
# print(names)
name_lst = []
for j in range(len(names)):
name_lst.append(names[j].text)
names = name_lst
person_list = []
for j in range(len(url_list)):
person_list.append('https://www.basketball-reference.com' + url_list[j])
url_list = person_list
print(team_my_names[h], end=": ")
file_name = 'files/' + team_my_names[h] + ".xls"
count = 0
print()
for i in range(len(url_list)):
print(names[i], end=',')
response = requests.get(url_list[i], headers=headers)
html = etree.HTML(response.text)
year_lst = html.xpath('//*[@id="per_game"]/tbody/tr/th/a/text()')
team_name = html.xpath('//*[@id="per_game"]/tbody/tr/td[@data-stat="team_id"]//text()')
for j in range(len(year_lst)):
sht1.write(count, 0, names[i])
sht1.write(count, 1, year_lst[j])
sht1.write(count, 2, 'NULL' if len(team_name) < j else team_name[j])
count += 1
xls.save(file_name)
print()
边栏推荐
- Time format record
- [set theory] equivalence relation (concept of equivalence relation | examples of equivalence relation | equivalence relation and closure)
- ruoyi接口权限校验
- The dynamic analysis and calculation of expressions are really delicious for flee
- 学习笔记 -- k-d tree 和 ikd-Tree 原理及对比
- The most classic 100 sentences in the world famous works
- Use selenium to climb the annual box office of Yien
- These two mosquito repellent ingredients are harmful to babies. Families with babies should pay attention to choosing mosquito repellent products
- How matlab modifies default settings
- [set theory] relational closure (relational closure solution | relational graph closure | relational matrix closure | closure operation and relational properties | closure compound operation)
猜你喜欢

IC_EDA_ALL虚拟机(丰富版):questasim、vivado、vcs、verdi、dc、pt、spyglass、icc2、synplify、INCISIVE、IC617、MMSIM、工艺库

第8章、MapReduce 生产经验

YOLOV2学习与总结

JMeter performance automation test

Install VM tools

ssh链接远程服务器 及 远程图形化界面的本地显示

机器学习 | 简单但是能提升模型效果的特征标准化方法(RobustScaler、MinMaxScaler、StandardScaler 比较和解析)

使用conda创建自己的深度学习环境

Reinstalling the system displays "setup is applying system settings" stationary

Example of joint use of ros+pytoch (semantic segmentation)
随机推荐
Machine learning | simple but feature standardization methods that can improve the effect of the model (comparison and analysis of robustscaler, minmaxscaler, standardscaler)
Read blog type data from mysql, Chinese garbled code - solved
C2338 Cannot format an argument. To make type T formattable provide a formatter<T> specialization:
Page text acquisition
Local rviz call and display of remote rostopic
DNS forward query:
SQL implementation merges multiple rows of records into one row
Yolov2 learning and summary
2022 cisp-pte (III) command execution
Redis cluster creation, capacity expansion and capacity reduction
Understand software testing
方差迭代公式推导
机器学习 | 简单但是能提升模型效果的特征标准化方法(RobustScaler、MinMaxScaler、StandardScaler 比较和解析)
【code】偶尔取值、判空、查表、验证等
Shell conditional statement
Golang operation redis: write and read kV data
Create your own deep learning environment with CONDA
100000 bonus is divided up. Come and meet the "sister who braves the wind and waves" among the winners
How does the insurance company check hypertension?
认识弹性盒子flex