当前位置:网站首页>爬一个网页的所有导师信息
爬一个网页的所有导师信息
2022-06-27 07:44:00 【超自然祈祷】
找老师,都不愿意要半脱产的,唉,难道我又得筛一遍导师的方向吗……
索性你无情我也不倾注什么感情了,python走起

计算器加了一下右边的人数,总共309,我以达成目标为主导,不需求代码全部搞定,于是我先爬一下所有带人名的链接

看到有标签,但是需要什么xpath,索性我用最省事的正则框选出来
# -*- coding: utf-8 -*-
import os
import os.path
import requests
import re
import csv
URL="https://aipt.ucas.ac.cn/index.php/zh-cn/jsdw/graduateteacher"
res = requests.get(URL)
REX = "http://.*?\""
print(res)
ls = re.findall(REX, res.text)
#" target="_blank"
with open('res.csv', mode='a', newline='', encoding='utf-8-sig') as f:
csv_writer = csv.writer(f, delimiter=',')
csv_writer.writerow([ls])还行,执行完正好309个,后面的好像都是又重复了一遍,删掉

对着刚存好的res.csv再故技重施地遍历一下,把存好的text内容对应地存进另一个res1.scv里,到时候用wps查找自己感兴趣的关键词找老师就行了。
(注意,把\n替换掉,这是个坑,不然text返回空,坑了我俩小时)
# -*- coding: utf-8 -*-
import os
import os.path
import requests
import re
import csv
lines=[]
with open('E:\\1projects\\PY\\PY\\res.csv','r') as f:
lines=f.readlines()
for line in lines:
str = line
str = str.replace("\n","") #替换掉\n
print(str)
res = requests.get(str)
print(res)
#print(res.text)
with open('res1.csv', mode='a', newline='', encoding='utf-8-sig') as f:
csv_writer = csv.writer(f, delimiter=',')
csv_writer.writerow([line,res.text])
print("done")直接wps过滤筛选,再单独点进他的主页看

边栏推荐
- 语音信号特征提取流程:输入语音信号-分帧、预加重、加窗、FFT->STFT谱(包括幅度、相位)-对复数取平方值->幅度谱-Mel滤波->梅尔谱-取对数->对数梅尔谱-DCT->FBank->MFCC
- 2. QT components used in the project
- 【11. 二维差分】
- js用switch输出成绩是否合格
- R 语言并行计算 spearman 相关系数,加快共现网络(co- occurrence network)构建速度
- Manim math engine
- One person manages 1000 servers? This automatic operation and maintenance tool must be mastered
- js成绩奖惩例题
- JS print 99 multiplication table
- 基础知识 | js基础
猜你喜欢

认识O(NlogN)的排序
![[Kevin's third play in a row] is rust really slower than C? Further analyze queen micro assessment](/img/ac/44e0ecd04fbea5efd39d2cc75dea59.jpg)
[Kevin's third play in a row] is rust really slower than C? Further analyze queen micro assessment

js中输入三个值,并且由小到大输出

C how to call line and rows when updating the database

js中判断成绩是否合格,范围在0-100,否则重新输入

Online text digit recognition list summation tool
![[compilation principles] review outline of compilation principles of Shandong University](/img/a6/b522a728ff21085411e7452f95872a.png)
[compilation principles] review outline of compilation principles of Shandong University

Speech signal feature extraction process: input speech signal - framing, pre emphasis, windowing, fft- > STFT spectrum (including amplitude and phase) - square the complex number - > amplitude spectru

JS to determine whether the result is qualified, the range is 0-100, otherwise re-enter

js来打印1-100间的质数并求总个数优化版
随机推荐
JS output shape
什么是浮选机?
Set the address book function to database maintenance, and add user name and password
JDBC transaction commit case
基础知识 | js基础
期货反向跟单—交易员的培训问题
JS uses the while cycle to calculate how many years it will take to grow from 1000 yuan to 5000 yuan if the interest rate for many years of investment is 5%
索引+sql练习优化
L'introduction en bourse de Wild Wind Pharmaceutical a pris fin: Yu pinzeng, qui avait l'intention de lever 540 millions de RMB, a effectué un investissement P2P.
Gérer 1000 serveurs par personne? Cet outil d'automatisation o & M doit être maîtrisé
无论LCD和OLED显示技术有多好,都无法替代这个古老的显示数码管
JS to determine whether the number entered by the user is a prime number (multiple methods)
语音合成:Tacotron详解【端到端语音合成模型】【与传统语音合成相比,它没有复杂的语音学和声学特征模块,而是仅用<文本序列,语音声谱>配对数据集对神经网络进行训练,因此简化了很多流程】
win命令行中导入、导出数据库相关表
R 中的 RNA-Seq 数据分析 - 调查数据中的差异表达基因!
JS to judge the odd and even function and find the function of circular area
JS to determine whether the result is qualified, the range is 0-100, otherwise re-enter
Websocket database listening
Implementation principle of similarity method in Oracle
R language consumption behavior statistics based on association rules and cluster analysis