当前位置:网站首页>如何免费获得一个市全年的气象数据?降雨量气温湿度太阳辐射等等数据
如何免费获得一个市全年的气象数据?降雨量气温湿度太阳辐射等等数据
2022-08-03 12:02:00 【地理遥感生态网】
气象数据一直是一个价值较高的数据,它被广泛用于各个领域的研究当中。气象数据包括有气温、气压、相对湿度、降水、蒸发、风向风速、日照等多种指标,但是包含了这些全部指标的气象数据却较难获取,即使获取到了也不能随意分享。
想要大规模爬取的话,需要自己写爬虫,我之前写过一个爬取深圳市数据的爬虫。对深圳市的天气数据爬取基本没有问题。
import requests
import demjson
import re
import calendar
import csv
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36\
(KHML, like Gecko) Chrome/52.0.2743.116 Safari/537.36',
}
def get_url(date):
url = 'https://www.timeanddate.com/scripts/cityajax.php?n=china/shenzhen&mode=historic'
url += '&hd=' + date
url += '&month=' + str(int(date[4:6]))
url += '&year=' + date[:4] + '&json=1'
return url
# input: type(str) eg:'20170601'
def crawl_single_day(date):
response = requests.get(get_url(date), headers=headers)
response_list = demjson.decode(response.text)
for weather in response_list:
w_time = re.compile(r'^\d+:\d+').search(weather['c'][0]['h']).group(0)
w_temperature = re.compile(
r'^-?\d+').search(weather['c'][2]['h']).group(0)
w_weather = re.compile(
r'^(.*?)\.').search(weather['c'][3]['h']).group(1)
if weather['c'][4]['h'] == 'No wind':
w_wind_speed = '0'
else:
w_wind_speed = re.compile(
r'^\d+').search(weather['c'][4]['h']).group(0)
w_wind_direction = re.compile(
r'title=\"(.*?)\"').search(weather['c'][5]['h']).group(1)
w_humidity = weather['c'][6]['h']
w_barometer = re.compile(r'^\d+').search(weather['c'][7]['h']).group(0)
w_visibility = weather['c'][8]['h']
if w_visibility != 'N/A':
w_visibility=re.compile(r'^\d+').search(w_visibility).group(0)
yield [date, w_time, w_temperature, w_weather, w_wind_speed, w_wind_direction,
w_humidity, w_barometer, w_visibility]
# input: type(int) eg: year=2017, month=6
def crawl_single_month(year, month):
_, num_day = calendar.monthrange(year, month)
month_str = str(year)
if month < 10:
month_str += '0' + str(month)
else:
month_str += str(month)
day_list = list(range(1, num_day + 1))
for day in day_list:
if day < 10:
for weather in crawl_single_day(month_str + '0' + str(day)):
yield weather
else:
for weather in crawl_single_day(month_str + str(day)):
yield weather
if __name__ == "__main__":
with open('weather0.csv', 'w', encoding='utf-8', newline='') as file:
writer = csv.writer(file)
writer.writerow('date time temperature weather wind_speed wind_direction humidity barometer visibility'.split())
for month in range(7, 13):
writer.writerows(crawl_single_month(2017, month))
with open('weather1.csv', 'w', encoding='utf-8', newline='') as file:
writer = csv.writer(file)
writer.writerow('date time temperature weather wind_speed wind_direction humidity barometer visibility'.split())
writer.writerows(crawl_single_day('20210401'))对 20210401的深圳天气数据爬取获得的 csv 文件如下图所示:
当然啦,需求量比较大的话,可以通过地理遥感生态网平台获取气象数据。
地理遥感生态网平台发布的气象数据包括有气温、气压、相对湿度、降水、蒸发、风向风速、日照太阳辐射等等多种指标。
1级目录
文件名
PRS
SURF_CLI_CHN_MUL_DAY-PRS-10004-YYYYMM.TXT(本站气压)
TEM
SURF_CLI_CHN_MUL_DAY-TEM-12001-YYYYMM.TXT(气温)
RHU
SURF_CLI_CHN_MUL_DAY-RHU-13003-YYYYMM.TXT(相对湿度)
PRE
SURF_CLI_CHN_MUL_DAY-PRE-13011-YYYYMM.TXT(降水)
EVP
SURF_CLI_CHN_MUL_DAY-EVP-13240-YYYYMM.TXT(蒸发)
WIN
SURF_CLI_CHN_MUL_DAY-WIN-11002-YYYYMM.TXT(风向风速)
SSD
SURF_CLI_CHN_MUL_DAY-SSD-14032-YYYYMM.TXT(日照)
GST
SURF_CLI_CHN_MUL_DAY-GST-12030-0cm-YYYYMM.TXT(0cm地温)
赶紧三连关注下, 数据获取途径如下:

边栏推荐
- LeetCode-48. 旋转图像
- Five super handy phone open-source automation tools, which is suitable for you?
- 深度学习中数据到底要不要归一化?实测数据来说明!
- 小身材有大作用——光模块寿命分析(二)
- 4500字归纳总结,一名软件测试工程师需要掌握的技能大全
- 【MySQL功法】第2话 · 数据库与数据表的基本操作
- Simple implementation of a high-performance clone of Redis using .NET (1)
- -找树根2-
- After completing the interview and clearance collection of Alibaba, I successfully won the 15th Offer this year
- ssh 免密登录了解下
猜你喜欢

漫谈缺陷管理的自动化实践方案

The effects of the background and the Activiti

第四周学习 HybridSN,MobileNet V1,V2,V3,SENet

Explain the virtual machine in detail!JD.com produced HotSpot VM source code analysis notes (with complete source code)

C language advanced article: memory function

【MySQL功法】第5话 · SQL单表查询

【云原生 · Kubernetes】部署Kubernetes集群

肝完Alibaba这份面试通关宝典,我成功拿下今年第15个Offer

Matlab学习10-图像处理之傅里叶变换

mysql进阶(二十四)防御SQL注入的方法总结
随机推荐
【MySQL功法】第4话 · 和kiko一起探索MySQL中的运算符
GET 和 POST 有什么区别?
智能日报脚本
谷歌研究员被群嘲:研究员爆料AI有意识,被勒令休假
bash for loop
RTP协议分析
From scratch Blazor Server (6) - authentication based on strategy
深度学习中数据到底要不要归一化?实测数据来说明!
LeetCode-1796. 字符串中第二大的数字
详解虚拟机!京东大佬出品HotSpot VM源码剖析笔记(附完整源码)
MySQL之json数据操作
【MySQL功法】第2话 · 数据库与数据表的基本操作
LyScript 实现对内存堆栈扫描
TiKV & TiFlash 加速复杂业务查询丨TiFlash 应用实践
Realize 2d characters move left and right while jumping
面试突击71:GET 和 POST 有什么区别?
基于英雄联盟的知识图谱问答系统
缓存--伪共享问题
Traceback (most recent call last): File
[Wrong title] Circuit maintenance