当前位置:网站首页>pytorch 载入历史模型时更换gpu卡号,map_location设置
pytorch 载入历史模型时更换gpu卡号,map_location设置
2022-07-03 13:00:00 【子燕若水】
pytorch 在保存训练后模型的时候,会把训练过程中使用的设备号(例如gpu卡号cuda:0 ,cpu)也一并保存下来。当pytorch重新载入历史模型时,模型默认根据训练时的设备卡号,把权值载入到相应的卡号上。
然而,有的时候测试过程和训练过程的设备情况是不一致的。
举个例子,A主机有四块GPU卡,然后我们用cuda:3 训练模型,并保存模型。
在测试时候,我们需要在客户的B主机跑模型,但是B主机只有一块gpu卡:cuda:0 。
如果按照默认方式载入模型的话,pytorch会报找不到gpu设备,或其他一些错误。
此时,载入的时候需要做一个变换,为torch.load指定gpu设备的映射方式:
根据pytorch的文档,在加载模型的时候,可以指定将模型的tensor加载到特定目标GPU上。
加载方法有:
>>> torch.load('tensors.pt')
# 1. Load all tensors onto the GPU 0
>>> torch.load('tensors.pt', map_location=torch.device('cuda:0'))
# 2. Load all tensors onto GPU 1
>>> torch.load('tensors.pt', map_location=lambda storage, loc: storage.cuda(1))
# 3. Map tensors from GPU 1 to GPU 0
>>> torch.load('tensors.pt', map_location={'cuda:1':'cuda:0'})
实测发现:
方法1 根本就没有加载到目标卡,模型原来在什么卡训练的,还是加载到原来的旧卡号上,因此指定失败。
方法3,代码之间就报错,location.startswith(‘cuda’): AttributeError: ‘NoneType’ object has no attribute ‘startswitch’,分析代码发现这是torch自己的bug! 坑爹的。
方法2: 可以正常的把tensor都加载到cuda1上。
————————————————
版权声明:本文为CSDN博主「Icoding_F2014」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/jmh1996/article/details/111041108
边栏推荐
- AI 考高数得分 81,网友:AI 模型也免不了“内卷”!
- 2022-02-13 plan for next week
- Task5: multi type emotion analysis
- My creation anniversary: the fifth anniversary
- 刚毕业的欧洲大学生,就能拿到美国互联网大厂 Offer?
- Flink SQL knows why (7): haven't you even seen the ETL and group AGG scenarios that are most suitable for Flink SQL?
- In the promotion season, how to reduce the preparation time of defense materials by 50% and adjust the mentality (personal experience summary)
- Oracle memory management
- Logback 日志框架
- [colab] [7 methods of using external data]
猜你喜欢

Some thoughts on business

Today's sleep quality record 77 points

Road construction issues

Resolved (error in viewing data information in machine learning) attributeerror: target_ names

MySQL functions and related cases and exercises

Introduction to the implementation principle of rxjs observable filter operator

STM32 and motor development (from MCU to architecture design)

Flink SQL knows why (XI): weight removal is not only count distinct, but also powerful duplication

TensorBoard可视化处理案例简析

Seven habits of highly effective people
随机推荐
Logback log framework
Smbms project
Red hat satellite 6: better management of servers and clouds
DQL basic query
Reptile
Asp.Net Core1.1版本没了project.json,这样来生成跨平台包
Flink SQL knows why (XIV): the way to optimize the performance of dimension table join (Part 1) with source code
Useful blog links
Flink SQL knows why (17): Zeppelin, a sharp tool for developing Flink SQL
[today in history] July 3: ergonomic standards act; The birth of pioneers in the field of consumer electronics; Ubisoft releases uplay
JS convert pseudo array to array
这本数学书AI圈都在转,资深ML研究员历时7年之作,免费电子版可看
Logseq 评测:优点、缺点、评价、学习教程
Tencent cloud tdsql database delivery and operation and maintenance Junior Engineer - some questions of Tencent cloud cloudlite certification (TCA) examination
Seven habits of highly effective people
KEIL5出现中文字体乱码的解决方法
Flink SQL knows why (VIII): the wonderful way to parse Flink SQL tumble window
Tutoriel PowerPoint, comment enregistrer une présentation sous forme de vidéo dans Powerpoint?
When we are doing flow batch integration, what are we doing?
编程内功之编程语言众多的原因