当前位置:网站首页>pytorch 载入历史模型时更换gpu卡号,map_location设置
pytorch 载入历史模型时更换gpu卡号,map_location设置
2022-07-03 13:00:00 【子燕若水】
pytorch 在保存训练后模型的时候,会把训练过程中使用的设备号(例如gpu卡号cuda:0 ,cpu)也一并保存下来。当pytorch重新载入历史模型时,模型默认根据训练时的设备卡号,把权值载入到相应的卡号上。
然而,有的时候测试过程和训练过程的设备情况是不一致的。
举个例子,A主机有四块GPU卡,然后我们用cuda:3 训练模型,并保存模型。
在测试时候,我们需要在客户的B主机跑模型,但是B主机只有一块gpu卡:cuda:0 。
如果按照默认方式载入模型的话,pytorch会报找不到gpu设备,或其他一些错误。
此时,载入的时候需要做一个变换,为torch.load指定gpu设备的映射方式:
根据pytorch的文档,在加载模型的时候,可以指定将模型的tensor加载到特定目标GPU上。
加载方法有:
>>> torch.load('tensors.pt')
# 1. Load all tensors onto the GPU 0
>>> torch.load('tensors.pt', map_location=torch.device('cuda:0'))
# 2. Load all tensors onto GPU 1
>>> torch.load('tensors.pt', map_location=lambda storage, loc: storage.cuda(1))
# 3. Map tensors from GPU 1 to GPU 0
>>> torch.load('tensors.pt', map_location={'cuda:1':'cuda:0'})
实测发现:
方法1 根本就没有加载到目标卡,模型原来在什么卡训练的,还是加载到原来的旧卡号上,因此指定失败。
方法3,代码之间就报错,location.startswith(‘cuda’): AttributeError: ‘NoneType’ object has no attribute ‘startswitch’,分析代码发现这是torch自己的bug! 坑爹的。
方法2: 可以正常的把tensor都加载到cuda1上。
————————————————
版权声明:本文为CSDN博主「Icoding_F2014」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/jmh1996/article/details/111041108
边栏推荐
- Some thoughts on business
- 18W word Flink SQL God Road manual, born in the sky
- PowerPoint 教程,如何在 PowerPoint 中将演示文稿另存为视频?
- 71 articles on Flink practice and principle analysis (necessary for interview)
- CVPR 2022 | interpretation of 6 excellent papers selected by meituan technical team
- Flink SQL knows why (XV): changed the source code and realized a batch lookup join (with source code attached)
- Realize the recognition and training of CNN images, and process the cifar10 data set and other methods through the tensorflow framework
- 今日睡眠质量记录77分
- R语言使用data函数获取当前R环境可用的示例数据集:获取datasets包中的所有示例数据集、获取所有包的数据集、获取特定包的数据集
- JS 将伪数组转换成数组
猜你喜欢
Flink SQL knows why (7): haven't you even seen the ETL and group AGG scenarios that are most suitable for Flink SQL?
Introduction to the implementation principle of rxjs observable filter operator
2022-02-14 incluxdb cluster write data writetoshard parsing
106. 如何提高 SAP UI5 应用路由 url 的可读性
Kivy教程之 如何自动载入kv文件
物联网毕设 --(STM32f407连接云平台检测数据)
[Database Principle and Application Tutorial (4th Edition | wechat Edition) Chen Zhibo] [Chapter 6 exercises]
Elk note 24 -- replace logstash consumption log with gohangout
8皇后问题
The 35 required questions in MySQL interview are illustrated, which is too easy to understand
随机推荐
物联网毕设 --(STM32f407连接云平台检测数据)
Solve system has not been booted with SYSTEMd as init system (PID 1) Can‘t operate.
研发团队资源成本优化实践
MapReduce实现矩阵乘法–实现代码
Resource Cost Optimization Practice of R & D team
Red Hat Satellite 6:更好地管理服务器和云
TensorBoard可视化处理案例简析
Realize the recognition and training of CNN images, and process the cifar10 data set and other methods through the tensorflow framework
The principle of human voice transformer
Comprehensive evaluation of double chain notes remnote: fast input, PDF reading, interval repetition / memory
Flink SQL knows why (XI): weight removal is not only count distinct, but also powerful duplication
71 articles on Flink practice and principle analysis (necessary for interview)
Useful blog links
Understanding of CPU buffer line
Several common optimization methods matlab principle and depth analysis
2022-02-11 heap sorting and recursion
Today's sleep quality record 77 points
Kivy教程之 如何自动载入kv文件
编程内功之编程语言众多的原因
php:  The document cannot be displayed in Chinese