当前位置：网站首页>pytorch 载入历史模型时更换gpu卡号，map_location设置

pytorch 载入历史模型时更换gpu卡号，map_location设置

2022-07-03 13:00:00 【子燕若水】

pytorch 在保存训练后模型的时候，会把训练过程中使用的设备号（例如gpu卡号cuda:0 ，cpu）也一并保存下来。当pytorch重新载入历史模型时，模型默认根据训练时的设备卡号，把权值载入到相应的卡号上。

然而，有的时候测试过程和训练过程的设备情况是不一致的。
举个例子，A主机有四块GPU卡，然后我们用cuda:3 训练模型，并保存模型。
在测试时候，我们需要在客户的B主机跑模型，但是B主机只有一块gpu卡：cuda:0 。

如果按照默认方式载入模型的话，pytorch会报找不到gpu设备，或其他一些错误。

此时，载入的时候需要做一个变换，为torch.load指定gpu设备的映射方式：

根据pytorch的文档，在加载模型的时候，可以指定将模型的tensor加载到特定目标GPU上。
加载方法有：

>>> torch.load('tensors.pt')
# 1. Load all tensors onto the GPU 0
>>> torch.load('tensors.pt', map_location=torch.device('cuda:0'))
# 2. Load all tensors onto GPU 1
>>> torch.load('tensors.pt', map_location=lambda storage, loc: storage.cuda(1))
# 3. Map tensors from GPU 1 to GPU 0
>>> torch.load('tensors.pt', map_location={'cuda:1':'cuda:0'})

实测发现：
方法1 根本就没有加载到目标卡，模型原来在什么卡训练的，还是加载到原来的旧卡号上，因此指定失败。
方法3，代码之间就报错，location.startswith(‘cuda’): AttributeError: ‘NoneType’ object has no attribute ‘startswitch’，分析代码发现这是torch自己的bug! 坑爹的。
方法2：可以正常的把tensor都加载到cuda1上。

————————————————
版权声明：本文为CSDN博主「Icoding_F2014」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/jmh1996/article/details/111041108

原网站

版权声明
本文为[子燕若水]所创，转载请带上原文链接，感谢
https://xiaoiedu.blog.csdn.net/article/details/125577206

当前位置：网站首页>pytorch 载入历史模型时更换gpu卡号，map_location设置

pytorch 载入历史模型时更换gpu卡号，map_location设置

边栏推荐

猜你喜欢

随机推荐