当前位置:网站首页>项目场景 with ERRTYPE = cudaError CUDA failure 999 unknown error
项目场景 with ERRTYPE = cudaError CUDA failure 999 unknown error
2022-08-02 02:20:00 【mtl1994】
项目场景 [with ERRTYPE = cudaError; bool THRW = true] CUDA failure 999: unknown error ; GPU=24 :
需要升级之前老的程序,之前的cuda 是10.2
问题描述:
环境
cuda 11.2 (之前是10.2)
onnxruntime-gpu 1.10
python 3.9.7

启动程序的时候
Traceback (most recent call last):
File "/home/aiuser/cover/liheng-foggun/app.py", line 15, in <module>
model = DetectMultiBackend(weights=config.paddle.model_file)
File "/home/aiuser/miniconda3/envs/cover/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/home/aiuser/cover/liheng-foggun/models/yolo.py", line 37, in __init__
self.session = onnxruntime.InferenceSession(weights, providers=['CUDAExecutionProvider'])
File "/home/aiuser/miniconda3/envs/cover/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 335, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/home/aiuser/miniconda3/envs/cover/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 379, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:122 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*) [with ERRTYPE =
cudaError; bool THRW = true] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:116 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*
, ERRTYPE, const char*) [with ERRTYPE = cudaError; bool THRW = true] CUDA failure 999: unknown error ; GPU=24 ; hostname=aiserver-sl-01 ; expr=cudaSetDevice(info_.device_id);
原因分析:
1.刚开始以为是onnxruntime-gpu 版本问题 升级到了 1.12 还是报错
2.网上又说是不兼容的问题
3.试试重装下驱动,卸载了11.2 的时候 通过nvidia-smi 发现之前10.2的驱动还存在
4.是因为之前的驱动没有卸载干净
解决方案:
1.卸载10.2
sudo /usr/local/cuda-10.2/bin/cuda-uninstaller
2.安装新驱动
#离线安装 515.57
sudo ./NVIDIA-Linux-x86_64-515.57.run -no-x-check -no-nouveau-check
VIDIA-Linux-x86_64-515.57.run -no-x-check -no-nouveau-check
边栏推荐
- BI-SQL丨WHILE
- LeetCode brushing diary: 53, the largest sub-array and
- LeetCode Review Diary: 34. Find the first and last position of an element in a sorted array
- BioVendor Human Club Cellular Protein (CC16) Elisa Kit Research Fields
- 【LeetCode每日一题】——103.二叉树的锯齿形层序遍历
- leetcode / anagram in string - some permutation of s1 string is a substring of s2
- [Unity entry plan] 2D Game Kit: A preliminary understanding of the composition of 2D games
- LeetCode Review Diary: 153. Find the Minimum Value in a Rotated Sort Array
- to-be-read list
- Talking about the "horizontal, vertical and vertical" development trend of domestic ERP
猜你喜欢

LeetCode brush diary: LCP 03. Machine's adventure

Check if IP or port is blocked

BI-SQL丨WHILE

Software testing Interface automation testing Pytest framework encapsulates requests library Encapsulates unified request and multiple base path processing Interface association encapsulation Test cas

"NetEase Internship" Weekly Diary (1)

个人博客系统项目测试

字典常用方法

Use baidu EasyDL implement factory workers smoking behavior recognition

Use DBeaver for mysql data backup and recovery

Good News | AR opens a new model for the textile industry, and ALVA Systems wins another award!
随机推荐
【LeetCode每日一题】——654.最大二叉树
Nanoprobes丨1-巯基-(三甘醇)甲醚功能化金纳米颗粒
【Unity入门计划】2D Game Kit:初步了解2D游戏组成
ALCCIKERS Shane 20191114
nacos启动报错,已配置数据库,单机启动
【LeetCode每日一题】——103.二叉树的锯齿形层序遍历
工程师如何对待开源
永磁同步电机36问(三)——SVPWM代码实现
Personal blog system project test
个人博客系统项目测试
Rasa 3 x learning series - Rasa - 4873 dispatcher Issues. Utter_message study notes
The failure to create a role in Dahua Westward Journey has been solved
swift project, sqlcipher3 -> 4, cannot open legacy database is there a way to fix it
力扣(LeetCode)213. 打家劫舍 II(2022.08.01)
Coding Experience Talk
2022-08-01 mysql/stoonedb slow SQL-Q18 analysis
Remember a gorm transaction and debug to solve mysql deadlock
Nanoprobes免疫测定丨FluoroNanogold试剂免疫染色方案
使用DBeaver进行mysql数据备份与恢复
AWR analysis report questions for help: How can SQL be optimized from what aspects?