当前位置:网站首页>项目场景 with ERRTYPE = cudaError CUDA failure 999 unknown error
项目场景 with ERRTYPE = cudaError CUDA failure 999 unknown error
2022-08-02 02:20:00 【mtl1994】
项目场景 [with ERRTYPE = cudaError; bool THRW = true] CUDA failure 999: unknown error ; GPU=24 :
需要升级之前老的程序,之前的cuda 是10.2
问题描述:
环境
cuda 11.2 (之前是10.2)
onnxruntime-gpu 1.10
python 3.9.7

启动程序的时候
Traceback (most recent call last):
File "/home/aiuser/cover/liheng-foggun/app.py", line 15, in <module>
model = DetectMultiBackend(weights=config.paddle.model_file)
File "/home/aiuser/miniconda3/envs/cover/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/home/aiuser/cover/liheng-foggun/models/yolo.py", line 37, in __init__
self.session = onnxruntime.InferenceSession(weights, providers=['CUDAExecutionProvider'])
File "/home/aiuser/miniconda3/envs/cover/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 335, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/home/aiuser/miniconda3/envs/cover/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 379, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:122 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*) [with ERRTYPE =
cudaError; bool THRW = true] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:116 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*
, ERRTYPE, const char*) [with ERRTYPE = cudaError; bool THRW = true] CUDA failure 999: unknown error ; GPU=24 ; hostname=aiserver-sl-01 ; expr=cudaSetDevice(info_.device_id);
原因分析:
1.刚开始以为是onnxruntime-gpu 版本问题 升级到了 1.12 还是报错
2.网上又说是不兼容的问题
3.试试重装下驱动,卸载了11.2 的时候 通过nvidia-smi 发现之前10.2的驱动还存在
4.是因为之前的驱动没有卸载干净
解决方案:
1.卸载10.2
sudo /usr/local/cuda-10.2/bin/cuda-uninstaller
2.安装新驱动
#离线安装 515.57
sudo ./NVIDIA-Linux-x86_64-515.57.run -no-x-check -no-nouveau-check
VIDIA-Linux-x86_64-515.57.run -no-x-check -no-nouveau-check
边栏推荐
- nacos启动报错,已配置数据库,单机启动
- messy website
- Force buckle, 752-open turntable lock
- swift project, sqlcipher3 -> 4, cannot open legacy database is there a way to fix it
- ¶ Backtop back to the top is not effective
- Entry name 'org/apache/commons/codec/language/bm/gen_approx_greeklatin.txt' collided
- 【LeetCode每日一题】——704.二分查找
- 优炫数据库导库导错了能恢复吗?
- LeetCode brush diary: LCP 03. Machine's adventure
- 用位运算为你的程序加速
猜你喜欢

接口测试神器Apifox究竟有多香?

FOFAHUB使用测试

Entry name 'org/apache/commons/codec/language/bm/gen_approx_greeklatin.txt' collided

Data transfer at the data link layer

LeetCode Brushing Diary: 74. Searching 2D Matrix

Handwritten Blog Platform ~ Day Two

Handwriting a blogging platform ~ Day 3
![[Unity entry plan] 2D Game Kit: A preliminary understanding of the composition of 2D games](/img/8a/07ca69c6dcc22757156cb615e241f8.png)
[Unity entry plan] 2D Game Kit: A preliminary understanding of the composition of 2D games

Chopper webshell feature analysis

oracle查询扫描全表和走索引
随机推荐
"NetEase Internship" Weekly Diary (3)
LeetCode刷题日记:34、 在排序数组中查找元素的第一个和最后一个位置
Analysis of volatile principle
leetcode/字符串中的变位词-s1字符串的某个排列是s2的子串
Remember a pit for gorm initialization
记一次gorm事务及调试解决mysql死锁
Outsourcing worked for three years, it was abolished...
Redis 底层的数据结构
Safety (1)
【web】理解 Cookie 和 Session 机制
字符串常用方法
Rasa 3 x learning series - Rasa - 4873 dispatcher Issues. Utter_message study notes
[ORB_SLAM2] SetPose, UpdatePoseMatrices
Chengdu openGauss user group recruit!
LeetCode brushing diary: 33. Search and rotate sorted array
cocos中使用async await异步加载资源
"NetEase Internship" Weekly Diary (2)
个人博客系统项目测试
[LeetCode Daily Question] - 103. Zigzag Level Order Traversal of Binary Tree
LeetCode刷题日记:74. 搜索二维矩阵