当前位置:网站首页>项目场景 with ERRTYPE = cudaError CUDA failure 999 unknown error
项目场景 with ERRTYPE = cudaError CUDA failure 999 unknown error
2022-08-02 02:20:00 【mtl1994】
项目场景 [with ERRTYPE = cudaError; bool THRW = true] CUDA failure 999: unknown error ; GPU=24 :
需要升级之前老的程序,之前的cuda 是10.2
问题描述:
环境
cuda 11.2 (之前是10.2)
onnxruntime-gpu 1.10
python 3.9.7

启动程序的时候
Traceback (most recent call last):
File "/home/aiuser/cover/liheng-foggun/app.py", line 15, in <module>
model = DetectMultiBackend(weights=config.paddle.model_file)
File "/home/aiuser/miniconda3/envs/cover/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/home/aiuser/cover/liheng-foggun/models/yolo.py", line 37, in __init__
self.session = onnxruntime.InferenceSession(weights, providers=['CUDAExecutionProvider'])
File "/home/aiuser/miniconda3/envs/cover/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 335, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/home/aiuser/miniconda3/envs/cover/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 379, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:122 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*) [with ERRTYPE =
cudaError; bool THRW = true] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:116 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*
, ERRTYPE, const char*) [with ERRTYPE = cudaError; bool THRW = true] CUDA failure 999: unknown error ; GPU=24 ; hostname=aiserver-sl-01 ; expr=cudaSetDevice(info_.device_id);
原因分析:
1.刚开始以为是onnxruntime-gpu 版本问题 升级到了 1.12 还是报错
2.网上又说是不兼容的问题
3.试试重装下驱动,卸载了11.2 的时候 通过nvidia-smi 发现之前10.2的驱动还存在
4.是因为之前的驱动没有卸载干净
解决方案:
1.卸载10.2
sudo /usr/local/cuda-10.2/bin/cuda-uninstaller
2.安装新驱动
#离线安装 515.57
sudo ./NVIDIA-Linux-x86_64-515.57.run -no-x-check -no-nouveau-check
VIDIA-Linux-x86_64-515.57.run -no-x-check -no-nouveau-check
边栏推荐
- Talking about the "horizontal, vertical and vertical" development trend of domestic ERP
- AWR analysis report questions for help: How can SQL be optimized from what aspects?
- 2022-08-01 反思
- 菜刀webshell特征分析
- The underlying data structure of Redis
- 通用客户端架构
- libcurl访问url保存为文件的简单示例
- Reflex WMS Intermediate Series 7: What should I do if I want to cancel the picking of an HD that has finished picking but has not yet been loaded?
- Hiring a WordPress Developer: 4 Practical Ways
- NIO‘s Sword(牛客多校赛)
猜你喜欢

BI-SQL丨WHILE

Hash collisions and consistent hashing

2022-07-30 mysql8 executes slow SQL-Q17 analysis

【LeetCode每日一题】——704.二分查找

【web】Understanding Cookie and Session Mechanism

Data transfer at the data link layer

【LeetCode每日一题】——654.最大二叉树
![[ORB_SLAM2] void Frame::ComputeImageBounds(const cv::Mat & imLeft)](/img/ed/ffced88c9d23c20ccf380494051381.jpg)
[ORB_SLAM2] void Frame::ComputeImageBounds(const cv::Mat & imLeft)

Check if IP or port is blocked

Unable to log in to the Westward Journey
随机推荐
LeetCode brush diary: LCP 03. Machine's adventure
Analysis of volatile principle
记一个gorm初始化的坑
2023年起,这些地区软考成绩低于45分也能拿证
Hiring a WordPress Developer: 4 Practical Ways
2022-07-30 mysql8 executes slow SQL-Q17 analysis
The first time I wrote a programming interview question for Niu Ke: input a string and return the letter with the most occurrences of the string
LeetCode Review Diary: 34. Find the first and last position of an element in a sorted array
2022-08-01 mysql/stoonedb慢SQL-Q18分析
ALCCIKERS Shane 20191114
Power button 1374. Generate each character string is an odd number
BioVendor人俱乐部细胞蛋白(CC16)Elisa试剂盒研究领域
[Unity entry plan] 2D Game Kit: A preliminary understanding of the composition of 2D games
Safety (1)
messy website
Redis 底层的数据结构
用位运算为你的程序加速
Talking about the "horizontal, vertical and vertical" development trend of domestic ERP
项目后台技术Express
Data transfer at the data link layer