当前位置:网站首页>CUDA Programming atomic operation atomicadd reports error err:msb3721, return code 1
CUDA Programming atomic operation atomicadd reports error err:msb3721, return code 1
2022-07-05 04:31:00 【Stanford rabbit】
problem : Atomic manipulation atomicAdd() Report errors err:MSB3721, Return code 1.
Problem description : Today, I am writing and using cuda Accelerate the calculation of normal direction related code after 3D point cloud reconstruction , One step is to calculate the average point distance after counting the point cloud distance . This requires multiple threads to add the distance between different points and adjacent points , Then divide by the number of points .
But without atomic operation atomicAdd() Words , The addition result will inevitably make mistakes . Be similar to OpenMP In parallel for The principle of loop locking the addition variable to ensure the correct addition of variables .
But because of Dot distance is double type , Ignorant I use it directly atomicAdd() Adding dots causes compilation errors err:MSB3721, Return code 1, After searching the official documents, I learned , For double precision double Type of atomic operation atomicAdd(), Only in The computing power is greater than 6.0 On the machine that supports , For my computing power, only 3.5 Old age machine GT720, Unable to compile successfully . So far, the final reason has been found . Look at the Yellow characters .
Direct cause : Double precision double The atomic operation of is done only when the computing power is 6.0 The above devices support , But my equipment is too LOW.
Solution : But the government also gave us this kind of old GPU The way to live , Using the code in the figure, the computing power can be less than 6.0 The device performs double precision double Atomic manipulation .
【 Be careful : Copy the following code , Don't use the official website code directly , Otherwise, the redefinition operator will report an error 】
#if define (__CUDA_ARCH__)||__CUDA_ARCH__ < 600
__device__ double atomicAdd(double* address, double val)
{
unsigned long long int* address_as_ull =
(unsigned long long int*)address;
unsigned long long int old = *address_as_ull, assumed;
do {
assumed = old;
old = atomicCAS(address_as_ull, assumed,
__double_as_longlong(val +
__longlong_as_double(assumed)));
// Note: uses integer comparison to avoid hang in case of NaN (since NaN != NaN)
} while (assumed != old);
return __longlong_as_double(old);
}
#endif
If this method can be used , Please pay attention to a favorite collection !
边栏推荐
- How can CIOs use business analysis to build business value?
- SPI read / write flash principle + complete code
- Managed service network: application architecture evolution in the cloud native Era
- 10 programming habits that web developers should develop
- Power management bus (pmbus)
- Burpsuite grabs app packets
- Hexadecimal to decimal
- 【UNIAPP】系统热更新实现思路
- CSDN body auto generate directory
- Here comes the Lantern Festival red envelope!
猜你喜欢

TPG x AIDU|AI领军人才招募计划进行中!

直播预告 | 容器服务 ACK 弹性预测最佳实践

直播預告 | 容器服務 ACK 彈性預測最佳實踐

Sword finger offer 04 Search in two-dimensional array

CSDN body auto generate directory

Label exchange experiment

Learning notes 8

Key review route of probability theory and mathematical statistics examination

可观测|时序数据降采样在Prometheus实践复盘

首席信息官如何利用业务分析构建业务价值?
随机推荐
You Li takes you to talk about C language 7 (define constants and macros)
Threejs Internet of things, 3D visualization of farm (III) model display, track controller setting, model moving along the route, model adding frame, custom style display label, click the model to obt
Machine learning -- neural network
CSDN正文自动生成目录
PHP读取ini文件并修改内容写入
【虚幻引擎UE】运行和启动的区别,常见问题分析
level18
American 5g open ran suffered another major setback, and its attempt to counter China's 5g technology has failed
蛇形矩阵
Label exchange experiment
QT Bluetooth: a class for searching Bluetooth devices -- qbluetooth devicediscoveryagent
Practice | mobile end practice
PR video clip (project packaging)
About the prompt loading after appscan is opened: guilogic, it keeps loading and gets stuck. My personal solution. (it may be the first solution available in the whole network at present)
The scale of computing power in China ranks second in the world: computing is leaping forward in Intelligent Computing
2022-2028 global and Chinese video coding and transcoding Market Research Report
Introduction to RT thread kernel (4) -- clock management
Function (error prone)
如何进行「小步重构」?
A real day for Beijing programmers!!!!!