当前位置:网站首页>CUDA Programming atomic operation atomicadd reports error err:msb3721, return code 1
CUDA Programming atomic operation atomicadd reports error err:msb3721, return code 1
2022-07-05 04:31:00 【Stanford rabbit】
problem : Atomic manipulation atomicAdd() Report errors err:MSB3721, Return code 1.
Problem description : Today, I am writing and using cuda Accelerate the calculation of normal direction related code after 3D point cloud reconstruction , One step is to calculate the average point distance after counting the point cloud distance . This requires multiple threads to add the distance between different points and adjacent points , Then divide by the number of points .
But without atomic operation atomicAdd() Words , The addition result will inevitably make mistakes . Be similar to OpenMP In parallel for The principle of loop locking the addition variable to ensure the correct addition of variables .
But because of Dot distance is double type , Ignorant I use it directly atomicAdd() Adding dots causes compilation errors err:MSB3721, Return code 1, After searching the official documents, I learned , For double precision double Type of atomic operation atomicAdd(), Only in The computing power is greater than 6.0 On the machine that supports , For my computing power, only 3.5 Old age machine GT720, Unable to compile successfully . So far, the final reason has been found . Look at the Yellow characters .
Direct cause : Double precision double The atomic operation of is done only when the computing power is 6.0 The above devices support , But my equipment is too LOW.
Solution : But the government also gave us this kind of old GPU The way to live , Using the code in the figure, the computing power can be less than 6.0 The device performs double precision double Atomic manipulation .
【 Be careful : Copy the following code , Don't use the official website code directly , Otherwise, the redefinition operator will report an error 】
#if define (__CUDA_ARCH__)||__CUDA_ARCH__ < 600
__device__ double atomicAdd(double* address, double val)
{
unsigned long long int* address_as_ull =
(unsigned long long int*)address;
unsigned long long int old = *address_as_ull, assumed;
do {
assumed = old;
old = atomicCAS(address_as_ull, assumed,
__double_as_longlong(val +
__longlong_as_double(assumed)));
// Note: uses integer comparison to avoid hang in case of NaN (since NaN != NaN)
} while (assumed != old);
return __longlong_as_double(old);
}
#endif
If this method can be used , Please pay attention to a favorite collection !
边栏推荐
- Technical tutorial: how to use easydss to push live streaming to qiniu cloud?
- [phantom engine UE] only six steps are needed to realize the deployment of ue5 pixel stream and avoid detours! (the principles of 4.26 and 4.27 are similar)
- Sword finger offer 04 Search in two-dimensional array
- Official announcement! The third cloud native programming challenge is officially launched!
- 【thingsboard】替换首页logo的方法
- A应用唤醒B应该快速方法
- 如何进行「小步重构」?
- Matplotlib draws three-dimensional scatter and surface graphs
- QT Bluetooth: a class for searching Bluetooth devices -- qbluetooth devicediscoveryagent
- [untitled]
猜你喜欢
[illusory engine UE] method to realize close-range rotation of operating objects under fuzzy background and pit recording
Live broadcast preview | container service ack elasticity prediction best practice
Network security - record web vulnerability fixes
Cookie learning diary 1
函數(易錯)
CSDN正文自动生成目录
[phantom engine UE] the difference between running and starting, and the analysis of common problems
Uncover the seven quirky brain circuits necessary for technology leaders
MacBook安装postgreSQL+postgis
2022-2028 global and Chinese video coding and transcoding Market Research Report
随机推荐
【科普】热设计基础知识:5G光器件之散热分析
[Chongqing Guangdong education] 2408t Chinese contemporary literature reference test in autumn 2018 of the National Open University
可观测|时序数据降采样在Prometheus实践复盘
The scale of computing power in China ranks second in the world: computing is leaping forward in Intelligent Computing
【虛幻引擎UE】實現UE5像素流部署僅需六步操作少走彎路!(4.26和4.27原理類似)
C26451: arithmetic overflow: use the operator * on a 4-byte value, and then convert the result to an 8-byte value. To avoid overflow, cast the value to wide type before calling the operator * (io.2)
Neural networks and deep learning Chapter 4: feedforward neural networks reading questions
Ctfshow 2022 Spring Festival welcome (detailed commentary)
Here comes the Lantern Festival red envelope!
Sequence diagram of single sign on Certification Center
Official announcement! The third cloud native programming challenge is officially launched!
Invalid bound statement (not found) in idea -- problem solving
Aperçu en direct | Services de conteneurs ACK flexible Prediction Best Practices
This is an age of uncertainty
CSDN正文自动生成目录
机器学习 --- 神经网络
Threejs Internet of things, 3D visualization of farm (III) model display, track controller setting, model moving along the route, model adding frame, custom style display label, click the model to obt
蛇形矩阵
Managed service network: application architecture evolution in the cloud native Era
PHP reads the INI file and writes the modified content