当前位置:网站首页>Tips of Day1 practice in 2022cuda summer training camp
Tips of Day1 practice in 2022cuda summer training camp
2022-07-29 09:46:00 【Hua Weiyun】

Previous review :
2022CUDA Summer training camp Day1 practice https://bbs.huaweicloud.com/blogs/364478
TIPS1: solve jupyter lab in sudo When you need a password
Jupyter Lab It's a good thing . It makes programming visible .
However , If you carry out such instructions :
!sudo /usr/local/cuda/bin/nvprof --print-api-trace ./hello_cudaThat's a little bad :

Because it will ask you for the password , But it doesn't give you a place to enter your password .
So you have to wait for the asterisk on the left to light up for a while ... Not like others cell After that, it becomes a number .
However , There is no solution . You can try this :
!echo nano|sudo -S /usr/local/cuda/bin/nvprof --print-api-trace ./hello_cudaTake the password as the input behind the pipeline .
You'll find that , Unexpectedly, he can run !


TIPS2: A mistake deepened the right CUDA Understanding of asynchronous execution
Zhang Xiaobai Jupyter Lab In the implementation of cuda Of helloworld( I'm sorry , The third time helloworld 了 ) I found a strange phenomenon , occasionally , After compilation and execution, there will be output , occasionally , No output after compilation and execution . But to perform nprof There is always output when .

So Zhang Xiaobai consulted the teacher :

The teacher pointed out that , A synchronization process is missing from my code . Less cudaDeviceSynchronize();
because CUDA The code itself is executed asynchronously . When the program instruction is issued, it is over . This is the time , without perform cudaDeviceSynchronize(), Wait for synchronization , Then it is possible that the result will come out before the program is executed , It is possible that the result will not come out until the program is executed . At that time , The caller's program has long exited . This also led to the strange phenomenon found by Zhang Xiaobai —— Sometimes there are results , Sometimes there is no result .
After synchronization, the code is as follows :
#include <stdio.h>__global__ void hello_from_gpu(){ printf("Hello World from the GPU!\n");}int main(void){ hello_from_gpu<<<2,2>>>(); cudaDeviceSynchronize(); return 0;}The results are as follows :

This time , Whether it's nvcc When compiling, add -run Parameters , Or run it separately after compilation , Or in nvprof Run in , There are printed results .
So it is better to , Make a deep impression . In the future, there will be no error of forgetting to add synchronization in the code ...( Maybe there will be !)
( The full text after , Thank you for reading )
边栏推荐
- Implementation and verification logic of complex expression input component
- 阿左的境界
- Solve the problem of reading data garbled by redis visualization tool
- Data type of MySQL
- QoS quality of service five traffic shaping of QoS boundary behavior
- pytest+allure生成测试报告
- dataframe. to_ Sql() inserts too many errors at one time
- 网络安全(6)
- Why does the system we developed have concurrent bugs? What is the root cause of concurrent bugs?
- Manually build ABP framework from 0 -abp official complete solution and manually build simplified solution practice
猜你喜欢

Encyclopedia of introduction to machine learning - 2018 "machine learning beginners" official account article summary

In simple terms, dependency injection and its application in Tiktok live broadcast

机器学习入门的百科全书-2018年“机器学习初学者”公众号文章汇总

Virtual machines use host graphics cards (Hyper-V and wsl2)

Problems and solutions of introducing redis cache

mysql 数据库 期末复习题库

Network security (5)

开放原子开源基金会黄金捐赠人优博讯携手合作伙伴,助力OpenHarmony破圈!

附录2-一些简单的练习

PyQt5快速开发与实战 6.1 好软件的三个维度 && 6.2 PyQt5中的布局管理 && 6.3 PyQt5的绝对位置布局
随机推荐
Manually build ABP framework from 0 -abp official complete solution and manually build simplified solution practice
一知半解 ~题目杂记 ~ 一个多态问题
[Apple Developer account]06 after transferring the developer account, the annual fee of the developer is automatically renewed
36. JS animation
那句话的作用
机器学习入门的百科全书-2018年“机器学习初学者”公众号文章汇总
[C language] Sanzi chess (intelligent chess playing + blocking players)
Excel tool for generating database table structure
尹伊:我的学习成长路径
使用cpolar发布树莓派网页(cpolar隧道的完善)
NFA determination and DFA minimization based on C language
【微信小程序】接口生成自定义首页二维码
机器学习之逻辑回归(Logistics Regression)
Network security (5)
【C语言】扫雷(递归展开 + 标记功能)
Configuration file settings for remote connection to Windows version server redis
高智伟:数据管理赋能交通行业数字化转型
Gao Zhiwei: data management enables the digital transformation of the transportation industry
mysql 数据库 期末复习题库
Pyqt5 rapid development and practice 6.4 qboxlayout (box layout)