当前位置：网站首页>2022cuda summer training camp Day1 practice

2022cuda summer training camp Day1 practice

2022-07-29 10:27:00 【Hua Weiyun】

Zhang Xiaobai once successfully ran the first CUDA Program , But I just know what it is and I don't know why . therefore CUDA Training camp is to help you know why .

We put CPU, This area of memory is called “ host （HOST）”, hold GPU, This area of video memory is called “ equipment （DEVICE）”.

CUDA The code execution of includes the following steps ：

Briefly , Namely host_to_device-》 stay device Top parallel computing -》device_to_host.

cuda The program is actually a right C Extension program . Its suffix is .cu, If the header file is .cuh.

This .cu Procedure except C Outside the syntax of the program , Some more cuda Unique part of , For example, it prefixes the function , It is divided into __global__, __host__,__device__ Three .

about __global__, That's what the training camp says ：

So-called “ Perform configuration ”, As we'll see , For instance, <<< >>> In the middle .

This identifier will be a C The function is declared as a Kernel function . It can only be used on devices （device） On the implementation .

about __host__ That's what it says ：

about __device__ That's what it says ：

Personal understanding , These prefixes define the device where these codes run , This allows the program to decide which device to run on .

For a simple Hello World In terms of code ：

#include <stdio.h>void hello_from_cpu(){    printf("Hello World from the CPU!\n");}int main(void){    hello_from_cpu();    return 0;}

If we want it to be in GPU Up operation , Only two steps are needed ：

（1） The function to be called hello_from_cpu Change it to hello_from_gpu , prefix __global__ Define it as a kernel function .

（2） stay main When the main function is called , Plus execution configuration <<< >>> part , If you add <<<1,1>>> Is parallel 1 Time , If you add <<<2,4>>> Then run 2X4 Time .

Let's look at the effect of the actual code modification ：

#include <stdio.h>__global__ void hello_from_gpu(){    printf("Hello World from the GPU!\n");}int main(void){    hello_from_gpu<<<1,1>>>();    return 0;}

cu The code must use nvcc compile , Compile according to GPU Fill in different parameters for different architectures .