当前位置:网站首页>CUDA basic knowledge
CUDA basic knowledge
2022-07-04 03:29:00 【ngsford】
GPU The problem that computation suits to solve is simple logic , Calculations are independent and parallel , and CPU The suitable problem to solve is , Complex logic , The problem of interdependency between calculations .
GPU Our design idea is high concurrent throughput , and CPU The design idea of is low delay . So we can see , stay CPU It has multi-level cache and powerful control unit , The purpose of introducing multi-level cache is to reduce latency , and GPU in , It has a large number of computing single clouds , The control unit is relatively simple , The number of caches is also relatively small .
1. Memory model
Hardware side :
SP: Thread processor , Have your own registers( register ) And local memory( Local memory ).registers and local memory Only by SP Visit by yourself .
SM: Multicore processor , By multiple SP as well as shared memory( Shared memory ) form .shared memory Can be SM All in SP visit .
GPU: The graphics card , By multiple SM as well as global memory( Global memory ) form .global memory Can be used by all SP visit .
Software side :
thread: Threads , Corresponding SP.
block: Thread block , By multiple tread form , Corresponding SM.
grid: By multiple block form , One GPU You can have multiple grid.
warp: Thread bundles , from 32 individual tread form . generally ,block yes 32 Multiple .
The index of the thread :
For this concept , It is actually calculating the index of the current thread in the defined computing unit , Index starts from zero . The key to understanding this calculation is to introduce spatial imagination , One dimension is a line , Two dimensions are faces , Three dimensional is three-dimensional .
dim3 grid1(2, 1, 1); // x=2, y=1, z=1, It's one-dimensional ,yz by 1
dim3 grid2(4, 2, 1); // x=4, y=2, z=1, It's two-dimensional ,z by 1
dim3 grid3(2, 3, 4); // x=2, y=3, z=4, This is three-dimensional ,xyz Are not as 1边栏推荐
- Redis transaction
- Is online futures account opening safe and reliable? Which domestic futures company is better?
- Basé sur... Netcore Development blog Project Starblog - (14) Implementation of theme switching function
- 7 * 24-hour business without interruption! Practice of applying multiple live landing in rookie villages
- Setting methods, usage methods and common usage scenarios of environment variables in postman
- Zigzag scan
- Contest3145 - the 37th game of 2021 freshman individual training match_ F: Smallest ball
- MySQL query
- Which product is better for 2022 annual gold insurance?
- false sharing
猜你喜欢

Constantly changing harmonyos custom JS components during the Spring Festival - Smart Koi

No clue about the data analysis report? After reading this introduction of smartbi, you will understand!

Hospital network planning and design document based on GLBP protocol + application form + task statement + opening report + interim examination + literature review + PPT + weekly progress + network to

Leetcode51.n queen

If you have just joined a new company, don't be fired because of your mistakes

What kind of experience is it when the Institute earns 20000 yuan a month!

Imperial cms7.5 imitation "D9 download station" software application download website source code

Audio and video technology development weekly | 232

Résumé: entropie, énergie libre, symétrie et dynamique dans le cerveau

Recursive structure
随机推荐
The property of judging odd or even numbers about XOR.
This function has none of DETERMINISTIC, NO SQL..... (you *might* want to use the less safe log_bin_t
@Scheduled scheduled tasks
I stepped on a foundation pit today
Recursive structure
Unity writes a character controller. The mouse controls the screen to shake and the mouse controls the shooting
Jenkins configures IP address access
3D game modeling is in full swing. Are you still confused about the future?
warning: LF will be replaced by CRLF in XXXXXX
system information
WP collection plug-in free WordPress collection hang up plug-in
Which product is better if you want to go abroad to insure Xinguan?
Enhanced for loop
Formulaire day05
CSP drawing
The "two-way link" of pushing messages helps app quickly realize two-way communication capability
Unity controls the selection of the previous and next characters
Contest3145 - the 37th game of 2021 freshman individual training match_ F: Smallest ball
How to use STR function of C language
Problems and solutions of several concurrent scenarios of redis