当前位置:网站首页>CUDA basic knowledge
CUDA basic knowledge
2022-07-04 03:29:00 【ngsford】
GPU The problem that computation suits to solve is simple logic , Calculations are independent and parallel , and CPU The suitable problem to solve is , Complex logic , The problem of interdependency between calculations .
GPU Our design idea is high concurrent throughput , and CPU The design idea of is low delay . So we can see , stay CPU It has multi-level cache and powerful control unit , The purpose of introducing multi-level cache is to reduce latency , and GPU in , It has a large number of computing single clouds , The control unit is relatively simple , The number of caches is also relatively small .
1. Memory model
Hardware side :
SP: Thread processor , Have your own registers( register ) And local memory( Local memory ).registers and local memory Only by SP Visit by yourself .
SM: Multicore processor , By multiple SP as well as shared memory( Shared memory ) form .shared memory Can be SM All in SP visit .
GPU: The graphics card , By multiple SM as well as global memory( Global memory ) form .global memory Can be used by all SP visit .
Software side :
thread: Threads , Corresponding SP.
block: Thread block , By multiple tread form , Corresponding SM.
grid: By multiple block form , One GPU You can have multiple grid.
warp: Thread bundles , from 32 individual tread form . generally ,block yes 32 Multiple .
The index of the thread :
For this concept , It is actually calculating the index of the current thread in the defined computing unit , Index starts from zero . The key to understanding this calculation is to introduce spatial imagination , One dimension is a line , Two dimensions are faces , Three dimensional is three-dimensional .
dim3 grid1(2, 1, 1); // x=2, y=1, z=1, It's one-dimensional ,yz by 1
dim3 grid2(4, 2, 1); // x=4, y=2, z=1, It's two-dimensional ,z by 1
dim3 grid3(2, 3, 4); // x=2, y=3, z=4, This is three-dimensional ,xyz Are not as 1
边栏推荐
- 長文綜述:大腦中的熵、自由能、對稱性和動力學
- system information
- false sharing
- Constantly changing harmonyos custom JS components during the Spring Festival - Smart Koi
- Unity knapsack system (code to center and exchange items)
- Contest3145 - the 37th game of 2021 freshman individual training match_ 1: Origami
- 2022 Guangxi provincial safety officer a certificate examination materials and Guangxi provincial safety officer a certificate simulation test questions
- Unity writes a character controller. The mouse controls the screen to shake and the mouse controls the shooting
- 1day vulnerability pushback skills practice (3)
- XSS prevention
猜你喜欢
Rhcsa day 2
I stepped on a foundation pit today
Webhook triggers Jenkins for sonar detection
National standard gb28181 protocol platform easygbs fails to start after replacing MySQL database. How to deal with it?
MySQL is dirty
GUI Graphical user interface programming (XIV) optionmenu - what do you want your girlfriend to wear on Valentine's day
A brief talk on professional modeler: the prospect and professional development of 3D game modeling industry in China
Easy to win insert sort
Backpropagation formula derivation [Li Hongyi deep learning version]
How about the ratings of 2022 Spring Festival Gala in all provinces? Map analysis helps you show clearly!
随机推荐
Johnson–Lindenstrauss Lemma
Enhanced for loop
What is the difference between enterprise wechat applet and wechat applet
3D game modeling is in full swing. Are you still confused about the future?
warning: LF will be replaced by CRLF in XXXXXX
Aperçu du code source futur - série juc
The first spring of the new year | a full set of property management application templates are presented, and Bi construction is "out of the box"
2022 attached lifting scaffold worker (special type of construction work) free test questions and attached lifting scaffold worker (special type of construction work) examination papers 2022 attached
Handler source code analysis
In my spare time, I like to write some technical blogs and read some useless books. If you want to read more of my original articles, you can follow my personal wechat official account up technology c
New year's first race, submit bug reward more!
2022 registration examination for safety production management personnel of fireworks and firecracker production units and examination skills for safety production management personnel of fireworks an
Apple submitted the new MAC model to the regulatory database before the spring conference
Short math guide for latex by Michael downs
Pagoda SSL can't be accessed? 443 port occupied? resolvent
Day05 錶格
What are the conditions for the opening of Tiktok live broadcast preview?
Li Chuang EDA learning notes 13: electrical network for drawing schematic diagram
Summary of Chinese remainder theorem
I stepped on a foundation pit today