当前位置:网站首页>CUDA details GPU architecture
CUDA details GPU architecture
2022-07-29 02:36:00 【Autumn ink】
Each thread has its own private local memory (Local Memory), Each thread block contains shared memory (Shared Memory), Can be shared by all threads in the thread block , Its life cycle is consistent with the thread block . Besides , All threads have access to global memory (Global Memory). You can also access some read-only memory blocks : Constant memory (Constant Memory) And texture memory (Texture Memory). Memory structure involves program optimization , They are not discussed in depth here .

GPU One of the core components of hardware is SM, I've said that before ,SM The English name is Streaming Multiprocessor, Stream multiprocessor .SM Its core components include CUDA The core , Shared memory , Register, etc ,SM Hundreds of threads can be executed concurrently , Concurrency depends on SM Number of resources owned . When one kernel When executed , its gird Thread blocks in are allocated to SM On , A thread block can only be in one SM Was dispatched to .SM In general, multiple thread blocks can be scheduled , It depends SM Own ability . So it's possible that one kernel Each thread block of is allocated multiple SM, therefore grid It's just logic , and SM It's the physical layer of execution .SM It's using SIMT (Single-Instruction, Multiple-Thread, Single instruction multithreading ) framework , The basic execution unit is
边栏推荐
- 0728~ sorting out interview questions
- Meeting notice of meeting OA
- 详解JS的四种异步解决方案:回调函数、Promise、Generator、async/await
- Work queue_ queue
- Responsive dream weaving template home decoration building materials website
- [upload pictures can be cut-1]
- Code random notes_ Hash_ 349 intersection of two numbers
- XSS靶场(二)xss.haozi
- Teach you how to install vscode by hand (with illustrated steps)
- 2022/07/28 learning notes (day18) common APIs
猜你喜欢

Redis主从模式、哨兵集群、分片集群

Teach you how to install vscode by hand (with illustrated steps)

当Synchronized遇到这玩意儿,有个大坑,要注意

Explain the four asynchronous solutions of JS in detail: callback function, promise, generator, async/await

How to guarantee password security? How does the secure browser manage passwords?

FPGA skimming memory (Verilog implementation of ram and FIFO)

防止勒索软件攻击数据的十种方法

MQTT例程

The outsourcing company "mixed" for two years, and I only did five things seriously. Now I get byte offer smoothly.

裂开了,一次连接池参数导致的雪崩问题
随机推荐
Work queue_ queue
Responsive dream weaving template home decoration website
如何利用 RPA 实现自动化获客?
Quickly master nodejs installation and getting started
ES6 detailed quick start!
IOT components
HTTP breakpoint resume and cache problems
会议OA之会议通知
FPGA skimming memory (Verilog implementation of ram and FIFO)
Keil5 open the engineering prompt not found device solution
How to quickly design a set of cross end components that support rendering rich text content
When synchronized encounters this thing, there is a big hole, so be careful
When I look at the source code, what am I thinking?
结合Retrofit 改造OKHttp 缓存
How to guarantee password security? How does the secure browser manage passwords?
HTTP cache
Experiment 2: Arduino's tricolor lamp experiment
Waiting queue wait_ queue
“两个披萨”团队的分支管理实践
代码实现 —— 多项式的最大公因式(线性代数)