当前位置:网站首页>Using the GPU parallel computing 】 【 OpenCL&OpenCLUtilty GPU parallel computing
Using the GPU parallel computing 】 【 OpenCL&OpenCLUtilty GPU parallel computing
2022-07-30 04:38:00 【JinSu_】
问题背景介绍

CPU:Fewer computing cores,在大规模并行计算能力上极受限制,擅长流程控制和逻辑处理
GPU:More computing cores,Computation-intensive tasks suitable for data parallelism
异构计算:CPUHandle complex logic operations and flow control,When you need to process a large amount of data of uniform type,再调用GPU进行并行计算
OpenCL和CUDA的区别&下载
OpenCL(Open Computing Langugae,开放运算语言)是第一个面向异构系统(此系统中可由CPU,GPU或其它类型的处理器架构组成)A cross-platform open standard for parallel programming.
CUDA(Compute Unified Device Architecture,统一计算架构),是显卡厂商NVIDIA推出的运算平台. 该架构使GPU能够解决复杂的计算问题. 它包含了CUDA指令集架构(ISA)以及GPU内部的并行计算引擎. 开发人员可以使用C、 C++和FORTRAN(或者是OpenCL)来为CUDA架构编写程序,所编写出的程序可以在支持CUDA的处理器上以超高性能运行.
CUDA和OpenCL,前者是配备完整工具包、针对单一供应商(NVIDIA)的成熟的开发平台,后者是一个开放的标准.OpenCL是一个API,at the first level,CUDAArchitecture is one level higher,It doesn't matter on this architectureOpenCL还是DX11这样的API,还是像C语言、Fortran、DX11计算,都可以支持.

Choose & Download Intel SDK for OpenCL Applications

CUDA Toolkit Archive | NVIDIA Developer
OpenCL的基本概念
platform & device
平台(platform)Can be considered to be provided by different manufacturersOpenCL API的实现.If a platform is selected, generally only the devices supported by the platform can be run.就当前的情况来看,如果选择了Intel的OpenCL SDK就只能使用Intel的CPU来进行计算了,如果选择AMD的APP SDKcan be carried outAMD的CPU和AMD的GPU来进行计算.
一般而言,AAfter the company's platform is selected, it cannot be usedBThe company's platform to communicate.
平台由两部分组成,host and device:
宿主机Host :宿主机一般为CPU,扮演组织者的角色.
设备Device :通常称为计算设备,设备有一个或多个计算单元,计算单元又包含一个或多个处理单元.在设备上运行的程序被称为核函数 (For the preparation of kernel functions,CUDA一般直接写在程序内,OpenCL是写在一个独立的文件中,并且文件后缀是.cl,由主机代码读入后执行)

上下文context
定义了整个OpenCL的运行环境,包括设备device、内核kernel、程序对象program、内存对象memoryand command queueCommandQueue :
设备 device : OpenCL程序调用的计算设备.
内核 kernel : Entry function that performs operations on the device program,在主机上调用.
程序对象 program : 内核程序的源代码(.cl文件)和可执行文件.
内存对象 memory : 计算设备执行OpenCL程序所需的变量.
命令队列 CommandQueue : 队列控制着kernel如何执行以及何时执行等细节.
内核kernel & 程序对象program
工作项(Work-item): 跟CUDA中的线程(Threads)是同一个概念,
N多个工作项(线程)执行同样的核函数,每个Work-item都有一个唯一固定的ID号,一般通过这个ID号来区分需要处理的数据.work-itemis an abstract computing unit,It does not correspond exactly to the computing unit allocated at the physical level,One actual physical computing unit can also compute multiple ones separatelywork-item.
工作组(Work-group):跟CUDA中的线程块(Block)是同一个概念,
N多个工作项组成一个工作组,Work-group内的这些Work-item之间可以通信和协作.

Kernel launch interfaceclEnqueueNDRangeKernel

跟CUDA中的 “<<<block,threads >>>来指定kernelThe number of threads to schedule” 是同一个概念,定义了Work-group的组织形式.
其中,三个重要参数:
global_work_offset: global_id的偏移量
global_work_size: 总的work-item数量
local_work_size: 虚拟的分区,定义work-group的中包含的work-item数量
举个例子:获取当前坐标x
get_global_id(0) = get_group_id(0) * get_local_size(0) + get_local_id(0) + get_global_offset(0) ;
核函数 & 内核 & 内核启动
A simple kernel,Multiply the data in the input 2D array by 2后,into the output array


OpenCL的封装库: OpenCLUtilty


https://registry.khronos.org/OpenCL/specs/opencl-cplusplus-1.2.pdf
https://github.com/smistad/OpenCLUtilityLibrary
举例:A + B = C

这里基于OpenCLUtilty开发opencl运行程序,A + B = C.The process is similar to the above figure.





举例:3First-order feature computation for dimensional data

![]()
一个开源的python包,For extracting radiomic features from medical imaging
Radiomic Features — pyradiomics v3.0.1.post15+g2791e23 documentation
边栏推荐
- sql statement - how to query data in another table based on the data in one table
- The 2nd Shanxi Province Network Security Skills Competition (Enterprise Group) Partial WP (10)
- DAY17: weak password detection and test
- Introduction to database - MySQL simple introduction
- Thinkphp 5.0.24 Variable Override Vulnerability Causes RCE Analysis
- @WebServlet注解(Servlet注解)
- Code open source design and implementation ideas
- [Linear table] - Detailed explanation of three practice questions of LeetCode
- 【周周有奖】云原生编程挑战赛“边缘容器”赛道邀你来战!
- 双指针问题(下)
猜你喜欢

1. Get data - requests.get()

2.6 Radix sort (bucket sort)

Simple experiment with BGP

Learning of redis_Basic part

Golang eight-legged text finishing (continuous handling)

Install MySQL Database on Kylin V10 Operating System

Discourse 自定义头部链接(Custom Header Links)

05全局配置文件application.properties详解

Discourse Custom Header Links

小程序npm包--API Promise化
随机推荐
SSM框架简单介绍
共建共享数字世界的根:阿里云打造全面的云原生开源生态
The 2nd Shanxi Province Network Security Skills Competition (Enterprise Group) Part of the WP (9)
2.6基数排序(桶排序)
sql语句-如何以一个表中的数据为条件据查询另一个表中的数据
How does MySql find out the latest data row that meets the conditions?
The 2nd Shanxi Province Network Security Skills Competition (Enterprise Group) Partial WP (10)
QT(39)-vs development qt program prompts that the source file cannot be opened
Classification of decision tree classification
[Linear table] - Detailed explanation of three practice questions of LeetCode
handler+message [message mechanism]
nSoftware.PowerShell.Server.2020
Shanxi group (enterprises) in the second network security skills competition part problem WP (7)
[MRCTF2020]Hello_ misc
The Double Pointer Problem (Part 1)
2.6 Merge Sort
软件测试员必看!数据库知识mysql查询语句大全
3. Dependency configuration management
sql statement - how to query data in another table based on the data in one table
4. Web Development