当前位置:网站首页>Using the GPU parallel computing 】 【 OpenCL&OpenCLUtilty GPU parallel computing
Using the GPU parallel computing 】 【 OpenCL&OpenCLUtilty GPU parallel computing
2022-07-30 04:38:00 【JinSu_】
问题背景介绍

CPU:Fewer computing cores,在大规模并行计算能力上极受限制,擅长流程控制和逻辑处理
GPU:More computing cores,Computation-intensive tasks suitable for data parallelism
异构计算:CPUHandle complex logic operations and flow control,When you need to process a large amount of data of uniform type,再调用GPU进行并行计算
OpenCL和CUDA的区别&下载
OpenCL(Open Computing Langugae,开放运算语言)是第一个面向异构系统(此系统中可由CPU,GPU或其它类型的处理器架构组成)A cross-platform open standard for parallel programming.
CUDA(Compute Unified Device Architecture,统一计算架构),是显卡厂商NVIDIA推出的运算平台. 该架构使GPU能够解决复杂的计算问题. 它包含了CUDA指令集架构(ISA)以及GPU内部的并行计算引擎. 开发人员可以使用C、 C++和FORTRAN(或者是OpenCL)来为CUDA架构编写程序,所编写出的程序可以在支持CUDA的处理器上以超高性能运行.
CUDA和OpenCL,前者是配备完整工具包、针对单一供应商(NVIDIA)的成熟的开发平台,后者是一个开放的标准.OpenCL是一个API,at the first level,CUDAArchitecture is one level higher,It doesn't matter on this architectureOpenCL还是DX11这样的API,还是像C语言、Fortran、DX11计算,都可以支持.

Choose & Download Intel SDK for OpenCL Applications

CUDA Toolkit Archive | NVIDIA Developer
OpenCL的基本概念
platform & device
平台(platform)Can be considered to be provided by different manufacturersOpenCL API的实现.If a platform is selected, generally only the devices supported by the platform can be run.就当前的情况来看,如果选择了Intel的OpenCL SDK就只能使用Intel的CPU来进行计算了,如果选择AMD的APP SDKcan be carried outAMD的CPU和AMD的GPU来进行计算.
一般而言,AAfter the company's platform is selected, it cannot be usedBThe company's platform to communicate.
平台由两部分组成,host and device:
宿主机Host :宿主机一般为CPU,扮演组织者的角色.
设备Device :通常称为计算设备,设备有一个或多个计算单元,计算单元又包含一个或多个处理单元.在设备上运行的程序被称为核函数 (For the preparation of kernel functions,CUDA一般直接写在程序内,OpenCL是写在一个独立的文件中,并且文件后缀是.cl,由主机代码读入后执行)

上下文context
定义了整个OpenCL的运行环境,包括设备device、内核kernel、程序对象program、内存对象memoryand command queueCommandQueue :
设备 device : OpenCL程序调用的计算设备.
内核 kernel : Entry function that performs operations on the device program,在主机上调用.
程序对象 program : 内核程序的源代码(.cl文件)和可执行文件.
内存对象 memory : 计算设备执行OpenCL程序所需的变量.
命令队列 CommandQueue : 队列控制着kernel如何执行以及何时执行等细节.
内核kernel & 程序对象program
工作项(Work-item): 跟CUDA中的线程(Threads)是同一个概念,
N多个工作项(线程)执行同样的核函数,每个Work-item都有一个唯一固定的ID号,一般通过这个ID号来区分需要处理的数据.work-itemis an abstract computing unit,It does not correspond exactly to the computing unit allocated at the physical level,One actual physical computing unit can also compute multiple ones separatelywork-item.
工作组(Work-group):跟CUDA中的线程块(Block)是同一个概念,
N多个工作项组成一个工作组,Work-group内的这些Work-item之间可以通信和协作.

Kernel launch interfaceclEnqueueNDRangeKernel

跟CUDA中的 “<<<block,threads >>>来指定kernelThe number of threads to schedule” 是同一个概念,定义了Work-group的组织形式.
其中,三个重要参数:
global_work_offset: global_id的偏移量
global_work_size: 总的work-item数量
local_work_size: 虚拟的分区,定义work-group的中包含的work-item数量
举个例子:获取当前坐标x
get_global_id(0) = get_group_id(0) * get_local_size(0) + get_local_id(0) + get_global_offset(0) ;
核函数 & 内核 & 内核启动
A simple kernel,Multiply the data in the input 2D array by 2后,into the output array


OpenCL的封装库: OpenCLUtilty


https://registry.khronos.org/OpenCL/specs/opencl-cplusplus-1.2.pdf
https://github.com/smistad/OpenCLUtilityLibrary
举例:A + B = C

这里基于OpenCLUtilty开发opencl运行程序,A + B = C.The process is similar to the above figure.





举例:3First-order feature computation for dimensional data

![]()
一个开源的python包,For extracting radiomic features from medical imaging
Radiomic Features — pyradiomics v3.0.1.post15+g2791e23 documentation
边栏推荐
- Catch That Cow(详解)
- 七、自定义配置
- 共建共享数字世界的根:阿里云打造全面的云原生开源生态
- webService interface
- SSM框架简单介绍
- Double pointer problem (middle)
- The VUX Datetime component compute-days-function dynamically sets the date list
- The 2nd Shanxi Province Network Security Skills Competition (Enterprise Group) Part of the WP (9)
- 05 Detailed explanation of the global configuration file application.properties
- [SQL] at a certain correlation with a table of data update another table
猜你喜欢
![[MRCTF2020]Hello_misc](/img/ea/0faacf6e544b60e3459d8ace4d5f42.png)
[MRCTF2020]Hello_misc

@ WebServlet annotations (Servlet annotations)

How to use labelme

1. 获取数据-requests.get()

【软件工程之美 - 专栏笔记】31 | 软件测试要为产品质量负责吗?
![[Awards every week] The](/img/78/4b510b190475d603490614d2c8199f.png)
[Awards every week] The "Edge Containers" track of the Cloud Native Programming Challenge invites you to fight!
![[Linear table] - Detailed explanation of three practice questions of LeetCode](/img/71/91ba0cc16fe062c1ac9e77e1cc8aa2.png)
[Linear table] - Detailed explanation of three practice questions of LeetCode

SSM框架简单介绍

WPF introduces ttf icon file usage record

2.6 Radix sort (bucket sort)
随机推荐
Mini Program wx.miniProgram.navigateTo jump address cannot be tabbar address
The VUX Datetime component compute-days-function dynamically sets the date list
Go 学习笔记(84)— Go 项目目录结构
五、视图解析与模板引擎
Perspective transformation matrix of image perspective correction should be matrix (single)/findHomography with getPerspectiveTransformd difference
Unity3D Application simulation enters the front and background and pauses
共建共享数字世界的根:阿里云打造全面的云原生开源生态
2.6基数排序(桶排序)
05全局配置文件application.properties详解
模拟问题(下)
Verify that the addShutdownHook hook takes effect
[MRCTF2020]Hello_ misc
@ WebServlet annotations (Servlet annotations)
swagger usage tutorial - quick use of swagger
C. Qualification Rounds
MySQL installation error solution
全流程调度——Azkaban入门与进阶
A must see for software testers!Database knowledge MySQL query statement Daquan
"Translation" Envoy Fundamentals, this is a training course, make people to more quickly using Envoy Proxy..
小程序 wx.miniProgram.navigateTo 跳转地址不能是tabbar地址