当前位置:网站首页>Using the GPU parallel computing 】 【 OpenCL&OpenCLUtilty GPU parallel computing
Using the GPU parallel computing 】 【 OpenCL&OpenCLUtilty GPU parallel computing
2022-07-30 04:38:00 【JinSu_】
问题背景介绍

CPU:Fewer computing cores,在大规模并行计算能力上极受限制,擅长流程控制和逻辑处理
GPU:More computing cores,Computation-intensive tasks suitable for data parallelism
异构计算:CPUHandle complex logic operations and flow control,When you need to process a large amount of data of uniform type,再调用GPU进行并行计算
OpenCL和CUDA的区别&下载
OpenCL(Open Computing Langugae,开放运算语言)是第一个面向异构系统(此系统中可由CPU,GPU或其它类型的处理器架构组成)A cross-platform open standard for parallel programming.
CUDA(Compute Unified Device Architecture,统一计算架构),是显卡厂商NVIDIA推出的运算平台. 该架构使GPU能够解决复杂的计算问题. 它包含了CUDA指令集架构(ISA)以及GPU内部的并行计算引擎. 开发人员可以使用C、 C++和FORTRAN(或者是OpenCL)来为CUDA架构编写程序,所编写出的程序可以在支持CUDA的处理器上以超高性能运行.
CUDA和OpenCL,前者是配备完整工具包、针对单一供应商(NVIDIA)的成熟的开发平台,后者是一个开放的标准.OpenCL是一个API,at the first level,CUDAArchitecture is one level higher,It doesn't matter on this architectureOpenCL还是DX11这样的API,还是像C语言、Fortran、DX11计算,都可以支持.

Choose & Download Intel SDK for OpenCL Applications

CUDA Toolkit Archive | NVIDIA Developer
OpenCL的基本概念
platform & device
平台(platform)Can be considered to be provided by different manufacturersOpenCL API的实现.If a platform is selected, generally only the devices supported by the platform can be run.就当前的情况来看,如果选择了Intel的OpenCL SDK就只能使用Intel的CPU来进行计算了,如果选择AMD的APP SDKcan be carried outAMD的CPU和AMD的GPU来进行计算.
一般而言,AAfter the company's platform is selected, it cannot be usedBThe company's platform to communicate.
平台由两部分组成,host and device:
宿主机Host :宿主机一般为CPU,扮演组织者的角色.
设备Device :通常称为计算设备,设备有一个或多个计算单元,计算单元又包含一个或多个处理单元.在设备上运行的程序被称为核函数 (For the preparation of kernel functions,CUDA一般直接写在程序内,OpenCL是写在一个独立的文件中,并且文件后缀是.cl,由主机代码读入后执行)

上下文context
定义了整个OpenCL的运行环境,包括设备device、内核kernel、程序对象program、内存对象memoryand command queueCommandQueue :
设备 device : OpenCL程序调用的计算设备.
内核 kernel : Entry function that performs operations on the device program,在主机上调用.
程序对象 program : 内核程序的源代码(.cl文件)和可执行文件.
内存对象 memory : 计算设备执行OpenCL程序所需的变量.
命令队列 CommandQueue : 队列控制着kernel如何执行以及何时执行等细节.
内核kernel & 程序对象program
工作项(Work-item): 跟CUDA中的线程(Threads)是同一个概念,
N多个工作项(线程)执行同样的核函数,每个Work-item都有一个唯一固定的ID号,一般通过这个ID号来区分需要处理的数据.work-itemis an abstract computing unit,It does not correspond exactly to the computing unit allocated at the physical level,One actual physical computing unit can also compute multiple ones separatelywork-item.
工作组(Work-group):跟CUDA中的线程块(Block)是同一个概念,
N多个工作项组成一个工作组,Work-group内的这些Work-item之间可以通信和协作.

Kernel launch interfaceclEnqueueNDRangeKernel

跟CUDA中的 “<<<block,threads >>>来指定kernelThe number of threads to schedule” 是同一个概念,定义了Work-group的组织形式.
其中,三个重要参数:
global_work_offset: global_id的偏移量
global_work_size: 总的work-item数量
local_work_size: 虚拟的分区,定义work-group的中包含的work-item数量
举个例子:获取当前坐标x
get_global_id(0) = get_group_id(0) * get_local_size(0) + get_local_id(0) + get_global_offset(0) ;
核函数 & 内核 & 内核启动
A simple kernel,Multiply the data in the input 2D array by 2后,into the output array


OpenCL的封装库: OpenCLUtilty


https://registry.khronos.org/OpenCL/specs/opencl-cplusplus-1.2.pdf
https://github.com/smistad/OpenCLUtilityLibrary
举例:A + B = C

这里基于OpenCLUtilty开发opencl运行程序,A + B = C.The process is similar to the above figure.





举例:3First-order feature computation for dimensional data

![]()
一个开源的python包,For extracting radiomic features from medical imaging
Radiomic Features — pyradiomics v3.0.1.post15+g2791e23 documentation
边栏推荐
- unity初学5 摄像机跟随,边界控制以及简单的粒子控制(2d)
- DAY17: weak password detection and test
- MySQL installation error solution
- @WebServlet注解(Servlet注解)
- Stimulsoft ReportsJS and DashboardsJS. 2022.3.3
- 全流程调度——Azkaban入门与进阶
- Shanxi group (enterprises) in the second network security skills competition part problem WP (7)
- Classification of decision tree classification
- Shell script basic editing specifications and variables
- Naive Bayes Classification
猜你喜欢

解决报错SyntaxError: (unicode error) ‘utf-8‘ codec can‘t decode byte 0xb7 in position 0: invalid start b

A must see for software testers!Database knowledge MySQL query statement Daquan

精品MySQL面试题,备战八月99%必问!过不了面试算我的

2.6归并排序

sql语句-如何以一个表中的数据为条件据查询另一个表中的数据

See you in shenzhen!Cloud native to accelerate the application building special: see cloud native FinOps, SRE, high-performance computing scenario best practices

Stimulsoft ReportsJS and DashboardsJS. 2022.3.3

WPF introduces ttf icon file usage record

webService interface

DAY17、CSRF 漏洞
随机推荐
KubeMeet 报名 | 「边缘原生」线上技术沙龙完整议程公布!
Excellent MySQL interview questions, 99% must ask in preparation for August!I can't pass the interview
深圳见!云原生加速应用构建专场:来看云原生 FinOps、SRE、高性能计算场景最佳实践
DAY17, CSRF vulnerability
【软件工程之美 - 专栏笔记】31 | 软件测试要为产品质量负责吗?
全流程调度——Azkaban入门与进阶
1. 获取数据-requests.get()
《构建之法》笔记---第十章 典型用户和场景
(Problem practice) Conditional probability + weight line segment tree + FWT + suffix array
Shell script basic editing specifications and variables
软件测试员必看!数据库知识mysql查询语句大全
成为一个合格的网安,你知道这些吗?
[SQL] at a certain correlation with a table of data update another table
Perspective transformation matrix of image perspective correction should be matrix (single)/findHomography with getPerspectiveTransformd difference
三、依赖配置管理
Go study notes (84) - Go project directory structure
Notes on "The Law of Construction"---Chapter 10 Typical Users and Scenarios
Simulation Problem (Part 1)
Learning of redis_Basic part
Web page element parsing a tag