当前位置:网站首页>High performance computing framework for image processing
High performance computing framework for image processing
2022-06-12 10:04:00 【zhoukehu91】
| frame | Introduce | |
| GPU | NPP | NVIDIA Performance Primitives,NVIDIA The company aims at GPU Developed GPU Accelerated images 、 video 、 Signal processing library , After installation CUDA Environment will be automatically installed . By calling NPP function , You can dispense with handwriting CUDA Kernel functions , Fast development . |
| CUDA | Compute Unified Device Architecture: NVIDIA Developed a parallel computing platform and programming model . It uses a graphics processor (GPU) Handling capacity of , Can greatly improve computing performance . You need to know its development language first CUDA C, And then develop . | |
| CPU | IPP | Integrated Performance Primitives,Intel High performance multimedia function library , Contains many functions optimized from the bottom , It covers a variety of applications including image processing , Its interface form is the same as NPP The library is similar to . For image processing ,IPP The library functions provided are introduced and referenced Blog . |
| TBB | Threading Building Blocks,Intel Developed for parallel programming based on C++ The framework of language , It's a set C++ Template library . It provides a lot of features , Have a higher level of abstraction than threads , Mainly used for multi-core CPU Multi thread processing acceleration under the platform . | |
Summarized below :
1)NPP and IPP Are provided by the encapsulated library functions , It mainly provides general algorithms . For example, filtering in image processing 、 Color space conversion, etc .NPP be used for GPU Parallel acceleration of the platform ,IPP be used for CPU Multi thread parallel acceleration of the platform .
2)CUDA and TBB Namely GPU and CPU Parallel development framework under the platform . Typically , For image processing for loop ( Pixel by pixel ) Handle ,CUDA You can accomplish multiple tasks by writing kernel functions CUDA Parallel acceleration of cores , and TBB You can accomplish multiple tasks through its specific interface CPU Parallel processing acceleration .
3) In the process of development , use first OpenMP Conduct CPU Image processing acceleration of the platform , But found CPU High occupancy , And the processing speed has not improved . Subsequent use TBB Development , Achieved the expected goal .IPP and TBB You can go from Intel Download on , see here .
Finally, the author uses TBB Accelerated critical code snippets , It mainly completes the color correction of color image , stay Xeon E3-1230 v2 platform (4 The core 8 Threads ) On , The execution speed of the algorithm is significantly improved . The code is as follows :
// Color image color correction
void ColorCorrect8UC3(Mat source, Mat& dst, int nR, int nG, int nB)
{
dst = source.clone();
if ((nR == 100) && (nG == 100) && (nB == 100))
return;
Mat src = source.clone();
if (nR < 0)
nR = 0;
if (nR > 100)
nR = 100;
if (nG < 0)
nG = 0;
if (nG > 100)
nG = 100;
if (nB < 0)
nB = 0;
if (nB > 100)
nB = 100;
int width = src.cols;
int height = src.rows;
unsigned char* pSrc = src.ptr();
unsigned char* pDst = dst.ptr();
//parallel_for coordination blocked_range2d It will be of great help to image processing
//blocked_range2d Parameter description of :
//(y Starting value ,y End value ,y Step value ,x Starting value ,x End value ,x Step value )
tbb::parallel_for(tbb::blocked_range2d<int>(0, height, 1, 0, width, 1),
[&](const tbb::blocked_range2d<int>& r)
{
for (int i = r.rows().begin(); i < r.rows().end(); ++i)
{
for (int j = r.cols().begin(); j < r.cols().end(); ++j)
{
pDst[i*width * 3 + j * 3 + 2] = (unsigned char)(pSrc[i*width * 3 + j * 3 + 2] * nR / 100.0);
pDst[i*width * 3 + j * 3 + 1] = (unsigned char)(pSrc[i*width * 3 + j * 3 + 1] * nG / 100.0);
pDst[i*width * 3 + j * 3 + 0] = (unsigned char)(pSrc[i*width * 3 + j * 3 + 0] * nB / 100.0);
}
}
});
}边栏推荐
- MySQL VI Database lock
- JVM (VIII) Thread safety and lock optimization
- 001: what is a data lake?
- Differences among list, set and map
- [path of system analyst] Chapter 18 security analysis and design of double disk system
- Spark complex structure data retrieval method
- Periodic pains of cross-border e-commerce? Papaya mobile power as an independent station enabler
- 2022 pole technology communication - the dispute over anmou technology is settled, and the cornerstone of the local semiconductor industry is more stable
- High quality and good books help guide apes and recommend "good summer books" with the four major publishers
- 【ParquetEncodingException: empty fields are illegal, the field should be ommited completely instead
猜你喜欢

MySQL优化之慢日志查询

JVM (IV) Class file structure (complete parsing of bytecode attached)

005:数据湖与数据仓库的区别

Tap series article 3 | introduction to Tanzu application platform deployment reference architecture
![[cloud native] establishment of Eureka service registration](/img/da/0a700081be767db91edd5f3d49b5d0.png)
[cloud native] establishment of Eureka service registration

奇葩错误 -- 轮廓检测检测到边框、膨胀腐蚀开闭运算效果颠倒

行业分析怎么做

传输层协议 ——— TCP协议

001: what is a data lake?

Transport layer protocol -- TCP protocol
随机推荐
005:数据湖与数据仓库的区别
markdown_图片并排的方案
002:数据湖有哪些特征
传输层协议 ——— TCP协议
MySQL索引常见问题
Reading notes of the fifth cultivation
[cloud native] establishment of Eureka service registration
Example interview -- dongyuhang: harvest love in the club
005: difference between data lake and data warehouse
总有一根阴线(上影线)会阻止多军前进的脚步,总有一个阳线(下影线)会阻挡空军肆虐的轰炸
In 2026, the capacity of China's software defined storage market will be close to US $4.51 billion
2026年中国软件定义存储市场容量将接近45.1亿美元
【ParquetEncodingException: empty fields are illegal, the field should be ommited completely instead
SAP HANA 错误消息 SYS_XSA authentication failed SQLSTATE - 28000
MySQL optimized slow log query
Crazy temporary products: super low price, big scuffle and new hope
Checkpoint of the four cornerstones of Flink
Papaya Mobile has a comprehensive layout of cross-border e-commerce SaaS papaya orange. What are the opportunities for this new track?
Research progress of DNA digital information storage
List of computer startup shortcut keys