当前位置:网站首页>Principle and example of OpenMP task
Principle and example of OpenMP task
2022-06-12 07:57:00 【jackniu123】
Personal understanding
Openmp since 3.0 In the future, we will move towards task driven .task The mechanism is very important , Tasks can be explicitly defined , And the rest parallel Not used in code block task The definition is actually an implicit task .
In the abstract, there are two pools : Thread pool and task pool . Idle threads will wait for tasks in the thread pool . Explicit tasks and implicit tasks will be placed in the task pool waiting for the thread to take them away .
task Clause is equivalent to explicitly defining a task . Often used in irregular loops ( Do not apply parallel for The cycle of ) And recursive functions . The default task can only be executed by one thread . When a thread encounters task When clause , It is possible to execute immediately by yourself , It may also be put into the task pool and wait for other threads to take it away .
taskwait Explicitly wait for the execution of the previously defined task to end . For synchronization .
final: When final When the condition is true , This task and subtask are no longer regarded as tasks , And if The essence should be consistent .
if: When if When the condition is false , The task will no longer be placed in the task pool , It is executed immediately by the thread that encounters it .
final And if Clause is used to prevent too fine-grained tasks from being placed in the task pool , Waste of resources . Because we prefer coarse-grained tasks , So as to allocate thread creation , start-up , Overhead caused by synchronization .
The official sample
#include <stdio.h>
#include <omp.h>
int fib(int n)
{
int i, j;
if (n<2)
return n;
else
{
#pragma omp task shared(i) firstprivate(n)
i=fib(n-1);
#pragma omp task shared(j) firstprivate(n)
j=fib(n-2);
#pragma omp taskwait
return i+j;
}
}
int main()
{
int n = 10;
omp_set_dynamic(0);
omp_set_num_threads(4);
#pragma omp parallel shared(n)
{
#pragma omp single
printf ("fib(%d) = %d\n", n, fib(n));
}
}
Used for two tone merge sort
The essence of Bi tonal merging and sorting is not difficult , But it is still difficult to understand , Two part recursion .
The first step is to put a sequence , In two parts , The first part is in ascending order , The second part is in descending order . The second part is to use the properties of Bi tonal queue , Turn it into order ,CSDN And Zhihu .
#include <stdio.h>
#include <algorithm>
#include <omp.h>
#include <functional>
#include <math.h>
typedef std::function<bool(int, int)> Func;
Func op[2];
void to_one_sequence(int* a, int n, int selector);
void bitonicSort(int* a, int n, int selector)
{
if (n <= 1)
return;
else
{
#pragma omp task if(n > 1024)
bitonicSort(a, n / 2, selector);
#pragma omp task if(n > 1024)
bitonicSort(a + n / 2, n / 2, selector ^ 1);
#pragma omp taskwait
to_one_sequence(a, n, selector);
}
}
void to_one_sequence(int* a, int n, int selector)
{
if (n <= 1)
return;
else
{
//#pragma omp task if(n > 1024)
{
auto judge = op[selector];
for (int i = 0; i < n / 2; i++) {
if (judge(a[i], a[i + n / 2]))
std::swap(a[i], a[i + n / 2]);
}
}
//#pragma omp taskwait
#pragma omp task if(n > 1024)
to_one_sequence(a, n / 2, selector);
#pragma omp task if(n > 1024)
to_one_sequence(a + n / 2, n / 2, selector);
#pragma omp taskwait
}
}
int main()
{
Func big = [](int a, int b)->bool {
return a <= b;};
Func small = [](int a, int b)->bool {
return a >= b;};
op[0] = small;
op[1] = big;
const int maxn = pow(2, 22);
int* a = new int[maxn];
#pragma omp parallel for
for (int i = 0; i < maxn; i++)
a[i] = rand() % 10000;
double begin = omp_get_wtime();
omp_set_dynamic(1);
//omp_set_num_threads(4);
#pragma omp parallel
{
#pragma omp single
{
printf("%d\n", omp_get_num_threads());
bitonicSort(a, maxn, 0);
}
}
printf("%lf\n", omp_get_wtime() - begin);
bool ok = 1;
for (int i = 0; i < maxn; i++)
{
if (i != 0)
{
if (a[i] < a[i - 1])
{
ok = 0;
break;
}
}
}
printf("%d\n", ok);
return 0;
}
边栏推荐
- 数值计算方法 Chapter5. 解线性方程组的直接法
- 2021.10.27-28 scientific research log
- The computer is connected to WiFi but can't connect to the Internet
- Voice assistant - future trends
- 移动端、安卓、IOS兼容性面试题
- 『Three.js』辅助坐标轴
- Chapter 4 - key management and distribution
- vscode 1.68变化与关注点(整理导入语句/实验性新命令中心等)
- 从AC5到AC6转型之路(1)——补救和准备
- Classic paper review: palette based photo retrieval
猜你喜欢

Topic 1 Single_Cell_analysis(1)

"Three.js" auxiliary coordinate axis

谋新局、促发展,桂林绿色数字经济的头雁效应

『Three.js』辅助坐标轴

The computer is connected to WiFi but can't connect to the Internet

Symfony 2: multiple and dynamic database connections

Topic 1 Single_ Cell_ analysis(4)

Rich dad, poor dad Abstract

Exposure compensation, white increase and black decrease theory

Dynamic simulation method of security class using Matlab based Matpower toolbox
随机推荐
Explanation and explanation on the situation that the volume GPU util (GPU utilization) is very low and the memory ueage (memory occupation) is very high during the training of pytoch
2、 Eight, ten and hexadecimal conversion
[redistemplate method details]
谋新局、促发展,桂林绿色数字经济的头雁效应
R language uses rstudio to save visualization results as PDF files (export--save as PDF)
Interview questions on mobile terminal, Android and IOS compatibility
R language uses neuralnet package to build neural network regression model (feedforward neural network regression model) and calculate MSE value (mean square error) of the model on the test set
LeetCode笔记:Weekly Contest 295
Voice assistant - future trends
Topic 1 Single_Cell_analysis(1)
Topic 1 Single_ Cell_ analysis(4)
Vscode 1.68 changes and concerns (sorting and importing statements / experimental new command center, etc.)
20220607. face recognition
2021.10.29-30 scientific research log
Seeking for a new situation and promoting development, the head goose effect of Guilin's green digital economy
CONDA reports an error when creating a virtual environment, and the problem is solved
20220524 backbone deep learning network framework
Leetcode notes: Weekly contest 277
数值计算方法 Chapter5. 解线性方程组的直接法
Mathematical knowledge - matrix - matrix / vector derivation