当前位置:网站首页>申请内存,std::transform和AVX256指令集用例和执行速度比较
申请内存,std::transform和AVX256指令集用例和执行速度比较
2022-07-30 06:00:00 【ZHY.Spiritual】
用两个一维数组交叉想减为例:
long long *pDataStop1 = new long long[iSize1];
long long *pDataStop2 = new long long[iSize2];1.申请内存指针用操作,数据的画,自己导入模拟数据
int m_iMaxDiff;
QMap<long long, int> m_mStatisTau;
int iSize1 = 7000;
int iSize2 = 7000;
long long *pDataStop1 = new long long[iSize1];
long long *pDataStop2 = new long long[iSize2];
long long iStop2;
long long iValue;
for(int j = 0; j < iSize1; j += 4) {
for(int k = 0; k < iSize2; k++) {
iStop2 = pDataStop2[k];
iValue = pDataStop1[j] - iStop2;
if(std::abs(iValue) <= m_iMaxDiff) {
m_mStatisTau[iValue] += 1;
}
iValue = pDataStop1[j + 1] - iStop2;
if(std::abs(iValue) <= m_iMaxDiff) {
m_mStatisTau[iValue] += 1;
}
iValue = pDataStop1[j + 2] - iStop2;
if(std::abs(iValue) <= m_iMaxDiff) {
m_mStatisTau[iValue] += 1;
}
iValue = pDataStop1[j + 3] - iStop2;
if(std::abs(iValue) <= m_iMaxDiff) {
m_mStatisTau[iValue] += 1;
}
}
}这个执行完,大概需要310ms
2.std::transform,内部用的多线程,c++17才能使用,使用std::execution::par_unseq项是最快的
for(int i = 0; i < iSize; i++) {
int iSize1 = vDataStop2[i].size();
std::vector<long long> vOut(iSize1);
for(int j = 0; j < vDataStop1[i].size(); j++) {
QVector<long long> vStop1(iSize1);
std::fill(vStop1.begin(), vStop1.end(), vDataStop1[i][j]);
std::transform(std::execution::par_unseq, vDataStop2[i].begin(),
vDataStop2[i].end(), vStop1.begin(), vOut.begin(), CalculatePoor);
}
}这个使用vector大概是490ms,使用指针大概是410ms,可以参考我上一个博客,写的比较详细
3.AVX256指令集
__m256i m1, m2;
long long re[4];
for(int j = 0; j < iSize1; j += 4) {
m1 = _mm256_set_epi64x(pDataStop1[j],pDataStop1[j+1],pDataStop1[j+2], pDataStop1[j+3]);
for(int k = 0; k < iSize2; k++) {
m2 = _mm256_set_epi64x(pDataStop2[k], pDataStop2[k], pDataStop2[k], pDataStop2[k]);
__m256i l1 = _mm256_sub_epi64(m1, m2);
re[3] = l1.m256i_i64[0];
re[2] = l1.m256i_i64[1];
re[1] = l1.m256i_i64[2];
re[0] = l1.m256i_i64[3];
if(std::abs(re[3]) <= m_iMaxDiff) {
m_mStatisTau[re[3]] += 1;
}
if(std::abs(re[2]) <= m_iMaxDiff) {
m_mStatisTau[re[2]] += 1;
}
if(std::abs(re[1]) <= m_iMaxDiff) {
m_mStatisTau[re[1]] += 1;
}
if(std::abs(re[0]) <= m_iMaxDiff) {
m_mStatisTau[re[0]] += 1;
}
}
}这个执行时间大概320ms
边栏推荐
- Pioneer in Distributed Systems - Leslie Lambert
- Mybitatis相关配置文件
- Ali two sides: Sentinel vs Hystrix comparison, how to choose?
- MySQL基础篇【命名规范】
- Playing script killing with AI: actually more involved than me
- ArrayList
- mysql高阶语句(一)
- 2020 ACM | MoFlow: An Invertible Flow Model for Generating Molecular Graphs
- Electron日常学习笔记
- 什么是微服务?
猜你喜欢

不会吧,Log4j 漏洞还没有完全修复?

【MySQL】MySQL中如何实现分页操作

如何实时计算日累计逐单资金流

RAID disk array
![[硬核干货]由0到1,突破信息系统项目管理师(呕心沥血经验之谈)!!!](/img/9a/f3e4bdd0ce8ec153a8e6bdbff5647e.jpg)
[硬核干货]由0到1,突破信息系统项目管理师(呕心沥血经验之谈)!!!

Huawei released "ten inventions", including computing, intelligent driving and other new fields

AI元学习引入神经科学,医疗效果有望精准提升

The CTO said I was not advised to use SELECT *, why is that?

DP5340国产替代CM5340立体声音频A/D转换器芯片

AI can identify race from X-rays, but no one knows why
随机推荐
Redis 如何实现防止超卖和库存扣减操作?
Proof of distance calculation from space vertex to plane and its source code
Go 结合Gin导出Mysql数据到Excel表格
云服务器零基础部署网站(保姆级教程)
识别“数据陷阱”,发现数据的可疑之处
RAID disk array
常用的配置
UDP和TCP使用同一个端口,可行吗?
MySQL master-slave replication configuration construction, one step in place
Pioneer in Distributed Systems - Leslie Lambert
How to understand plucker coordinates (geometric understanding)
阿里二面:Sentinel vs Hystrix 对比,如何选择?
MySQL基础篇【命名规范】
Ali Ermian: How many cluster solutions does Redis have?I answered 4
go : 使用gorm创建数据库记录
使用navicat连接mysql数据库时常报的错误:2003、1698、1251
When does MySQL use table locks and when does it use row locks?
golang : Zap日志整合
Required request body is missing problem solving
MySQL basics [naming convention]