当前位置:网站首页>Why does [mindspore ascend] [custom operator] repeatedly assign values to one tensor affect another tensor?
Why does [mindspore ascend] [custom operator] repeatedly assign values to one tensor affect another tensor?
2022-07-25 00:12:00 【Xiaole happy】
1、 Enter a Tensor, Assign a value to ub And then from ub Assign values to form output . Between these two assignments , Another irrelevant... Has been operated many times Tensor, Will affect the previous Tensor Output result of .
2、 In file MAX_ITER = 1000000 When , The output is all 0
3、 In file MAX_ITER = 1 When , The output is correct
【 Screenshot information 】

error

correct
Code :
import numpy as np
import mindspore as ms
import te
import topi
import te.tik
hfe = lambda x,y,eps=1e-2:np.max(np.abs(x-y)/(np.abs(x)+np.abs(y)+eps))
ms.context.set_context(mode=ms.context.GRAPH_MODE, device_target="Ascend")
cus_S0_op_info = ms.ops.TBERegOp("CusS0") \
.fusion_type("OPAQUE") \
.async_flag(False) \
.binfile_name("S0.so") \
.compute_cost(10) \
.kernel_name("CusS0Impl") \
.partial_flag(True) \
.input(0, "input1", False, "required", "all") \
.output(0, "output", False, "required", "all") \
.dtype_format(ms.ops.DataType.F32_Default, ms.ops.DataType.F32_Default) \
.dtype_format(ms.ops.DataType.F16_Default, ms.ops.DataType.F16_Default) \
.get_op_info()
@ms.ops.op_info_register(cus_S0_op_info)
def CusS0Impl(input1, output, kernel_name="CusS0Impl"):
tik_instance = te.tik.Tik()
input_shape = input1.get("shape")
M = input_shape[0]
N = input_shape[1]
bLength_N = ((N-1)//16)+1
bLength = ((M*N-1)//16)+1
MAX_ITER = 1000000
input1 = tik_instance.Tensor("float16", (M,N), name="input1", scope=te.tik.scope_gm)
output = tik_instance.Tensor("float16", (M,), name="output", scope=te.tik.scope_gm)
input1_ub = tik_instance.Tensor("float16", (M,N), name="input1_ub", scope=te.tik.scope_ubuf)
tik_instance.data_move(input1_ub, input1, 0, 1, bLength, 0, 0)
tmp_vector = tik_instance.Tensor("float16", (M,), name="tmp_vector", scope=te.tik.scope_ubuf)
repeat = tik_instance.Scalar('int32')
r = ((M-1)//128)+1
repeat.set_as(r)
with tik_instance.for_range(0,MAX_ITER) as i:
tik_instance.vec_dup(128, tmp_vector, 0.0, repeat, 8)
tik_instance.data_move(output, input1_ub, 0, 1, bLength_N, 0, 0)
tik_instance.BuildCCE(kernel_name=kernel_name,inputs=[input1],outputs=[output])
class CusS0(ms.ops.PrimitiveWithInfer):
@ms.ops.prim_attr_register
def __init__(self):
self.CusS0Impl = CusS0Impl
self.init_prim_io_names(inputs=['input1'], outputs=['output'])
def infer_shape(self, input1):
shape = [input1[1]]
return shape
def infer_dtype(self, input1):
return input1
if __name__=='__main__':
np0 = np.random.randn(128,128).astype(np.float16)
op = CusS0()
ret0 = op(ms.Tensor(np0)).asnumpy()
print(np0)
print(ret0)No problem at all , Just doubt two scalar There may be a pit .
repeat = tik_instance.Scalar('int32')
r = ((M-1)//128)+1
repeat.set_as(r)
with tik_instance.for_range(0,MAX_ITER) as i:
tik_instance.vec_dup(128, tmp_vector, 0.0, repeat, 8)Try the following code :
repeat = tik_instance.Scalar('int32')
repeat.set_as(1000000)
with tik_instance.for_range(0,repeat) as i:
tik_instance.vec_dup(128, tmp_vector, 0.0, 1, 8)边栏推荐
- Does opengauss support using Sqlalchemy connections?
- Heap and stack in embedded development
- Install software on kubernetes cluster using helm 3 package manager
- Netease game Flink SQL platform practice
- [nuxt 3] (x) runtime configuration
- Why do I have to clean up data?
- [leetcode weekly replay] game 83 biweekly 20220723
- UART
- @Mapkey usage instructions
- 2022 the most NB JVM foundation to tuning notes, thoroughly understand Alibaba P6 small case
猜你喜欢

Be an artistic test / development programmer and slowly change yourself

Weekly summary (*66): next five years

c语言:深度刨析函数栈帧

ROS机械臂 Movelt 学习笔记3 | kinect360相机(v1)相关配置

Multithreading & high concurrency (the latest in the whole network: interview questions + map + Notes) the interviewer is calm

QT project - security monitoring system (function realization of each interface)

痛并快乐的-NIO编程
![[英雄星球七月集训LeetCode解题日报] 第24日 线段树](/img/ae/1f3288a99cb07fcbb1836357e0229a.png)
[英雄星球七月集训LeetCode解题日报] 第24日 线段树

Oracle is not null cannot filter null values

Let me introduce you to the partition automatic management of data warehouse
随机推荐
Analyzing the principle of DNS resolution in kubernetes cluster
Exception, import package and file operation
Live broadcast preview | online seminar on open source security governance models and tools
2022 the most NB JVM foundation to tuning notes, thoroughly understand Alibaba P6 small case
What can testers do when there is an online bug?
Are you still using system. Currenttimemillis()? Take a look at stopwatch
Install software on kubernetes cluster using helm 3 package manager
Routing policy in republishing
Redis memory analysis tool RMA usage
Excel file processing tool class (based on easyexcel)
Restructuredtext grammar summary for beginners
4. Immersion test
NVIDIA inspector detailed instructions
Why do I have to clean up data?
Ggplot2 visual faceting, visual faceted ridge plot with facet_wrap, and customize the background color of the faceted icon title box
[hero planet July training leetcode problem solving daily] 24th line segment tree
Transmission download list, download file migration machine guide
指针与数组
Promtool Check
来自大佬洗礼!2022 头条首发纯手打 MySQL 高级进阶笔记, 吃透 P7 有望