当前位置:网站首页>Loops in tensorrt
Loops in tensorrt
2022-07-02 06:29:00 【Little Heshang sweeping the floor】
TensorRT The cycle in
Click here to join NVIDIA Developer Program
NVIDIA TensorRT Support circular structure , This is useful for circular Networks . TensorRT Loops support scanning input tensors 、 The cyclic definition of tensor and “ Scan output ” and “ Last value ” Output .
1. Defining A Loop
The cycle consists of a cyclic boundary layer (loop boundary layers) Definition .
ITripLimitLayer
Specify the number of iterations of the loop .IIteratorLayer
Enable the loop to iterate the tensor .IRecurrenceLayer
Specify a cycle definition .ILoopOutputLayer
Specify the output of the loop .
Each boundary layer inherits from the class ILoopBoundaryLayer
, This class has a method getLoop()
Used to get its associated ILoop
. ILoop
Object identification loop . Have the same ILoop
All cyclic boundary layers of belong to this cycle .
The following figure depicts the structure of the loop and the data flow at the boundary . Cyclic invariant tensors can be used directly inside loops , for example FooLayer Shown .
[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-Ifo18gkx-1656593793084)(loop.png)]
A cycle can have multiple IIteratorLayer
、 IRecurrenceLayer
and ILoopOutputLayer
, And there can be at most two ITripLimitLayer
, As described later . No, ILoopOutputLayer
The loop of has no output , And by the TensorRT Optimize .
NVIDIA TensorRT The flow control structure layer in the support matrix describes Can be used inside the loop TensorRT layer .
The inner layer is free to use tensors defined inside or outside the loop . Other loops can be included inside ( see also Nested loop ) And other conditional constructions ( see also Conditions of nested ).
To define a loop , First , Use INetworkDefinition ::addLoop
Method to create a ILoop
object . Then add boundary layer and inner layer . The rest of this section describes the characteristics of the boundary layer , Use loop
Express INetworkDefinition ::addLoop
Back to ILoop*
.
ITripLimitLayer
Support counting cycles and while
loop .
loop ->addTripLimit( t ,TripLimit::kCOUNT)
Create aITripLimitLayer
, Its inputt
Specifies the number of iterations of the loop 0D INT32 tensor .loop ->addTripLimit( t ,TripLimit::kWHILE)
Create aITripLimitLayer
, Its inputt
It's a 0D Bool tensor , Used to specify whether iterations should be performed . Usuallyt
OrIRecurrenceLayer
Output , Or the calculation based on the output .
A cycle can have at most one limit .
IIteratorLayer
Supports forward or backward iterations on any axis .
loop ->addIterator( t )
Add oneIIteratorLayer
, It is tensor t The shaft 0 Iterate on the . for example , If the input is a matrix :
2 3 5
4 6 8
One dimensional tensor of the first iteration {2, 3, 5} And the second iteration {4, 6, 8} . Iterations beyond the tensor range are invalid .
loop ->addIterator( t , axis )
similar , But this layer iterates on a given axis . for example , If axis=1 And the input is matrix , Then each iteration will transfer a column of the matrix .loop ->addIterator( t , axis,reverse )
similar , But ifreverse =true
, Then the layer produces its output in reverse order .
ILoopOutputLayer
Support three forms of cyclic output :
loop ->addLoopOutput( t, LoopOutput::kLAST_VALUE)
Outputt
The last value of , amongt
Must beIRecurrenceLayer
Output .loop-> addLoopOutput( t ,LoopOutput::kCONCATENATE, axis )
Output the input of each iteration in series tot
. for example , If the input is a one-dimensional tensor , The value of the first iteration is{ a,b,c}
, The value of the second iteration is{d,e,f}
, axis =0 , Then the output is matrix :
a b c
d e f
If axis =1 , The output of :
a d
b e
c f
loop-> addLoopOutput( t ,LoopOutput::kREVERSE, axis )
similar , But the order is reversed .kCONCATENATE
andkREVERSE
Form requires a second input , This is a 0D INT32 Shape tensor , Used to specify the length of the new output dimension . When the length is greater than the number of iterations , Additional elements contain arbitrary values . The second input , for exampleu
, You should useILoopOutputLayer::setInput(1, u )
Set up .
Last , also IRecurrenceLayer
. Its first input specifies the initial output value , The second input specifies the next output value . The first input must come from outside the loop ; The second input usually comes from inside the loop . for example , This C++ Fragment TensorRT simulation :
for (int32_t i = j; ...; i += k) ...
You can create , among j and k yes ITensor*
.
ILoop* loop = n.addLoop();
IRecurrenceLayer* iRec = loop->addRecurrence(j);
ITensor* i = iRec->getOutput(0);
ITensor* iNext = addElementWise(*i, *k,
ElementWiseOperation::kADD)->getOutput(0);
iRec->setInput(1, *iNext);
The second input is TensorRT The only case where trailing edge is allowed . If these inputs are deleted , Then the remaining network must be acyclic .
2. Formal Semantics
TensorRT With application semantics , This means that there are no visible side effects other than engine input and output . Because there are no side effects , Intuition about loops in imperative languages is not always valid . This section defines TensorRT The formal semantics of circular structure .
Formal semantics is based on tensor inert sequences (lazy sequences). Each iteration of the loop corresponds to an element in the sequence . Cyclic tensor X The sequence of is expressed as * X 0, X 1, X 2, ... *
. The elements of the sequence are lazily evaluated , It means as needed .
IIteratorLayer(X)
The output of is * X[0], X[1], X[2], ... *
among X[i]
It means that IIteratorLayer
Subscript on the specified axis .
IRecurrenceLayer(X,Y)
The output of is * X, Y0, Y1, Y2, ... *
.
The input and output of depends on LoopOutput The type of .
kLAST_VALUE
: The input is a single tensor X , about n-trip loop , The output is X n .kCONCATENATE
: The first input is tensorX
, The second input is the scalar shape tensorY
. The result isX0, X1, X2, ... Xn-1
With post fill ( If necessary, ) Connect toY
Specified length . IfY < n
Runtime error .Y
Is the construction time constant . Pay attention to andIIteratorLayer
The inverse relationship of .IIteratorLayer
Map the tensor to a series of sub tensors ; withkCONCATENATE
OfILoopOutputLayer
Map a series of sub tensors to a tensor .- kREVERSE : Be similar to
kCONCATENATE
, But the output direction is opposite .
ILoopOutputLayer
In the output definition of n The value is determined by the circular ITripLimitLayer
determine :
- For counting cycles , It is the iteration count , Express ITripLimitLayer The input of .
- about while loop , It's the smallest n bring X n X_n Xn For false , among
X
yesITripLimitLayer
The sequence of input tensors .
The output of acyclic layer is the sequential application of layer functions . for example , For a two input acyclic layer F(X,Y) = * f(X 0 , Y 0 ), f(X 1 , Y 1 ), f(X 2 , Y 2 )... *
. If a tensor comes from outside the loop , That is, loop invariant , Then its sequence is created by copying tensor .
3. Nested Loops
TensorRT Infer the nesting of loops from the data flow . for example , If the cycle B Used in cycles A The value defined in , be B Considered nested in A in .
TensorRT Reject loops without clean nested Networks , For example, if the loop A Using a loop B Internally defined values , vice versa .
4. Limitations
Loops that reference multiple dynamic dimensions may consume an unexpected amount of memory .
In a cycle , Memory allocation is like all dynamic dimensions taking the maximum value of any of these dimensions . for example , If a circular reference to two dimensions is [4,x,y]
and [6,y]
Tensor , Then the memory allocation of these tensors is like their dimension [4,max(x,y),max(x ,y)]
and [6,max(x,y)]
.
with kLAST_VALUE
Of LoopOutputLayer
The input of must be IRecurrenceLayer
Output .
loop API Support only FP32
and FP16
precision .
5. Replacing IRNNv2Layer With Loops
IRNNv2Layer
stay TensorRT 7.2.1 Has abandoned , And will be in TensorRT 9.0 Delete in . Using a loop API Synthetic cyclic subnet . for example , see also sampleCharRNN Method SampleCharRNNLoop::addLSTMCell
. loop API It allows you to express the general circular network , Not limited to IRNNLayer and IRNNv2Layer Precast units in .
see also sampleCharRNN .
边栏推荐
- FE - Weex 使用简单封装数据加载插件为全局加载方法
- selenium备忘录:selenium\webdriver\remote\remote_connection.py:374: ResourceWarning: unclosed<xxxx>解决办法
- Detailed explanation of BGP message
- Browser principle mind map
- 队列(线性结构)
- Linear DP (split)
- Flask-Migrate 检测不到db.string() 等长度变化
- 【每日一题】写一个函数,判断一个字符串是否为另外一个字符串旋转之后的字符串。
- Detailed definition of tensorrt data format
- Introduce two automatic code generators to help improve work efficiency
猜你喜欢
随机推荐
selenium+msedgedriver+edge浏览器安装驱动的坑
标签属性disabled selected checked等布尔类型赋值不生效?
2020-9-23 QT的定时器Qtimer类的使用。
unittest.TextTestRunner不生成txt测试报告
DeprecationWarning: .ix is deprecated. Please use.loc for label based indexing or.iloc for positi
selenium备忘录:selenium\webdriver\remote\remote_connection.py:374: ResourceWarning: unclosed<xxxx>解决办法
MySql索引
Name six schemes to realize delayed messages at one go
Hydration failed because the initial UI does not match what was rendered on the server. One of the reasons for the problem
深入学习JVM底层(三):垃圾回收器与内存分配策略
Alibaba cloud MFA binding Chrome browser
Sudo right raising
sprintf_s的使用方法
CUDA and Direct3D consistency
IDEA公布全新默认UI,太清爽了(内含申请链接)
【每日一题】—华为机试01
Eggjs -typeorm 之 TreeEntity 实战
Use of Arduino wire Library
Pbootcms collection and warehousing tutorial quick collection release
Does the assignment of Boolean types such as tag attribute disabled selected checked not take effect?