❓ Question
I would need to optimize an already trained segmentation model through TorchTensorRT, the idea would be to optimize the model by running the newest PyTorch NGC docker image under WSL2, exporting the model and then loading it in a C++ application that uses LibTorch, e.g.
#include <torch/script.h>
// ...
torch::jit::script::Module module;
try {
// Deserialize the ScriptModule from a file using torch::jit::load().
module = torch::jit::load(argv[1]);
}
Would this be the right approach?
What you have already tried
At the moment I only tried to optimize the model through TorchTensorRT, and something weird happens. Here I'll show the results for the Python script below that I obtained on two different devices:
- a Ubuntu desktop with a GTX1080Ti (that I use for development)
- a Windows PC with a RTX3080 (that is my target device)
As you can see, the optimization process under WSL gives me a lot of GPU errors, while on Ubuntu it seems to work fine. Why does this happen?
My script:
import torch_tensorrt
import yaml
import torch
import os
import time
import numpy as np
import torch.backends.cudnn as cudnn
import argparse
import segmentation_models_pytorch as smp
import pytorch_lightning as pl
cudnn.benchmark = True
def benchmark(model, input_shape=(1, 3, 512, 512), dtype=torch.float, nwarmup=50, nruns=1000):
input_data = torch.randn(input_shape)
input_data = input_data.to("cuda")
if dtype==torch.half:
input_data = input_data.half()
print("Warm up ...")
with torch.no_grad():
for _ in range(nwarmup):
features = model(input_data)
torch.cuda.synchronize()
print("Start timing ...")
timings = []
with torch.no_grad():
for i in range(1, nruns+1):
start_time = time.time()
features = model(input_data)
torch.cuda.synchronize()
end_time = time.time()
timings.append(end_time - start_time)
if i%100==0:
print('Iteration %d/%d, ave batch time %.2f ms'%(i, nruns, np.mean(timings)*1000))
print("Input shape:", input_data.size())
print("Output features size:", features.size())
print('Average batch time: %.2f ms'%(np.mean(timings)*1000))
def load_config(config_path: str):
with open(config_path) as f:
config = yaml.load(f, Loader=yaml.FullLoader)
return config
def main():
# Load target model
parser = argparse.ArgumentParser()
parser.add_argument("weights_path")
parser.add_argument("config_path")
args = parser.parse_args()
config = load_config(args.config_path)
model_dict = config["model"]
model_dict["activation"] = "softmax2d"
model = smp.create_model(**model_dict)
state_dict = torch.load(args.weights_path)["state_dict"]
model.load_state_dict(state_dict)
model.to("cuda")
model.eval()
# Create dummy data for tracing and benchmarking purposes.
dtype = torch.float32
shape = (1, 3, 512, 512)
input_data = torch.randn(shape).to("cuda")
# Convert model to script module
print("Tracing PyTorch model...")
traced_script_module = torch.jit.trace(model, input_data)
# torch_script_module = torch.jit.load(model_path).cuda()
print("Script Module generated.")
print("\nBenchmarking Script Module...")
# First benchmark <===================================
benchmark(traced_script_module, shape, dtype)
# Convert to TRT Module...
output_path = args.config_path.split(os.path.sep)[-1] + "_trt_.pt"
print("Creating TRT module...")
trt_ts_module = torch_tensorrt.compile(
traced_script_module,
inputs = [
torch_tensorrt.Input( # Specify input object with shape and dtype
shape=shape,
dtype=dtype) # Datatype of input tensor. Allowed options torch.(float|half|int8|int32|bool)
],
enabled_precisions = {dtype},
)
print("TRT Module created")
print("\nBenchmarking TRT Module...")
benchmark(trt_ts_module, shape, dtype)
torch.jit.save(trt_ts_module, os.path.join("models",output_path)) # save the TRT embedded Torchscript
if __name__ == "__main__":
main()
Ubuntu desktop
[email protected]:/DockerStuff# python script.py path/to/checkout.tar path/to/config.yaml
No pretrained weights exist for this model. Using random initialization.
Tracing PyTorch model...
/opt/conda/lib/python3.8/site-packages/segmentation_models_pytorch/base/model.py:16: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if h % output_stride != 0 or w % output_stride != 0:
Script Module generated.
Benchmarking Script Module...
Warm up ...
Start timing ...
Iteration 100/1000, ave batch time 7.00 ms
Iteration 200/1000, ave batch time 6.88 ms
Iteration 300/1000, ave batch time 6.76 ms
Iteration 400/1000, ave batch time 6.91 ms
Iteration 500/1000, ave batch time 6.93 ms
Iteration 600/1000, ave batch time 6.98 ms
Iteration 700/1000, ave batch time 6.99 ms
Iteration 800/1000, ave batch time 6.91 ms
Iteration 900/1000, ave batch time 6.89 ms
Iteration 1000/1000, ave batch time 6.87 ms
Input shape: torch.Size([1, 3, 512, 512])
Output features size: torch.Size([1, 3, 512, 512])
Average batch time: 6.87 ms
Creating TRT module...
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - There may be undefined behavior using dynamic shape and aten::size
WARNING: [Torch-TensorRT] - There may be undefined behavior using dynamic shape and aten::size
WARNING: [Torch-TensorRT] - Interpolation layer will be run through ATen, not TensorRT. Performance may be lower than expected
[1, 256, 128, 128]
[1, 256, 128, 128]
WARNING: [Torch-TensorRT] - Interpolation layer will be run through ATen, not TensorRT. Performance may be lower than expected
[1, 3, 512, 512]
[1, 3, 512, 512]
[1, 256, 128, 128]
[1, 3, 512, 512]
[1, 256, 128, 128]
[1, 3, 512, 512]
[1, 256, 128, 128]
[1, 3, 512, 512]
[1, 256, 128, 128]
[1, 3, 512, 512]
TRT Module created
Benchmarking TRT Module...
Warm up ...
Start timing ...
Iteration 100/1000, ave batch time 3.29 ms
Iteration 200/1000, ave batch time 3.30 ms
Iteration 300/1000, ave batch time 3.30 ms
Iteration 400/1000, ave batch time 3.30 ms
Iteration 500/1000, ave batch time 3.31 ms
Iteration 600/1000, ave batch time 3.30 ms
Iteration 700/1000, ave batch time 3.30 ms
Iteration 800/1000, ave batch time 3.30 ms
Iteration 900/1000, ave batch time 3.30 ms
Iteration 1000/1000, ave batch time 3.30 ms
Input shape: torch.Size([1, 3, 512, 512])
Output features size: torch.Size([1, 3, 512, 512])
Average batch time: 3.30 ms
Windows PC
[email protected]:/DockerStuff# python script.py path/to/checkout.tar path/to/config.yaml
No pretrained weights exist for this model. Using random initialization.
Tracing PyTorch model...
/opt/conda/lib/python3.8/site-packages/segmentation_models_pytorch/base/model.py:16: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if h % output_stride != 0 or w % output_stride != 0:
Script Module generated.
Benchmarking Script Module...
Warm up ...
Start timing ...
Iteration 100/1000, ave batch time 3.21 ms
Iteration 200/1000, ave batch time 3.18 ms
Iteration 300/1000, ave batch time 3.17 ms
Iteration 400/1000, ave batch time 3.17 ms
Iteration 500/1000, ave batch time 3.16 ms
Iteration 600/1000, ave batch time 3.16 ms
Iteration 700/1000, ave batch time 3.16 ms
Iteration 800/1000, ave batch time 3.16 ms
Iteration 900/1000, ave batch time 3.16 ms
Iteration 1000/1000, ave batch time 3.15 ms
Input shape: torch.Size([1, 3, 512, 512])
Output features size: torch.Size([1, 3, 512, 512])
Average batch time: 3.15 ms
Creating TRT module...
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - Mean converter disregards dtype
WARNING: [Torch-TensorRT] - There may be undefined behavior using dynamic shape and aten::size
WARNING: [Torch-TensorRT] - There may be undefined behavior using dynamic shape and aten::size
WARNING: [Torch-TensorRT] - Interpolation layer will be run through ATen, not TensorRT. Performance may be lower than expected
[1, 256, 128, 128]
[1, 256, 128, 128]
WARNING: [Torch-TensorRT] - Interpolation layer will be run through ATen, not TensorRT. Performance may be lower than expected
[1, 3, 512, 512]
[1, 3, 512, 512]
[1, 256, 128, 128]
[1, 3, 512, 512]
[1, 256, 128, 128]
[1, 3, 512, 512]
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.17 : Tensor = aten::_convolution(%1217, %self.encoder.model.blocks.1.0.conv_pw.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.1/__module.encoder.model.blocks.1.0/__module.encoder.model.blocks.1.0.conv_pw # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.19 : Tensor = aten::batch_norm(%input.17, %self.encoder.model.blocks.1.0.bn1.weight, %self.encoder.model.blocks.1.0.bn1.bias, %self.encoder.model.blocks.1.0.bn1.running_mean, %self.encoder.model.blocks.1.0.bn1.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.1/__module.encoder.model.blocks.1.0/__module.encoder.model.blocks.1.0.bn1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1220 : Tensor = aten::relu(%input.19), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.1/__module.encoder.model.blocks.1.0/__module.encoder.model.blocks.1.0.act1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1393:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.29 : Tensor = aten::_convolution(%1223, %self.encoder.model.blocks.1.0.conv_pwl.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.1/__module.encoder.model.blocks.1.0/__module.encoder.model.blocks.1.0.conv_pwl # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.31 : Tensor = aten::batch_norm(%input.29, %self.encoder.model.blocks.1.0.bn3.weight, %self.encoder.model.blocks.1.0.bn3.bias, %self.encoder.model.blocks.1.0.bn3.running_mean, %self.encoder.model.blocks.1.0.bn3.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.1/__module.encoder.model.blocks.1.0/__module.encoder.model.blocks.1.0.bn3 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.33 : Tensor = aten::_convolution(%input.31, %self.encoder.model.blocks.2.0.conv_pw.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.2/__module.encoder.model.blocks.2.0/__module.encoder.model.blocks.2.0.conv_pw # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.35 : Tensor = aten::batch_norm(%input.33, %self.encoder.model.blocks.2.0.bn1.weight, %self.encoder.model.blocks.2.0.bn1.bias, %self.encoder.model.blocks.2.0.bn1.running_mean, %self.encoder.model.blocks.2.0.bn1.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.2/__module.encoder.model.blocks.2.0/__module.encoder.model.blocks.2.0.bn1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1228 : Tensor = aten::relu(%input.35), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.2/__module.encoder.model.blocks.2.0/__module.encoder.model.blocks.2.0.act1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1393:0 || %input.369 : Tensor = aten::_convolution(%input.31, %self.decoder.block1.0.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.decoder/__module.decoder.block1/__module.decoder.block1.0 # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.371 : Tensor = aten::batch_norm(%input.369, %self.decoder.block1.1.weight, %self.decoder.block1.1.bias, %self.decoder.block1.1.running_mean, %self.decoder.block1.1.running_var, %870, %878, %879, %873), scope: __module.decoder/__module.decoder.block1/__module.decoder.block1.1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %high_res_features : Tensor = aten::relu(%input.371), scope: __module.decoder/__module.decoder.block1/__module.decoder.block1.2 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1395:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.45 : Tensor = aten::_convolution(%1231, %self.encoder.model.blocks.2.0.conv_pwl.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.2/__module.encoder.model.blocks.2.0/__module.encoder.model.blocks.2.0.conv_pwl # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.47 : Tensor = aten::batch_norm(%input.45, %self.encoder.model.blocks.2.0.bn3.weight, %self.encoder.model.blocks.2.0.bn3.bias, %self.encoder.model.blocks.2.0.bn3.running_mean, %self.encoder.model.blocks.2.0.bn3.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.2/__module.encoder.model.blocks.2.0/__module.encoder.model.blocks.2.0.bn3 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.49 : Tensor = aten::_convolution(%input.47, %self.encoder.model.blocks.2.1.conv_pw.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.2/__module.encoder.model.blocks.2.1/__module.encoder.model.blocks.2.1.conv_pw # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.51 : Tensor = aten::batch_norm(%input.49, %self.encoder.model.blocks.2.1.bn1.weight, %self.encoder.model.blocks.2.1.bn1.bias, %self.encoder.model.blocks.2.1.bn1.running_mean, %self.encoder.model.blocks.2.1.bn1.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.2/__module.encoder.model.blocks.2.1/__module.encoder.model.blocks.2.1.bn1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1236 : Tensor = aten::relu(%input.51), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.2/__module.encoder.model.blocks.2.1/__module.encoder.model.blocks.2.1.act1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1393:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.65 : Tensor = aten::_convolution(%1242, %self.encoder.model.blocks.3.0.conv_pw.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.3/__module.encoder.model.blocks.3.0/__module.encoder.model.blocks.3.0.conv_pw # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.67 : Tensor = aten::batch_norm(%input.65, %self.encoder.model.blocks.3.0.bn1.weight, %self.encoder.model.blocks.3.0.bn1.bias, %self.encoder.model.blocks.3.0.bn1.running_mean, %self.encoder.model.blocks.3.0.bn1.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.3/__module.encoder.model.blocks.3.0/__module.encoder.model.blocks.3.0.bn1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1245 : Tensor = aten::relu(%input.67), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.3/__module.encoder.model.blocks.3.0/__module.encoder.model.blocks.3.0.act1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1393:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.85 : Tensor = aten::_convolution(%input.83, %self.encoder.model.blocks.3.0.conv_pwl.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.3/__module.encoder.model.blocks.3.0/__module.encoder.model.blocks.3.0.conv_pwl # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.87 : Tensor = aten::batch_norm(%input.85, %self.encoder.model.blocks.3.0.bn3.weight, %self.encoder.model.blocks.3.0.bn3.bias, %self.encoder.model.blocks.3.0.bn3.running_mean, %self.encoder.model.blocks.3.0.bn3.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.3/__module.encoder.model.blocks.3.0/__module.encoder.model.blocks.3.0.bn3 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.89 : Tensor = aten::_convolution(%input.87, %self.encoder.model.blocks.3.1.conv_pw.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.3/__module.encoder.model.blocks.3.1/__module.encoder.model.blocks.3.1.conv_pw # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.91 : Tensor = aten::batch_norm(%input.89, %self.encoder.model.blocks.3.1.bn1.weight, %self.encoder.model.blocks.3.1.bn1.bias, %self.encoder.model.blocks.3.1.bn1.running_mean, %self.encoder.model.blocks.3.1.bn1.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.3/__module.encoder.model.blocks.3.1/__module.encoder.model.blocks.3.1.bn1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1259 : Tensor = aten::relu(%input.91), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.3/__module.encoder.model.blocks.3.1/__module.encoder.model.blocks.3.1.act1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1393:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.113 : Tensor = aten::_convolution(%1271, %self.encoder.model.blocks.3.2.conv_pw.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.3/__module.encoder.model.blocks.3.2/__module.encoder.model.blocks.3.2.conv_pw # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.115 : Tensor = aten::batch_norm(%input.113, %self.encoder.model.blocks.3.2.bn1.weight, %self.encoder.model.blocks.3.2.bn1.bias, %self.encoder.model.blocks.3.2.bn1.running_mean, %self.encoder.model.blocks.3.2.bn1.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.3/__module.encoder.model.blocks.3.2/__module.encoder.model.blocks.3.2.bn1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1274 : Tensor = aten::relu(%input.115), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.3/__module.encoder.model.blocks.3.2/__module.encoder.model.blocks.3.2.act1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1393:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.137 : Tensor = aten::_convolution(%1286, %self.encoder.model.blocks.3.3.conv_pw.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.3/__module.encoder.model.blocks.3.3/__module.encoder.model.blocks.3.3.conv_pw # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.139 : Tensor = aten::batch_norm(%input.137, %self.encoder.model.blocks.3.3.bn1.weight, %self.encoder.model.blocks.3.3.bn1.bias, %self.encoder.model.blocks.3.3.bn1.running_mean, %self.encoder.model.blocks.3.3.bn1.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.3/__module.encoder.model.blocks.3.3/__module.encoder.model.blocks.3.3.bn1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1289 : Tensor = aten::relu(%input.139), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.3/__module.encoder.model.blocks.3.3/__module.encoder.model.blocks.3.3.act1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1393:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.161 : Tensor = aten::_convolution(%1301, %self.encoder.model.blocks.4.0.conv_pw.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.4/__module.encoder.model.blocks.4.0/__module.encoder.model.blocks.4.0.conv_pw # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.163 : Tensor = aten::batch_norm(%input.161, %self.encoder.model.blocks.4.0.bn1.weight, %self.encoder.model.blocks.4.0.bn1.bias, %self.encoder.model.blocks.4.0.bn1.running_mean, %self.encoder.model.blocks.4.0.bn1.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.4/__module.encoder.model.blocks.4.0/__module.encoder.model.blocks.4.0.bn1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1304 : Tensor = aten::relu(%input.163), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.4/__module.encoder.model.blocks.4.0/__module.encoder.model.blocks.4.0.act1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1393:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.185 : Tensor = aten::_convolution(%1316, %self.encoder.model.blocks.4.1.conv_pw.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.4/__module.encoder.model.blocks.4.1/__module.encoder.model.blocks.4.1.conv_pw # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.187 : Tensor = aten::batch_norm(%input.185, %self.encoder.model.blocks.4.1.bn1.weight, %self.encoder.model.blocks.4.1.bn1.bias, %self.encoder.model.blocks.4.1.bn1.running_mean, %self.encoder.model.blocks.4.1.bn1.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.4/__module.encoder.model.blocks.4.1/__module.encoder.model.blocks.4.1.bn1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1319 : Tensor = aten::relu(%input.187), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.4/__module.encoder.model.blocks.4.1/__module.encoder.model.blocks.4.1.act1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1393:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.209 : Tensor = aten::_convolution(%1331, %self.encoder.model.blocks.4.2.conv_pw.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.4/__module.encoder.model.blocks.4.2/__module.encoder.model.blocks.4.2.conv_pw # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.211 : Tensor = aten::batch_norm(%input.209, %self.encoder.model.blocks.4.2.bn1.weight, %self.encoder.model.blocks.4.2.bn1.bias, %self.encoder.model.blocks.4.2.bn1.running_mean, %self.encoder.model.blocks.4.2.bn1.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.4/__module.encoder.model.blocks.4.2/__module.encoder.model.blocks.4.2.bn1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1334 : Tensor = aten::relu(%input.211), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.4/__module.encoder.model.blocks.4.2/__module.encoder.model.blocks.4.2.act1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1393:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.233 : Tensor = aten::_convolution(%1346, %self.encoder.model.blocks.5.0.conv_pw.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.5/__module.encoder.model.blocks.5.0/__module.encoder.model.blocks.5.0.conv_pw # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.235 : Tensor = aten::batch_norm(%input.233, %self.encoder.model.blocks.5.0.bn1.weight, %self.encoder.model.blocks.5.0.bn1.bias, %self.encoder.model.blocks.5.0.bn1.running_mean, %self.encoder.model.blocks.5.0.bn1.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.5/__module.encoder.model.blocks.5.0/__module.encoder.model.blocks.5.0.bn1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1349 : Tensor = aten::relu(%input.235), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.5/__module.encoder.model.blocks.5.0/__module.encoder.model.blocks.5.0.act1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1393:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.253 : Tensor = aten::_convolution(%input.251, %self.encoder.model.blocks.5.0.conv_pwl.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.5/__module.encoder.model.blocks.5.0/__module.encoder.model.blocks.5.0.conv_pwl # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.255 : Tensor = aten::batch_norm(%input.253, %self.encoder.model.blocks.5.0.bn3.weight, %self.encoder.model.blocks.5.0.bn3.bias, %self.encoder.model.blocks.5.0.bn3.running_mean, %self.encoder.model.blocks.5.0.bn3.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.5/__module.encoder.model.blocks.5.0/__module.encoder.model.blocks.5.0.bn3 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.257 : Tensor = aten::_convolution(%input.255, %self.encoder.model.blocks.5.1.conv_pw.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.5/__module.encoder.model.blocks.5.1/__module.encoder.model.blocks.5.1.conv_pw # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.259 : Tensor = aten::batch_norm(%input.257, %self.encoder.model.blocks.5.1.bn1.weight, %self.encoder.model.blocks.5.1.bn1.bias, %self.encoder.model.blocks.5.1.bn1.running_mean, %self.encoder.model.blocks.5.1.bn1.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.5/__module.encoder.model.blocks.5.1/__module.encoder.model.blocks.5.1.bn1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1363 : Tensor = aten::relu(%input.259), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.5/__module.encoder.model.blocks.5.1/__module.encoder.model.blocks.5.1.act1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1393:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.281 : Tensor = aten::_convolution(%1375, %self.encoder.model.blocks.5.2.conv_pw.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.5/__module.encoder.model.blocks.5.2/__module.encoder.model.blocks.5.2.conv_pw # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.283 : Tensor = aten::batch_norm(%input.281, %self.encoder.model.blocks.5.2.bn1.weight, %self.encoder.model.blocks.5.2.bn1.bias, %self.encoder.model.blocks.5.2.bn1.running_mean, %self.encoder.model.blocks.5.2.bn1.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.5/__module.encoder.model.blocks.5.2/__module.encoder.model.blocks.5.2.bn1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1378 : Tensor = aten::relu(%input.283), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.5/__module.encoder.model.blocks.5.2/__module.encoder.model.blocks.5.2.act1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1393:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.305 : Tensor = aten::_convolution(%1390, %self.encoder.model.blocks.6.0.conv_pw.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.6/__module.encoder.model.blocks.6.0/__module.encoder.model.blocks.6.0.conv_pw # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.307 : Tensor = aten::batch_norm(%input.305, %self.encoder.model.blocks.6.0.bn1.weight, %self.encoder.model.blocks.6.0.bn1.bias, %self.encoder.model.blocks.6.0.bn1.running_mean, %self.encoder.model.blocks.6.0.bn1.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.6/__module.encoder.model.blocks.6.0/__module.encoder.model.blocks.6.0.bn1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1393 : Tensor = aten::relu(%input.307), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.6/__module.encoder.model.blocks.6.0/__module.encoder.model.blocks.6.0.act1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1393:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.317 : Tensor = aten::_convolution(%1396, %self.encoder.model.blocks.6.0.conv_pwl.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.6/__module.encoder.model.blocks.6.0/__module.encoder.model.blocks.6.0.conv_pwl # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.319 : Tensor = aten::batch_norm(%input.317, %self.encoder.model.blocks.6.0.bn3.weight, %self.encoder.model.blocks.6.0.bn3.bias, %self.encoder.model.blocks.6.0.bn3.running_mean, %self.encoder.model.blocks.6.0.bn3.running_var, %870, %878, %879, %873), scope: __module.encoder/__module.encoder.model/__module.encoder.model.blocks.6/__module.encoder.model.blocks.6.0/__module.encoder.model.blocks.6.0.bn3 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.321 : Tensor = aten::_convolution(%input.319, %self.decoder.aspp.0.convs.0.0.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.0/__module.decoder.aspp.0.convs.0/__module.decoder.aspp.0.convs.0.0 # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.323 : Tensor = aten::batch_norm(%input.321, %self.decoder.aspp.0.convs.0.1.weight, %self.decoder.aspp.0.convs.0.1.bias, %self.decoder.aspp.0.convs.0.1.running_mean, %self.decoder.aspp.0.convs.0.1.running_var, %870, %878, %879, %873), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.0/__module.decoder.aspp.0.convs.0/__module.decoder.aspp.0.convs.0.1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1401 : Tensor = aten::relu(%input.323), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.0/__module.decoder.aspp.0.convs.0/__module.decoder.aspp.0.convs.0.2 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1395:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.327 : Tensor = aten::_convolution(%input.325, %self.decoder.aspp.0.convs.1.0.1.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.0/__module.decoder.aspp.0.convs.1/__module.decoder.aspp.0.convs.1.0/__module.decoder.aspp.0.convs.1.0.1 # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.329 : Tensor = aten::batch_norm(%input.327, %self.decoder.aspp.0.convs.1.1.weight, %self.decoder.aspp.0.convs.1.1.bias, %self.decoder.aspp.0.convs.1.1.running_mean, %self.decoder.aspp.0.convs.1.1.running_var, %870, %878, %879, %873), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.0/__module.decoder.aspp.0.convs.1/__module.decoder.aspp.0.convs.1.1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1405 : Tensor = aten::relu(%input.329), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.0/__module.decoder.aspp.0.convs.1/__module.decoder.aspp.0.convs.1.2 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1395:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.333 : Tensor = aten::_convolution(%input.331, %self.decoder.aspp.0.convs.2.0.1.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.0/__module.decoder.aspp.0.convs.2/__module.decoder.aspp.0.convs.2.0/__module.decoder.aspp.0.convs.2.0.1 # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.335 : Tensor = aten::batch_norm(%input.333, %self.decoder.aspp.0.convs.2.1.weight, %self.decoder.aspp.0.convs.2.1.bias, %self.decoder.aspp.0.convs.2.1.running_mean, %self.decoder.aspp.0.convs.2.1.running_var, %870, %878, %879, %873), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.0/__module.decoder.aspp.0.convs.2/__module.decoder.aspp.0.convs.2.1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1409 : Tensor = aten::relu(%input.335), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.0/__module.decoder.aspp.0.convs.2/__module.decoder.aspp.0.convs.2.2 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1395:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.339 : Tensor = aten::_convolution(%input.337, %self.decoder.aspp.0.convs.3.0.1.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.0/__module.decoder.aspp.0.convs.3/__module.decoder.aspp.0.convs.3.0/__module.decoder.aspp.0.convs.3.0.1 # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.341 : Tensor = aten::batch_norm(%input.339, %self.decoder.aspp.0.convs.3.1.weight, %self.decoder.aspp.0.convs.3.1.bias, %self.decoder.aspp.0.convs.3.1.running_mean, %self.decoder.aspp.0.convs.3.1.running_var, %870, %878, %879, %873), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.0/__module.decoder.aspp.0.convs.3/__module.decoder.aspp.0.convs.3.1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %1413 : Tensor = aten::relu(%input.341), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.0/__module.decoder.aspp.0.convs.3/__module.decoder.aspp.0.convs.3.2 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1395:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.353 : Tensor = aten::_convolution(%input.351, %self.decoder.aspp.0.project.0.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.0/__module.decoder.aspp.0.project/__module.decoder.aspp.0.project.0 # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.355 : Tensor = aten::batch_norm(%input.353, %self.decoder.aspp.0.project.1.weight, %self.decoder.aspp.0.project.1.bias, %self.decoder.aspp.0.project.1.running_mean, %self.decoder.aspp.0.project.1.running_var, %870, %878, %879, %873), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.0/__module.decoder.aspp.0.project/__module.decoder.aspp.0.project.1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %input.357 : Tensor = aten::relu(%input.355), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.0/__module.decoder.aspp.0.project/__module.decoder.aspp.0.project.2 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1395:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.363 : Tensor = aten::_convolution(%input.361, %self.decoder.aspp.1.1.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.1/__module.decoder.aspp.1.1 # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.365 : Tensor = aten::batch_norm(%input.363, %self.decoder.aspp.2.weight, %self.decoder.aspp.2.bias, %self.decoder.aspp.2.running_mean, %self.decoder.aspp.2.running_var, %870, %878, %879, %873), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.2 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %input.367 : Tensor = aten::relu(%input.365), scope: __module.decoder/__module.decoder.aspp/__module.decoder.aspp.3 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1395:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.377 : Tensor = aten::_convolution(%input.375, %self.decoder.block2.0.1.weight, %23, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.decoder/__module.decoder.block2/__module.decoder.block2.0/__module.decoder.block2.0.1 # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 + %input.379 : Tensor = aten::batch_norm(%input.377, %self.decoder.block2.1.weight, %self.decoder.block2.1.bias, %self.decoder.block2.1.running_mean, %self.decoder.block2.1.running_var, %870, %878, %879, %873), scope: __module.decoder/__module.decoder.block2/__module.decoder.block2.1 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2381:0 + %input.381 : Tensor = aten::relu(%input.379), scope: __module.decoder/__module.decoder.block2/__module.decoder.block2.2 # /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:1395:0 : invalid argument
WARNING: [Torch-TensorRT TorchScript Conversion Context] - GPU error during getBestTactic: %input.383 : Tensor = aten::_convolution(%input.381, %self.segmentation_head.0.weight, %self.segmentation_head.0.bias, %869, %871, %869, %870, %871, %25, %873, %870, %873, %873), scope: __module.segmentation_head/__module.segmentation_head.0 # /opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py:442:0 : invalid argument
[1, 256, 128, 128]
[1, 3, 512, 512]
[1, 256, 128, 128]
[1, 3, 512, 512]
TRT Module created
Benchmarking TRT Module...
Warm up ...
Start timing ...
Iteration 100/1000, ave batch time 2.74 ms
Iteration 200/1000, ave batch time 2.75 ms
Iteration 300/1000, ave batch time 2.74 ms
Iteration 400/1000, ave batch time 2.75 ms
Iteration 500/1000, ave batch time 2.74 ms
Iteration 600/1000, ave batch time 2.74 ms
Iteration 700/1000, ave batch time 2.75 ms
Iteration 800/1000, ave batch time 2.75 ms
Iteration 900/1000, ave batch time 2.75 ms
Iteration 1000/1000, ave batch time 2.75 ms
Input shape: torch.Size([1, 3, 512, 512])
Output features size: torch.Size([1, 3, 512, 512])
Environment
newest PyTorch NGC docker image
My Windows PC mounts a RTX3080.
My Ubuntu desktop mounts a GTX1080Ti.
Additional context
question No Activity channel: windows