当前位置:网站首页>【FastDepth】《FastDepth:Fast Monocular Depth Estimation on Embedded Systems》
【FastDepth】《FastDepth:Fast Monocular Depth Estimation on Embedded Systems》
2022-07-02 06:26:00 【bryant_meng】



ICRA-2019
文章目录
1 Background and Motivation
加速现有单目深度估计模型,使其不太损失精度的同时有较低延迟,能在 micro aerial vehicle 部署运行,辅助 mapping, localization, and obstacle avoidance 等 robotic tasks
2 Related Work
- Monocular Depth Estimation
- Efficient Neural Networks
- Network Pruning
3 Advantages / Contributions
加速单目深度估计模型:
- a low-complexity and low-latency decoder design
- a state-of-the-art pruning algorithm(NetAdapt 剪枝)
- Hardware-specific compilation(TVM 部署 DWConv 优化)
4 Method
1)整体结构
朴实无华的 U-Net 结构,skip connection 用的 add(没用 concat,avoid increasing the number of feature map channels)
upsample layer 细节如下

conv5(深度可分离卷积) + linear interpolation(相比于双线性,底层实现简单通用)
2)Network Pruning
用的 NetAdapt 方法来剪枝
《NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications》

就比较暴力和直接,下面的图更直观一些

3)Network Compilation
用 TVM 来加速 DWConv
参考:
TVM是一个支持 GPU、CPU、FPGA指令生成的开源编译器框架
TVM最大的特点是基于图和算符结构来优化指令生成,最大化硬件执行效率,它向上对接Tensorflow、Pytorch等深度学习框架,向下兼容GPU、CPU、ARM、TPU等硬件设备
TVM是一个端到端的指令生成器。它从深度学习框架中接收模型输入,然后进行图的转化和基本的优化,最后生成指令完成到硬件的部署。
TVM有两个主要特性:
- 支持将Keras、MxNet、PyTorch、Tensorflow、CoreML、DarkNet框架的深度学习模型编译为多种硬件后端的最小可部署模型。
- 能够自动生成和优化多个后端的张量操作并达到更好的性能。
下面感受下整体框架

再感受一下
再再再感受一下
5 Experiments
5.1 Datasets

评价指标
δ 1 \delta1 δ1 (the percentage of predicted pixels where the relative error is within 25%),越大越好
RMSE (root mean squared error),越小越好
5.2 Final Results and Comparison With Prior Work
实验平台

NVIDIA Jetson TX2 系列模组可为嵌入式 AI 计算设备提供出色的速度与能效。配备NVIDIA Pascal GPU、高达 8 GB 内存、59.7 GB/s 的显存带宽以及各种标准硬件接口,每款超级计算机模组将真正的AI计算带到边缘端。
相比 encoder,decoder占了更多 runtime,需要重点优化
Jetson TX2 in high performance (max-N) 模式下,和其他方法对比
Jetson TX2 in high energy-efficiency (max-Q) 模式下的结果
可视化结果如下,the error is highest at boundaries and at distant objects.
(c) 和(d)区别是 skip connection,(d)精细化了很多
5.3 Ablation Study
1)Encoder Design Space
选择的是 MobileNet,速度精度最好的权衡
2)Decoder Design Space
Upsample Operation,也即图 2 中的 upsample layer

(a)和(b)中的上采样操作是补零了 zero-insertion,(d)是 nearest neighbor interpolation

Depthwise Separable Convolution and Skip Connections

3)Hardware-Specific Optimization
把 DWConv 发挥到了进一步逼近理论压缩率的程度
4)Network Pruning
6 Conclusion(own) / Future work
更像是竞赛的技术报告!!!
code:https://github.com/dwofk/fast-depth
边栏推荐
- Win10+vs2017+denseflow compilation
- Faster-ILOD、maskrcnn_benchmark安装过程及遇到问题
- MoCO ——Momentum Contrast for Unsupervised Visual Representation Learning
- 【Hide-and-Seek】《Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization xxx》
- Three principles of architecture design
- MMDetection模型微调
- 传统目标检测笔记1__ Viola Jones
- 机器学习理论学习:感知机
- 【Ranking】Pre-trained Language Model based Ranking in Baidu Search
- MMDetection安装问题
猜你喜欢

Label propagation

【MEDICAL】Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization

SSM second hand trading website
![[medical] participants to medical ontologies: Content Selection for Clinical Abstract Summarization](/img/24/09ae6baee12edaea806962fc5b9a1e.png)
[medical] participants to medical ontologies: Content Selection for Clinical Abstract Summarization

Alpha Beta Pruning in Adversarial Search

使用百度网盘上传数据到服务器上

【Hide-and-Seek】《Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization xxx》

SSM garbage classification management system

Tencent machine test questions

Mmdetection installation problem
随机推荐
华为机试题
腾讯机试题
Classloader and parental delegation mechanism
【信息检索导论】第二章 词项词典与倒排记录表
Ding Dong, here comes the redis om object mapping framework
Implementation of purchase, sales and inventory system with ssm+mysql
【MEDICAL】Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization
[paper introduction] r-drop: regulated dropout for neural networks
[tricks] whiteningbert: an easy unsupervised sentence embedding approach
Label propagation
MySQL has no collation factor of order by
PHP returns the corresponding key value according to the value in the two-dimensional array
@Transitional step pit
半监督之mixmatch
【信息检索导论】第一章 布尔检索
Machine learning theory learning: perceptron
一份Slide两张表格带你快速了解目标检测
Implementation of yolov5 single image detection based on onnxruntime
论文tips
Sparksql data skew