当前位置:网站首页>【MobileNet V3】《Searching for MobileNetV3》
【MobileNet V3】《Searching for MobileNetV3》
2022-07-02 06:26:00 【bryant_meng】
1 Background and Motivation
【MobileNet】《MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications》(CVPR-2017)
【MobileNet V2】《MobileNetV2:Inverted Residuals and Linear Bottlenecks》(CVPR-2018)
deliver the next generation of high accuracy efficient neural network models to power on-device computer vision
2 Related Work
reducing the number of parameters -> reducing the number of operations (MAdds) -> reducing the actual measured latencyNAS
cell level -> block levelQuantization
knowledge distillation
3 Advantages / Contributions
NAS + 手动设计组装成 mobilenet v3 backbone,提出了 hard swish 激活函数(swish 改进版),提出了 Lite R-ASPP 分割头(R-ASPP 改进版),在分类、目标检测、分割数据集上速度和精度均有提升
4 Method
1)Network Search
Platform-Aware NAS for Blockwise Search(来自 MnastNet,稍微修改了一下 reward design 的权重)
NetAdapt for Layerwise Search
search per layer for the number of filters
maximizes △ A c c △ l a t e n c y \frac{\bigtriangleup Acc}{\bigtriangleup latency} △latency△Acc
2)Network Improvements
Redesigning Expensive Layers
search 后的网络头尾比较重,进行了优化
channels 32 + ReLU or swish 减小到了 channels 16 + hard swish
s w i s h ( x ) = x ⋅ σ ( x ) swish(x) = x \cdot \sigma(x) swish(x)=x⋅σ(x)
swish activation function 虽然提升了网络精度,但对硬件部署不够友好,增加了计算时间,作者采取了如下的改进(piece-wise linear)
h − s w i s h ( x ) = x R e L U 6 ( x + 3 ) 6 h-swish(x) = x \frac{ReLU6(x+3)}{6} h−swish(x)=x6ReLU6(x+3)
比 relu 慢的
only use h-swish at the second half of the model(we find that most of the benefits swish are realized by using them only in the deeper layers)
Large squeeze-and-excite
v3 相比于 v2,采用了 SE 模块,SE 里面的 sigmoid 也是采用的 hard 形式,也即 R e L U 6 ( x + 3 ) 6 \frac{ReLU6(x+3)}{6} 6ReLU6(x+3)
作者把 SE 模块中的 squeeze fc 固定成 block 中 expand 通道数的 1/4(图 4 红√ 处)
no discernible latency cost
MobileNetV3 Definitions
5 Experiments
use single-threaded large core in all our measurements
5.1 Datasets
- ImageNet
- Cityscapes
5.2 Classification
1)Impact of non-linearities
这里的 112 看的不是特别懂,N 越大按道理用的 h-swish 越多,速度要慢一些,怎么还快了
2)Impact of other components
5.3 Detection
mAP 离谱,哈哈
5.4 Segmentation
R-ASPP 基础上改进
6 Conclusion(own) / Future work
帕累托最优(Pareto Optimality),也称为帕累托效率(Pareto efficiency),是指资源分配的一种理想状态,假定固有的一群人和可分配的资源,从一种分配状态到另一种状态的变化中,在没有使任何人境况变坏的前提下,使得至少一个人变得更好,这就是帕累托改进或帕累托最优化。
帕累托最优状态就是不可能再有更多的帕累托改进的余地;换句话说,帕累托改进是达到帕累托最优的路径和方法。 帕累托最优是公平与效率的“理想王国”。是由帕累托提出的。MobileNet V3 = MobileNet v2 + SE + hard-swish activation + half initial layers channel & last block do global average pooling first(来自 盖肉特别慌)
- PHP uses the method of collecting to insert a value into the specified position in the array
- Faster-ILOD、maskrcnn_benchmark训练自己的voc数据集及问题汇总
- 【AutoAugment】《AutoAugment:Learning Augmentation Policies from Data》
- Proof and understanding of pointnet principle
- Comparison of chat Chinese corpus (attach links to various resources)
- [torch] the most concise logging User Guide
- Determine whether the version number is continuous in PHP
- PointNet理解(PointNet实现第4步)
- Apple added the first iPad with lightning interface to the list of retro products
- Drawing mechanism of view (I)
Timeout docking video generation
Memory model of program
Proof and understanding of pointnet principle
[introduction to information retrieval] Chapter 1 Boolean retrieval
ModuleNotFoundError: No module named ‘pytest‘
Pointnet understanding (step 4 of pointnet Implementation)
【TCDCN】《Facial landmark detection by deep multi-task learning》
Common CNN network innovations
【MagNet】《Progressive Semantic Segmentation》
ModuleNotFoundError: No module named ‘pytest‘
Memory model of program
【Paper Reading】
[torch] some ideas to solve the problem that the tensor parameters have gradients and the weight is not updated
SSM second hand trading website
Tencent machine test questions
Drawing mechanism of view (3)
Machine learning theory learning: perceptron
Translation of the paper "written mathematical expression recognition with bidirectionally trained transformer"
[model distillation] tinybert: distilling Bert for natural language understanding
Interpretation of ernie1.0 and ernie2.0 papers
[introduction to information retrieval] Chapter 1 Boolean retrieval
Play online games with mame32k
PHP returns the corresponding key value according to the value in the two-dimensional array
[tricks] whiteningbert: an easy unsupervised sentence embedding approach
Win10 solves the problem that Internet Explorer cannot be installed
【Sparse-to-Dense】《Sparse-to-Dense:Depth Prediction from Sparse Depth Samples and a Single Image》