当前位置:网站首页>Apple mobileone: the mobile terminal only needs 1ms of high-performance backbone
Apple mobileone: the mobile terminal only needs 1ms of high-performance backbone
2022-06-11 11:42:00 【Zhiyuan community】

The paper :https://arxiv.org/abs/2206.04040
Reading guide
Efficient neural network backbones for mobile devices are usually aimed at FLOP Or parameter count etc . However , When deployed on mobile devices , These indicators may not have a good correlation with the reasoning delay of the trunk . therefore , In this paper, different indicators are widely analyzed by deploying multiple efficient networks on mobile devices . By identifying and analyzing the architecture and optimization bottlenecks in efficient neural networks , This article provides ways to alleviate these bottlenecks . So , The author designed an efficient backbone MobileOne, Its variant is in iPhone12 Reasoning time on is less than 1 millisecond , stay ImageNet Upper top-1 Accuracy rate is 75.9%. This article shows MobileOne State of the art performance in an efficient architecture , At the same time, the speed on mobile devices has been improved many times , One of the best models is ImageNet Got the same as MobileFormer Similar performance , At the same time, the speed is improved 38 times .MobileOne stay ImageNet Upper top-1 More accurate than EfficientNet High at similar delays 2.3%. Besides ,MobileOne It can be extended to multiple tasks —— Image classification 、 Object detection and semantic segmentation , Compared with the existing efficient architecture deployed on mobile devices , Delay and accuracy are significantly improved .

contribution
High efficiency network has more practical value , But academic research tends to focus on FLOPs Or decrease of parameter quantity , There is no strict consistency between the two and the reasoning efficiency . such as ,FLOPs Memory access consumption and computing parallelism are not considered , Like a parameterless operation ( Such as skipping the connection Add、Concat etc. ) It will bring significant memory access consumption , Lead to longer reasoning time .

In order to better analyze the bottleneck of high-efficiency network , The author iPhone12 Platform as the benchmark , From different dimensions " bottleneck " analysis , See above . You can see from it :
A model with a high number of parameters can also have a low latency , such as ShuffleNetV2;
Have high FLOPs Our model can also have low latency , such as MobileNetV1 and ShuffleNetV2;

The above table starts from SRCC The angle is analyzed , You can see :
In mobile terminal , Delay with FLOPs The correlation with parameter quantity is weak ;
stay PC-CPU End , This correlation is further weakened .
Method
Based on these insights , The author starts with two main efficiencies " bottleneck " Compared dimensionally , Then the performance " bottleneck " The corresponding scheme is put forward .

- Activation Functions: The above table compares the effects of different activation functions on delay , You can see : Despite the same architecture , However, the delay caused by different activation functions varies greatly . Default selection for this article ReLU Activation function .

- Architectural Block: The above table shows the two main factors that affect the delay ( Memory access consumption and computing parallelism ) It has been analyzed , See the table above , You can see : When a single branch structure is used , The model is faster . Besides , To improve efficiency , The author has limited practical application in large model configuration SE modular .

Based on the above analysis ,MobileOne The core module of is based on MobileNetV1 And Design , At the same time, it absorbs the idea of heavy parameters , Get the structure shown in the figure above . notes : There is also a super parameter in the heavy parameter mechanism k Used to control the number of heavy parameter branches ( Experiments show that : For small models , This variant is more profitable ).

stay Model Scaling Similar aspects MobileNetV2, The table above shows MobileOne Parameter information of different configurations .

In terms of training optimization , Smaller models require less regularity , So the author puts forward Annealing Regular adjustment mechanism of ( Can bring 0.5% The indicators have been improved ); Besides , The author also introduces a progressive learning mechanism ( Can bring 0.4% The indicators have been improved ); Last , The author also uses EMA Mechanism , Final MobileOne-S2 The model achieves 77.4% Indicators of .
experiment

The table above shows ImageNet Performance and efficiency comparison of different lightweight schemes on data sets , You can see :
- Even the lightest Transformer At least 4ms, and MobileOne-S4 Only 1.86ms You can achieve 79.4% The accuracy of the ;
- comparison EfficientNet-B0,MobileOne-S3 It not only has high indicators 1%, At the same time, it has faster reasoning speed ;
- Compared to other solutions , stay PC-CPU End ,MobileOne There are still very obvious advantages .

The table above shows MS-COCO testing 、VOC Segmentation and ADE20K Performance comparison on split tasks , Obviously :
- stay MC-COCO On mission ,MobileOne-S4 Than MNASNet High indicators 27.8%, Than MobileViT high 6.1%;
- stay VOC Split tasks , The proposed scheme is better than MobileViT high 1.3%, Than MobileNetV2 high 5.8%;
- stay ADE20K On mission , The proposed best solution is better than MobileNetV2 high 12%, and MobileOne-S1 Still than MobileNetV2 high 2.9%.

At the end of the article , The author made a witty remark :"Although, our models are state-of-the art within the regime of efficient architectures, the accuracy lags large models ConvNeXt and Swin Transformer".
边栏推荐
- Typeerror: argument of type "Int 'is not Iterable
- WordPress regenerate featured image plugin: regenerate thumbnails
- MSF CS OpenSSL traffic encryption
- 【碎碎念】关于波长|波速|周期的想法
- 2022 | framework for Android interview -- Analysis of the core principles of binder, handler, WMS and AMS!
- 普通人应当如何挑选年金险产品?
- NFT digital collection system platform construction
- SpingBoot+Quartrz生产环境的应用支持分布式、自定义corn、反射执行多任务
- Liufan, CFO of papaya mobile, unleashes women's innovative power in the digital age
- WordPress重新生成特色图像插件:Regenerate Thumbnails
猜你喜欢

广东市政安全施工资料管理软件2022新表格来啦

苹果MobileOne: 移动端仅需1ms的高性能骨干

Web development model selection, who graduated from web development

Problems encountered when using nailing intranet to penetrate and upload PHP projects

Streaking? Baa!

CVPR 2022 | 文本引导的实体级别图像操作ManiTrans

How to solve the problem that high-precision positioning technologies such as ultra wideband UWB, Bluetooth AOA and RTK cannot be widely used due to their high cost? Adopt the idea of integrated deplo

Method of converting VOC format data set to Yolo format data set

Gerber文件在PCB制造中的作用

Use of Chinese input method input event composition
随机推荐
【Go】Gin源码解读
WordPress landing page beautification plug-in: recommended by login Designer
The complete manual of the strongest Flink operator is a good choice for the interview~
The role of Gerber file in PCB manufacturing
The no category parents plug-in helps you remove the category prefix from the category link
2022 | framework for Android interview -- Analysis of the core principles of binder, handler, WMS and AMS!
JVM-类加载过程
Lifeifei: I am more like a scientist in physics than an engineer
木瓜移动CFO刘凡 释放数字时代女性创新力量
2020-07 学习笔记整理
Processing of uci-har datasets
Runtime reconfiguration of etcd
Maximum water container
Intl.NumberFormat 设置数字格式
[第二章 基因和染色体的关系]生物知识概括–高一生物
【C语言】anonymous/unnamed struct&&union
log4j-slf4j-impl cannot be present with log4j-to-slf4j
nft数字藏品系统开发搭建流程
Where is it safer to open an account for soda ash futures? How much does it cost to buy soda ash futures?
Etcd介绍