当前位置:网站首页>Apple mobileone: the mobile terminal only needs 1ms of high-performance backbone
Apple mobileone: the mobile terminal only needs 1ms of high-performance backbone
2022-06-11 11:42:00 【Zhiyuan community】

The paper :https://arxiv.org/abs/2206.04040
Reading guide
Efficient neural network backbones for mobile devices are usually aimed at FLOP Or parameter count etc . However , When deployed on mobile devices , These indicators may not have a good correlation with the reasoning delay of the trunk . therefore , In this paper, different indicators are widely analyzed by deploying multiple efficient networks on mobile devices . By identifying and analyzing the architecture and optimization bottlenecks in efficient neural networks , This article provides ways to alleviate these bottlenecks . So , The author designed an efficient backbone MobileOne, Its variant is in iPhone12 Reasoning time on is less than 1 millisecond , stay ImageNet Upper top-1 Accuracy rate is 75.9%. This article shows MobileOne State of the art performance in an efficient architecture , At the same time, the speed on mobile devices has been improved many times , One of the best models is ImageNet Got the same as MobileFormer Similar performance , At the same time, the speed is improved 38 times .MobileOne stay ImageNet Upper top-1 More accurate than EfficientNet High at similar delays 2.3%. Besides ,MobileOne It can be extended to multiple tasks —— Image classification 、 Object detection and semantic segmentation , Compared with the existing efficient architecture deployed on mobile devices , Delay and accuracy are significantly improved .

contribution
High efficiency network has more practical value , But academic research tends to focus on FLOPs Or decrease of parameter quantity , There is no strict consistency between the two and the reasoning efficiency . such as ,FLOPs Memory access consumption and computing parallelism are not considered , Like a parameterless operation ( Such as skipping the connection Add、Concat etc. ) It will bring significant memory access consumption , Lead to longer reasoning time .

In order to better analyze the bottleneck of high-efficiency network , The author iPhone12 Platform as the benchmark , From different dimensions " bottleneck " analysis , See above . You can see from it :
A model with a high number of parameters can also have a low latency , such as ShuffleNetV2;
Have high FLOPs Our model can also have low latency , such as MobileNetV1 and ShuffleNetV2;

The above table starts from SRCC The angle is analyzed , You can see :
In mobile terminal , Delay with FLOPs The correlation with parameter quantity is weak ;
stay PC-CPU End , This correlation is further weakened .
Method
Based on these insights , The author starts with two main efficiencies " bottleneck " Compared dimensionally , Then the performance " bottleneck " The corresponding scheme is put forward .

- Activation Functions: The above table compares the effects of different activation functions on delay , You can see : Despite the same architecture , However, the delay caused by different activation functions varies greatly . Default selection for this article ReLU Activation function .

- Architectural Block: The above table shows the two main factors that affect the delay ( Memory access consumption and computing parallelism ) It has been analyzed , See the table above , You can see : When a single branch structure is used , The model is faster . Besides , To improve efficiency , The author has limited practical application in large model configuration SE modular .

Based on the above analysis ,MobileOne The core module of is based on MobileNetV1 And Design , At the same time, it absorbs the idea of heavy parameters , Get the structure shown in the figure above . notes : There is also a super parameter in the heavy parameter mechanism k Used to control the number of heavy parameter branches ( Experiments show that : For small models , This variant is more profitable ).

stay Model Scaling Similar aspects MobileNetV2, The table above shows MobileOne Parameter information of different configurations .

In terms of training optimization , Smaller models require less regularity , So the author puts forward Annealing Regular adjustment mechanism of ( Can bring 0.5% The indicators have been improved ); Besides , The author also introduces a progressive learning mechanism ( Can bring 0.4% The indicators have been improved ); Last , The author also uses EMA Mechanism , Final MobileOne-S2 The model achieves 77.4% Indicators of .
experiment

The table above shows ImageNet Performance and efficiency comparison of different lightweight schemes on data sets , You can see :
- Even the lightest Transformer At least 4ms, and MobileOne-S4 Only 1.86ms You can achieve 79.4% The accuracy of the ;
- comparison EfficientNet-B0,MobileOne-S3 It not only has high indicators 1%, At the same time, it has faster reasoning speed ;
- Compared to other solutions , stay PC-CPU End ,MobileOne There are still very obvious advantages .

The table above shows MS-COCO testing 、VOC Segmentation and ADE20K Performance comparison on split tasks , Obviously :
- stay MC-COCO On mission ,MobileOne-S4 Than MNASNet High indicators 27.8%, Than MobileViT high 6.1%;
- stay VOC Split tasks , The proposed scheme is better than MobileViT high 1.3%, Than MobileNetV2 high 5.8%;
- stay ADE20K On mission , The proposed best solution is better than MobileNetV2 high 12%, and MobileOne-S1 Still than MobileNetV2 high 2.9%.

At the end of the article , The author made a witty remark :"Although, our models are state-of-the art within the regime of efficient architectures, the accuracy lags large models ConvNeXt and Swin Transformer".
边栏推荐
- Publish WordPress database cache plug-in: DB cache reloaded 3.1
- Intl.numberformat set number format
- WordPress站内链接修改插件:Velvet Blues Update URLs
- Définir l'adresse de réception par défaut [Centre commercial du projet]
- 在毕设中学习03
- Node连接MySql数据库写模糊查询接口
- Only when you find your own advantages can you work tirelessly and get twice the result with half the effort!
- [file upload vulnerability 06] server file content detection and bypass experiment + image horse production method (based on upload-labs-14 shooting range)
- 让WordPress支持注册用户上传自定义头像功能
- [fragmentary thoughts] thoughts on wavelength, wave velocity and period
猜你喜欢

Intl.NumberFormat 设置数字格式

How to form a good habit? By perseverance? By determination? None of them!
![[file upload vulnerability 06] server file content detection and bypass experiment + image horse production method (based on upload-labs-14 shooting range)](/img/30/79516390c2b2b50a224eaa84a0c1c9.jpg)
[file upload vulnerability 06] server file content detection and bypass experiment + image horse production method (based on upload-labs-14 shooting range)

Only when you find your own advantages can you work tirelessly and get twice the result with half the effort!

Display of receiving address list 【 project mall 】

Template engine - thymeleaf

全国多年太阳辐射空间分布数据1981-2022年、气温分布数据、蒸散量数据、蒸发量数据、降雨量分布数据、日照数据、风速数据

The tutor transferred me 800 yuan and asked me to simulate a circuit (power supply design)

Liufan, CFO of papaya mobile, unleashes women's innovative power in the digital age

设置默认收货地址【项目 商城】
随机推荐
测试cos-html-cache静态缓存插件
Gerber文件在PCB制造中的作用
Typeerror: argument of type "Int 'is not Iterable
設置默認收貨地址【項目 商城】
在毕设中学习03
【Go】Gin源码解读
How should ordinary people choose annuity insurance products?
Adapter mode -- can you talk well?
JS to realize the rotation chart (riding light). Pictures can be switched left and right. Moving the mouse will stop the rotation
Set the default receiving address [project mall]
Queuing theory model
NFT digital collection system development and construction process
WordPress landing page beautification plug-in: recommended by login Designer
Web development model selection, who graduated from web development
推荐几款Gravatar头像缓存插件
2019年书单
WordPress landing page customization plug-in recommendation
WordPress站内链接修改插件:Velvet Blues Update URLs
WordPress数据库缓存插件:DB Cache Reloaded
全国多年太阳辐射空间分布数据1981-2022年、气温分布数据、蒸散量数据、蒸发量数据、降雨量分布数据、日照数据、风速数据