当前位置:网站首页>Mobileone: the mobile terminal only needs 1ms of high-performance backbone
Mobileone: the mobile terminal only needs 1ms of high-performance backbone
2022-06-29 03:46:00 【AI vision netqi】
Open source address :
GitHub - shoutOutYangJie/MobileOne: An Improved One millisecond Mobile Backbone
1060 On the video card ,224*224,cpu On 10ms,gpu On 3ms.
Test code :
if __name__ == '__main__':
model = make_mobileone_s0().cuda(0)
model.eval()
data = torch.rand(1, 3, 224, 224).cuda(0)
for i in range(10):
start = time.time()
out = model(data)
print('time', time.time() - start, out.size())MobileOne(≈MobileNetV1+RepVGG+ Training Trick) By Apple The company's proposal is based on iPhone12 Optimized ultra lightweight architecture , stay ImageNet The data set is marked with <1ms The speed of 75.9% Of Top1 precision .

In order to better analyze the bottleneck of high-efficiency network , The author iPhone12 Platform as the benchmark , From different dimensions " bottleneck " analysis , See above . You can see from it :
A model with a high number of parameters can also have a low latency , such as ShuffleNetV2;
Have high FLOPs Our model can also have low latency , such as MobileNetV1 and ShuffleNetV2;

The above table starts from SRCC The angle is analyzed , You can see :
In mobile terminal , Delay with FLOPs The correlation with parameter quantity is weak ;
stay PC-CPU End , This correlation is further weakened .
The specific plan
Based on these insights , The author starts with two main efficiencies " bottleneck " Compared dimensionally , Then the performance " bottleneck " The corresponding scheme is put forward .

Activation Functions: The above table compares the effects of different activation functions on delay , You can see : Despite the same architecture , However, the delay caused by different activation functions varies greatly . Default selection for this article ReLU Activation function .

Architectural Block: The above table shows the two main factors that affect the delay ( Memory access consumption and computing parallelism ) It has been analyzed , See the table above , You can see : When a single branch structure is used , The model is faster . Besides , To improve efficiency , The author has limited practical application in large model configuration SE modular .

Based on the above analysis ,MobileOne The core module of is based on MobileNetV1 And Design , At the same time, it absorbs the idea of heavy parameters , Get the structure shown in the figure above . notes : There is also a super parameter in the heavy parameter mechanism k Used to control the number of heavy parameter branches ( Experiments show that : For small models , This variant is more profitable ).

stay Model Scaling Similar aspects MobileNetV2, The table above shows MobileOne Parameter information of different configurations .

In terms of training optimization , Smaller models require less regularity , So the author puts forward Annealing Regular adjustment mechanism of ( Can bring 0.5% The indicators have been improved ); Besides , The author also introduces a progressive learning mechanism ( Can bring 0.4% The indicators have been improved ); Last , The author also uses EMA Mechanism , Final MobileOne-S2 The model achieves 77.4% Indicators of .
experimental result

The table above shows ImageNet Performance and efficiency comparison of different lightweight schemes on data sets , You can see :
Even the lightest Transformer At least 4ms, and MobileOne-S4 Only 1.86ms You can achieve 79.4% The accuracy of the ;
comparison EfficientNet-B0,MobileOne-S3 It not only has high indicators 1%, At the same time, it has faster reasoning speed ;
Compared to other solutions , stay PC-CPU End ,MobileOne There are still very obvious advantages .

The table above shows MS-COCO testing 、VOC Segmentation and ADE20K Performance comparison on split tasks , Obviously :
stay MC-COCO On mission ,MobileOne-S4 Than MNASNet High indicators 27.8%, Than MobileViT high 6.1%;
stay VOC Split tasks , The proposed scheme is better than MobileViT high 1.3%, Than MobileNetV2 high 5.8%;
stay ADE20K On mission , The proposed best solution is better than MobileNetV2 high 12%, and MobileOne-S1 Still than MobileNetV2 high 2.9%.

At the end of the article , The author made a witty remark :"Although, our models are state-of-the art within the regime of efficient architectures, the accuracy lags large models ConvNeXt and Swin Transformer". What I want to say is : Look at the picture above .
边栏推荐
- An internal error occurred during: 'Retrieving archetypes:'.
- Laravel v. about laravel using the pagoda panel to connect to the cloud database (MySQL)
- [interview guide] AI algorithm interview
- Data collection and management [4]
- Input input box click with border
- 人大金仓(KingBase)导出表结构
- [tcapulusdb knowledge base] Introduction to tcapulusdb data import
- Grafana Getting Started tutorial
- Same tree [from part to whole]
- Data collection and management [8]
猜你喜欢

高性能限流器 Guava RateLimiter

【TcaplusDB知识库】TcaplusDB-tcapsvrmgr工具介绍(三)
![[test theory] quality analysis ability](/img/4b/d011e16c7b2be52fe12c123214779e.jpg)
[test theory] quality analysis ability

Installation and deployment of sw-x framework

88.(cesium篇)cesium聚合图

Seekbar custom pictures are not displayed completely up, down, left, right / bitmaptodrawable / bitmaptodrawable inter rotation / paddingstart/paddingend /thumboffset

【资料上新】基于3568开发板的NPU开发资料全面升级

【世界海洋日】TcaplusDB号召你一同保护海洋生物多样性

分布式id解决方案

leetcode:560. 和为 K 的子数组
随机推荐
分布式id解决方案
[dynamic planning] change exchange
Data statistical analysis (SPSS) [3]
Laravel v. about laravel using the pagoda panel to connect to the cloud database (MySQL)
5-minute NLP: summary of time chronology from bag of words to transformer
[World Ocean Day] tcapulusdb calls on you to protect marine biodiversity together
Live broadcast preview | neurips special session I & Young Scientists special session
需求分析说明书和需求规格说明书
点云地图导入gazebo思路
[tcapulusdb knowledge base] Introduction to tcapulusdb table data caching
Inventory deduction based on redis
Requirements analysis specification and requirements specification
Go implements distributed locks
Preliminary construction of SSM project environment
Zigzag sequence traversal of binary tree [one of layered traversal methods - > preorder traversal +level]
《运营之光3.0》全新上市——跨越时代,自我颠覆的诚意之作
[tcapulusdb knowledge base] Introduction to tcapulusdb data import
4种分布式session解决方案
Idea of importing point cloud map into gazebo
MySQL Varcahr to int