当前位置：网站首页>Nasvit: neural architecture search of efficient visual converter with gradient conflict perception hypernetwork training

Nasvit: neural architecture search of efficient visual converter with gradient conflict perception hypernetwork training

2022-07-03 03:00:00 【Zhiyuan community】

Paper title ：NASViT: Neural Architecture Search for Efficient Vision Transformers with Gradient Conflict aware Supernet Training

Thesis link ：https://openreview.net/forum?id=Qaw16njk6L

Design accurate and efficient vision Transformer (ViT) Is a very important but challenging task . One time neural architecture search based on HYPERNET (NAS) It can realize rapid architecture optimization , And in convolution neural network (CNN) Has made the most advanced (SOTA) result . However , Direct application of HYPERNET based NAS To optimize ViT Can lead to poor performance - And training ViT Worse than . In this work , We observed that the poor performance is due to the gradient conflict problem ：ViTs The gradient conflict ratio of different subnetworks and hypernetworks in CNN More serious , This leads to premature saturation and poor convergence of training . To alleviate the problem , We propose a series of technologies , Including gradient projection algorithm 、 Switchable layer scaling design and simplified data enhancement and regularization training scheme . The proposed technique significantly improves the convergence and performance of all subnetworks . The mixture we found ViT Model series , be called NASViT, stay ImageNet From 200M To 800M FLOPs Of top-1 The accuracy is from 78.2% To 81.8%, And superior to all existing technologies CNN and ViT, Include AlphaNet and LeViT etc. . When transferring to semantics in segmentation tasks ,NASViT stay Cityscape and ADE20K The performance of data sets is also better than that of previous Backbone Networks , Only in 5GFLOPs We have achieved 73.2% and 37.9% Of mIoU.

原网站

版权声明
本文为[Zhiyuan community]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202151018356912.html

当前位置：网站首页>Nasvit: neural architecture search of efficient visual converter with gradient conflict perception hypernetwork training

Nasvit: neural architecture search of efficient visual converter with gradient conflict perception hypernetwork training

边栏推荐

猜你喜欢

随机推荐