当前位置:网站首页>Nasvit: neural architecture search of efficient visual converter with gradient conflict perception hypernetwork training
Nasvit: neural architecture search of efficient visual converter with gradient conflict perception hypernetwork training
2022-07-03 03:00:00 【Zhiyuan community】
Paper title :NASViT: Neural Architecture Search for Efficient Vision Transformers with Gradient Conflict aware Supernet Training
Thesis link :https://openreview.net/forum?id=Qaw16njk6L
Design accurate and efficient vision Transformer (ViT) Is a very important but challenging task . One time neural architecture search based on HYPERNET (NAS) It can realize rapid architecture optimization , And in convolution neural network (CNN) Has made the most advanced (SOTA) result . However , Direct application of HYPERNET based NAS To optimize ViT Can lead to poor performance - And training ViT Worse than . In this work , We observed that the poor performance is due to the gradient conflict problem :ViTs The gradient conflict ratio of different subnetworks and hypernetworks in CNN More serious , This leads to premature saturation and poor convergence of training . To alleviate the problem , We propose a series of technologies , Including gradient projection algorithm 、 Switchable layer scaling design and simplified data enhancement and regularization training scheme . The proposed technique significantly improves the convergence and performance of all subnetworks . The mixture we found ViT Model series , be called NASViT, stay ImageNet From 200M To 800M FLOPs Of top-1 The accuracy is from 78.2% To 81.8%, And superior to all existing technologies CNN and ViT, Include AlphaNet and LeViT etc. . When transferring to semantics in segmentation tasks ,NASViT stay Cityscape and ADE20K The performance of data sets is also better than that of previous Backbone Networks , Only in 5GFLOPs We have achieved 73.2% and 37.9% Of mIoU.
边栏推荐
- I2C subsystem (III): I2C driver
- Can netstat still play like this?
- ComponentScan和ComponentScans的区别
- Counter统计数量后,如何返回有序的key
- How to implement append in tensor
- Force freeing memory in PHP
- "Analysis of 43 cases of MATLAB neural network": Chapter 43 efficient programming skills of neural network -- Discussion Based on the characteristics of the new version of MATLAB r2012b
- Apple releases MacOS 11.6.4 update: mainly security fixes
- js根据树结构查找某个节点的下面的所有父节点或者子节点
- Use cve-2021-43893 to delete files on the domain controller
猜你喜欢
左连接,内连接
C language beginner level - pointer explanation - paoding jieniu chapter
Add MDF database file to SQL Server database, and the error is reported
基于can总线的A2L文件解析(2)
Thunderbolt Chrome extension caused the data returned by the server JS parsing page data exception
I2C subsystem (III): I2C driver
Kubernetes cluster log and efk architecture log scheme
Add some hard dishes to the interview: how to improve throughput and timeliness in delayed task scenarios!
The process of connecting MySQL with docker
你真的懂继电器吗?
随机推荐
Error when installing MySQL in Linux: starting mysql The server quit without updating PID file ([FAILED]al/mysql/data/l.pid
Why choose a frame? What frame to choose
Add automatic model generation function to hade
JS finds all the parent nodes or child nodes under a node according to the tree structure
C语言中左值和右值的区别
ASP. Net core 6 framework unveiling example demonstration [02]: application development based on routing, MVC and grpc
[fluent] future asynchronous programming (introduction | then method | exception capture | async, await keywords | whencomplete method | timeout method)
Distributed transaction
处理数据集,使用LabelEncoder将所有id转换为从0开始
C language beginner level - pointer explanation - paoding jieniu chapter
Random Shuffle attention
函数栈帧的创建与销毁
Change cell color in Excel using C - cell color changing in Excel using C
A2L file parsing based on CAN bus (2)
How to use asp Net MVC identity 2 change password authentication- How To Change Password Validation in ASP. Net MVC Identity 2?
Can netstat still play like this?
Xiaodi notes
Left connection, inner connection
Creation and destruction of function stack frame
Three.js本地环境搭建