当前位置:网站首页>Nasvit: neural architecture search of efficient visual converter with gradient conflict perception hypernetwork training
Nasvit: neural architecture search of efficient visual converter with gradient conflict perception hypernetwork training
2022-07-03 03:00:00 【Zhiyuan community】
Paper title :NASViT: Neural Architecture Search for Efficient Vision Transformers with Gradient Conflict aware Supernet Training
Thesis link :https://openreview.net/forum?id=Qaw16njk6L
Design accurate and efficient vision Transformer (ViT) Is a very important but challenging task . One time neural architecture search based on HYPERNET (NAS) It can realize rapid architecture optimization , And in convolution neural network (CNN) Has made the most advanced (SOTA) result . However , Direct application of HYPERNET based NAS To optimize ViT Can lead to poor performance - And training ViT Worse than . In this work , We observed that the poor performance is due to the gradient conflict problem :ViTs The gradient conflict ratio of different subnetworks and hypernetworks in CNN More serious , This leads to premature saturation and poor convergence of training . To alleviate the problem , We propose a series of technologies , Including gradient projection algorithm 、 Switchable layer scaling design and simplified data enhancement and regularization training scheme . The proposed technique significantly improves the convergence and performance of all subnetworks . The mixture we found ViT Model series , be called NASViT, stay ImageNet From 200M To 800M FLOPs Of top-1 The accuracy is from 78.2% To 81.8%, And superior to all existing technologies CNN and ViT, Include AlphaNet and LeViT etc. . When transferring to semantics in segmentation tasks ,NASViT stay Cityscape and ADE20K The performance of data sets is also better than that of previous Backbone Networks , Only in 5GFLOPs We have achieved 73.2% and 37.9% Of mIoU.

边栏推荐
- The base value is too large (the error is marked as "08") [duplicate] - value too great for base (error token is'08') [duplicate]
- The process of connecting MySQL with docker
- sql server数据库添加 mdf数据库文件,遇到的报错
- tensor中的append应该如何实现
- 敏捷认证(Professional Scrum Master)模拟练习题
- 敏捷认证(Professional Scrum Master)模拟练习题-2
- Use cve-2021-43893 to delete files on the domain controller
- JMeter performance test JDBC request (query database to obtain database data) use "suggestions collection"
- Three.js本地环境搭建
- C language beginner level - pointer explanation - paoding jieniu chapter
猜你喜欢

Sqlserver row to column pivot

Deep reinforcement learning for intelligent transportation systems: a survey paper reading notes

函数栈帧的创建与销毁

超好用的日志库 logzero

Joking about Domain Driven Design (III) -- Dilemma

HW-初始准备

TCP handshake three times and wave four times. Why does TCP need handshake three times and wave four times? TCP connection establishes a failure processing mechanism

Today, it's time to copy the bottom!

Add MDF database file to SQL Server database, and the error is reported

你真的懂继电器吗?
随机推荐
Segmentation fault occurs during VFORK execution
Deep reinforcement learning for intelligent transportation systems: a survey paper reading notes
Use of check boxes: select all, deselect all, and select some
Reset or clear NET MemoryStream - Reset or Clear . NET MemoryStream
SQL statement
Add MDF database file to SQL Server database, and the error is reported
Three. JS local environment setup
销毁Session和清空指定的属性
[flutter] example of asynchronous programming code between future and futurebuilder (futurebuilder constructor setting | handling flutter Chinese garbled | complete code example)
I2C subsystem (I): I2C spec
JMeter performance test JDBC request (query database to obtain database data) use "suggestions collection"
Distributed transaction
后管中编辑与预览获取表单的值写法
HW initial preparation
Are there any recommended term life insurance products? I want to buy a term life insurance.
Opengauss database development and debugging tool guide
Your family must be very poor if you fight like this!
Chart. JS multitooltip tag - chart js multiTooltip labels
leetcode540
As a leader, how to control the code version and demand development when the epidemic comes| Community essay solicitation