当前位置:网站首页>[2022 freshmen learning] key points of the third week
[2022 freshmen learning] key points of the third week
2022-07-29 05:03:00 【AI frontier theory group @ouc】
1、Batch Normalization
Use BN What we need to pay attention to when :
- Because the mean and variance will be counted in real time during training , However, historical statistical values should be used in the test , Not current , So set it up . stay pytorch You can create a model by model.train() and model.eval() Method control .( Similar to that Dropout)
- batch size Set it as large as possible , The larger the setting, the closer the mean and variance are to the real distribution of the whole data set .( But also consider your hardware )
- take BN Layer on the convolution layer (Conv) And activation layer ( for example ReLU) Between , And don't use bias in the convolution layer bias.
2、 Grouping convolution

Group Convolution It's about input feature map Grouping , Then each group is convoluted separately . If divided into G Group , Then the parameter quantity will be reduced to the original 1/G.
Speed up : Theoretically, it can improve the speed of the network , But in fact, there may not be 3X3 High convolution efficiency , This is because pytorch Specifically for 3X3 The convolution of is optimized , Grouping convolution destroys this optimization .
Accuracy improvement : Grouping can transform features into several subspaces (subspace), Have a more comprehensive understanding of image information . Something like Transformer Inside Multi-Head Self-Attention, It's just Transformer Inside is the grouping of attention calculation , It was called “ long position ”, And the group convolution is right convolution grouping .

Transformer From natural language processing , In a real language environment , Every word and different words , Have different relationships . You can use different Attention To complement these different relationships . The above figure shows three attention, That is, three subspaces , You can better learn this relationship in subspace .

AlexNet There is also a classic discovery in , The first three lines in the above figure are GPU1 Learned filter , The last three lines are GPU2 Learned filter . You can find a major learning texture 、 Gradient information , Another major learning color information , It can be understood as different subspaces .
3、Res2Net

From the work of chengmingming teacher group of Nankai University , Characteristics of the group + The perfect combination of multiple scales . Two experiments in this paper discuss the grouping of features . It can be seen that , With the improvement of scale , Accuracy will improve , The speed will decrease . and , Grouping is greater than 4 When , Relative to grouping as 4 The improvement is not very obvious . therefore , Feature grouping is not the more the better , Increasing the number of groups will increase the calculation consumption , Need a certain balance .


边栏推荐
- IOS interview preparation - Online
- On prepayment of house purchase
- Data Lake: spark, a distributed open source processing engine
- 使用更灵活、更方便的罗氏线圈
- Reveal安装配置调试
- JDBC statement + resultset introduction
- 如何避免示波器电流探头损坏
- Sparksql inserts or updates in batches and saves data to MySQL
- IDEA中使用注解Test
- 如何安装office2010安装包?office2010安装包安装到电脑上的方法
猜你喜欢

Improve the readability of your regular expressions a hundred times

GCC Basics

带你一文理解JS数组

How is the entered query SQL statement executed?

excel怎么设置行高和列宽?excel设置行高和列宽的方法

The song of the virtual idol was originally generated in this way!
Let you understand several common traffic exposure schemes in kubernetes cluster

1 句代码,搞定 ASP.NET Core 绑定多个源到同一个类

Command line interactive tools (latest version) inquirer practical tutorial

How does excel filter out the content you want? Excel table filtering content tutorial
随机推荐
On prepayment of house purchase
What if excel is stuck and not saved? The solution of Excel not saved but stuck
ios面试准备 - objective-c篇
How to monitor micro web services
Traffic flow prediction pit climbing record (I): traffic flow data set, original data
金达威董秘回复:公司看好NMN产品的市场前景,已推出系列产品
excel怎么设置行高和列宽?excel设置行高和列宽的方法
ODOO开发教程之图表
Go memory model for concurrency
【微信小程序--解决display:flex最后一行对齐问题。(不连续排列会分到两边)】
How is the entered query SQL statement executed?
JDBC statement + resultset introduction
Improve the readability of your regular expressions a hundred times
Implementation of img responsive pictures (including the usage of srcset attribute and sizes attribute, and detailed explanation of device pixel ratio)
PHP determines whether the user has logged in. If logged in, the home page will be displayed. If not, enter the login page or registration page
Wechat picture identification
Build auto.js script development environment
Torch.nn.crossentropyloss() details
How to open IE browser by running win command
软件测试面试题(四)