当前位置:网站首页>[2022 freshmen learning] key points of the third week
[2022 freshmen learning] key points of the third week
2022-07-29 05:03:00 【AI frontier theory group @ouc】
1、Batch Normalization
Use BN What we need to pay attention to when :
- Because the mean and variance will be counted in real time during training , However, historical statistical values should be used in the test , Not current , So set it up . stay pytorch You can create a model by model.train() and model.eval() Method control .( Similar to that Dropout)
- batch size Set it as large as possible , The larger the setting, the closer the mean and variance are to the real distribution of the whole data set .( But also consider your hardware )
- take BN Layer on the convolution layer (Conv) And activation layer ( for example ReLU) Between , And don't use bias in the convolution layer bias.
2、 Grouping convolution

Group Convolution It's about input feature map Grouping , Then each group is convoluted separately . If divided into G Group , Then the parameter quantity will be reduced to the original 1/G.
Speed up : Theoretically, it can improve the speed of the network , But in fact, there may not be 3X3 High convolution efficiency , This is because pytorch Specifically for 3X3 The convolution of is optimized , Grouping convolution destroys this optimization .
Accuracy improvement : Grouping can transform features into several subspaces (subspace), Have a more comprehensive understanding of image information . Something like Transformer Inside Multi-Head Self-Attention, It's just Transformer Inside is the grouping of attention calculation , It was called “ long position ”, And the group convolution is right convolution grouping .

Transformer From natural language processing , In a real language environment , Every word and different words , Have different relationships . You can use different Attention To complement these different relationships . The above figure shows three attention, That is, three subspaces , You can better learn this relationship in subspace .

AlexNet There is also a classic discovery in , The first three lines in the above figure are GPU1 Learned filter , The last three lines are GPU2 Learned filter . You can find a major learning texture 、 Gradient information , Another major learning color information , It can be understood as different subspaces .
3、Res2Net

From the work of chengmingming teacher group of Nankai University , Characteristics of the group + The perfect combination of multiple scales . Two experiments in this paper discuss the grouping of features . It can be seen that , With the improvement of scale , Accuracy will improve , The speed will decrease . and , Grouping is greater than 4 When , Relative to grouping as 4 The improvement is not very obvious . therefore , Feature grouping is not the more the better , Increasing the number of groups will increase the calculation consumption , Need a certain balance .


边栏推荐
- Double type nullpointexception in Flink flow calculation
- WPS插入超链接无法打开,提示“无法打开指定文件”怎么办!
- How to debug UDP port
- ODOO开发教程之图表
- Use annotation test in idea
- IDEA中使用注解Test
- Using jupyter (I), install jupyter under windows, open the browser, and modify the default opening address
- Data Lake: spark, a distributed open source processing engine
- 【config】配置数组参数
- Deadlock to be resolved
猜你喜欢

Flink+iceberg environment construction and production problem handling

What if the computer cannot open excel? The solution of Excel not opening

Quick start JDBC

ODOO开发教程之透视表

如何让照片中的人物笑起来?HMS Core视频编辑服务一键微笑功能,让人物笑容更自然

Use more flexible and convenient Rogowski coil

MySQL定时调用预置函数完成数据更新

Command line interactive tools (latest version) inquirer practical tutorial

Traffic flow prediction pit climbing record (I): traffic flow data set, original data

Introduction of JDBC preparestatement+ database connection pool
随机推荐
What if excel is stuck and not saved? The solution of Excel not saved but stuck
如何避免示波器电流探头损坏
IOS interview preparation - IOS
使用近场探头和电流探头进行EMI干扰排查
Data Lake: spark, a distributed open source processing engine
AttributeError: ‘module‘ object has no attribute ‘create_connection‘
Northeast University Data Science Foundation (matlab) - Notes
Conv1d of torch
Academic | [latex] super detailed texlive2022+tex studio download installation configuration
Solution | get the relevant information about the current employees' highest salary in each department |
The difference between the two ways of thread implementation - simple summary
Unity Metaverse(三)、Protobuf & Socket 实现多人在线
Various configurations when pulsar starts the client (client, producer, consumer)
MySQL regularly calls preset functions to complete data update
力扣------对奇偶下标分别排序
Learn matlab to draw geographical map, line scatter bubble density map
A little knowledge about management
Mysql各版本下载地址及多版本共存安装
Let you understand several common traffic exposure schemes in kubernetes cluster
How to open IE browser by running win command