当前位置:网站首页>PyTorch学习之归一化层(BatchNorm、LayerNorm、InstanceNorm、GroupNorm)[通俗易懂]
PyTorch学习之归一化层(BatchNorm、LayerNorm、InstanceNorm、GroupNorm)[通俗易懂]
2022-06-29 10:44:00 【全栈程序员站长】
大家好,又见面了,我是你们的朋友全栈君。
BN,LN,IN,GN从学术化上解释差异:BatchNorm:batch方向做归一化,算NHW的均值,对小batchsize效果不好;BN主要缺点是对batchsize的大小比较敏感,由于每次计算均值和方差是在一个batch上,所以如果batchsize太小,则计算的均值、方差不足以代表整个数据分布 LayerNorm:channel方向做归一化,算CHW的均值,主要对RNN作用明显; InstanceNorm:一个channel内做归一化,算H*W的均值,用在风格化迁移;因为在图像风格化中,生成结果主要依赖于某个图像实例,所以对整个batch归一化不适合图像风格化中,因而对HW做归一化。可以加速模型收敛,并且保持每个图像实例之间的独立。 GroupNorm:将channel方向分group,然后每个group内做归一化,算(C//G)HW的均值;这样与batchsize无关,不受其约束。 SwitchableNorm是将BN、LN、IN结合,赋予权重,让网络自己去学习归一化层应该使用什么方法。
1 BatchNorm
torch.nn.BatchNorm1d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) torch.nn.BatchNorm2d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) torch.nn.BatchNorm3d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
参数:
num_features: 来自期望输入的特征数,该期望输入的大小为’batch_size x num_features [x width]’ eps: 为保证数值稳定性(分母不能趋近或取0),给分母加上的值。默认为1e-5。 momentum: 动态均值和动态方差所使用的动量。默认为0.1。 affine: 布尔值,当设为true,给该层添加可学习的仿射变换参数。 track_running_stats:布尔值,当设为true,记录训练过程中的均值和方差;
实现公式:
2 GroupNorm
torch.nn.GroupNorm(num_groups, num_channels, eps=1e-05, affine=True)
参数:
num_groups:需要划分为的groups num_features: 来自期望输入的特征数,该期望输入的大小为’batch_size x num_features [x width]’ eps: 为保证数值稳定性(分母不能趋近或取0),给分母加上的值。默认为1e-5。 momentum: 动态均值和动态方差所使用的动量。默认为0.1。 affine: 布尔值,当设为true,给该层添加可学习的仿射变换参数。
实现公式:
3 InstanceNorm
torch.nn.InstanceNorm1d(num_features, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False) torch.nn.InstanceNorm2d(num_features, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False) torch.nn.InstanceNorm3d(num_features, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
参数:
num_features: 来自期望输入的特征数,该期望输入的大小为’batch_size x num_features [x width]’ eps: 为保证数值稳定性(分母不能趋近或取0),给分母加上的值。默认为1e-5。 momentum: 动态均值和动态方差所使用的动量。默认为0.1。 affine: 布尔值,当设为true,给该层添加可学习的仿射变换参数。 track_running_stats:布尔值,当设为true,记录训练过程中的均值和方差;
实现公式:
4 LayerNorm
torch.nn.LayerNorm(normalized_shape, eps=1e-05, elementwise_affine=True)
参数:
normalized_shape: 输入尺寸 [∗×normalized_shape[0]×normalized_shape[1]×…×normalized_shape[−1]] eps: 为保证数值稳定性(分母不能趋近或取0),给分母加上的值。默认为1e-5。 elementwise_affine: 布尔值,当设为true,给该层添加可学习的仿射变换参数。
实现公式:
5 LocalResponseNorm
torch.nn.LocalResponseNorm(size, alpha=0.0001, beta=0.75, k=1.0)
参数:
size:用于归一化的邻居通道数 alpha:乘积因子,Default: 0.0001 beta :指数,Default: 0.75 k:附加因子,Default: 1
实现公式:
参考:BatchNormalization、LayerNormalization、InstanceNorm、GroupNorm、SwitchableNorm总结
发布者:全栈程序员栈长,转载请注明出处:https://javaforall.cn/132340.html原文链接:https://javaforall.cn
边栏推荐
- The necessary operation for those big guys to fly 666 inadvertently at the bash command line terminal
- Qt学习02 GUI程序实例分析
- Adding sharding sphere5.0.0 sub tables to the ruoyi framework (adding custom sub table policies through SPI)
- Doodle cloud development demo login
- Qt学习06 窗口部件及窗口类型
- Hit the industry directly! The first model selection tool in the industry was launched by the flying propeller
- Uber前安全主管面临欺诈指控 曾隐瞒数据泄露事件
- (JS) imitate an instanceof method
- The use of variables in shell that you have to be familiar with
- Unity学习笔记--Vector3怎么设置默认参数
猜你喜欢

Pipeline aggregations管道聚合- parent-2

Safety innovation practice | Haitai Fangyuan was invited to participate in the technical exchange Seminar on "network information innovation and value co creation in the digital age"

Limit introduction summary

Bs-gx-017 online examination management system based on SSM

Xuetong denies that the theft of QQ number is related to it: it has been reported; IPhone 14 is ready for mass production: four models are launched simultaneously; Simple and elegant software has long

Qt学习07 Qt中的坐标系统

CTO专访:合见工软深化产品布局 加速国产EDA技术革新

Take another picture of cloud redis' improvement path

Self-Improvement! Junior college "counter attack" master of Zhejiang University, 3 SCI, and finally become a doctor of Tsinghua University!

Nature | 全球海洋微生物组的生物合成潜力
随机推荐
信息技术应用创新专业人员(数据库)中级培训火热招生中(7月6-10日)
Qt学习11 Qt 中的字符串类
Qt学习02 GUI程序实例分析
what? It's amazing that you can read the whole comic book for free. You can't learn to be a money saver together
How to find out the wrong mobile number when querying MySQL
[HBZ sharing] the principle of reentrantlock realized by AQS + CAS +locksupport
QT learning 09 calculator interface code reconstruction
关于IP定位查询接口的测评Ⅱ
巴比特 | 元宇宙每日必读:HTC 宣布推出首款元宇宙手机,售价约2700元人民币,都有哪些新玩法?...
Doodle cloud development demo login
(JS) filter out keys with value greater than 2 in the object
Qt学习01 GUI程序原理分析
【每日3题(2)】生成交替二进制字符串的最少操作数
MySQL开启慢查询
Evaluation of IP location query interface Ⅱ
Modbustcp protocol WiFi wireless learning single channel infrared module (round shell version)
The first "cyborg" in the world died, and he only transformed himself to "change his life against the sky"
Modbus RTU protocol 485 learning 2-way infrared module
TTL serial port learning infrared remote control module can be extended to network control
【HBZ分享】Mysql的InnoDB原理