当前位置：网站首页>Normalization layer of pytorch learning (batchnorm, layernorm, instancenorm, groupnorm) [easy to understand]

Normalization layer of pytorch learning (batchnorm, layernorm, instancenorm, groupnorm) [easy to understand]

2022-06-29 13:48:00 【Full stack programmer webmaster】

Hello everyone , I meet you again , I'm your friend, Quan Jun .

BN,LN,IN,GN Explain the differences academically ：BatchNorm：batch The direction is normalized , count NHW The average of , Yes, small batchsize The result is bad ;BN The main drawback is to batchsize It's more sensitive to the size of , Because the mean and variance of each calculation are in one batch On , So if batchsize Too small , Then the calculated mean 、 Variance is not enough to represent the entire data distribution LayerNorm：channel The direction is normalized , count CHW The average of , Mainly for RNN The effect is obvious ; InstanceNorm： One channel Do normalization inside , count H*W The average of , Used in stylized migration ; Because in image stylization , The result depends on an image instance , So for the whole batch Normalization is not suitable for image stylization , Therefore, the HW Normalization . It can accelerate the convergence of the model , And keep each image instance independent . GroupNorm： take channel Direction points group, Then each group Do normalization inside , count (C//G)HW The average of ; In this way batchsize irrelevant , Not bound by it . SwitchableNorm Yes, it will BN、LN、IN combination , Give weight to , Let the network go by itself Study What method should be used for the normalization layer .

1 BatchNorm

torch.nn.BatchNorm1d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) torch.nn.BatchNorm2d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) torch.nn.BatchNorm3d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

Parameters ：

num_features： The number of features from the desired input , The expected input size is ’batch_size x num_features [x width]’ eps： To ensure numerical stability （ The denominator cannot approach or take 0）, The value added to the denominator . The default is 1e-5. momentum： Momentum used for dynamic mean and variance . The default is 0.1. affine： Boolean value , When set to true, Add learning affine transformation parameters to this layer . track_running_stats： Boolean value , When set to true, Record the mean and variance during training ;

Implementation formula ：

2 GroupNorm

torch.nn.GroupNorm(num_groups, num_channels, eps=1e-05, affine=True)

Parameters ：

num_groups： Need to be divided into groups num_features： The number of features from the desired input , The expected input size is ’batch_size x num_features [x width]’ eps： To ensure numerical stability （ The denominator cannot approach or take 0）, The value added to the denominator . The default is 1e-5. momentum： Momentum used for dynamic mean and variance . The default is 0.1. affine： Boolean value , When set to true, Add learning affine transformation parameters to this layer .

Implementation formula ：

3 InstanceNorm

torch.nn.InstanceNorm1d(num_features, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False) torch.nn.InstanceNorm2d(num_features, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False) torch.nn.InstanceNorm3d(num_features, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)

Parameters ：

num_features： The number of features from the desired input , The expected input size is ’batch_size x num_features [x width]’ eps： To ensure numerical stability （ The denominator cannot approach or take 0）, The value added to the denominator . The default is 1e-5. momentum： Momentum used for dynamic mean and variance . The default is 0.1. affine： Boolean value , When set to true, Add learning affine transformation parameters to this layer . track_running_stats： Boolean value , When set to true, Record the mean and variance during training ;

Implementation formula ：

4 LayerNorm

torch.nn.LayerNorm(normalized_shape, eps=1e-05, elementwise_affine=True)

Parameters ：

normalized_shape： Enter dimensions [∗×normalized_shape[0]×normalized_shape[1]×…×normalized_shape[−1]] eps： To ensure numerical stability （ The denominator cannot approach or take 0）, The value added to the denominator . The default is 1e-5. elementwise_affine： Boolean value , When set to true, Add learning affine transformation parameters to this layer .

Implementation formula ：

5 LocalResponseNorm

torch.nn.LocalResponseNorm(size, alpha=0.0001, beta=0.75, k=1.0)

Parameters ：

size： Number of neighbor channels used for normalization alpha： Product factor ,Default: 0.0001 beta ： Index ,Default: 0.75 k： Additional factor ,Default: 1

Implementation formula ：

Reference resources ：BatchNormalization、LayerNormalization、InstanceNorm、GroupNorm、SwitchableNorm summary

Publisher ： Full stack programmer stack length , Reprint please indicate the source ：https://javaforall.cn/132340.html Link to the original text ：https://javaforall.cn

原网站

版权声明
本文为[Full stack programmer webmaster]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/180/202206291043092877.html