当前位置:网站首页>Data to enhance Mixup principle and code reading
Data to enhance Mixup principle and code reading
2022-08-05 02:32:00 【00000cj】
paper:mixup: Beyond Empirical Risk Minimization
存在的问题
- 经验风险最小化(Empirical Risk Minimization, ERM)Allows large neural networks to forcefully memorize training data(rather than learning、泛化),Even with strong regularization,Or in a classification problem where labels are randomly assigned,这个问题也依然存在.
- 使用ERMPrinciples for training neural networks,When evaluating on data outside the distribution of training samples,Predictions can vary significantly,This is called an adversarial example.
One solution to this problem is neighborhood risk minimization(Vicinal Risk Minimization, VRM),That is to construct more samples based on the original samples through data augmentation,But data augmentation requires human knowledge to describe the neighborhood of each sample in the training data,比如翻转、缩放等.因此VRM也有两点不足
- The data augmentation process relies on datasets,Expert knowledge is therefore required
- Data augmentation only models neighborhood relationships between the same class
Mix-up
针对上述问题,本文提出一种data-agnostic的数据增强方法mixup,
其中\(x_{i},x_{j}\)are two images randomly selected from the training set,\(y_{i},y_{j}\)是对应的one-hot标签,通过先验知识:The linear interpolation of the feature vector and the linear interpolation of the corresponding target are still a corresponding relationship,A new sample is constructed\((\widetilde{x},\widetilde{y})\).其中\(\lambda\)通过\(\beta(\alpha, \alpha)\)distribution gain,\(\alpha\)是超参.
此外,The author mentions some conclusions obtained through experiments
- It is found through experiments that the combination of three or more samples cannot bring about further accuracy improvement,On the contrary, it will increase the computational cost.
- The author's implementation method is through a separatedata loader获得一个batch的数据,然后在random shufflepost on this onebatchdata usage withinmixup,The authors found that this strategy worked well,同时减少了I/O.
- Only on samples of the same classmixupThere is no improvement in accuracy.
实现
torchvision版本
这里通过roll方法将batchThe picture inside is panned back one,然后与原batch进行mixup,相当于batchEach picture inside is compared with the adjacent onemixup,roll方法详见
class RandomMixup(torch.nn.Module):
"""Randomly apply Mixup to the provided batch and targets.
The class implements the data augmentations as described in the paper
`"mixup: Beyond Empirical Risk Minimization" <https://arxiv.org/abs/1710.09412>`_.
Args:
num_classes (int): number of classes used for one-hot encoding.
p (float): probability of the batch being transformed. Default value is 0.5.
alpha (float): hyperparameter of the Beta distribution used for mixup.
Default value is 1.0.
inplace (bool): boolean to make this transform inplace. Default set to False.
"""
def __init__(self, num_classes: int, p: float = 0.5, alpha: float = 1.0, inplace: bool = False) -> None:
super().__init__()
if num_classes < 1:
raise ValueError(
f"Please provide a valid positive value for the num_classes. Got num_classes={num_classes}"
)
if alpha <= 0:
raise ValueError("Alpha param can't be zero.")
self.num_classes = num_classes
self.p = p
self.alpha = alpha
self.inplace = inplace
def forward(self, batch: Tensor, target: Tensor) -> Tuple[Tensor, Tensor]:
"""
Args:
batch (Tensor): Float tensor of size (B, C, H, W)
target (Tensor): Integer tensor of size (B, )
Returns:
Tensor: Randomly transformed batch.
"""
if batch.ndim != 4:
raise ValueError(f"Batch ndim should be 4. Got {batch.ndim}")
if target.ndim != 1:
raise ValueError(f"Target ndim should be 1. Got {target.ndim}")
if not batch.is_floating_point():
raise TypeError(f"Batch dtype should be a float tensor. Got {batch.dtype}.")
if target.dtype != torch.int64:
raise TypeError(f"Target dtype should be torch.int64. Got {target.dtype}")
if not self.inplace:
batch = batch.clone()
target = target.clone()
if target.ndim == 1:
target = torch.nn.functional.one_hot(target, num_classes=self.num_classes).to(dtype=batch.dtype)
if torch.rand(1).item() >= self.p:
return batch, target
# It's faster to roll the batch by one instead of shuffling it to create image pairs
batch_rolled = batch.roll(1, 0)
target_rolled = target.roll(1, 0)
# Implemented as on mixup paper, page 3.
lambda_param = float(torch._sample_dirichlet(torch.tensor([self.alpha, self.alpha]))[0])
batch_rolled.mul_(1.0 - lambda_param)
batch.mul_(lambda_param).add_(batch_rolled)
target_rolled.mul_(1.0 - lambda_param)
target.mul_(lambda_param).add_(target_rolled)
return batch, target
def __repr__(self) -> str:
s = (
f"{self.__class__.__name__}("
f"num_classes={self.num_classes}"
f", p={self.p}"
f", alpha={self.alpha}"
f", inplace={self.inplace}"
f")"
)
return s
mmclassification版本
这里是通过randperm将batchThe pictures inside are scrambled,然后与原batch进行mixup,并且得到\(\lambda\)的方法与torchvision也不一样.
class BatchMixupLayer(BaseMixupLayer):
r"""Mixup layer for a batch of data.
Mixup is a method to reduces the memorization of corrupt labels and
increases the robustness to adversarial examples. It's
proposed in `mixup: Beyond Empirical Risk Minimization
<https://arxiv.org/abs/1710.09412>`
This method simply linearly mix pairs of data and their labels.
Args:
alpha (float): Parameters for Beta distribution to generate the
mixing ratio. It should be a positive number. More details
are in the note.
num_classes (int): The number of classes.
prob (float): The probability to execute mixup. It should be in
range [0, 1]. Default sto 1.0.
Note:
The :math:`\alpha` (``alpha``) determines a random distribution
:math:`Beta(\alpha, \alpha)`. For each batch of data, we sample
a mixing ratio (marked as :math:`\lambda`, ``lam``) from the random
distribution.
"""
def __init__(self, *args, **kwargs):
super(BatchMixupLayer, self).__init__(*args, **kwargs)
def mixup(self, img, gt_label):
one_hot_gt_label = one_hot_encoding(gt_label, self.num_classes)
lam = np.random.beta(self.alpha, self.alpha)
batch_size = img.size(0)
index = torch.randperm(batch_size)
mixed_img = lam * img + (1 - lam) * img[index, :]
mixed_gt_label = lam * one_hot_gt_label + (
1 - lam) * one_hot_gt_label[index, :]
return mixed_img, mixed_gt_label
def __call__(self, img, gt_label):
return self.mixup(img, gt_label)
目标检测中的mixup
在文章Bag of Freebies for Training Object Detection Neural Networks 中,for two picturesmixupThen just merge all of the two graphsgt box,and did not do it for category labelsmixup.But the article mentions"weighted loss indicates the overall loss is the summation of multiple objects with ratio 0 to 1 according to image blending ratio they belong to in the original training images",即在计算losstime for each objectloss按mixupThe coefficients are weighted and summed.
参考
边栏推荐
- Note that Weifang generally needs to pay attention to issuing invoices
- 【存储】曙光存储DS800-G35 ISCSI各映射LUN给服务器
- select tag custom style
- Dotnet 6 Why does the network request not follow the change of the system network proxy and dynamically switch the proxy?
- VSCode Change Default Terminal 如何修改vscode的默认terminal
- 在这个超连接的世界里,你的数据安全吗
- 01 【前言 基础使用 核心概念】
- [ROS](10)ROS通信 —— 服务(Service)通信
- [深入研究4G/5G/6G专题-51]: URLLC-16-《3GPP URLLC相关协议、规范、技术原理深度解读》-11-高可靠性技术-2-链路自适应增强(根据无线链路状态动态选择高可靠性MCS)
- 02 【开发服务器 资源模块】
猜你喜欢
QT语言文件制作
DAY23:命令执行&代码执行漏洞
Gantt chart is here, project management artifact, template is used directly
iNFTnews | What can NFTs bring to the sports industry and fans?
What should I do if the self-incrementing id of online MySQL is exhausted?
[LeetCode Brush Questions] - Sum of Numbers topic (more topics to be added)
特殊矩阵的压缩存储
Pisanix v0.2.0 released | Added support for dynamic read-write separation
HOG特征学习笔记
2022-08-04:输入:去重数组arr,里面的数只包含0~9。limit,一个数字。 返回:要求比limit小的情况下,能够用arr拼出来的最大数字。 来自字节。
随机推荐
[深入研究4G/5G/6G专题-51]: URLLC-16-《3GPP URLLC相关协议、规范、技术原理深度解读》-11-高可靠性技术-2-链路自适应增强(根据无线链路状态动态选择高可靠性MCS)
View handler stepping record
shell statement to modify txt file or sh file
select tag custom style
继承关系下构造方法的访问特点
【存储】曙光存储DS800-G35 ISCSI各映射LUN给服务器
STM32使用stm32cubemx LL库系列教程
散列表的查找(哈希表)
Error: Not a signal or slot declaration
协作D2D局部模型聚合的半分散联合学习
常见的硬件延迟
线性表的查找
[机缘参悟-60]:《兵者,诡道也》-2-孙子兵法解读
学习笔记-----左偏树
Compressed storage of special matrices
转:查尔斯·汉迪:你是谁,比你做什么更重要
Amazon Cloud Technology joins hands with Thundersoft to build an AIoT platform for industry customers
2022了你还不会『低代码』?数据科学也能玩转Low-Code啦!
蚁剑高级模块开发
C学生管理系统 指定位置插入学生节点