当前位置：网站首页>Lightweight Backbone VGNetG Achieves "No Choice, All" Lightweight Backbone Network

Lightweight Backbone VGNetG Achieves "No Choice, All" Lightweight Backbone Network

2022-08-04 07:34:00 【AI Vision Network】

128*128 gpu 1060 8ms cpu 11ms

skipnet gpu 5ms gpu 18ms

现代高效卷积神经网络 (CNN) 总是使用深度可分离卷积 (DSC) 和神经网络架构搜索 (NAS) 来减少参数数量和计算复杂度.但忽略了网络的一些固有特征.受可视化特征图和 N×N(N>1) 卷积核的启发,本文引入了几个指导方针,以进一步提高参数效率和推理速度.

基于这些指导方针设计的参数高效 CNN 架构称为 VGNetG,比以前的网络实现了更好的精度和更低的延迟,参数减少了大约 30%~50%.VGNetG-1.0MP 在 ImageNet 分类数据集上以 0.99M 的参数实现了 67.7% 的 top-1 准确率,在 1.14M 的参数下实现了 69.2% 的 top-1 准确率.
此外,证明边缘检测器可以通过用固定边缘检测 kernel替换 N×N kernel来替换可学习的深度卷积层来混合特征.VGNetF-1.5MP 达到 64.4%(-3.2%) 的 top-1 准确率和 66.2%(-1.4%) 的 top-1 准确率,并带有额外的高斯kernel.

1本文方法

作者主要是研究了由标准卷积构造的3个典型网络：

标准卷积==>ResNet-RS,
组卷积==>RegNet,
深度可分离卷积==>MobileNet、ShuffleNetV2和EfficientNets.

这些可视化结果表明,M×N×N kernel在网络的不同阶段具有明显不同的模式和分布.

1.1 CNN可以学习如何满足采样定理

以前的工作一直认为卷积神经网络忽略了经典的采样定理,但是作者发现卷积神经网络通过学习低通滤波器可以在一定程度上满足采样定理,尤其是基于 DSCs 的网络,例如 MobileNetV1 和 EfficientNets,如图 2 所示.

1、标准卷积/组卷积

如图 2a 和 2b 所示,在整个 M×N×N 个kernel中存在一个或多个显著 N×N 个kernel,例如模糊kernel,这种现象也意味着这些层的参数是冗余的.请注意,显著kernel不一定看起来像高斯kernel.

2、深度可分离卷积

Strided-DSC 的kernel通常类似于高斯kernel,包括但不限于 MobileNetV1、MobileNetV2、MobileNetV3、ShuffleNetV2、ReXNet、EfficientNets.此外,Strided-DSC kernel的分布不是高斯分布,而是高斯混合分布.

3、最后一个卷积层的Kernels

现代 CNN 总是在分类器之前使用全局池化层来降低维度.因此,类似的现象也出现在最后的深度卷积层上,如图 4 所示.

这些可视化表明应该在下采样层和最后一层选择深度卷积而不是标准卷积和组卷积.此外,可以在下采样层中使用固定的高斯kernel.

1.2 重用相邻图层之间的特征图

Identity Kernel和类似特征图

如上图所示,许多深度卷积核仅在中心具有较大的值,就像网络中间部分的恒等核一样.由于输入只是传递到下一层,因此带有恒等核的卷积会导致特征图重复和计算冗余.另一方面,下图显示许多特征图在相邻层之间是相似的（重复的）.

因此,可以用恒等映射代替部分卷积.否则,深度卷积在早期层中很慢,因为它们通常不能充分利用 Shufflenet V2 中报告的现代加速器.所以这种方法可以提高参数效率和推理时间.

1.3 边缘检测器作为可学习的深度卷积

边缘特征包含有关图像的重要信息.如下图所示,大部分kernel近似于边缘检测kernel,例如 Sobel 滤波器 kernel 和拉普拉斯滤波器 kernel.并且这种kernel的比例在后面的层中减少,而喜欢模糊kernel的kernel比例增加.

因此,也许边缘检测器可以取代基于 DSC 的网络中的深度卷积,以混合不同空间位置之间的特征.作者将通过用边缘检测kernel替换可学习kernel来证明这一点.

2网络架构

2.1 DownsamplingBlock

DownsamplingBlock 将分辨率减半并扩展通道数.如图 a 所示,仅扩展通道由逐点卷积生成以重用特征.深度卷积的核可以随机初始化或使用固定的高斯核.

2.2 HalfIdentityBlock

如图 b 所示,用恒等映射替换半深度卷积,并在保持块宽度的同时减少half pointwise convolutions.

请注意,输入的右半通道成为输出的左半通道,以便更好地重用特征.

2.3 VGNet Architecture

使用 DownsamplingBlock 和 HalfIdentityBlock 构建了受参数数量限制的 VGNets.整体 VGNetG-1.0MP 架构如表 1 所示.

2.4 Variants of VGNet

为了进一步研究 N×N 内核的影响,引入了 VGNets 的几个变体：VGNetC、VGNetG 和 VGNetF.

VGNetC：所有参数都是随机初始化和可学习的.

VGNetG：除 DownsamplingBlock 的内核外,所有参数都是随机初始化和可学习的.

VGNetF：深度卷积的所有参数都是固定的.

3实验

4参考

[1].EFFICIENT CNN ARCHITECTURE DESIGN GUIDED BY VISUALIZATION.

原文地址：

cv-models/vgnet.py at 8817ebcdc4a06e9843c88a730126b223a7869441 · ffiirree/cv-models · GitHub

I combined all the code into one file：

import time
from functools import partial
import os
import torch
import torch.nn as nn
from typing import Any, List, OrderedDict, Union

from functools import partial
import torch
import torch.nn as nn
import torch.nn.functional as F

_NORM_POSIITON: str = 'before'
_NORMALIZER: nn.Module = nn.BatchNorm2d
_NONLINEAR: nn.Module = partial(nn.ReLU, inplace=True)
_SE_INNER_NONLINEAR: nn.Module = partial(nn.ReLU, inplace=True)
_SE_GATING_FN: nn.Module = nn.Sigmoid
_SE_DIVISOR: int = 8
_SE_USE_NORM: bool = False


def get_gaussian_kernel1d(kernel_size, sigma: torch.Tensor):
    ksize_half = (kernel_size - 1) * 0.5

    x = torch.linspace(-ksize_half, ksize_half, steps=kernel_size).to(sigma.device)
    pdf = torch.exp(-0.5 * (x / sigma).pow(2))
    return pdf / pdf.sum()


def get_gaussian_kernel2d(kernel_size, sigma: torch.Tensor):
    kernel1d = get_gaussian_kernel1d(kernel_size, sigma)
    return torch.mm(kernel1d[:, None], kernel1d[None, :])


class ChannelChunk(nn.Module):
    def __init__(self, groups: int):
        super().__init__()

        self.groups = groups

    def forward(self, x):
        return torch.chunk(x, self.groups, dim=1)

    def extra_repr(self):
        return f'groups={self.groups}'


class ChannelSplit(nn.Module):
    def __init__(self, sections):
        super().__init__()

        self.sections = sections

    def forward(self, x):
        return torch.split(x, self.sections, dim=1)

    def extra_repr(self):
        return f'sections={self.sections}'


class Combine(nn.Module):
    def __init__(self, method: str = 'ADD', *args, **kwargs):
        super().__init__()
        assert method in ['ADD', 'CONCAT'], ''

        self.method = method
        self._combine = self._add if self.method == 'ADD' else self._cat

    @staticmethod
    def _add(x):
        return x[0] + x[1]

    @staticmethod
    def _cat(x):
        return torch.cat(x, dim=1)

    def forward(self, x):
        return self._combine(x)

    def extra_repr(self):
        return f'method=\'{self.method}\''


class PointwiseConv2d(nn.Conv2d):
    def __init__(self, inp, oup, stride: int = 1, bias: bool = False, groups: int = 1):
        super().__init__(inp, oup, 1, stride=stride, padding=0, bias=bias, groups=groups)


def normalizer_fn(channels):
    return _NORMALIZER(channels)


def activation_fn():
    return _NONLINEAR()


def channel_shuffle(x, groups):
    batchsize, num_channels, height, width = x.data.size()
    channels_per_group = num_channels // groups

    # reshape
    x = x.view(batchsize, groups, channels_per_group, height, width)
    x = torch.transpose(x, 1, 2).contiguous()

    # flatten
    x = x.view(batchsize, -1, height, width)
    return x


def norm_activation(channels, normalizer_fn: nn.Module = None, activation_fn: nn.Module = None, norm_position: str = None) -> List[nn.Module]:
    norm_position = norm_position or _NORM_POSIITON
    assert norm_position in ['before', 'after', 'none'], ''

    normalizer_fn = normalizer_fn or _NORMALIZER
    activation_fn = activation_fn or _NONLINEAR

    if normalizer_fn == None and activation_fn == None:
        return []

    if normalizer_fn == None:
        return [activation_fn()]

    if activation_fn == None:
        return [normalizer_fn(channels)]

    if norm_position == 'after':
        return [activation_fn(), normalizer_fn(channels)]

    return [normalizer_fn(channels), activation_fn()]


def make_divisible(value, divisor, min_value=None):
    if min_value is None:
        min_value = divisor

    new_value = max(min_value, int(value + divisor / 2) // divisor * divisor)

    # Make sure that round down does not go down by more than 10%.
    if new_value < 0.9 * value:
        new_value += divisor

    return new_value


class Stage(nn.Sequential):
    def __init__(self, *args):
        if len(args) == 1 and isinstance(args[0], list):
            args = args[0]
        super().__init__(*args)

    def append(self, m: Union[nn.Module, List[nn.Module]]):
        if isinstance(m, nn.Module):
            self.add_module(str(len(self)), m)
        elif isinstance(m, list):
            [self.append(i) for i in m]
        else:
            ValueError('')


class Affine(nn.Module):
    def __init__(self, dim):
        super().__init__()

        self.dim = dim

        self.alpha = nn.Parameter(torch.ones(dim, 1, 1))
        self.beta = nn.Parameter(torch.zeros(dim, 1, 1))

    def forward(self, x):
        return self.alpha * x + self.beta

    def extra_repr(self):
        return f'{self.dim}'


class Conv2d3x3(nn.Conv2d):
    def __init__(self, in_channels: int, out_channels: int, stride: int = 1, padding: int = None, dilation: int = 1, bias: bool = False, groups: int = 1):
        padding = padding if padding is not None else dilation
        super().__init__(in_channels, out_channels, 3, stride=stride, padding=padding, dilation=dilation, bias=bias, groups=groups)


class Conv2d1x1(nn.Conv2d):
    def __init__(self, in_channels: int, out_channels: int, stride: int = 1, padding: int = 0, bias: bool = False, groups: int = 1):
        super().__init__(in_channels, out_channels, 1, stride=stride, padding=padding, bias=bias, groups=groups)


class Conv2d3x3BN(nn.Sequential):
    def __init__(self, in_channels: int, out_channels: int, stride: int = 1, padding: int = None, dilation: int = 1, bias: bool = False, groups: int = 1, normalizer_fn: nn.Module = None):
        normalizer_fn = normalizer_fn or _NORMALIZER
        padding = padding if padding is not None else dilation

        super().__init__(Conv2d3x3(in_channels, out_channels, stride=stride, padding=padding, dilation=dilation, bias=bias, groups=groups))
        if normalizer_fn:
            self.add_module(str(self.__len__()), normalizer_fn(out_channels))


class Conv2d1x1BN(nn.Sequential):
    def __init__(self, in_channels: int, out_channels: int, stride: int = 1, padding: int = 0, bias: bool = False, groups: int = 1, normalizer_fn: nn.Module = None):
        normalizer_fn = normalizer_fn or _NORMALIZER

        super().__init__(Conv2d1x1(in_channels, out_channels, stride=stride, padding=padding, bias=bias, groups=groups))
        if normalizer_fn:
            self.add_module(str(self.__len__()), normalizer_fn(out_channels))


class Conv2d1x1Block(nn.Sequential):
    def __init__(self, in_channels: int, out_channels: int, stride: int = 1, padding: int = 0, bias: bool = False, groups: int = 1, normalizer_fn: nn.Module = None, activation_fn: nn.Module = None,
            norm_position: str = None):
        super().__init__(Conv2d1x1(in_channels, out_channels, stride=stride, padding=padding, bias=bias, groups=groups), *norm_activation(out_channels, normalizer_fn, activation_fn, norm_position))


class Conv2dBlock(nn.Sequential):
    def __init__(self, in_channels, out_channels, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, bias: bool = False, groups: int = 1, normalizer_fn: nn.Module = None,
            activation_fn: nn.Module = None, norm_position: str = None, ):
        if padding is None:
            padding = ((kernel_size - 1) * (dilation - 1) + kernel_size) // 2

        super().__init__(nn.Conv2d(in_channels, out_channels, kernel_size=kernel_size, bias=bias, stride=stride, padding=padding, dilation=dilation, groups=groups),
            *norm_activation(out_channels, normalizer_fn, activation_fn, norm_position))


class DropPath(nn.Module):
    """Stochastic Depth: Drop paths per sample (when applied in main path of residual blocks)"""

    def __init__(self, survival_prob: float):
        super().__init__()

        self.p = survival_prob

    def forward(self, x):
        if self.p == 1. or not self.training:
            return x

        # work with diff dim tensors, not just 2D ConvNets
        shape = (x.shape[0],) + (1,) * (x.ndim - 1)

        probs = self.p + torch.rand(shape, dtype=x.dtype, device=x.device)
        # We therefore need to re-calibrate the outputs of any given function f
        # by the expected number of times it participates in training, p.
        return (x / self.p) * probs.floor_()

    def extra_repr(self):
        return f'survival_prob={self.p}'


class PointwiseBlock(nn.Sequential):
    def __init__(self, inp, oup, stride: int = 1, groups: int = 1, normalizer_fn: nn.Module = None, activation_fn: nn.Module = None, norm_position: str = None, ):
        super().__init__(PointwiseConv2d(inp, oup, stride=stride, groups=groups), *norm_activation(oup, normalizer_fn, activation_fn, norm_position))


class DepthwiseConv2dBN(nn.Sequential):
    def __init__(self, inp, oup, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, normalizer_fn: nn.Module = None):
        normalizer_fn = normalizer_fn or _NORMALIZER

        super().__init__(DepthwiseConv2d(inp, oup, kernel_size, stride=stride, padding=padding, dilation=dilation), normalizer_fn(oup))


class DepthwiseBlock(nn.Sequential):
    def __init__(self, inp, oup, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, normalizer_fn: nn.Module = None, activation_fn: nn.Module = None,
            norm_position: str = None):
        super().__init__(DepthwiseConv2d(inp, oup, kernel_size, stride, padding=padding, dilation=dilation), *norm_activation(oup, normalizer_fn, activation_fn, norm_position))


class ChannelShuffle(nn.Module):
    def __init__(self, groups: int):
        super().__init__()

        self.groups = groups

    def forward(self, x):
        return channel_shuffle(x, self.groups)

    def extra_repr(self):
        return 'groups={}'.format(self.groups)


class DepthwiseConv2d(nn.Conv2d):
    def __init__(self, inp, oup, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, bias: bool = False, ):
        if padding is None:
            padding = ((kernel_size - 1) * (dilation - 1) + kernel_size) // 2

        super().__init__(inp, oup, kernel_size, stride=stride, padding=padding, dilation=dilation, bias=bias, groups=inp)


class PointwiseConv2d(nn.Conv2d):
    def __init__(self, inp, oup, stride: int = 1, bias: bool = False, groups: int = 1):
        super().__init__(inp, oup, 1, stride=stride, padding=0, bias=bias, groups=groups)


class DepthwiseConv2dBN(nn.Sequential):
    def __init__(self, inp, oup, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, normalizer_fn: nn.Module = None):
        normalizer_fn = normalizer_fn or _NORMALIZER

        super().__init__(DepthwiseConv2d(inp, oup, kernel_size, stride=stride, padding=padding, dilation=dilation), normalizer_fn(oup))


class PointwiseBlock(nn.Sequential):
    def __init__(self, inp, oup, stride: int = 1, groups: int = 1, normalizer_fn: nn.Module = None, activation_fn: nn.Module = None, norm_position: str = None, ):
        super().__init__(PointwiseConv2d(inp, oup, stride=stride, groups=groups), *norm_activation(oup, normalizer_fn, activation_fn, norm_position))


class SEBlock(nn.Sequential):
    """Squeeze excite block
    """

    def __init__(self, channels, ratio, inner_activation_fn: nn.Module = None, gating_fn: nn.Module = None):
        squeezed_channels = make_divisible(int(channels * ratio), _SE_DIVISOR)
        inner_activation_fn = inner_activation_fn or _SE_INNER_NONLINEAR
        gating_fn = gating_fn or _SE_GATING_FN

        layers = OrderedDict([])

        layers['pool'] = nn.AdaptiveAvgPool2d((1, 1))
        layers['reduce'] = Conv2d1x1(channels, squeezed_channels, bias=True)
        if _SE_USE_NORM:
            layers['norm'] = _NORMALIZER(squeezed_channels)
        layers['act'] = inner_activation_fn()
        layers['expand'] = Conv2d1x1(squeezed_channels, channels, bias=True)
        layers['gate'] = gating_fn()

        super().__init__(layers)

    def _forward(self, input):
        for module in self:
            input = module(input)
        return input

    def forward(self, x):
        return x * self._forward(x)


class InvertedResidualBlock(nn.Module):
    def __init__(self, inp, oup, t, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, se_ratio: float = None, se_ind: bool = False, survival_prob: float = None,
            normalizer_fn: nn.Module = None, activation_fn: nn.Module = None, dw_se_act: nn.Module = None):
        super().__init__()

        self.inp = inp
        self.planes = int(self.inp * t)
        self.oup = oup
        self.stride = stride
        self.apply_residual = (self.stride == 1) and (self.inp == self.oup)
        self.se_ratio = se_ratio if se_ind or se_ratio is None else (se_ratio / t)
        self.has_se = (self.se_ratio is not None) and (self.se_ratio > 0) and (self.se_ratio <= 1)

        normalizer_fn = normalizer_fn or _NORMALIZER
        activation_fn = activation_fn or _NONLINEAR

        layers = []
        if t != 1:
            layers.append(Conv2d1x1Block(inp, self.planes, normalizer_fn=normalizer_fn, activation_fn=activation_fn))

        if dw_se_act is None:
            layers.append(DepthwiseBlock(self.planes, self.planes, kernel_size, stride=self.stride, padding=padding, dilation=dilation, normalizer_fn=normalizer_fn, activation_fn=activation_fn))
        else:
            layers.append(DepthwiseConv2dBN(self.planes, self.planes, kernel_size, stride=self.stride, padding=padding, dilation=dilation, normalizer_fn=normalizer_fn))

        if self.has_se:
            layers.append(SEBlock(self.planes, self.se_ratio))

        if dw_se_act:
            layers.append(dw_se_act())

        layers.append(Conv2d1x1BN(self.planes, oup, normalizer_fn=normalizer_fn))

        if self.apply_residual and survival_prob:
            layers.append(DropPath(survival_prob))

        self.branch1 = nn.Sequential(*layers)
        self.branch2 = nn.Identity() if self.apply_residual else None
        self.combine = Combine('ADD') if self.apply_residual else None

    def forward(self, x):
        if self.apply_residual:
            return self.combine([self.branch2(x), self.branch1(x)])
        else:
            return self.branch1(x)


class FusedInvertedResidualBlock(nn.Module):
    def __init__(self, inp, oup, t, kernel_size: int = 3, stride: int = 1, padding: int = None, se_ratio: float = None, se_ind: bool = False, survival_prob: float = None,
            normalizer_fn: nn.Module = None, activation_fn: nn.Module = None):
        super().__init__()

        self.inp = inp
        self.planes = int(self.inp * t)
        self.oup = oup
        self.stride = stride
        self.padding = padding if padding is not None else (kernel_size // 2)
        self.apply_residual = (self.stride == 1) and (self.inp == self.oup)
        self.se_ratio = se_ratio if se_ind or se_ratio is None else (se_ratio / t)
        self.has_se = (self.se_ratio is not None) and (self.se_ratio > 0) and (self.se_ratio <= 1)

        normalizer_fn = normalizer_fn or _NORMALIZER
        activation_fn = activation_fn or _NONLINEAR

        layers = [Conv2dBlock(inp, self.planes, kernel_size, stride=self.stride, padding=self.padding, normalizer_fn=normalizer_fn, activation_fn=activation_fn)]

        if self.has_se:
            layers.append(SEBlock(self.planes, self.se_ratio))

        layers.append(Conv2d1x1BN(self.planes, oup, normalizer_fn=normalizer_fn))

        if self.apply_residual and survival_prob:
            layers.append(DropPath(survival_prob))

        self.branch1 = nn.Sequential(*layers)
        self.branch2 = nn.Identity() if self.apply_residual else None
        self.combine = Combine('ADD') if self.apply_residual else None

    def forward(self, x):
        if self.apply_residual:
            return self.combine([self.branch2(x), self.branch1(x)])
        else:
            return self.branch1(x)


class SharedDepthwiseConv2d(nn.Module):
    def __init__(self, channels, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, t: int = 2, bias: bool = False):
        super().__init__()

        self.channels = channels // t
        self.t = t

        if padding is None:
            padding = ((kernel_size - 1) * (dilation - 1) + kernel_size) // 2

        self.mux = DepthwiseConv2d(self.channels, self.channels, kernel_size, stride, padding, dilation, bias=bias)

    def forward(self, x):
        x = torch.chunk(x, self.t, dim=1)
        x = [self.mux(xi) for xi in x]
        return torch.cat(x, dim=1)


class HalfIdentityBlock(nn.Module):
    def __init__(self, inp: int, se_ratio: float = 0.0):
        super().__init__()

        self.half3x3 = Conv2d3x3(inp // 2, inp // 2, groups=(inp // 2))
        self.combine = Combine('CONCAT')
        self.conv1x1 = PointwiseBlock(inp, inp // 2)

        if se_ratio > 0.0:
            self.conv1x1 = nn.Sequential(PointwiseBlock(inp, inp // 2), SEBlock(inp // 2, se_ratio))

    def forward(self, x):
        out = self.combine([x[0], self.half3x3(x[1])])
        return [x[1], self.conv1x1(out)]


class GaussianBlur(nn.Module):
    def __init__(self, channels: int, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, sigma: float = 1.0, learnable: bool = True):
        super().__init__()

        padding = padding or ((kernel_size - 1) * (dilation - 1) + kernel_size) // 2

        self.channels = channels
        self.kernel_size = (kernel_size, kernel_size)
        self.padding = (padding, padding)
        self.stride = (stride, stride)
        self.dilation = (dilation, dilation)
        self.padding_mode = 'zeros'
        self.learnable = learnable

        self.sigma = nn.Parameter(torch.tensor(sigma), learnable)

    def forward(self, x):
        return F.conv2d(x, self.weight, None, self.stride, self.padding, self.dilation, self.channels)

    @property
    def weight(self):
        kernel = get_gaussian_kernel2d(self.kernel_size[0], self.sigma)
        return kernel.repeat(self.channels, 1, 1, 1)

    @property
    def out_channels(self):
        return self.channels

    def extra_repr(self):
        s = ('{channels}, kernel_size={kernel_size}'
             ', learnable={learnable}, stride={stride}')
        if self.padding != (0,) * len(self.padding):
            s += ', padding={padding}'
        if self.dilation != (1,) * len(self.dilation):
            s += ', dilation={dilation}'
        if self.padding_mode != 'zeros':
            s += ', padding_mode={padding_mode}'
        return s.format(**self.__dict__)


class DownsamplingBlock(nn.Module):
    def __init__(self, inp, oup, stride: int = 2, method: str = 'blur', se_ratio: float = 0.0):
        assert method in ['blur', 'dwconv', 'maxpool'], f'{method}'

        super().__init__()

        if method == 'dwconv' or stride == 1:
            self.downsample = DepthwiseConv2d(inp, inp, 3, stride)
        elif method == 'maxpool':
            self.downsample = nn.MaxPool2d(kernel_size=3, stride=stride)
        elif method == 'blur':
            self.downsample = GaussianBlur(inp, stride=stride, sigma=1.1, learnable=False)
        else:
            ValueError(f'Unknown downsampling method: {method}.')

        split_chs = 0 if inp > oup else min(oup // 2, inp)

        self.split = ChannelSplit([inp - split_chs, split_chs])
        self.conv1x1 = PointwiseBlock(inp, oup - split_chs)

        if se_ratio > 0.0:
            self.conv1x1 = nn.Sequential(PointwiseBlock(inp, oup - split_chs), SEBlock(oup - split_chs, se_ratio))

        self.halve = nn.Identity()
        if oup > 2 * inp or inp > oup:
            self.halve = nn.Sequential(Combine('CONCAT'), ChannelChunk(2))

    def forward(self, x):
        x = self.downsample(x)
        _, x2 = self.split(x)
        return self.halve([x2, self.conv1x1(x)])


class VGNet(nn.Module):
    def __init__(self, in_channels: int = 3, num_classes: int = 1000, channels: List[int] = None, downsamplings: List[str] = None, layers: List[int] = None, se_ratio: float = 0.0,
            thumbnail: bool = False, **kwargs: Any):
        super().__init__()

        position = 'after'
        FRONT_S = 1 if thumbnail else 2
        strides = [FRONT_S, 2, 2, 2]

        self.features = nn.Sequential(OrderedDict([('stem', Conv2dBlock(in_channels, channels[0], stride=FRONT_S))]))

        for i in range(len(strides)):
            self.features.add_module(f'stage{i + 1}', self.make_layers(channels[i], channels[i + 1], strides[i], downsamplings[i], layers[i], se_ratio))

        self.features.stage4.append(nn.Sequential(# DepthwiseConv2d(channels[-1], channels[-1]),
            SharedDepthwiseConv2d(channels[-1], t=8), PointwiseBlock(channels[-1], channels[-1]), ))

        self.avg = nn.AdaptiveAvgPool2d((1, 1))
        self.classifier = nn.Linear(channels[-1], num_classes)

    def make_layers(self, inp, oup, s, m, n, se_ratio):
        layers = [DownsamplingBlock(inp, oup, stride=s, method=m, se_ratio=se_ratio)]
        for _ in range(n - 1):
            layers.append(HalfIdentityBlock(oup, se_ratio))

        layers.append(Combine('CONCAT'))
        return Stage(layers)

    def forward(self, x):
        x = self.features(x)
        x = self.avg(x)
        x = torch.flatten(x, 1)
        x = self.classifier(x)
        return x


def _vgnet(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
    model = VGNet(**kwargs)

    if pretrained:
        if pth is not None:
            state_dict = torch.load(os.path.expanduser(pth))
        else:
            assert 'url' in kwargs and kwargs['url'] != '', 'Invalid URL.'
            state_dict = torch.hub.load_state_dict_from_url(kwargs['url'], progress=progress)
        model.load_state_dict(state_dict)
    return model


# @export
# @nonlinear(partial(nn.SiLU, inplace=True))
# def vgnetg_1_0mp_se(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
#     kwargs['channels'] = [28, 56, 112, 224, 368]
#     kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
#     kwargs['layers'] = [4, 7, 13, 2]
#     kwargs['se_ratio'] = 0.25
#     return _vgnet(pretrained, pth, progress, **kwargs)
#
#
# @export
# @nonlinear(partial(nn.SiLU, inplace=True))
# def vgnetg_1_5mp_se(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
#     kwargs['channels'] = [32, 64, 128, 256, 512]
#     kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
#     kwargs['layers'] = [3, 7, 14, 2]
#     kwargs['se_ratio'] = 0.25
#     return _vgnet(pretrained, pth, progress, **kwargs)
#
#
# @export
# @nonlinear(partial(nn.SiLU, inplace=True))
# def vgnetg_2_0mp_se(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
#     kwargs['channels'] = [32, 72, 168, 376, 512]
#     kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
#     kwargs['layers'] = [3, 6, 13, 2]
#     kwargs['se_ratio'] = 0.25
#     return _vgnet(pretrained, pth, progress, **kwargs)
#
#
# @export
# @nonlinear(partial(nn.SiLU, inplace=True))
# def vgnetg_2_5mp_se(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
#     kwargs['channels'] = [32, 80, 192, 400, 544]
#     kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
#     kwargs['layers'] = [3, 6, 16, 2]
#     kwargs['se_ratio'] = 0.25
#     return _vgnet(pretrained, pth, progress, **kwargs)
#
#
# @export
# @nonlinear(partial(nn.SiLU, inplace=True))
# def vgnetg_5_0mp_se(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
#     kwargs['channels'] = [32, 88, 216, 456, 856]
#     kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
#     kwargs['layers'] = [4, 7, 15, 5]
#     kwargs['se_ratio'] = 0.25
#     return _vgnet(pretrained, pth, progress, **kwargs)


if __name__ == '__main__':



    kwargs = {}
    kwargs['channels'] = [28, 56, 112, 224, 368]
    kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
    kwargs['layers'] = [4, 7, 13, 2]
    kwargs['se_ratio'] = 0.25
    kwargs['num_classes'] = 4

    model = _vgnet(False, "", True, **kwargs)

    model.eval()  # .cuda()

    data = torch.randn(1, 3, 128, 128)  # .cuda()

    for i in range(20):
        start = time.time()
        out = model(data)
        print('time', time.time() - start, out.size())

原网站

版权声明
本文为[AI Vision Network]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/216/202208040628198528.html