当前位置：网站首页>Model building in pytorch

Model building in pytorch

2022-07-29 06:12:00 【Quinn-ntmy】

One 、 Two elements of building a model

Building sub modules ： In the model established by oneself （ Inherit nn.Module） Of __init__() Method ;
Splice submodules ： In the model forward() In the method .

Two 、nn.Module class

In the model nn.Module ： All our models , All network layers inherit from this class .
torch.nn Include （1）nn.Parameter、（2）nn.functional、（3）nn.Module、（4）nn.init, These sub modules work together .

1.nn.Parameter
Zhang quantum class , Represents a learnable parameter , Such as weight、bias.
The parameters of the model need to be trained by the optimizer , Therefore, the parameter is usually set to requires_grad = True Tensor . meanwhile , In a model , There are many parameters , Manual management is not easy . Generally, the parameters are used as nn.Parameter Express , And use nn.Module To manage all the parameters under its structure .

Code example
Such as sub module Attention The learnable parameters in ：

        if score_function == 'mlp':
            self.weight = nn.Parameter(torch.Tensor(hidden_dim*2))
        elif self.score_function == 'bi_linear':
            self.weight = nn.Parameter(torch.Tensor(hidden_dim, hidden_dim))
        else:  # dot_product / scaled_dot_product
            self.register_parameter('weight', None)
        self.reset_parameters()

In practice , Usually by inheritance nn.Module To build module classes , And put all the parts that contain the parameters that need to be learned in the constructor .

class AEN_BERT(nn.Module):
    def __init__(self, bert, opt):
        super(AEN_BERT, self).__init__()
        self.opt = opt
        self.bert = bert
        self.squeeze_embedding = SqueezeEmbedding()
        self.dropout = nn.Dropout(opt.dropout)

        self.attn_k = Attention(opt.bert_dim, out_dim=opt.hidden_dim, n_head=8, score_function='mlp', dropout=opt.dropout)
        self.attn_q = Attention(opt.bert_dim, out_dim=opt.hidden_dim, n_head=8, score_function='mlp', dropout=opt.dropout)
        self.ffn_c = PositionwiseFeedForward(opt.hidden_dim, dropout=opt.dropout)
        self.ffn_t = PositionwiseFeedForward(opt.hidden_dim, dropout=opt.dropout)

        self.attn_s1 = Attention(opt.hidden_dim, n_head=8, score_function='mlp', dropout=opt.dropout)

        self.dense = nn.Linear(opt.hidden_dim*3, opt.polarities_dim)

    def forward(self, inputs):
        context, target = inputs[0], inputs[1]
        context_len = torch.sum(context != 0, dim=-1)
        target_len = torch.sum(target != 0, dim=-1)

        context = self.squeeze_embedding(context, context_len)
        context, _ = self.bert(context, return_dict=False)
        context = self.dropout(context)

        target = self.squeeze_embedding(target, target_len)
        target, _ = self.bert(target, return_dict=False)
        target = self.dropout(target)


        hc, _ = self.attn_k(context, context)  #  Introspective contextual word modeling 
        hc = self.ffn_c(hc)     #  Point by point convolution transform 

        ht, _ = self.attn_q(context, target)  #  Context aware target word modeling 
        ht = self.ffn_t(ht)     #  Point by point convolution transform 


        s1, _ = self.attn_s1(hc, ht)   #  The target specific context represents 


        #  Average pooling in the paper ？？average pooling  The final representation of the output 
        hc_mean = torch.div(torch.sum(hc, dim=1), context_len.unsqueeze(1).float())
        ht_mean = torch.div(torch.sum(ht, dim=1), target_len.unsqueeze(1).float())
        s1_mean = torch.div(torch.sum(s1, dim=1), context_len.unsqueeze(1).float())
        # torch.div(a, b )： tensor a And scalars b Do element by element division , Or two broadcast tensors a、b Do element by element division between 

        x = torch.cat((hc_mean, s1_mean, ht_mean), dim=-1)   # concat  Connect together 
        out = self.dense(x)    #  Use  nn.Linear  Fully connected layer 
        return out

You can see that the module class is built AEN_BERT, It includes sub modules Attention, The part of the model that contains the parameters that need to be learned is placed in the constructed function （ Sub module ） in .

2. nn.functional
nn.functional： The concrete realization of function . Such as ：

（1） Activate function series （F.relu,F.sigmoid,F.tanh,F.softmax）
（2） Model layer series （F.linear,F.conv2d,F.max_pool2d,F.dropout2d,F.embedding）
（3） Loss function series （F.binary_cross_entropy,F.mse_loss,F.cross_entropy）

In order to facilitate the management of parameters , Usually by inheritance nn.Module Convert to the implementation form of class , And directly packaged in nn Under module ：

（1） The activation function becomes （nn.Relu,nn.Sigmoid,nn.Tanh,nn.Softmax）
（2） The model layer （nn.Linear,nn.Conv2d,nn.Max_pool2d,nn.Dropout2d,nn.Embedding）
（3） Loss function （nn.BCELoss,nn.MSELoss,nn.CrossEntorpyLoss）

3. nn.Module
All network layer base classes , Manage the properties of the network .
stay nn.Module in , Yes 8 Important attributes , Used to manage the entire model , They all exist in the form of an ordered dictionary ：

self._parameters: Dict[str, Optional[Parameter]] = OrderedDict()
self._buffers: Dict[str, Optional[Tensor]] = OrderedDict()
self._backward_hooks: Dict[int, Callable] = OrderedDict()
self._forward_hooks: Dict[int, Callable] = OrderedDict()
self._forward_pre_hooks: Dict[int, Callable] = OrderedDict()
self._state_dict_hooks: Dict[int, Callable] = OrderedDict()
self._load_state_dict_pre_hooks: Dict[int, Callable] = OrderedDict()
self._modules: Dict[str, Optional['Module']] = OrderedDict()

（1）_parameters： Storage management belongs to nn.Parameter Attributes of a class , for example weight、bias These parameters ;
（2）_modules： Storage management nn.Module class ;
（3）_buffers： Storage management buffer properties , Such as BN Layer. running_mean,std And so on will exist in this ;
（4）***_hooks： Storage management hook function （5 One and hooks About the dictionary ）.
nn.Module Mechanism for building attributes ： First, there is a big Module Inherit nn.Module The base class , Like the one above AEN_BERT, And then this big one Module There can be many sub modules in it , These sub modules also inherit from nn.Module, In these Module Of __init__ In the method , It will first call the initialization method of the parent class 8 Initialization of properties .
And then when building each sub module , Divided into two steps , The first step is to initialize , Then be __setattr__ Methods by judging value Save it in the corresponding attribute dictionary , And then assign values to the corresponding members . Build sub modules one by one , Finally, the whole big Module Build complete .

summary ：

One Module It can contain more than one child module;
One Module It's equivalent to an operation , Must be realized forward() function ;
Every Module There are 8 A dictionary manages its properties （ The most common is _parameters,_modules）

In general , We seldom use it directly nn.Parameter To define the parameters to build the model , But by assembling some common model layers . These model layers are also inherited from nn.Module The object of , It also includes parameters , Submodules that belong to the module we want to define .

nn.Module There are ways to manage these submodules ：

children() Method ： Return to generator , Including all sub modules under the module ;

named_children() Method ： Return to a generator , Including all sub modules under the module , And their names ;

modules() Method ： Return to a generator , It includes all the modules at all levels under the module , Including the module itself ;

named_modules() Method ： Return to a generator , It includes all the modules at all levels under the module and their names , Including the module itself .

among children() Methods and named_children() The method is more used ,modules() Methods and named_modules() The method is less used , Its functions can be achieved through multiple named_children() The nested use implementation of .

4. nn.init： Parameter initialization method .

原网站

版权声明
本文为[Quinn-ntmy]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/210/202207290519490922.html

当前位置：网站首页>Model building in pytorch

Model building in pytorch

One 、 Two elements of building a model

Two 、nn.Module class

边栏推荐

猜你喜欢

随机推荐