Model summary in PyTorch similar to `model.summary()` in Keras

Last update: Dec 29, 2022

Overview

Keras style `model.summary()` in PyTorch

Keras has a neat API to view the visualization of the model which is very helpful while debugging your network. Here is a barebone code to try and mimic the same in PyTorch. The aim is to provide information complementary to, what is not provided by print(your_model) in PyTorch.

Usage

pip install torchsummary or
git clone https://github.com/sksq96/pytorch-summary

from torchsummary import summary
summary(your_model, input_size=(channels, H, W))

Note that the input_size is required to make a forward pass through the network.

Examples

CNN for MNIST

import torch
import torch.nn as nn
import torch.nn.functional as F
from torchsummary import summary

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # PyTorch v0.4.0
model = Net().to(device)

summary(model, (1, 28, 28))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1           [-1, 10, 24, 24]             260
            Conv2d-2             [-1, 20, 8, 8]           5,020
         Dropout2d-3             [-1, 20, 8, 8]               0
            Linear-4                   [-1, 50]          16,050
            Linear-5                   [-1, 10]             510
================================================================
Total params: 21,840
Trainable params: 21,840
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.06
Params size (MB): 0.08
Estimated Total Size (MB): 0.15
----------------------------------------------------------------

VGG16

import torch
from torchvision import models
from torchsummary import summary

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
vgg = models.vgg16().to(device)

summary(vgg, (3, 224, 224))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 224, 224]           1,792
              ReLU-2         [-1, 64, 224, 224]               0
            Conv2d-3         [-1, 64, 224, 224]          36,928
              ReLU-4         [-1, 64, 224, 224]               0
         MaxPool2d-5         [-1, 64, 112, 112]               0
            Conv2d-6        [-1, 128, 112, 112]          73,856
              ReLU-7        [-1, 128, 112, 112]               0
            Conv2d-8        [-1, 128, 112, 112]         147,584
              ReLU-9        [-1, 128, 112, 112]               0
        MaxPool2d-10          [-1, 128, 56, 56]               0
           Conv2d-11          [-1, 256, 56, 56]         295,168
             ReLU-12          [-1, 256, 56, 56]               0
           Conv2d-13          [-1, 256, 56, 56]         590,080
             ReLU-14          [-1, 256, 56, 56]               0
           Conv2d-15          [-1, 256, 56, 56]         590,080
             ReLU-16          [-1, 256, 56, 56]               0
        MaxPool2d-17          [-1, 256, 28, 28]               0
           Conv2d-18          [-1, 512, 28, 28]       1,180,160
             ReLU-19          [-1, 512, 28, 28]               0
           Conv2d-20          [-1, 512, 28, 28]       2,359,808
             ReLU-21          [-1, 512, 28, 28]               0
           Conv2d-22          [-1, 512, 28, 28]       2,359,808
             ReLU-23          [-1, 512, 28, 28]               0
        MaxPool2d-24          [-1, 512, 14, 14]               0
           Conv2d-25          [-1, 512, 14, 14]       2,359,808
             ReLU-26          [-1, 512, 14, 14]               0
           Conv2d-27          [-1, 512, 14, 14]       2,359,808
             ReLU-28          [-1, 512, 14, 14]               0
           Conv2d-29          [-1, 512, 14, 14]       2,359,808
             ReLU-30          [-1, 512, 14, 14]               0
        MaxPool2d-31            [-1, 512, 7, 7]               0
           Linear-32                 [-1, 4096]     102,764,544
             ReLU-33                 [-1, 4096]               0
          Dropout-34                 [-1, 4096]               0
           Linear-35                 [-1, 4096]      16,781,312
             ReLU-36                 [-1, 4096]               0
          Dropout-37                 [-1, 4096]               0
           Linear-38                 [-1, 1000]       4,097,000
================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 218.59
Params size (MB): 527.79
Estimated Total Size (MB): 746.96
----------------------------------------------------------------

Multiple Inputs

import torch
import torch.nn as nn
from torchsummary import summary

class SimpleConv(nn.Module):
    def __init__(self):
        super(SimpleConv, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(1, 1, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
        )

    def forward(self, x, y):
        x1 = self.features(x)
        x2 = self.features(y)
        return x1, x2
    
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = SimpleConv().to(device)

summary(model, [(1, 16, 16), (1, 28, 28)])

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1            [-1, 1, 16, 16]              10
              ReLU-2            [-1, 1, 16, 16]               0
            Conv2d-3            [-1, 1, 28, 28]              10
              ReLU-4            [-1, 1, 28, 28]               0
================================================================
Total params: 20
Trainable params: 20
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.77
Forward/backward pass size (MB): 0.02
Params size (MB): 0.00
Estimated Total Size (MB): 0.78
----------------------------------------------------------------

References

The idea for this package sparked from this PyTorch issue.
Thanks to @ncullen93 and @HTLife.
For Model Size Estimation @jacobkimmel (details here)

License

pytorch-summary is MIT-licensed.

Model summary in PyTorch similar to `model.summary()` in Keras

Related tags

Overview

Keras style `model.summary()` in PyTorch

Usage

Examples

CNN for MNIST

VGG16

Multiple Inputs

References

License

Owner

Shubham Chandel

PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

A Closer Look at Structured Pruning for Neural Network Compression

Differentiable SDE solvers with GPU support and efficient sensitivity analysis.

PyTorch framework A simple and complete framework for PyTorch, providing a variety of data loading and simple task solutions that are easy to extend and migrate

A lightweight wrapper for PyTorch that provides a simple declarative API for context switching between devices, distributed modes, mixed-precision, and PyTorch extensions.

You like pytorch? You like micrograd? You love tinygrad! ❤️

The goal of this library is to generate more helpful exception messages for numpy/pytorch matrix algebra expressions.

Pretrained EfficientNet, EfficientNet-Lite, MixNet, MobileNetV3 / V2, MNASNet A1 and B1, FBNet, Single-Path NAS

Kaldi-compatible feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd

PyGCL: Graph Contrastive Learning Library for PyTorch

An optimizer that trains as fast as Adam and as good as SGD.

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

3D-RETR: End-to-End Single and Multi-View3D Reconstruction with Transformers

Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation.

PyTorch toolkit for biomedical imaging

A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

Distiller is an open-source Python package for neural network compression research.

lookahead optimizer (Lookahead Optimizer: k steps forward, 1 step back) for pytorch

Pytorch implementation of Distributed Proximal Policy Optimization

Model summary in PyTorch similar to `model.summary()` in Keras

Related tags

Overview

Keras style model.summary() in PyTorch

Usage

Examples

CNN for MNIST

VGG16

Multiple Inputs

References

License

Owner

Shubham Chandel

PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

A Closer Look at Structured Pruning for Neural Network Compression

Differentiable SDE solvers with GPU support and efficient sensitivity analysis.

PyTorch framework A simple and complete framework for PyTorch, providing a variety of data loading and simple task solutions that are easy to extend and migrate

A lightweight wrapper for PyTorch that provides a simple declarative API for context switching between devices, distributed modes, mixed-precision, and PyTorch extensions.

You like pytorch? You like micrograd? You love tinygrad! ❤️

The goal of this library is to generate more helpful exception messages for numpy/pytorch matrix algebra expressions.

Pretrained EfficientNet, EfficientNet-Lite, MixNet, MobileNetV3 / V2, MNASNet A1 and B1, FBNet, Single-Path NAS

Kaldi-compatible feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd

PyGCL: Graph Contrastive Learning Library for PyTorch

An optimizer that trains as fast as Adam and as good as SGD.

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

3D-RETR: End-to-End Single and Multi-View3D Reconstruction with Transformers

Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation.

PyTorch toolkit for biomedical imaging

A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

Distiller is an open-source Python package for neural network compression research.

lookahead optimizer (Lookahead Optimizer: k steps forward, 1 step back) for pytorch

Pytorch implementation of Distributed Proximal Policy Optimization

Keras style `model.summary()` in PyTorch