PyTorch extensions for fast R&D prototyping and Kaggle farming

Overview

Pytorch-toolbelt

Build Status Documentation Status DeepSource

A pytorch-toolbelt is a Python library with a set of bells and whistles for PyTorch for fast R&D prototyping and Kaggle farming:

What's inside

  • Easy model building using flexible encoder-decoder architecture.
  • Modules: CoordConv, SCSE, Hypercolumn, Depthwise separable convolution and more.
  • GPU-friendly test-time augmentation TTA for segmentation and classification
  • GPU-friendly inference on huge (5000x5000) images
  • Every-day common routines (fix/restore random seed, filesystem utils, metrics)
  • Losses: BinaryFocalLoss, Focal, ReducedFocal, Lovasz, Jaccard and Dice losses, Wing Loss and more.
  • Extras for Catalyst library (Visualization of batch predictions, additional metrics)

Showcase: Catalyst, Albumentations, Pytorch Toolbelt example: Semantic Segmentation @ CamVid

Why

Honest answer is "I needed a convenient way to re-use code for my Kaggle career". During 2018 I achieved a Kaggle Master badge and this been a long path. Very often I found myself re-using most of the old pipelines over and over again. At some point it crystallized into this repository.

This lib is not meant to replace catalyst / ignite / fast.ai high-level frameworks. Instead it's designed to complement them.

Installation

pip install pytorch_toolbelt

How do I ...

Model creation

Create Encoder-Decoder U-Net model

Below a code snippet that creates vanilla U-Net model for binary segmentation. By design, both encoder and decoder produces a list of tensors, from fine (high-resolution, indexed 0) to coarse (low-resolution) feature maps. Access to all intermediate feature maps is beneficial if you want to apply deep supervision losses on them or encoder-decoder of object detection task, where access to intermediate feature maps is necessary.

from torch import nn
from pytorch_toolbelt.modules import encoders as E
from pytorch_toolbelt.modules import decoders as D

class UNet(nn.Module):
    def __init__(self, input_channels, num_classes):
        super().__init__()
        self.encoder = E.UnetEncoder(in_channels=input_channels, out_channels=32, growth_factor=2)
        self.decoder = D.UNetDecoder(self.encoder.channels, decoder_features=32)
        self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return self.logits(x[0])

Create Encoder-Decoder FPN model with pretrained encoder

Similarly to previous example, you can change decoder to FPN with contatenation.

from torch import nn
from pytorch_toolbelt.modules import encoders as E
from pytorch_toolbelt.modules import decoders as D

class SEResNeXt50FPN(nn.Module):
   def __init__(self, num_classes, fpn_channels):
       super().__init__()
       self.encoder = E.SEResNeXt50Encoder()
       self.decoder = D.FPNCatDecoder(self.encoder.channels, fpn_channels)
       self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)

   def forward(self, x):
       x = self.encoder(x)
       x = self.decoder(x)
       return self.logits(x[0])

Change number of input channels for the Encoder

All encoders from pytorch_toolbelt supports changing number of input channels. Simply call encoder.change_input_channels(num_channels) and first convolution layer will be changed. Whenever possible, existing weights of convolutional layer will be re-used (in case new number of channels is greater than default, new weight tensor will be padded with randomly-initialized weigths). Class method returns self, so this call can be chained.

from pytorch_toolbelt.modules import encoders as E

encoder = E.SEResnet101Encoder()
encoder = encoder.change_input_channels(6)

Misc

Count number of parameters in encoder/decoder and other modules

When designing a model and optimizing number of features in neural network, I found it's quite useful to print number of parameters in high-level blocks (like encoder and decoder). Here is how to do it with pytorch_toolbelt:

from torch import nn
from pytorch_toolbelt.modules import encoders as E
from pytorch_toolbelt.modules import decoders as D
from pytorch_toolbelt.utils import count_parameters

class SEResNeXt50FPN(nn.Module):
    def __init__(self, num_classes, fpn_channels):
        super().__init__()
        self.encoder = E.SEResNeXt50Encoder()
        self.decoder = D.FPNCatDecoder(self.encoder.channels, fpn_channels)
        self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return self.logits(x[0])

net = SEResNeXt50FPN(1, 128)
print(count_parameters(net))
# Prints {'total': 34232561, 'trainable': 34232561, 'encoder': 25510896, 'decoder': 8721536, 'logits': 129}

Compose multiple losses

There are multiple ways to combine multiple losses, and high-level DL frameworks like Catalyst offers way more flexible way to achieve this, but here's 100%-pure PyTorch implementation of mine:

from pytorch_toolbelt import losses as L

# Creates a loss function that is a weighted sum of focal loss 
# and lovasz loss with weigths 1.0 and 0.5 accordingly.
loss = L.JointLoss(L.FocalLoss(), L.LovaszLoss(), 1.0, 0.5)

TTA / Inferencing

Apply Test-time augmentation (TTA) for the model

Test-time augmetnation (TTA) can be used in both training and testing phases.

from pytorch_toolbelt.inference import tta

model = UNet()

# Truly functional TTA for image classification using horizontal flips:
logits = tta.fliplr_image2label(model, input)

# Truly functional TTA for image segmentation using D4 augmentation:
logits = tta.d4_image2mask(model, input)

Inference on huge images:

Quite often, there is a need to perform image segmentation for enormously big image (5000px and more). There are a few problems with such a big pixel arrays:

  1. There are size limitations on maximum size of CUDA tensors (Concrete numbers depends on driver and GPU version)
  2. Heavy CNNs architectures may eat up all available GPU memory with ease when inferencing relatively small 1024x1024 images, leaving no room to bigger image resolution.

One of the solutions is to slice input image into tiles (optionally overlapping) and feed each through model and concatenate the results back. In this way you can guarantee upper limit of GPU ram usage, while keeping ability to process arbitrary-sized images on GPU.

import numpy as np
from torch.utils.data import DataLoader
import cv2

from pytorch_toolbelt.inference.tiles import ImageSlicer, CudaTileMerger
from pytorch_toolbelt.utils.torch_utils import tensor_from_rgb_image, to_numpy


image = cv2.imread('really_huge_image.jpg')
model = get_model(...)

# Cut large image into overlapping tiles
tiler = ImageSlicer(image.shape, tile_size=(512, 512), tile_step=(256, 256))

# HCW -> CHW. Optionally, do normalization here
tiles = [tensor_from_rgb_image(tile) for tile in tiler.split(image)]

# Allocate a CUDA buffer for holding entire mask
merger = CudaTileMerger(tiler.target_shape, 1, tiler.weight)

# Run predictions for tiles and accumulate them
for tiles_batch, coords_batch in DataLoader(list(zip(tiles, tiler.crops)), batch_size=8, pin_memory=True):
    tiles_batch = tiles_batch.float().cuda()
    pred_batch = model(tiles_batch)

    merger.integrate_batch(pred_batch, coords_batch)

# Normalize accumulated mask and convert back to numpy
merged_mask = np.moveaxis(to_numpy(merger.merge()), 0, -1).astype(np.uint8)
merged_mask = tiler.crop_to_orignal_size(merged_mask)

Advanced examples

  1. Inria Sattelite Segmentation
  2. CamVid Semantic Segmentation

Citation

@misc{Khvedchenya_Eugene_2019_PyTorch_Toolbelt,
  author = {Khvedchenya, Eugene},
  title = {PyTorch Toolbelt},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/BloodAxe/pytorch-toolbelt}},
  commit = {cc5e9973cdb0dcbf1c6b6e1401bf44b9c69e13f3}
}
Comments
  • Is compute_pyramid_patch_weight_loss correctly imlemented?

    Is compute_pyramid_patch_weight_loss correctly imlemented?

    https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tiles.py#L33 can be deleted.

    https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tiles.py#L28 https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tiles.py#L29

    are never updated and stay zero?

    P.S. Numpy is very slow. replacing sqrt and square speeds things up a lot.

    opened by ternaus 7
  • Dice Loss/Score question

    Dice Loss/Score question

    Hey Eugene,

    First of all, thank you for this very useful package. I'm transferring my environment from TF to Pytorch now and having your advanced losses is very helpful. However, when I trained the same model on the same data using same loss functions in both frameworks, I noticed that I get very different loss numbers (I'm using multilabel approach). Digging a little deeper in your code I noticed that when you calculate the Dice Loss you always calculate per sample AND per channel loss and then average it. I don't understand why are you doing the per channel calculation ad averaging, and not the Dice loss for all classes together. I can show What I mean on a dummy example below:

    Let's prepare 2 dummy multilabel matrices - ground truth (d_gt) and prediction (d_pr) with 3 classes each, 0 Red, 1 Green and 2 Blue: d_gt = np.zeros(shape=(20,20,3)) d_gt[5:10,5:10,0] =1 d_gt[10:15,10:15,1] =1 d_gt[:,:,2] = (1 - d_gt.sum(axis=-1, keepdims=True)).squeeze() plt.imshow(d_gt)

    image

    d_pr = np.zeros(shape=(20,20,3)) d_pr[4:9,4:9,0] =1 d_pr[11:14,11:14,1] =1 d_pr[:,:,2] = (1 - d_pr.sum(axis=-1, keepdims=True)).squeeze() plt.imshow(d_pr)

    image

    One can see that (using Dice Loss = 1- Dice Score):

    • Dice Loss for Red is 1- ((16+ 16) / (25+ 25)) = 0.36
    • Dice Loss for Green is 1 - ((9+9)/(9+25) = 0.4706
    • Dice Loss for Blue is 1 - ((341+341)/(350+366)) = 0.0474

    However, total Dice Loss for the whole picture is 1 - (2*(16+9+341)/(2*400) = 0.085

    After wrapping them into tensors d_gt_tensor = torch.from_numpy(np.transpose(d_gt,(2,0,1))).unsqueeze(0) d_pr_tensor = torch.from_numpy(np.transpose(d_pr,(2,0,1))).unsqueeze(0) what your Dice Loss (with from_logits=False) is returning is 0.2927 which is the averaged loss of individual channels instead of the total loss. The culprit seems to be passing dims=(0,2) to the soft_dice_score function, I think that dims=(1,2) should be passed instead to get individual scores for each item in the batch? Unless this behaviour is intended but then I'd need some more explanation why.

    Second smaller question regrading your Dice Loss is why you use from_logits= True by default?

    Thanks in advance!

    opened by JanSobus 5
  • Is dependency on `opencv-python` necessary?

    Is dependency on `opencv-python` necessary?

    Depending on opencv-python makes it difficult to use the library in the docker environment since there is typically no gui. Would it be possible to depend on the opencv-python-headless instead?

    Thanks.

    opened by MikiGrit 4
  • integrate_batch throws error: RuntimeError: The size of tensor a (6) must match the size of tensor b (928) ...

    integrate_batch throws error: RuntimeError: The size of tensor a (6) must match the size of tensor b (928) ...

    Hi, I'm trying to use your tiling tools with my yolov5 model but in the following line I get following error:

    https://github.com/BloodAxe/pytorch-toolbelt/blob/cab4fc4e209d9c9e5db18cf1e01bb979c65cf08b/pytorch_toolbelt/inference/tiles.py#L341

    RuntimeError: The size of tensor a (6) must match the size of tensor b (928) at non-singleton dimension 2

    The debugger shows a tile tensor size of (52983,6) and a weight tensor size of (1, 928,928). What could be the reason for the difference in the tensor size?

    Some more infos: model size: 928x928 image size is 3840*2160 I am leading the model using DetectMultiBackend from yolov5

    opened by jokober 4
  • TypeError: object of type 'int' has no len()

    TypeError: object of type 'int' has no len()

    I am unable to create a basic UNet model from the library as given on the readme. Here's the code for the same:

    from torch import nn
    from pytorch_toolbelt.modules import encoders as E
    from pytorch_toolbelt.modules import decoders as D
    
    class UNet(nn.Module):
        def __init__(self, input_channels, num_classes):
            super().__init__()
            self.encoder = E.UnetEncoder(in_channels=input_channels, out_channels=32, growth_factor=2)
            self.decoder = D.UNetDecoder(self.encoder.channels, decoder_features=32)
            self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)
    
        def forward(self, x):
            x = self.encoder(x)
            x = self.decoder(x)
            return self.logits(x[0])
        
    model= UNet(input_channels= 3, num_classes= 1)
    

    Error:

    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-1-4e8064bebb83> in <module>
         15         return self.logits(x[0])
         16 
    ---> 17 model= UNet(input_channels= 3, num_classes= 1)
    
    <ipython-input-1-4e8064bebb83> in __init__(self, input_channels, num_classes)
          7         super().__init__()
          8         self.encoder = E.UnetEncoder(in_channels=input_channels, out_channels=32, growth_factor=2)
    ----> 9         self.decoder = D.UNetDecoder(self.encoder.channels, decoder_features=32)
         10         self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)
         11 
    
    ~/anaconda3/envs/dl_gpu/lib/python3.7/site-packages/pytorch_toolbelt/modules/decoders/unet.py in __init__(self, feature_maps, decoder_features, unet_block, upsample_block)
         38             decoder_features = [None] * num_blocks
         39         else:
    ---> 40             if len(decoder_features) != num_blocks:
         41                 raise ValueError(f"decoder_features must have length of {num_blocks}")
         42         in_channels_for_upsample_block = feature_maps[-1]
    
    TypeError: object of type 'int' has no len()
    
    opened by sainatarajan 4
  • Getting out of memory by using inference on huge images

    Getting out of memory by using inference on huge images

    I have tried pretty small slices but get cuda out of memory on ---> 23 pred_batch = best_model(tiles_batch)[:, 0:1, :,:] As I can see it finally preceded few steps but failed. I have GPU with 8 GB, model it`s unet but wuth heavy encoders. Image shape (6300, 6304, 3)

    import numpy as np
    import torch
    import cv2
    from tqdm import tqdm_notebook
    from pytorch_toolbelt.inference.tiles import ImageSlicer, CudaTileMerger
    from pytorch_toolbelt.utils.torch_utils import tensor_from_rgb_image, to_numpy
    
    
    image = img_to_predict
    
    # Cut large image into overlapping tiles
    tiler = ImageSlicer(image.shape, tile_size=(64, 64), tile_step=(64, 64), weight='pyramid')
    
    # HCW -> CHW. Optionally, do normalization here
    tiles = [tensor_from_rgb_image(tile) for tile in tiler.split(image)]
    
    # Allocate a CUDA buffer for holding entire mask
    merger = CudaTileMerger(tiler.target_shape, 1, tiler.weight)
    
    # Run predictions for tiles and accumulate them
    for tiles_batch, coords_batch in tqdm_notebook(DataLoader(list(zip(tiles, tiler.crops)), batch_size=1, pin_memory=True)):
        tiles_batch = tiles_batch.float().cuda()
        pred_batch = best_model(tiles_batch)[:, 0:1, :,:] # taking only first channel
    
        merger.integrate_batch(pred_batch, coords_batch)
    
    # Normalize accumulated mask and convert back to numpy
    merged_mask = np.moveaxis(to_numpy(merger.merge()), 0, -1).astype(np.uint8)
    merged_mask = tiler.crop_to_orignal_size(merged_mask)
    
    opened by Diyago 3
  • UnetSegmentationModel dimension won't match

    UnetSegmentationModel dimension won't match

    I want to try hrnet34_unet64 for image segmentation using:

    encoder = E.HRNetV2Encoder34(pretrained=pretrained, layers=[0, 1, 2, 3, 4])
    UnetSegmentationModel(encoder, num_classes=num_classes, unet_channels=[64, 128, 256, 512], dropout=dropout)
    

    And got an error: ``RuntimeError: Sizes of tensors must match except in dimension 2. Got 128 and 256 (The offending index is 0)```

    Could you please let me know what is wrong? Thanks!

    opened by xdtl 2
  • SoftCrossEntropyLoss error

    SoftCrossEntropyLoss error

    When I use the SoftCrossEntropyLoss, I got the error:

    RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

    Could anyone help me? BTW, what paper proposed the SoftCrossEntropyLoss?

    opened by somebodyus 2
  • performance of ImageSlicer weight=pyramid

    performance of ImageSlicer weight=pyramid

    ImageSlicer with weight=pyramid is/was super slow to initialize. It is the weight used in README.md example "Inference on huge images". (in https://github.com/BloodAxe/pytorch-toolbelt/issues/23 performance was mentioned and I guess it was the reason people look at this code)

    opened by ksenobojca 2
  •  FocalLoss

    FocalLoss

    🐛 Bug

    There are two types of focal loss here (BinaryFocalLoss and FocalLoss): https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/losses/focal.py

    Both of these functions are calling the focal_loss_with_logits function, while the second one should use softmax_focal_loss_with_logits.

    opened by mehran66 1
  • Focal loss error

    Focal loss error

    Multiclass Focal loss returns error.

        loss = criterion(preds, target)
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/joint_loss.py", line 32, in forward
        return self.first(*input) + self.second(*input)
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/joint_loss.py", line 18, in forward
        return self.loss(*input) * self.weight
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/focal.py", line 89, in forward
        loss += self.focal_loss_fn(cls_label_input, cls_label_target)
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/functional.py", line 45, in focal_loss_with_logits
        logpt = F.binary_cross_entropy_with_logits(output, target, reduction="none")
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2580, in binary_cross_entropy_with_logits
        raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
    ValueError: Target size (torch.Size([5, 1, 256, 256])) must be the same as input size (torch.Size([5, 256, 256]))
    Exception ignored in: <function tqdm.__del__ at 0x7fd03260d400>
    Traceback (most recent call last):
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1128, in __del__
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1341, in close
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1520, in display
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1131, in __repr__
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1481, in format_dict
    TypeError: cannot unpack non-iterable NoneType object
    

    I think that line 83 in pytorch_toolbelt/losses/focal.py should be changed from cls_label_input = label_input[:, cls, ...] to cls_label_input = label_input[:, cls, ...].unsqueeze(1)

    opened by vbakhteev 1
  • Detailed documentation is recommended

    Detailed documentation is recommended

    Thank you very much for making such a good library. It would be nice to have a more detailed document, for example, https://smp.readthedocs.io/en/latest/

    enhancement Looking for contributors 
    opened by Hengwei-Zhao96 1
Releases(0.6.2)
  • 0.6.1(Oct 25, 2022)

  • 0.6.0(Oct 20, 2022)

  • 0.5.3(Oct 20, 2022)

    Bugfixes

    • Fix https://github.com/BloodAxe/pytorch-toolbelt/issues/78 thanks https://github.com/mehran66 for pointing this out

    New Stuff

    • InriaAerialImageDataset for working with Inria Aerial Dataset
    • get_collate_for_dataset function to get collate fn if a dataset instance (argument) exposes get_collate_fn method. Works also for ConcatDataset.

    Improvements

    • DatasetMeanStdCalculator supports dtype to specify accumulator type (float64 by default)
    Source code(tar.gz)
    Source code(zip)
  • 0.5.2(Aug 26, 2022)

    BugFixes

    • Fixed bug in ApplySoftmaxTo and ApplySigmoidTo modules that could lead to activations not applied to input when it was a string

    New API

    • Added fs.find_images_in_dir_recursive
    • Added utils.describe_outputs to return a human-friendly representation of complex (dict, nested list, etc) outputs to see shape, mean/std of each tensor.

    Other

    More MyPy fixes & type annotations

    Source code(tar.gz)
    Source code(zip)
  • 0.5.1(Jun 27, 2022)

    New API

    • Added fs.find_subdirectories_in_dir to retrieve list of subdirectories (non-recursive) in the given directory.
    • Added logodd averaging of TTA predictions and counterpart logodd_mean function.

    Improvements

    • In plot_confusion_matrix one can disable plotting scores in each cell using show_scores argument (True by default).
    • freeze_model method now returns input module argument.
    Source code(tar.gz)
    Source code(zip)
  • 0.5.0(Mar 10, 2022)

    Version 0.5.0

    This is the major release update of Pytorch Toolbelt. It's been a long time since the last update and there are many improvements & updates since 0.4.4:

    New features

    • Added class pytorch_toolbelt.datasets.DatasetMeanStdCalculator to compute mean & std of the dataset that does not fit entirely in memory.
    • New decoder module: BiFPNDecoder
    • New encoders: SwinTransformer, SwinB, SwinL, SwinT, SwinS
    • Added broadcast_from_master function to distributed utils. This method allows scattering a tensor from the master node to all nodes.
    • Added reduce_dict_sum to gather & concatenate dictionary of lists from all nodes in DDP.
    • Added master_print as a drop-in replacement to print that prints to stdout only on the zero-rank node.

    Bug Fixes

    • Fix bug in lovasz loss by @seefun in https://github.com/BloodAxe/pytorch-toolbelt/pull/62

    Breaking changes

    • Bounding boxes matching method has been divided into two: match_bboxes and match_bboxes_hungarian. The first method uses scores of predicted bboxes and matches most confident predictions first, while the match_bboxes_hungarian matches bboxes to maximize overall IoU.
    • set_manual_seed now sets random seed for Numpy.
    • to_numpy now correctly works for None and all iterables (Not only tuple & list)

    Fixes & Improvements (NO BC)

    • Added dim argument to ApplySoftmaxTo to specify channel for softmax operator (default value is 1, which was hardcoded previously)
    • ApplySigmoidTo now applies in-place sigmoid (Purely performance optimization)
    • TileMerger now supports specifying a device (Torch semantics) for storing intermediate tensors of accumulated tiles.
    • All TTA functions supports PyTorch Tracing
    • MultiscaleTTA now supports a model that returns a single Tensor (Key-Value outputs still works as before)
    • balanced_binary_cross_entropy_with_logits and BalancedBCEWithLogitsLoss now supports ignore_index argument.
    • BiTemperedLogisticLoss & BinaryBiTemperedLogisticLoss also got support of ignore_index argument.
    • focal_loss_with_logits now also supports ignore_index. Computation of ignored values has been moved from BinaryFocalLoss to this function.
    • Reduced number of boilerplates & hardcoded code for encoders from timm. Now GenericTimmEncoder queries output strides & feature maps directly from the timm's encoder instance.
    • HRNet-based encoders now have a use_incre_features argument to specify whether output feature maps should have an increased number of features.
    • change_extension, read_rgb_image, read_image_as_is functions now supports Path as input argument. Return type (str) remains unchanged.
    • count_parameters now accepts human_friendly argument to print parameters count in human-friendly form 21.1M instead 21123123.
    • plot_confusion_matrix now has format_string argument (None by default) to specify custom format string for values in confusion matrix.
    • RocAucMetricCallback for Catalyst got fix_nans argument to fix NaN outputs, which caused roc_auc to raise an exception and break the training.
    • BestWorstMinerCallbac now additionally logs batch with NaN value in monitored metric
    Source code(tar.gz)
    Source code(zip)
  • 0.4.4(Aug 12, 2021)

    New features

    • New tiled processing classes for 3D data - VolumeSlicer and VolumeMerger. Designed similarly to ImageSlicer. Not you can run 3D segmentation on huge volumes without risk of OOM.
    • Support of labels (scalar or 1D vector) augmentation/deaugmentation in D2, D4 and flip-style TTA.
    • Balanced BCE loss (BalancedBCEWithLogitsLoss)
    • Bi-Tempered loss 'BiTemperedLogisticLoss'
    • SelectByIndex helper module to pick named output of the model (For use in nn.Sequential)
    • New encoders MobileNetV3Large, MobileNetV3Small from torchvision.
    • New encoders from timm package (HRNets, ResNetD, EfficientNetV2 and others).
    • DeepLabV3 & DeepLabV3+ Decoders
    • Pure PyTorch-based implementation for bbox matching (match_bboxes) that supports both CPU/GPU matching using hungarian algorithm.

    Bugfixes

    • Fix bug in Lovasz Loss (#62), thanks @seefun

    Breaking Changes

    • Parameter ignore renamed to ignore_index in BinaryLovaszLoss class.
    • Renamed fpn_channels argument in constructor of FPNSumDecoder and FPNCatDecoder to channels.
    • Renamed 'output_channelsargument in constructor ofHRNetSegmentationDecoderto 'channels.
    • conv1x1 not set bias to zero by default
    • Bumped up minimal pytorch version to 1.8.1

    Other Improvements

    • Ensembler class not correctly works with torch.jit.tracing
    • Numerous docstrings & type annotations enchancements
    Source code(tar.gz)
    Source code(zip)
  • 0.4.3(Apr 2, 2021)

    PyTorch Toolbelt 0.4.3

    Modules

    • Added missing sigmoid activation support to get_activation_block
    • Make Encoders support JIT & Tracing
    • Better support for encoders from timm (They named with prefix Timm)

    Utils

    • rgb_image_from_tensor now clip values

    TTA & Ensembling

    • Ensembler now supports arithmetic, geometric & harmonic averaging via reduction parameter.
    • Bring geometric & harmonic averaging to all TTA functions as well

    Datasets

    • read_binary_mask
    • Refactor SegmentationDataset to support strided masks for deep supervision
    • Added RandomSubsetDataset and RandomSubsetWithMaskDataset to sample dataset based on some condition (E.g. sample only samples of particular class)

    Other

    As usual, more tests, better type annotations & comments

    Source code(tar.gz)
    Source code(zip)
  • 0.4.2(Mar 3, 2021)

    Breaking Changes

    • Bump up minimal PyTorch version to 1.7.1

    New features

    • New dataset classes ClassificationDataset, SegmentationDataset for easy every-day use in Kaggle
    • New losses: FocalCosineLoss, BiTemperedLogisticLoss, SoftF1Loss
    • Support of new activations for get_activation_block (Silu, Softplus, Gelu)
    • More encoders from timm package: NFNets, NFRegNet, HRNet, DPN
    • RocAucMetricCallback for Catalyst
    • MultilabelAccuracyCallback and AccuracyCallback with DDP support

    Bugfixes

    • Fix invalid prefix in catalyst registry to from tbt to tbt.
    Source code(tar.gz)
    Source code(zip)
  • 0.4.1(Jan 14, 2021)

    New features

    • Added Soft-F1 loss for direct optimization of F1 score (Binary case only)
    • Fully rework TTA (Kept backward compatibility where it's possible) module for inference.
    • Added support of ignore_index to Dice & Jaccard losses.
    • Improved Lovasz loss to work in fp16 mode.
    • Added option to override selected params in make_n_channel_input.
    • More Encoders, from timm package.
    • FPNFuse module not works on 2D, 3D and N-D inputs.
    • Added Global K-Max 2D pooling block.
    • Added Generalized mean pooling 2D block.
    • Added softmax_over_dim_X, argmax_over_dim_X shorthand functions for use in metrics to get soft/hard labels without using lambda functions.
    • Added helper visualization functions to add fancy header to image, stack images of different sizes.
    • Improved rendering of confusion matrix.

    Catalyst goodies

    • Encoders & Losses are available in Catalyst registry
    • StopIfNanCallback
    • Added OutputDistributionCallback to log distribtion of predictions to TensorBoard.
    • Added UMAPCallback to visualize embedding space using UMAP in TensorBoard.

    Breaking Changes

    • Renamed CudaTileMerger to TileMerger. TileMerger allows to specify target device explicitly.
    • tensor_from_rgb_image removed in favor of image_to_tensor.

    Bug fixes & Improvements

    • Improve numeric stability of focal_loss_with_logits when reduction="sum"
    • Prevent NaN in FocalLoss when all elements are equal to ignore_index value.
    • A LOT of type hints.
    Source code(tar.gz)
    Source code(zip)
  • 0.4.0(Aug 19, 2020)

    New features

    • Memory-efficient Swish and Mish activation functions (Credits goes to http://github.com/rwightman/pytorch-image-models)
    • Refactor EfficientNet encoders (no pretrained weights yet)

    Fixes

    • Fixed incorrect default value for ignore_index in SoftCrossEntropyLoss

    Breaking changes

    • All catalyst-related utils updated to be compatible with Catalyst 20.8.2
    • Remove PIL package dependency

    Improvements

    • More comments, more type hints
    Source code(tar.gz)
    Source code(zip)
  • 0.3.2(Apr 28, 2020)

    New features

    • Many helpful callbacks for Catalyst library: HyperParameterCallback, LossAdapter to name a few.
    • New losses for deep model supervision (Helpful, when size of target and output mask are different)
    • Stacked Hourglass encoder
    • Context Aggregation Network decoder

    Breaking Changes

    • ABN module will now resolve as nn.Sequential(BatchNorm2d, Activation) instead of a hand-crafted module. This enables easier conversion of batch normalization modules to the nn.SyncBatchNorm.

    • Almost every Encoder/Decoder implementation has been refactored for better clarity and flexibility. Please double-check your pipelines.

    Important bugfixes

    • Improved numerical stability of Dice / Jaccard losses (Using log_sigmoid() + exp() instead of plain sigmoid() )

    Other

    • A lots of comments for functions and modules
    • Code cleanup, thanks for DeepSource
    • Type annotations for modules and functions
    • Update of README
    Source code(tar.gz)
    Source code(zip)
  • 0.3.1(Feb 25, 2020)

    Fixes

    • Fixed bug in computation IoU metric in binary_dice_iou_score function
    • Fixed incorrect default value in SoftCrossEntropyLoss #38

    Improvements

    • Function draw_binary_segmentation_predictions now has parameter image_format (rgb|bgr|gray) to specify format of the image to visualize correctly images in TB
    • More type annotations across the codebase

    New features

    • New visualization function draw_multilabel_segmentation_predictions
    Source code(tar.gz)
    Source code(zip)
  • 0.3.0(Jan 17, 2020)

    Pytorch Toolbel 0.3.0

    This release has a huge set of new features, bugfixes and breaking changes. So be careful, when upgrading. pip install pytorch-toolbelt==0.3.0

    New features

    Encoders

    • HRNetV2
    • DenseNets
    • EfficientNet
    • Encoder class has change_input_channels method to change number of channels in input image

    New losses

    • BCELoss with support of ignore_index
    • SoftBCELoss (Label smoothing loss for binary case with support of ignore_index)
    • SoftCrossEntropyLoss (Label smoothing loss for multiclass case with support of ignore_index)

    Catalyst goodies

    • Online pseudolabeling callback
    • Training signal annealing callback

    Other

    • New activation functions support in ABN block: Swish, Mish, HardSigmoid
    • New decoders (Unet, FPN, DeeplabV3, PPM) to simplify creation of segmentation models
    • CREDITS.md to include all the references to code/articles. Existing list is definitely not complete, so feel free to make PR's
    • Object context block from OCNet

    API changes

    • Focal loss now supports normalized focal loss and reduced focal loss extensions.
    • Optimize computation of pyramid weight matrix #34
    • Default value align_corners=False in F.interpolate when doing bilinear upsampling.

    Bugfixes

    • Fix missing call to batch normalization block in FPNBottleneckBN
    • Fix numerical stability for DiceLoss and JaccardLoss when log_loss=True
    • Fix numerical stability when computing normalized focal loss
    Source code(tar.gz)
    Source code(zip)
  • 0.2.1(Oct 7, 2019)

  • 0.2.0(Oct 4, 2019)

    PyTorch Toolbelt 0.2.0

    This release dedicated to housekeeping work. Dice/IoU metrics and losses have been redesigned to reduce amount of duplicated code and bring more clarity. Code is now auto-formatted using Black.

    pip install pytorch_toolbelt==0.2.0

    Catalyst contrib

    • Refactor Dice/IoU loss into single metric IoUMetricsCallback with a few cool features: metric="dice|jaccard" to choose what metric should be used; mode=binary|multiclass|multilabel to specify problem type (binary, multiclass or multi-label segmentation)'; classes_of_interest=[1,2,4] to select for which set of classes metric should be computed and nan_score_on_empty=False to compute Dice Accuracy (Counts as a 1.0 if both y_true and y_pred are empty; 0.0 if y_pred is not empty).
    • Added L-p regularization callback to apply L1 and L2 regularization to model with support of regularization strength scheduling.

    Losses

    • Refactor DiceLoss/JaccardLoss losses in a same fashion as metrics.

    Models

    • Add Densenet encoders
    • Bugfix: Fix missing BN+Relu in UNetDecoder
    • Global pooling modules can squeeze spatial channel dimensions if flatten=True.

    Misc

    • Add more unit tests
    • Code-style is now managed with Black
    • to_numpy now supports int, float scalar types
    Source code(tar.gz)
    Source code(zip)
  • 0.1.4(Sep 12, 2019)

  • 0.1.3(Jul 24, 2019)

    PyTorch Toolbelt 0.1.3

    1. Added ignore_index for focal loss
    2. Added ignore_index to some metrics for Catalyst
    3. Added tif extension for find_images_in_dir
    Source code(tar.gz)
    Source code(zip)
  • 0.1.1(Jun 29, 2019)

    New functionality / breaking changes

    • Added visualization functions to render best/worst batches for binary and semantic segmentation.
    • JaccardScoreCallback now is a single callback for computing IoU for binary/multiclass/multilabel segmentation.
    • Added HFF module (Hierarchical feature fusion).
    • Added set_trainable function to enable/disabled training and batch-norm on module and it's childs.
    • RLE encoding/decoding (Hi, Kaggle)

    API changes

    • rgb_image_from_tensor now accepts dtype parameters for returned image

    Bugfixes

    • Fixed wrong implementation of UpsampleAddConv (There was extra residual connection)
    Source code(tar.gz)
    Source code(zip)
  • 0.1.0(Jun 12, 2019)

    New stuff:

    1. EfficientNet
    2. Multiscale TTA module
    3. New activations: Swish, HardSwish, HardSigmoid
    4. AGN module (Activated Group Norm), mimicks ABN

    Changes:

    1. SpatialGate2d now accepts squeeze_channels for explicit number of squeeze channels.

    Misc

    1. Code formatting
    Source code(tar.gz)
    Source code(zip)
  • 0.0.9(Jun 3, 2019)

  • 0.0.8(May 19, 2019)

    • Global pooling, SCSE module and MobileNetV3 encoders are not ONNX and CoreML friendly.
    • Refactored FPN module for more flexible interpolate_add tuning (can use any module with two inputs)
    Source code(tar.gz)
    Source code(zip)
  • 0.0.7(May 8, 2019)

  • 0.0.6(May 6, 2019)

    New features

    1. Added WiderResNet & WiderResNetA2 encoders (https://github.com/mapillary/inplace_abn)
    2. Added implementation of reduced focal loss (https://arxiv.org/abs/1903.01347)
    Source code(tar.gz)
    Source code(zip)
  • 0.0.5(Apr 26, 2019)

    Changes

    • Added 10-Crop TTA (https://github.com/BloodAxe/pytorch-toolbelt/issues/4)
    • Added unit tests for TTA functions
    • Added freeze_bn function to freeze all BN layers in a model
    • Rename unpad_tensor to unpad_image_tensor to mimick pad_image_tensor

    Bugfixes

    • Fixed bug in d4_image2mask
    Source code(tar.gz)
    Source code(zip)
  • 0.0.4(May 6, 2019)

  • 0.0.3(May 6, 2019)

Owner
Eugene Khvedchenya
AI/ML Advisor, Entrepreneur, Kaggle Master. Author of pytorch-toolbelt. Core maintainer of albumentations. Catalyst contributor.
Eugene Khvedchenya
Kaldi-compatible feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd

Kaldi-compatible feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd

Fangjun Kuang 119 Jan 03, 2023
Distiller is an open-source Python package for neural network compression research.

Wiki and tutorials | Documentation | Getting Started | Algorithms | Design | FAQ Distiller is an open-source Python package for neural network compres

Intel Labs 4.1k Dec 28, 2022
Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)

News SRU++, a new SRU variant, is released. [tech report] [blog] The experimental code and SRU++ implementation are available on the dev branch which

ASAPP Research 2.1k Jan 01, 2023
PyTorch extensions for fast R&D prototyping and Kaggle farming

Pytorch-toolbelt A pytorch-toolbelt is a Python library with a set of bells and whistles for PyTorch for fast R&D prototyping and Kaggle farming: What

Eugene Khvedchenya 1.3k Jan 05, 2023
A pure Python implementation of Compact Bilinear Pooling and Count Sketch for PyTorch.

Compact Bilinear Pooling for PyTorch. This repository has a pure Python implementation of Compact Bilinear Pooling and Count Sketch for PyTorch. This

Grégoire Payen de La Garanderie 234 Dec 07, 2022
3D-RETR: End-to-End Single and Multi-View3D Reconstruction with Transformers

3D-RETR: End-to-End Single and Multi-View 3D Reconstruction with Transformers (BMVC 2021) Zai Shi*, Zhao Meng*, Yiran Xing, Yunpu Ma, Roger Wattenhofe

Zai Shi 36 Dec 21, 2022
This is an differentiable pytorch implementation of SIFT patch descriptor.

This is an differentiable pytorch implementation of SIFT patch descriptor. It is very slow for describing one patch, but quite fast for batch. It can

Dmytro Mishkin 150 Dec 24, 2022
Pytorch implementation of Distributed Proximal Policy Optimization

Pytorch-DPPO Pytorch implementation of Distributed Proximal Policy Optimization: https://arxiv.org/abs/1707.02286 Using PPO with clip loss (from https

Alexis David Jacq 164 Jan 05, 2023
Fast Discounted Cumulative Sums in PyTorch

TODO: update this README! Fast Discounted Cumulative Sums in PyTorch This repository implements an efficient parallel algorithm for the computation of

Daniel Povey 7 Feb 17, 2022
A Pytorch Implementation for Compact Bilinear Pooling.

CompactBilinearPooling-Pytorch A Pytorch Implementation for Compact Bilinear Pooling. Adapted from tensorflow_compact_bilinear_pooling Prerequisites I

169 Dec 23, 2022
Pretrained EfficientNet, EfficientNet-Lite, MixNet, MobileNetV3 / V2, MNASNet A1 and B1, FBNet, Single-Path NAS

(Generic) EfficientNets for PyTorch A 'generic' implementation of EfficientNet, MixNet, MobileNetV3, etc. that covers most of the compute/parameter ef

Ross Wightman 1.5k Jan 01, 2023
A few Windows specific scripts for PyTorch

It is a repo that contains scripts that makes using PyTorch on Windows easier. Easy Installation Update: Starting from 0.4.0, you can go to the offici

408 Dec 15, 2022
Differentiable SDE solvers with GPU support and efficient sensitivity analysis.

PyTorch Implementation of Differentiable SDE Solvers This library provides stochastic differential equation (SDE) solvers with GPU support and efficie

Google Research 1.2k Jan 04, 2023
Fast, general, and tested differentiable structured prediction in PyTorch

Torch-Struct: Structured Prediction Library A library of tested, GPU implementations of core structured prediction algorithms for deep learning applic

HNLP 1.1k Jan 07, 2023
A lightweight wrapper for PyTorch that provides a simple declarative API for context switching between devices, distributed modes, mixed-precision, and PyTorch extensions.

A lightweight wrapper for PyTorch that provides a simple declarative API for context switching between devices, distributed modes, mixed-precision, and PyTorch extensions.

Fidelity Investments 56 Sep 13, 2022
PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations

PyTorch Sparse This package consists of a small extension library of optimized sparse matrix operations with autograd support. This package currently

Matthias Fey 757 Jan 04, 2023
High-fidelity performance metrics for generative models in PyTorch

High-fidelity performance metrics for generative models in PyTorch

Vikram Voleti 5 Oct 24, 2021
Pretrained ConvNets for pytorch: NASNet, ResNeXt, ResNet, InceptionV4, InceptionResnetV2, Xception, DPN, etc.

Pretrained models for Pytorch (Work in progress) The goal of this repo is: to help to reproduce research papers results (transfer learning setups for

Remi 8.7k Dec 31, 2022
A tutorial on "Bayesian Compression for Deep Learning" published at NIPS (2017).

Code release for "Bayesian Compression for Deep Learning" In "Bayesian Compression for Deep Learning" we adopt a Bayesian view for the compression of

Karen Ullrich 190 Dec 30, 2022
PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference

PyTorch implementation of [1611.06440 Pruning Convolutional Neural Networks for Resource Efficient Inference] This demonstrates pruning a VGG16 based

Jacob Gildenblat 836 Dec 26, 2022