A criticism of a recent paper on buggy image downsampling methods in popular image processing and deep learning libraries.

Last update: Jul 12, 2022

Related tags

Deep Learning buggy-resizing-critique

Overview

A Criticism of the Paper On Buggy Resizing Libraries

This repository contains:

a Jupyter notebook for reproducing the aliased image downsampling fenomenon, as demonstrated in the On Buggy Resizing Libraries paper, which argues that the image downsampling methods of the OpenCV, Tensorflow and PyTorch libraries are "buggy", with only PIL being correct.
simple solutions for antialiasing in every framework, which solves the issue in all cases using the same functions, simply by setting parameters appropriately:
- OpenCV: change the interpolation from bilinear to area (from cv2.INTER_LINEAR to cv2.INTER_AREA)
- Tensorflow: set the antialias flag to True
- PyTorch: change the interpolation mode from bilinear to area, or simply use torchvision.transforms.Resize() instead of torch.nn.functional.interpolate()

Try it out in a Colab Notebook:

My opinion:

neither of the used image downsampling methods is "buggy", not applying antialiasing by default is an understandable design decision for both image and tensor operations.
the main figure of the paper is misleading, and it only illustrates the issues of aliasing for image resizing.
the aliasing issue with downsampling can be solved in all frameworks by simply setting a few parameters correctly. My criticism is that this is not mentioned in the paper.
torchvision.transforms.Resize() is claimed to only be a "a wrapper around the PIL library" in a note in Section 3.2 of the paper. This is true for PIL image inputs, but is incorrect for torch.Tensors, which are resized using torchvision interpolation operations.
the remaining parts of the paper provide valuable insights into the effects of interpolation methods, quantization and compression on the FID score of generative models.

Update: Just found out that there is another, very thorough investigation of the same issue. Highly recommend checking the blogpost out. They also implement an OpenCV-compatible Pillow-equivalent resizing that provides proper antialiasing for all interpolations.

Bilinear downsampling results with and without aliasing:

The main figure (Figure 1) of the paper:

A criticism of a recent paper on buggy image downsampling methods in popular image processing and deep learning libraries.

Related tags

Overview

A Criticism of the Paper On Buggy Resizing Libraries

Owner

Rlmm blender toolkit - A set of tools to streamline level generation in UDK straight from Blender

Code for MSc Quantitative Finance Dissertation

This repository contains source code for the Situated Interactive Language Grounding (SILG) benchmark

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

Code for CVPR 2021 paper TransNAS-Bench-101: Improving Transferrability and Generalizability of Cross-Task Neural Architecture Search.

Rule Extraction Methods for Interactive eXplainability

A repository for storing njxzc final exam review material

Code repository for EMNLP 2021 paper 'Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods'

Official code for "Maximum Likelihood Training of Score-Based Diffusion Models", NeurIPS 2021 (spotlight)

Miscellaneous and lightweight network tools

Fast, general, and tested differentiable structured prediction in PyTorch

A fuzzing framework for SMT solvers

A lightweight library to compare different PyTorch implementations of the same network architecture.

Employee-Managment - Company employee registration software in the face recognition system

pytorch implementation of ABC : Auxiliary Balanced Classifier for Class-imbalanced Semi-supervised Learning

The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Fight Recognition from Still Images in the Wild @ WACVW2022, Real-world Surveillance Workshop

🎃 Core identification module of AI powerful point reading system platform.

Code for the ICML 2021 paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"

This is the code for Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning