CAST: Character labeling in Animation using Self-supervision by Tracking

Related tags

Deep LearningCAST
Overview

CAST: Character labeling in Animation using Self-supervision by Tracking

(Published as a conference paper at EuroGraphics 2022)

Note: The CAST paper code, evaluation dataset, and models are to be stored here soon. Authors: Oron Nir, Gal Rapoport, and Ariel Shamir All technical details are available on the paper.

For more details reach out to oronnir11 at gmail dot com

Abstract

Cartoons and animation domain videos have very different characteristics compared to real-life images and videos. In addition, this domain carries a large variability in styles. Current computer vision and deep-learning solutions often fail on animated content because they were trained on natural images. In this paper we present a method to refine a semantic representation suitable for specific animated content. We first train a neural network on a large-scale set of animation videos and use the mapping to deep features as an embedding space. Next, we use self-supervision to refine the representation for any specific animation style by gathering many examples of animated characters in this style, using a multi-object tracking. These examples are used to define triplets for contrastive loss training. The refined semantic space allows better clustering of animated characters even when they have diverse manifestations. Using this space we can build dictionaries of characters in an animation videos, and define specialized classifiers for specific stylistic content (e.g.,\ characters in a specific animation series) with very little user effort. These classifiers are the basis for automatically labeling characters in animation videos. We present results on a collection of characters in a variety of animation styles.

The pipeline below illustrates the major components. These steps are available in the E2E scripts. CAST Pipeline

#Supported E2E flows: Here are the corresponding endpoint scripts (Python 3.7.9).

1. main_1.py Input: mp4, output: triplets (Windows).
   1.1. Split the video to shots.
   1.2. Sample frames.
   1.3. Run detector and vanilla embeddings.
   1.4. Run a Multi-Object Tracker per shot.
   1.5. Sample triplets. 
2. tuner_data_preper_2.py and anchor_triplets_json_2.py (Windows).
   2.1. Generate triplets with corresponding images for finetune.
   2.2. Prepare the JSON file with the bounding boxes to later embed using the finetuned model. 
3. fine_tune_series_3.py Triplets contrastive finetune (Linux).
   3.1. Run the finetune flow.
   3.2. Use the tuned model to better embed the bounding boxes.
4. cluster_4.py Grouping (Windows).
   4.1. Cluster the embedded boxes and find the cluster center/exemplar.
   4.2. Visualize with collages.

Other evaluation and visualization scripts that were developed towards EuroGraphics submission are also provided here.

#Citation Please cite our paper with the following bibtex:

@misc{nir2022cast,
      title={CAST: Character labeling in Animation using Self-supervision by Tracking}, 
      author={Oron Nir and Gal Rapoport and Ariel Shamir},
      year={2022},
      eprint={2201.07619},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
BT-Unet: A-Self-supervised-learning-framework-for-biomedical-image-segmentation-using-Barlow-Twins

BT-Unet: A-Self-supervised-learning-framework-for-biomedical-image-segmentation-using-Barlow-Twins Deep learning has brought most profound contributio

Narinder Singh Punn 12 Dec 04, 2022
OptaPlanner wrappers for Python. Currently significantly slower than OptaPlanner in Java or Kotlin.

OptaPy is an AI constraint solver for Python to optimize the Vehicle Routing Problem, Employee Rostering, Maintenance Scheduling, Task Assignment, School Timetabling, Cloud Optimization, Conference S

OptaPy 211 Jan 02, 2023
Breaking the Dilemma of Medical Image-to-image Translation

Breaking the Dilemma of Medical Image-to-image Translation Supervised Pix2Pix and unsupervised Cycle-consistency are two modes that dominate the field

Kid Liet 86 Dec 21, 2022
A smart Chat bot that can help to know about corona virus and Make prediction of corona using X-ray.

TRINIT_Hum_kuchh_nahi_karenge_ML01 Document Link https://github.com/Jatin-Goyal-552/TRINIT_Hum_kuchh_nahi_karenge_ML01/blob/main/hum_kuchh_nahi_kareng

JatinGoyal 1 Feb 03, 2022
This project is for a Twitter bot that monitors a bird feeder in my backyard. Any detected birds are identified and posted to Twitter.

Backyard Birdbot Introduction This is a silly hobby project to use existing ML models to: Detect any birds sighted by a webcam Identify whic

Chi Young Moon 71 Dec 25, 2022
Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Contra-OOD Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers. Requirements PyTorch Transformers datasets

Wenxuan Zhou 27 Oct 28, 2022
NeuroMorph: Unsupervised Shape Interpolation and Correspondence in One Go

NeuroMorph: Unsupervised Shape Interpolation and Correspondence in One Go This repository provides our implementation of the CVPR 2021 paper NeuroMorp

Meta Research 35 Dec 08, 2022
This is the winning solution of the Endocv-2021 grand challange.

Endocv2021-winner [Paper] This is the winning solution of the Endocv-2021 grand challange. Dependencies pytorch # tested with 1.7 and 1.8 torchvision

Vajira Thambawita 14 Dec 03, 2022
Code for EMNLP2020 long paper: BERT-Attack: Adversarial Attack Against BERT Using BERT

BERT-ATTACK Code for our EMNLP2020 long paper: BERT-ATTACK: Adversarial Attack Against BERT Using BERT Dependencies Python 3.7 PyTorch 1.4.0 transform

Linyang Li 142 Jan 04, 2023
A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Convolutional Neural Network (CNN). This repository contains a source code of a deep learning network built with TensorFlow and Keras to classify gend

Pawel Dziemiach 1 Dec 19, 2021
Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks

LMMNN Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks This is the working dire

Giora Simchoni 10 Nov 02, 2022
Unofficial implementation of Perceiver IO: A General Architecture for Structured Inputs & Outputs

Perceiver IO Unofficial implementation of Perceiver IO: A General Architecture for Structured Inputs & Outputs Usage import torch from src.perceiver.

Timur Ganiev 111 Nov 15, 2022
Repository of our paper 'Refer-it-in-RGBD' in CVPR 2021

Refer-it-in-RGBD This is the repository of our paper 'Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD Images' in CVPR 2021 Pape

Haolin Liu 34 Nov 07, 2022
Semi-Supervised Semantic Segmentation with Cross-Consistency Training (CCT)

Semi-Supervised Semantic Segmentation with Cross-Consistency Training (CCT) Paper, Project Page This repo contains the official implementation of CVPR

Yassine 344 Dec 29, 2022
Source code of D-HAN: Dynamic News Recommendation with Hierarchical Attention Network

D-HAN The source code of D-HAN This is the source code of D-HAN: Dynamic News Recommendation with Hierarchical Attention Network. However, only the co

30 Sep 22, 2022
Official code repository for the publication "Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons"

Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons This repository contains the code to repr

Computational Neuroscience, University of Bern 3 Aug 04, 2022
A Flexible Generative Framework for Graph-based Semi-supervised Learning (NeurIPS 2019)

G3NN This repo provides a pytorch implementation for the 4 instantiations of the flexible generative framework as described in the following paper: A

Jiaqi Ma 14 Oct 11, 2022
Introduction to AI assignment 1 HCM University of Technology, term 211

Sokoban Bot Introduction to AI assignment 1 HCM University of Technology, term 211 Abstract This is basically a solver for Sokoban game using Breadth-

Quang Minh 4 Dec 12, 2022
Official PyTorch implementation of the paper Image-Based CLIP-Guided Essence Transfer.

TargetCLIP- official pytorch implementation of the paper Image-Based CLIP-Guided Essence Transfer This repository finds a global direction in StyleGAN

Hila Chefer 221 Dec 13, 2022
Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving

GSAN Introduction Code for paper GSAN: Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving, wh

YE Luyao 6 Oct 27, 2022