Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Overview

Style Transformer for Image Inversion and Editing (CVPR2022)

https://arxiv.org/abs/2203.07932

Existing GAN inversion methods fail to provide latent codes for reliable reconstruction and flexible editing simultaneously. This paper presents a transformer-based image inversion and editing model for pretrained StyleGAN which is not only with less distortions, but also of high quality and flexibility for editing. The proposed model employs a CNN encoder to provide multi-scale image features as keys and values. Meanwhile it regards the style code to be determined for different layers of the generator as queries. It first initializes query tokens as learnable parameters and maps them into $W^+$ space. Then the multi-stage alternate self- and cross-attention are utilized, updating queries with the purpose of inverting the input by the generator. Moreover, based on the inverted code, we investigate the reference- and label-based attribute editing through a pretrained latent classifier, and achieve flexible image-to-image translation with high quality results. Extensive experiments are carried out, showing better performances on both inversion and editing tasks within StyleGAN.


Our style transformer proposes novel multi-stage style transformer in w+ space to invert image accurately, and we characterize the image editing in StyleGAN into label-based and reference-based, and use a non-linear classifier to generate the editing vector.

Getting Started

Prerequisites

  • Ubuntu 16.04
  • NVIDIA GPU + CUDA CuDNN
  • Python 3

Pretrained Models

We provide the pre-trained models of inversion for face and car domains.

Training

Preparing Datasets

Update configs/paths_config.py with the necessary data paths and model paths for training and inference.

dataset_paths = {
    'train_data': '/path/to/train/data'
    'test_data': '/path/to/test/data',
}

Preparing Generator

We use rosinality's StyleGAN2 implementation. You can download the 256px pretrained model in the project and put it in the directory /pretrained_models.

Training Inversion Model

python scripts/train.py \
--dataset_type=ffhq_encode \
--exp_dir=results/train_style_transformer \
--batch_size=8 \
--test_batch_size=8 \
--val_interval=5000 \
--save_interval=10000 \
--stylegan_weights=pretrained_models/stylegan2-ffhq-config-f.pt

Inference

python scripts/inference.py \
--exp_dir=results/infer_style_transformer \
--checkpoint_path=results/train_style_transformer/checkpoints/best_model.pt \
--data_path=/test_data \
--test_batch_size=8 \

Citation

If you use this code for your research, please cite

@article{hu2022style,
  title={Style Transformer for Image Inversion and Editing},
  author={Hu, Xueqi and Huang, Qiusheng and Shi, Zhengyi and Li, Siyuan and Gao, Changxin and Sun, Li and Li, Qingli},
  journal={arXiv preprint arXiv:2203.07932},
  year={2022}
}
Owner
Xueqi Hu
Xueqi Hu
Codes accompanying the paper "Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning" (NeurIPS 2021 Spotlight

Implicit Constraint Q-Learning This is a pytorch implementation of ICQ on Datasets for Deep Data-Driven Reinforcement Learning (D4RL) and ICQ-MA on SM

42 Dec 23, 2022
Data-driven reduced order modeling for nonlinear dynamical systems

SSMLearn Data-driven Reduced Order Models for Nonlinear Dynamical Systems This package perform data-driven identification of reduced order model based

Haller Group, Nonlinear Dynamics 27 Dec 13, 2022
Modular Gaussian Processes

Modular Gaussian Processes for Transfer Learning 🧩 Introduction This repository contains the implementation of our paper Modular Gaussian Processes f

Pablo Moreno-Muñoz 10 Mar 15, 2022
BERT model training impelmentation using 1024 A100 GPUs for MLPerf Training v1.1

Pre-trained checkpoint and bert config json file Location of checkpoint and bert config json file This MLCommons members Google Drive location contain

SAIT (Samsung Advanced Institute of Technology) 12 Apr 27, 2022
NAACL2021 - COIL Contextualized Lexical Retriever

COIL Repo for our NAACL paper, COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List. The code covers learning

Luyu Gao 108 Dec 31, 2022
L-Verse: Bidirectional Generation Between Image and Text

Far beyond learning long-range interactions of natural language, transformers are becoming the de-facto standard for many vision tasks with their power and scalabilty

Kim, Taehoon 102 Dec 21, 2022
DeepMoCap: Deep Optical Motion Capture using multiple Depth Sensors and Retro-reflectors

DeepMoCap: Deep Optical Motion Capture using multiple Depth Sensors and Retro-reflectors By Anargyros Chatzitofis, Dimitris Zarpalas, Stefanos Kollias

tofis 24 Oct 08, 2022
Weakly-supervised semantic image segmentation with CNNs using point supervision

Code for our ECCV paper What's the Point: Semantic Segmentation with Point Supervision. Summary This library is a custom build of Caffe for semantic i

27 Sep 14, 2022
Voice assistant - Voice assistant with python

🌐 Python Voice Assistant 🌵 - User's greeting 🌵 - Writing tasks to todo-list ?

PythonToday 10 Dec 26, 2022
Python based framework for Automatic AI for Regression and Classification over numerical data.

Python based framework for Automatic AI for Regression and Classification over numerical data. Performs model search, hyper-parameter tuning, and high-quality Jupyter Notebook code generation.

BlobCity, Inc 141 Dec 21, 2022
Satellite labelling tool for manual labelling of storm top features such as overshooting tops, above-anvil plumes, cold U/Vs, rings etc.

Satellite labelling tool About this app A tool for manual labelling of storm top features such as overshooting tops, above-anvil plumes, cold U/Vs, ri

Czech Hydrometeorological Institute - Satellite Department 10 Sep 14, 2022
Code accompanying our NeurIPS 2021 traffic4cast challenge

Traffic forecasting on traffic movie snippets This repo contains all code to reproduce our approach to the IARAI Traffic4cast 2021 challenge. In the c

Nina Wiedemann 2 Aug 09, 2022
TCube generates rich and fluent narratives that describes the characteristics, trends, and anomalies of any time-series data (domain-agnostic) using the transfer learning capabilities of PLMs.

TCube: Domain-Agnostic Neural Time series Narration This repository contains the code for the paper: "TCube: Domain-Agnostic Neural Time series Narrat

Mandar Sharma 7 Oct 31, 2021
Cave Generation using metaballs in Blender. Originally created by sdfgeoff, Edited by Myself (Archie Jaskowicz).

Blender-Cave-Generation Cave Generation using metaballs in Blender. Originally created by sdfgeoff, Edited by Myself (Archie Jaskowicz). Installation

2 Dec 28, 2022
FSL-Mate: A collection of resources for few-shot learning (FSL).

FSL-Mate is a collection of resources for few-shot learning (FSL). In particular, FSL-Mate currently contains FewShotPapers: a paper list which tracks

Yaqing Wang 1.5k Jan 08, 2023
Colar: Effective and Efficient Online Action Detection by Consulting Exemplars, CVPR 2022.

Colar: Effective and Efficient Online Action Detection by Consulting Exemplars This repository is the official implementation of Colar. In this work,

LeYang 246 Dec 13, 2022
Object-aware Contrastive Learning for Debiased Scene Representation

Object-aware Contrastive Learning Official PyTorch implementation of "Object-aware Contrastive Learning for Debiased Scene Representation" by Sangwoo

43 Dec 14, 2022
Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources

marge This repository releases the code for Generating Query Focused Summaries from Query-Free Resources. Please cite the following paper [bib] if you

Yumo Xu 28 Nov 10, 2022
Hitters Linear Regression - Hitters Linear Regression With Python

Hitters_Linear_Regression Kullanacağımız veri seti Carnegie Mellon Üniversitesi'

AyseBuyukcelik 2 Jan 26, 2022
converts nominal survey data into a numerical value based on a dictionary lookup.

SWAP RATE Converts nominal survey data into a numerical values based on a dictionary lookup. It allows the user to switch nominal scale data from text

Jake Rhodes 1 Jan 18, 2022