Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

Last update: Nov 28, 2022

Related tags

Computer Vision PPE

Overview

PPE ✨

Repository for our CVPR'2022 paper:

Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model. Zipeng Xu, Tianwei Lin, Hao Tang, Fu Li, Dongliang He, Nicu Sebe, Radu Timofte, Luc Van Gool, Errui Ding. To appear in CVPR 2022.

Pytorch implementation is at here: zipengxuc/PPE-Pytorch.

Updates

24 Mar 2022: We update our arxiv-version paper.

30 Mar 2022: We have had some changes in releasing the code. Pytorch implementation is now at here: zipengxuc/PPE-Pytorch.

14 Apr 2022: Update our PaddlePaddle inference code in this repository.

To reproduce our results:

Setup:

Install CLIP:

conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=<CUDA_VERSION>
pip install ftfy regex tqdm gdown
pip install git+https://github.com/openai/CLIP.git

Download pre-trained models:

The code relies on the PaddleGAN (PaddlePaddle implementation of StyleGAN2). Download the pre-trained StyleGAN2 generator from here.

We provided several pretrained PPE models on here.
Invert real images:

The mapper is trained on latent vectors, so it is necessary to invert images into latent space. To edit human face, StyleCLIP provides the CelebA-HQ that was inverted by e4e: test set.

Usage:

Please first put downloaded pretraiend models and data on ckpt folder.

Inference

In PaddlePaddle version, we only provide inference code to generate editing results:

python mapper/evaluate.py

Reference

@article{xu2022ppe,
author = {Zipeng Xu and Tianwei Lin and Hao Tang and Fu Li and Dongliang He and Nicu Sebe and Radu Timofte and Luc Van Gool and Errui Ding},
title = {Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model},
journal = {arXiv preprint arXiv:2111.13333},
year = {2021}
}

If you have any questions, please contact [email protected]. :)

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

Related tags

Overview

PPE ✨

Updates

To reproduce our results:

Setup:

Usage:

Inference

Reference

Owner

Zipeng Xu

OCR software for recognition of handwritten text

A pkg stiching around view images(4-6cameras) to generate bird's eye view.

Code related to "Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity" paper

Kornia is a open source differentiable computer vision library for PyTorch.

This is the open source implementation of the ICLR2022 paper "StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis"

A machine learning software for extracting information from scholarly documents

Rest API Written In Python To Classify NSFW Images.

Source code of RRPN ---- Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Shape Detection - It's a shape detection project with OpenCV and Python.

CNN+LSTM+CTC based OCR implemented using tensorflow.

Textboxes : Image Text Detection Model : python package (tensorflow)

(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

Open Source Computer Vision Library

PianoVisuals - Create background videos synced with piano music using opencv

list all open dataset about ocr.

Pre-Recognize Library - library with algorithms for improving OCR quality.

Document Layout Analysis Projects

A simple demo program for using OpenCV on Android

Crop regions in napari manually

Comparison-of-OCR (KerasOCR, PyTesseract,EasyOCR)