Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

Overview

CLIP-GLaSS

Repository for the paper Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

An in-browser demo is available here

Installation

Clone this repository

git clone https://github.com/galatolofederico/clip-glass && cd clip-glass

Create a virtual environment and install the requirements

virtualenv --python=python3.6 env && . ./env/bin/activate
pip install -r requirements.txt

Run CLIP-GLaSS

You can run CLIP-GLaSS with:

python run.py --config  --target 

Specifying and according to the following table:

Config Meaning Target Type
GPT2 Use GPT2 to solve the Image-to-Text task Image
DeepMindBigGAN512 Use DeepMind's BigGAN 512x512 to solve the Text-to-Image task Text
DeepMindBigGAN256 Use DeepMind's BigGAN 256x256 to solve the Text-to-Image task Text
StyleGAN2_ffhq_d Use StyleGAN2-ffhq to solve the Text-to-Image task Text
StyleGAN2_ffhq_nod Use StyleGAN2-ffhq without Discriminator to solve the Text-to-Image task Text
StyleGAN2_church_d Use StyleGAN2-church to solve the Text-to-Image task Text
StyleGAN2_church_nod Use StyleGAN2-church without Discriminator to solve the Text-to-Image task Text
StyleGAN2_car_d Use StyleGAN2-car to solve the Text-to-Image task Text
StyleGAN2_car_nod Use StyleGAN2-car without Discriminator to solve the Text-to-Image task Text

If you do not have downloaded the models weights you will be prompted to run ./download-weights.sh You will find the results in the folder ./tmp, a different output folder can be specified with --tmp-folder

Examples

python run.py --config StyleGAN2_ffhq_d --target "the face of a man with brown eyes and stubble beard"
python run.py --config GPT2 --target gpt2_images/dog.jpeg

Acknowledgments and licensing

This work heavily relies on the following amazing repositories and would have not been possible without them:

All their work can be shared under the terms of the respective original licenses.

All my original work (everything except the content of the folders clip, stylegan2 and gpt2) is released under the terms of the GNU/GPLv3 license. Coping, adapting e republishing it is not only consent but also encouraged.

Citing

If you want to cite use you can use this BibTeX

@article{galatolo_glass
,	author	= {Galatolo, Federico A and Cimino, Mario GCA and Vaglini, Gigliola}
,	title	= {Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search}
,	year	= {2021}
}

Contacts

For any further question feel free to reach me at [email protected] or on Telegram @galatolo

Owner
Federico Galatolo
PhD Student @ University of Pisa
Federico Galatolo
A Simple and Versatile Framework for Object Detection and Instance Recognition

SimpleDet - A Simple and Versatile Framework for Object Detection and Instance Recognition Major Features FP16 training for memory saving and up to 2.

TuSimple 3k Dec 12, 2022
Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

Jie Liu 111 Dec 31, 2022
OptaPlanner wrappers for Python. Currently significantly slower than OptaPlanner in Java or Kotlin.

OptaPy is an AI constraint solver for Python to optimize the Vehicle Routing Problem, Employee Rostering, Maintenance Scheduling, Task Assignment, School Timetabling, Cloud Optimization, Conference S

OptaPy 211 Jan 02, 2023
Bringing sanity to world of messed-up data

Sanitize sanitize is a Python module for making sure various things (e.g. HTML) are safe to use. It was originally written by Mark Pilgrim and is dist

Alireza Savand 63 Oct 26, 2021
Start-to-finish tutorial for interactive music co-creation in PyTorch and Tensorflow.js

Start-to-finish tutorial for interactive music co-creation in PyTorch and Tensorflow.js

Chris Donahue 98 Dec 14, 2022
Subgraph Based Learning of Contextual Embedding

SLiCE Self-Supervised Learning of Contextual Embeddings for Link Prediction in Heterogeneous Networks Dataset details: We use four public benchmark da

Pacific Northwest National Laboratory 27 Dec 01, 2022
One-line your code easily but still with the fun of doing so!

One-liner-iser One-line your code easily but still with the fun of doing so! Have YOU ever wanted to write one-line Python code, but don't have the sa

5 May 04, 2022
People movement type classifier with YOLOv4 detection and SORT tracking.

Movement classification The goal of this project would be movement classification of people, in other words, walking (normal and fast) and running. Yo

4 Sep 21, 2021
A rough implementation of the paper "A Steering Algorithm for Redirected Walking Using Reinforcement Learning"

A rough implementation of the paper "A Steering Algorithm for Redirected Walking Using Reinforcement Learning"

Somnus `Chen 2 Jun 09, 2022
Algebraic effect handlers in Python

PyEffect: Algebraic effects in Python What IDK. Usage effects.handle(operation, handlers=None) effects.set_handler(effect, handler) Supported effects

Greg Werbin 5 Dec 27, 2021
An official implementation of MobileStyleGAN in PyTorch

MobileStyleGAN: A Lightweight Convolutional Neural Network for High-Fidelity Image Synthesis Official PyTorch Implementation The accompanying videos c

Sergei Belousov 602 Jan 07, 2023
DeepFaceLab fork which provides IPython Notebook to use DFL with Google Colab

DFL-Colab — DeepFaceLab fork for Google Colab This project provides you IPython Notebook to use DeepFaceLab with Google Colaboratory. You can create y

779 Jan 05, 2023
hySLAM is a hybrid SLAM/SfM system designed for mapping

HySLAM Overview hySLAM is a hybrid SLAM/SfM system designed for mapping. The system is based on ORB-SLAM2 with some modifications and refactoring. Raú

Brian Hopkinson 15 Oct 10, 2022
Space Time Recurrent Memory Network - Pytorch

Space Time Recurrent Memory Network - Pytorch (wip) Implementation of Space Time Recurrent Memory Network, recurrent network competitive with attentio

Phil Wang 50 Nov 07, 2021
This repository contains several jupyter notebooks to help users learn to use neon, our deep learning framework

neon_course This repository contains several jupyter notebooks to help users learn to use neon, our deep learning framework. For more information, see

Nervana 92 Jan 03, 2023
BEAMetrics: Benchmark to Evaluate Automatic Metrics in Natural Language Generation

BEAMetrics: Benchmark to Evaluate Automatic Metrics in Natural Language Generation Installing The Dependencies $ conda create --name beametrics python

7 Jul 04, 2022
A benchmark for the task of translation suggestion

WeTS: A Benchmark for Translation Suggestion Translation Suggestion (TS), which provides alternatives for specific words or phrases given the entire d

zhyang 55 Dec 24, 2022
The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

PRIMER The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization. PRIMER is a pre-trained model for mu

AI2 114 Jan 06, 2023
NudeNet: Neural Nets for Nudity Classification, Detection and selective censoring

NudeNet: Neural Nets for Nudity Classification, Detection and selective censoring Uncensored version of the following image can be found at https://i.

notAI.tech 1.1k Dec 29, 2022
A powerful framework for decentralized federated learning with user-defined communication topology

Scatterbrained Decentralized Federated Learning Scatterbrained makes it easy to build federated learning systems. In addition to traditional federated

Johns Hopkins Applied Physics Laboratory 7 Sep 26, 2022