A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.

Overview

AnimeGAN

A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.

Randomly Generated Images

The images are generated from a DCGAN model trained on 143,000 anime character faces for 100 epochs.

fake_sample_1

Image Interpolation

Manipulating latent codes, enables the transition from images in the first row to the last row.

transition

Original Images

The images are not clean, some outliers can be observed, which degrades the quality of the generated images.

real_sample

Usage

To run the experiment,

$ python main.py --dataRoot path_to_dataset/ 

The pretrained model for DCGAN are also in this repo, play it inside the jupyter notebook.

anime-faces Dataset

Anime-style images of 126 tags are collected from danbooru.donmai.us using the crawler tool gallery-dl. The images are then processed by a anime face detector python-animeface. The resulting dataset contains ~143,000 anime faces. Note that some of the tags may no longer meaningful after cropping, i.e. the cropped face images under 'uniform' tag may not contain visible parts of uniforms.

How to construct the dataset from scratch ?

Prequisites: gallery-dl, python-animeface

  1. Download anime-style images

    # download 1000 images under the tag "misaka_mikoto"
    gallery-dl --images 1000 "https://danbooru.donmai.us/posts?tags=misaka_mikoto"
    
    # in a multi-processing manner
    cat tags.txt | \
    xargs -n 1 -P 12 -I 'tag' \ 
    bash -c ' gallery-dl --images 1000 "https://danbooru.donmai.us/posts?tags=$tag" '
  2. Extract faces from the downloaded images

    import animeface
    from PIL import Image
    
    im = Image.open('images/anime_image_misaka_mikoto.png')
    faces = animeface.detect(im)
    x,y,w,h = faces[0].face.pos
    im = im.crop((x,y,x+w,y+h))
    im.show() # display

I've cleaned the original dataset, the new version of the dataset has 115085 images in 126 tags. You can access the images from:

Non-commercial use please.

Things I've learned

  1. GANs are really hard to train.
  2. DCGAN generally works well, simply add fully-connected layers causes problems.
  3. In my cases, more layers for G yields better images, in the sense that G should be more powerful than D.
  4. Add noise to D's inputs and labels helps stablize training.
  5. Use differnet input and generate resolution (64x64 vs 96x96), there seems no obvious difference during training, the generated images are also very similar.
  6. Binray Noise as G's input amazingly works, but the images are not as good as those with Gussian Noise, idea credit to @cwhy ['Binary Noise' here I mean a sequence of {-1,1} generated by bernoulli distribution at p=0.5 ]

I did not carefully verify them, if you are looking for some general GAN tips, see @soumith's ganhacks

Others

  1. This project is heavily influenced by chainer-DCGAN and IllustrationGAN, the codes are mostly borrowed from PyTorch DCGAN example, thanks the authors for the clean codes.
  2. Dependencies: pytorch, torchvision
  3. This is a toy project for me to learn PyTorch and GANs, most importantly, for fun! :) Any feedback is welcome.

@jayleicn

Comments
  • KeyError: 'module name can\'t contain

    KeyError: 'module name can\'t contain "."'

    The classes in module.py contains some nn.Module layers whose names contains some'.' in it, so I got error messages like the title, so how could I play it??

    opened by bolin12 2
  • no image under some tags

    no image under some tags

    Hi, The dataset from google drive contains 126 tags. However, some folders are emtpy:

    1girl apron blush collarbone hairclip honma_meiko japanese_clothes monochrome necktie nishizumi_miho purple_eyes scarf school_uniform sunglasses

    Is this normal? Thanks

    opened by samrere 1
  • set_sizes_contiguous is not allowed on a Tensor created from .data or .detach().

    set_sizes_contiguous is not allowed on a Tensor created from .data or .detach().

    input.data.resize_(real_cpu.size()).copy_(real_cpu) RuntimeError: set_sizes_contiguous is not allowed on a Tensor created from .data or .detach(). If your intent is to change the metadata of a Tensor (such as sizes / strides / storage / storage_offset) without autograd tracking the change, remove the .data / .detach() call and wrap the change in a with torch.no_grad(): block. For example, change: x.data.set_(y) to: with torch.no_grad(): x.set_(y)

    opened by athulvingt 0
  • can not run

    can not run

    Traceback (most recent call last): File "main.py", line 6, in import torch File "/Library/Python/2.7/site-packages/torch/init.py", line 81, in from torch._C import * RuntimeError: module compiled against API version 0xa but this version of numpy is 0x9

    opened by xtzero 0
  • How do I start with my own model

    How do I start with my own model

    I would like to know how I can use this image generation to generate my own images from a self made model?

    Where can I read upon on this. I find no concrete info on making the models.

    opened by quintendewilde 0
Owner
Jie Lei 雷杰
UNC CS PhD student, vision+language.
Jie Lei 雷杰
POPPY (Physical Optics Propagation in Python) is a Python package that simulates physical optical propagation including diffraction

POPPY: Physical Optics Propagation in Python POPPY (Physical Optics Propagation in Python) is a Python package that simulates physical optical propaga

Space Telescope Science Institute 132 Dec 15, 2022
Repository providing a wide range of self-supervised pretrained models for computer vision tasks.

Hierarchical Pretraining: Research Repository This is a research repository for reproducing the results from the project "Self-supervised pretraining

Colorado Reed 53 Nov 09, 2022
Repository for MDPGT

MD-PGT Repository for implementing and reproducing the results for the paper MDPGT: Momentum-based Decentralized Policy Gradient Tracking. Available E

Xian Yeow Lee 2 Dec 30, 2021
Dynamical movement primitives (DMPs), probabilistic movement primitives (ProMPs), spatially coupled bimanual DMPs.

Movement Primitives Movement primitives are a common group of policy representations in robotics. There are many different types and variations. This

DFKI Robotics Innovation Center 63 Jan 06, 2023
Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On

Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On [Project website] [Dataset] [Video] Abstract We propose a new g

71 Dec 24, 2022
Official Implement of CVPR 2021 paper “Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting”

RGBT Crowd Counting Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin. "Cross-Modal Collaborative Representation Learning and a L

37 Dec 08, 2022
Sample and Computation Redistribution for Efficient Face Detection

Introduction SCRFD is an efficient high accuracy face detection approach which initially described in Arxiv. Performance Precision, flops and infer ti

Sajjad Aemmi 13 Mar 05, 2022
[ICCV2021] Official Pytorch implementation for SDGZSL (Semantics Disentangling for Generalized Zero-Shot Learning)

Semantics Disentangling for Generalized Zero-shot Learning This is the official implementation for paper Zhi Chen, Yadan Luo, Ruihong Qiu, Zi Huang, J

25 Dec 06, 2022
Numenta published papers code and data

Numenta research papers code and data This repository contains reproducible code for selected Numenta papers. It is currently under construction and w

Numenta 293 Jan 06, 2023
The codes I made while I practiced various TensorFlow examples

TensorFlow_Exercises The codes I made while I practiced various TensorFlow examples About the codes I didn't create these codes by myself, but re-crea

Terry Taewoong Um 614 Dec 08, 2022
RoBERTa Marathi Language model trained from scratch during huggingface 🤗 x flax community week

RoBERTa base model for Marathi Language (मराठी भाषा) Pretrained model on Marathi language using a masked language modeling (MLM) objective. RoBERTa wa

Nipun Sadvilkar 23 Oct 19, 2022
Build Graph Nets in Tensorflow

Graph Nets library Graph Nets is DeepMind's library for building graph networks in Tensorflow and Sonnet. Contact DeepMind 5.2k Jan 05, 2023

ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers

ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers Official implementation of ViewFormer. ViewFormer is a NeRF-free neural rend

Jonáš Kulhánek 169 Dec 30, 2022
Code for the CVPR2021 paper "Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition"

Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition This repository contains code for the CVPR2021 paper "Patch-NetV

QVPR 368 Jan 06, 2023
Implementation of Bidirectional Recurrent Independent Mechanisms (Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules)

BRIMs Bidirectional Recurrent Independent Mechanisms Implementation of the paper Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neura

Sarthak Mittal 26 May 26, 2022
DFM: A Performance Baseline for Deep Feature Matching

DFM: A Performance Baseline for Deep Feature Matching Python (Pytorch) and Matlab (MatConvNet) implementations of our paper DFM: A Performance Baselin

143 Jan 02, 2023
Gin provides a lightweight configuration framework for Python

Gin Config Authors: Dan Holtmann-Rice, Sergio Guadarrama, Nathan Silberman Contributors: Oscar Ramirez, Marek Fiser Gin provides a lightweight configu

Google 1.7k Jan 03, 2023
Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model About This repository contains the code to replicate the syn

Haruka Kiyohara 12 Dec 07, 2022
CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation [arxiv] This is the official repository for CDTrans: Cross-domain Transformer for

238 Dec 22, 2022
Weighing Counts: Sequential Crowd Counting by Reinforcement Learning

LibraNet This repository includes the official implementation of LibraNet for crowd counting, presented in our paper: Weighing Counts: Sequential Crow

Hao Lu 18 Nov 05, 2022