Photo2cartoon - 人像卡通化探索项目 (photo-to-cartoon translation project)

Overview

人像卡通化 (Photo to Cartoon)

中文版 | English Version

该项目为小视科技卡通肖像探索项目。您可使用微信扫描下方二维码或搜索“AI卡通秀”小程序体验卡通化效果。

也可以前往我们的ai开放平台进行在线体验:https://ai.minivision.cn/#/coreability/cartoon

技术交流QQ群:937627932

Updates

简介

人像卡通风格渲染的目标是,在保持原图像ID信息和纹理细节的同时,将真实照片转换为卡通风格的非真实感图像。我们的思路是,从大量照片/卡通数据中习得照片到卡通画的映射。一般而言,基于成对数据的pix2pix方法能达到较好的图像转换效果,但本任务的输入输出轮廓并非一一对应,例如卡通风格的眼睛更大、下巴更瘦;且成对的数据绘制难度大、成本较高,因此我们采用unpaired image translation方法来实现。

Unpaired image translation流派最经典方法是CycleGAN,但原始CycleGAN的生成结果往往存在较为明显的伪影且不稳定。近期的论文U-GAT-IT提出了一种归一化方法——AdaLIN,能够自动调节Instance Norm和Layer Norm的比重,再结合attention机制能够实现精美的人像日漫风格转换。

与夸张的日漫风不同,我们的卡通风格更偏写实,要求既有卡通画的简洁Q萌,又有明确的身份信息。为此我们增加了Face ID Loss,使用预训练的人脸识别模型提取照片和卡通画的ID特征,通过余弦距离来约束生成的卡通画。

此外,我们提出了一种Soft-AdaLIN(Soft Adaptive Layer-Instance Normalization)归一化方法,在反规范化时将编码器的均值方差(照片特征)与解码器的均值方差(卡通特征)相融合。

模型结构方面,在U-GAT-IT的基础上,我们在编码器之前和解码器之后各增加了2个hourglass模块,渐进地提升模型特征抽象和重建能力。

由于实验数据较为匮乏,为了降低训练难度,我们将数据处理成固定的模式。首先检测图像中的人脸及关键点,根据人脸关键点旋转校正图像,并按统一标准裁剪,再将裁剪后的头像输入人像分割模型去除背景。

Start

安装依赖库

项目所需的主要依赖库如下:

  • python 3.6
  • pytorch 1.4
  • tensorflow-gpu 1.14
  • face-alignment
  • dlib
  • onnxruntime

Clone:

git clone https://github.com/minivision-ai/photo2cartoon.git
cd ./photo2cartoon

下载资源

谷歌网盘 | 百度网盘 提取码:y2ch

  1. 人像卡通化预训练模型:photo2cartoon_weights.pt(20200504更新),存放在models路径下。
  2. 头像分割模型:seg_model_384.pb,存放在utils路径下。
  3. 人脸识别预训练模型:model_mobilefacenet.pth,存放在models路径下。(From: InsightFace_Pytorch
  4. 卡通画开源数据:cartoon_data,包含trainBtestB
  5. 人像卡通化onnx模型:photo2cartoon_weights.onnx 谷歌网盘,存放在models路径下。

测试

将一张测试照片(亚洲年轻女性)转换为卡通风格:

python test.py --photo_path ./images/photo_test.jpg --save_path ./images/cartoon_result.png

测试onnx模型

python test_onnx.py --photo_path ./images/photo_test.jpg --save_path ./images/cartoon_result.png

训练

1.数据准备

训练数据包括真实照片和卡通画像,为降低训练复杂度,我们对两类数据进行了如下预处理:

  • 检测人脸及关键点。
  • 根据关键点旋转校正人脸。
  • 将关键点边界框按固定的比例扩张并裁剪出人脸区域。
  • 使用人像分割模型将背景置白。

我们开源了204张处理后的卡通画数据,您还需准备约1000张人像照片(为匹配卡通数据,尽量使用亚洲年轻女性照片,人脸大小最好超过200x200像素),使用以下命令进行预处理:

python data_process.py --data_path YourPhotoFolderPath --save_path YourSaveFolderPath

将处理后的数据按照以下层级存放,trainAtestA中存放照片头像数据,trainBtestB中存放卡通头像数据。

├── dataset
    └── photo2cartoon
        ├── trainA
            ├── xxx.jpg
            ├── yyy.png
            └── ...
        ├── trainB
            ├── zzz.jpg
            ├── www.png
            └── ...
        ├── testA
            ├── aaa.jpg 
            ├── bbb.png
            └── ...
        └── testB
            ├── ccc.jpg 
            ├── ddd.png
            └── ...

2.训练

重新训练:

python train.py --dataset photo2cartoon

加载预训练参数:

python train.py --dataset photo2cartoon --pretrained_weights models/photo2cartoon_weights.pt

多GPU训练(仍建议使用batch_size=1,单卡训练):

python train.py --dataset photo2cartoon --batch_size 4 --gpu_ids 0 1 2 3

Q&A

Q:为什么开源的卡通化模型与小程序中的效果有差异?

A:开源模型的训练数据收集自互联网,为了得到更加精美的效果,我们在训练小程序中卡通化模型时,采用了定制的卡通画数据(200多张),且增大了输入分辨率。此外,小程序中的人脸特征提取器采用自研的识别模型,效果优于本项目使用的开源识别模型。

Q:如何选取效果最好的模型?

A:首先训练模型200k iterations,然后使用FID指标挑选出最优模型,最终挑选出的模型为迭代90k iterations时的模型。

Q:关于人脸特征提取模型。

A:实验中我们发现,使用自研的识别模型计算Face ID Loss训练效果远好于使用开源识别模型,若训练效果出现鲁棒性问题,可尝试将Face ID Loss权重置零。

Q:人像分割模型是否能用与分割半身像?

A:不能。该模型是针对本项目训练的专用模型,需先裁剪出人脸区域再输入。

Tips

我们开源的模型是基于亚洲年轻女性训练的,对于其他人群覆盖不足,您可根据使用场景自行收集相应人群的数据进行训练。我们的开放平台提供了能够覆盖各类人群的卡通化服务,您可前往体验。如有定制卡通风格需求请联系商务:18852075216。

参考

U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation [Paper][Code]

InsightFace_Pytorch

Owner
Minivision_AI
Minivision_AI
Pytorch Implementation of PointNet and PointNet++++

Pytorch Implementation of PointNet and PointNet++ This repo is implementation for PointNet and PointNet++ in pytorch. Update 2021/03/27: (1) Release p

Luigi Ariano 1 Nov 11, 2021
Repositorio oficial del curso IIC2233 Programación Avanzada 🚀✨

IIC2233 - Programación Avanzada Evaluación Las evaluaciones serán efectuadas por medio de actividades prácticas en clases y tareas. Se calculará la no

IIC2233 @ UC 47 Sep 06, 2022
An efficient toolkit for Face Stylization based on the paper "AgileGAN: Stylizing Portraits by Inversion-Consistent Transfer Learning"

MMGEN-FaceStylor English | 简体中文 Introduction This repo is an efficient toolkit for Face Stylization based on the paper "AgileGAN: Stylizing Portraits

OpenMMLab 182 Dec 27, 2022
[EMNLP 2020] Keep CALM and Explore: Language Models for Action Generation in Text-based Games

Contextual Action Language Model (CALM) and the ClubFloyd Dataset Code and data for paper Keep CALM and Explore: Language Models for Action Generation

Princeton Natural Language Processing 43 Dec 16, 2022
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context Code in both PyTorch and TensorFlow

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context This repository contains the code in both PyTorch and TensorFlow for our paper

Zhilin Yang 3.3k Jan 06, 2023
Realistic lighting in ursina!

Ursina Lighting Realistic lighting in ursina! If you want to have realistic lighting in ursina, import the UrsinaLighting.py in your project and use t

17 Jul 07, 2022
Detectron2-FC a fast construction platform of neural network algorithm based on detectron2

What is Detectron2-FC Detectron2-FC a fast construction platform of neural network algorithm based on detectron2. We have been working hard in two dir

董晋宗 9 Jun 06, 2022
PyTorch implementation of PNASNet-5 on ImageNet

PNASNet.pytorch PyTorch implementation of PNASNet-5. Specifically, PyTorch code from this repository is adapted to completely match both my implemetat

Chenxi Liu 314 Nov 25, 2022
Multi-Content GAN for Few-Shot Font Style Transfer at CVPR 2018

MC-GAN in PyTorch This is the implementation of the Multi-Content GAN for Few-Shot Font Style Transfer. The code was written by Samaneh Azadi. If you

Samaneh Azadi 422 Dec 04, 2022
HEAM: High-Efficiency Approximate Multiplier Optimization for Deep Neural Networks

Approximate Multiplier by HEAM What's HEAM? HEAM is a general optimization method to generate high-efficiency approximate multipliers for specific app

4 Sep 11, 2022
(CVPR 2021) Lifting 2D StyleGAN for 3D-Aware Face Generation

Lifting 2D StyleGAN for 3D-Aware Face Generation Official implementation of paper "Lifting 2D StyleGAN for 3D-Aware Face Generation". Requirements You

Yichun Shi 66 Nov 29, 2022
MohammadReza Sharifi 27 Dec 13, 2022
Contrastive Language-Image Pretraining

CLIP [Blog] [Paper] [Model Card] [Colab] CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pair

OpenAI 11.5k Jan 08, 2023
Framework for evaluating ANNS algorithms on billion scale datasets.

Billion-Scale ANN http://big-ann-benchmarks.com/ Install The only prerequisite is Python (tested with 3.6) and Docker. Works with newer versions of Py

Harsha Vardhan Simhadri 132 Dec 24, 2022
PCGNN - Procedural Content Generation with NEAT and Novelty

PCGNN - Procedural Content Generation with NEAT and Novelty Generation Approach — Metrics — Paper — Poster — Examples PCGNN - Procedural Content Gener

Michael Beukman 8 Dec 10, 2022
UltraGCN: An Ultra Simplification of Graph Convolutional Networks for Recommendation

UltraGCN This is our Pytorch implementation for our CIKM 2021 paper: Kelong Mao, Jieming Zhu, Xi Xiao, Biao Lu, Zhaowei Wang, Xiuqiang He. UltraGCN: A

XUEPAI 93 Jan 03, 2023
Meta-learning for NLP

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks Code for training the meta-learning models and fine-tuning on downstr

IESL 43 Nov 08, 2022
Deployment of PyTorch chatbot with Flask

Chatbot Deployment with Flask and JavaScript In this tutorial we deploy the chatbot I created in this tutorial with Flask and JavaScript. This gives 2

Patrick Loeber (Python Engineer) 107 Dec 29, 2022
Virtual hand gesture mouse using a webcam

NonMouse 日本語のREADMEはこちら This is an application that allows you to use your hand itself as a mouse. The program uses a web camera to recognize your han

Yuki Takeyama 55 Jan 01, 2023
Pytorch implementation of Supporting Clustering with Contrastive Learning, NAACL 2021

Supporting Clustering with Contrastive Learning SCCL (NAACL 2021) Dejiao Zhang, Feng Nan, Xiaokai Wei, Shangwen Li, Henghui Zhu, Kathleen McKeown, Ram

231 Jan 05, 2023