Automatically erase objects in the video, such as logo, text, etc.

Overview

Video-Auto-Wipe

Read English Introduction:Here

  本人不定期的基于生成技术制作一些好玩有趣的算法模型,这次带来的作品是“视频擦除”方向的应用模型,它实现的功能是自动感知到视频中我们不想看见的部分(譬如广告、水印、字幕、图标等等)然后进行擦除。由于图标擦除模型存在潜在的被利用于侵权行为的隐患,因此我暂时只分享了字幕擦除模型,希望能帮助到大家。
  我后续会持续不断的探索和制作新的生成方向的技术内容。基于生成模型可玩的点还有很多,此项目仅展示了其中一个做落地应用的例子。本项目的模型版权所属为:www.seeprettyface.com ,未获得授权请不要直接用作商业用途。关于算法的细节介绍可以参阅我的研究笔记



效果预览

1. 图标擦除

  图标擦除模型的功能是模型自动感知到视频中图标的位置然后进行擦除,感知图标的方法为在时域上静止不动的小块像素块被视作图标。

1.1 测试1-电视剧的台标、剧名和角标擦除

Image text

查看视频



1.2 测试2-足球赛的台标、状态栏擦除

Image text

查看视频



1.3 测试3-综艺节目的台标、状态栏擦除

Image text

查看视频



1.4 测试4-短视频MV的遮挡图标擦除

Image text

查看视频



1.5 测试5-短视频MV的遮挡水印擦除

Image text

查看视频



1.6 测试6-新闻媒体的台标擦除

Image text

查看视频





2. 动态图标擦除

  动态图标擦除模型的功能是模型自动感知到视频中动态图标的位置然后进行擦除,感知动态图标的方法为在时域上闪烁出现或动态移动的固定像素块被视作动态图标,这个在制作上有一定难度所以还没有对外开放。

2.1 测试1-闪烁出现的特效文字擦除

Image text

Image text

查看视频





3. 字幕擦除

  字幕擦除模型的功能是模型自动感知到视频中字幕的位置然后进行擦除,感知字幕的方法为具有统一样式的文字区域被视作字幕。

3.1 测试1-电影字幕擦除

Image text

查看视频



3.2 测试2-电视剧字幕擦除

Image text

查看视频



3.3 测试3-综艺节目字幕擦除

Image text

查看视频



3.4 测试4-综艺节目特殊字幕擦除

Image text

查看视频



3.5 测试5-网络视频字幕擦除

Image text

查看视频



3.6 测试6-小语种字幕擦除

Image text

查看视频





使用方法

1.环境配置

  torch>1.0
  其他的缺什么依赖就pip install xxx,需要的东西不多

2.运行方法

  1. 下载预训练模型放在pretrained-weight文件夹里;
    预训练模型下载地址:链接:https://pan.baidu.com/s/1ubZHkgkcskS7Bpg8ZbtoRQ 提取码:ricn

  2. 将视频文件和mask文件放在input文件夹里,编辑demo.py(或通过命令行参数)选中对应文件位置;
    输入样例下载地址:https://pan.baidu.com/s/1rfdAwxomCVjTJ1zwl7hu3g 提取码:qk64

  3. 图标擦除任务运行:python demo.py delogo
   字幕擦除任务运行:python demo.py detext



训练方法

训练数据

  1.YoutubeVOS2018数据集;

  2.基于搜集的300余部高清电影制作了2,709部电影片段数据集;
    下载地址:https://pan.baidu.com/s/1CIgJmFmx5iR2JfgAyjVaeg 提取码:xb7o

  3.基于搜集的40余部综艺节目制作了864部综艺片段数据集;
    下载地址:https://pan.baidu.com/s/1lJk6IIWlwxknAie0LlGYOg 提取码:9rd4

训练过程

  第1步. 针对特定任务的时域感知训练;
  第2步. 融合擦除模型的微调训练。

训练配置

最近寻觅到了一种非常简易的制作和训练方法:
  '图标擦除'模型在单卡3090上训练3天;
  '字幕擦除'模型在单卡3090上训练2天;





更多玩法

  这个项目目前还只是做了很短期的尝试,实际上视频擦除可玩的点还有很多,譬如敏感内容(涉黄涉暴等)擦除、广告擦除、指定人/物擦除、背景人擦除等等。只要是能寻找到有像素预测的场景+有像素预测的需求都是“视频擦除”可以玩出花样的情景~

Sample





了解更多

  本人的研究方向是生成模型的应用技术研究。生成技术解决的问题是像素的预测,也就是在一个有缺失/完全缺失的图像棋盘上进行像素的填补/预测,使填补/预测完的图像符合真实图像的规律。基于这种模式可展开的玩法有很多,除了我之前做的数字人生成、视频内容生成等,我们还可以拓展出更多并行的思路出来。
  尽管目前大部分的CV落地项目都集中在感知和识别任务上,而对于重构和生成任务的研发相对较少,但这不应影响我们对于生成技术价值的判断,毕竟生成技术是相对较新、参与人较少,但是应用前景较广的研究方向。我后续将持续致力于探索生成方向的落地型算法研发,欢迎访问我的网站了解这方面最新的研究进展:www.seeprettyface.com

Sample

Owner
seeprettyface.com
seeprettyface.com
A simple editor for captions in .SRT file extension

WaySRT A simple editor for captions in .SRT file extension The program doesn't use any external dependecies, just run: python way_srt.py {file_name.sr

Gustavo Lopes 3 Nov 16, 2022
IOT: Instance-wise Layer Reordering for Transformer Structures

Introduction This repository contains the code for Instance-wise Ordered Transformer (IOT), which is introduced in the ICLR2021 paper IOT: Instance-wi

IOT 19 Nov 15, 2022
An unofficial styleguide and best practices summary for PyTorch

A PyTorch Tools, best practices & Styleguide This is not an official style guide for PyTorch. This document summarizes best practices from more than a

IgorSusmelj 1.5k Jan 05, 2023
Repository for the paper titled: "When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer"

When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer This repository contains code for our paper titled "When is BERT M

Princeton Natural Language Processing 9 Dec 23, 2022
Official Repository for our ECCV2020 paper: Imbalanced Continual Learning with Partitioning Reservoir Sampling

Imbalanced Continual Learning with Partioning Reservoir Sampling This repository contains the official PyTorch implementation and the dataset for our

Chris Dongjoo Kim 40 Sep 18, 2022
(CVPR2021) ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic

ClassSR (CVPR2021) ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic Paper Authors: Xiangtao Kong, Hengyuan

Xiangtao Kong 308 Jan 05, 2023
A PyTorch implementation of DenseNet.

A PyTorch Implementation of DenseNet This is a PyTorch implementation of the DenseNet-BC architecture as described in the paper Densely Connected Conv

Brandon Amos 771 Dec 15, 2022
Conditional Gradients For The Approximately Vanishing Ideal

Conditional Gradients For The Approximately Vanishing Ideal Code for the paper: Wirth, E., and Pokutta, S. (2022). Conditional Gradients for the Appro

IOL Lab @ ZIB 0 May 25, 2022
Equivariant layers for RC-complement symmetry in DNA sequence data

Equi-RC Equivariant layers for RC-complement symmetry in DNA sequence data This is a repository that implements the layers as described in "Reverse-Co

7 May 19, 2022
[ICCV2021] Official Pytorch implementation for SDGZSL (Semantics Disentangling for Generalized Zero-Shot Learning)

Semantics Disentangling for Generalized Zero-shot Learning This is the official implementation for paper Zhi Chen, Yadan Luo, Ruihong Qiu, Zi Huang, J

25 Dec 06, 2022
PyTorch implementations for our SIGGRAPH 2021 paper: Editable Free-viewpoint Video Using a Layered Neural Representation.

st-nerf We provide PyTorch implementations for our paper: Editable Free-viewpoint Video Using a Layered Neural Representation SIGGRAPH 2021 Jiakai Zha

Diplodocus 258 Jan 02, 2023
An end-to-end regression problem of predicting the price of properties in Bangalore.

Bangalore-House-Price-Prediction An end-to-end regression problem of predicting the price of properties in Bangalore. Deployed in Heroku using Flask.

Shruti Balan 1 Nov 25, 2022
A new video text spotting framework with Transformer

TransVTSpotter: End-to-end Video Text Spotter with Transformer Introduction A Multilingual, Open World Video Text Dataset and End-to-end Video Text Sp

weijiawu 67 Jan 03, 2023
Generic Event Boundary Detection: A Benchmark for Event Segmentation

Generic Event Boundary Detection: A Benchmark for Event Segmentation We release our data annotation & baseline codes for detecting generic event bound

47 Nov 22, 2022
Code to reproduce the experiments from our NeurIPS 2021 paper " The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective"

Code To run: python runner.py new --save SAVE_NAME --data PATH_TO_DATA_DIR --dataset DATASET --model model_name [options] --n 1000 - train - t

Geoff Pleiss 5 Dec 12, 2022
Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators

Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators This is our Pytorch implementation for t

RUCAIBox 12 Jul 22, 2022
Small utility to demangle Nim symbols in callgrind files

nim_callgrind A small utility to demangle Nim symbols from callgrind files. Usage Run your (Nim) program with something like this: valgrind --tool=cal

kraptor 3 Feb 15, 2022
MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks Introduction This repo contains the pytorch impl

Meta Research 38 Oct 10, 2022
Finite-temperature variational Monte Carlo calculation of uniform electron gas using neural canonical transformation.

CoulombGas This code implements the neural canonical transformation approach to the thermodynamic properties of uniform electron gas. Building on JAX,

FermiFlow 9 Mar 03, 2022
STEAL - Learning Semantic Boundaries from Noisy Annotations (CVPR 2019)

STEAL This is the official inference code for: Devil Is in the Edges: Learning Semantic Boundaries from Noisy Annotations David Acuna, Amlan Kar, Sanj

469 Dec 26, 2022