Tensorflow 2 implementation of our high quality frame interpolation neural network

Overview

FILM: Frame Interpolation for Large Scene Motion

Project | Paper | YouTube | Benchmark Scores

Tensorflow 2 implementation of our high quality frame interpolation neural network. We present a unified single-network approach that doesn't use additional pre-trained networks, like optical flow or depth, and yet achieve state-of-the-art results. We use a multi-scale feature extractor that shares the same convolution weights across the scales. Our model is trainable from frame triplets alone.

FILM: Frame Interpolation for Large Motion
Fitsum Reda, Janne Kontkanen, Eric Tabellion, Deqing Sun, Caroline Pantofaru, Brian Curless
Google Research
Technical Report 2022.

A sample 2 seconds moment. FILM transforms near-duplicate photos into a slow motion footage that look like it is shot with a video camera.

Installation

  • Get Frame Interpolation source codes
> git clone https://github.com/google-research/frame-interpolation frame_interpolation
  • Optionally, pull the recommended Docker base image
> docker pull gcr.io/deeplearning-platform-release/tf2-gpu.2-6:latest
  • Install dependencies
> pip install -r frame_interpolation/requirements.txt
> apt-get install ffmpeg

Pre-trained Models

  • Create a directory where you can keep large files. Ideally, not in this directory.
> mkdir 
   

   
  • Download pre-trained TF2 Saved Models from google drive and put into .

The downloaded folder should have the following structure:

pretrained_models/
├── film_net/
│   ├── L1/
│   ├── VGG/
│   ├── Style/
├── vgg/
│   ├── imagenet-vgg-verydeep-19.mat

Running the Codes

The following instructions run the interpolator on the photos provided in frame_interpolation/photos.

One mid-frame interpolation

To generate an intermediate photo from the input near-duplicate photos, simply run:

> python3 -m frame_interpolation.eval.interpolator_test \
     --frame1 frame_interpolation/photos/one.png \
     --frame2 frame_interpolation/photos/two.png \
     --model_path 
   
    /film_net/Style/saved_model \
     --output_frame frame_interpolation/photos/middle.png \

   

This will produce the sub-frame at t=0.5 and save as 'frame_interpolation/photos/middle.png'.

Many in-between frames interpolation

Takes in a set of directories identified by a glob (--pattern). Each directory is expected to contain at least two input frames, with each contiguous frame pair treated as an input to generate in-between frames.

/film_net/Style/saved_model \ --times_to_interpolate 6 \ --output_video">
> python3 -m frame_interpolation.eval.interpolator_cli \
     --pattern "frame_interpolation/photos" \
     --model_path 
   
    /film_net/Style/saved_model \
     --times_to_interpolate 6 \
     --output_video

   

You will find the interpolated frames (including the input frames) in 'frame_interpolation/photos/interpolated_frames/', and the interpolated video at 'frame_interpolation/photos/interpolated.mp4'.

The number of frames is determined by --times_to_interpolate, which controls the number of times the frame interpolator is invoked. When the number of frames in a directory is 2, the number of output frames will be 2^times_to_interpolate+1.

Datasets

We use Vimeo-90K as our main training dataset. For quantitative evaluations, we rely on commonly used benchmark datasets, specifically:

Creating a TFRecord

The training and benchmark evaluation scripts expect the frame triplets in the TFRecord storage format.

We have included scripts that encode the relevant frame triplets into a tf.train.Example data format, and export to a TFRecord file.

You can use the commands python3 -m frame_interpolation.datasets.create_ _tfrecord --help for more information.

For example, run the command below to create a TFRecord for the Middlebury-other dataset. Download the images and point --input_dir to the unzipped folder path.

> python3 -m frame_interpolation.datasets.create_middlebury_tfrecord \
    --input_dir=
   
     \
    --output_tfrecord_filepath=
    

   

Training

Below are our training gin configuration files for the different loss function:

frame_interpolation/training/
├── config/
│   ├── film_net-L1.gin
│   ├── film_net-VGG.gin
│   ├── film_net-Style.gin

To launch a training, simply pass the configuration filepath to the desired experiment.
By default, it uses all visible GPUs for training. To debug or train on a CPU, append --mode cpu.

> python3 -m frame_interpolation.training.train \
     --gin_config frame_interpolation/training/config/
   
    .gin \
     --base_folder 
     \
     --label 
    

    
   
  • When training finishes, the folder structure will look like this:

   
    /
├── 
    
   

Build a SavedModel

Optionally, to build a SavedModel format from a trained checkpoints folder, you can use this command:

> python3 -m frame_interpolation.training.build_saved_model_cli \
     --base_folder  \
     --label 
   

   
  • By default, a SavedModel is created when the training loop ends, and it will be saved at / .

Evaluation on Benchmarks

Below, we provided the evaluation gin configuration files for the benchmarks we have considered:

frame_interpolation/eval/
├── config/
│   ├── middlebury.gin
│   ├── ucf101.gin
│   ├── vimeo_90K.gin
│   ├── xiph_2K.gin
│   ├── xiph_4K.gin

To run an evaluation, simply pass the configuration file of the desired evaluation dataset.
If a GPU is visible, it runs on it.

> python3 -m frame_interpolation.eval.eval_cli -- \
     --gin_config frame_interpolation/eval/config/
   
    .gin \
     --model_path 
    
     /film_net/L1/saved_model

    
   

The above command will produce the PSNR and SSIM scores presented in the paper.

Citation

If you find this implementation useful in your works, please acknowledge it appropriately by citing:

@inproceedings{reda2022film,
 title = {Frame Interpolation for Large Motion},
 author = {Fitsum Reda and Janne Kontkanen and Eric Tabellion and Deqing Sun and Caroline Pantofaru and Brian Curless},
 booktitle = {arXiv},
 year = {2022}
}
@misc{film-tf,
  title = {Tensorflow 2 Implementation of "FILM: Frame Interpolation for Large Scene Motion"},
  author = {Fitsum Reda and Janne Kontkanen and Eric Tabellion and Deqing Sun and Caroline Pantofaru and Brian Curless},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/google-research/frame-interpolation}}
}

Contact: Fitsum Reda ([email protected])

Acknowledgments

We would like to thank Richard Tucker, Jason Lai and David Minnen. We would also like to thank Jamie Aspinall for the imagery included in this repository.

Coding style

  • 2 spaces for indentation
  • 80 character line length
  • PEP8 formatting

Disclaimer

This is not an officially supported Google product.

Repository for "Toward Practical Monocular Indoor Depth Estimation" (CVPR 2022)

Toward Practical Monocular Indoor Depth Estimation Cho-Ying Wu, Jialiang Wang, Michael Hall, Ulrich Neumann, Shuochen Su [arXiv] [project site] DistDe

Meta Research 122 Dec 13, 2022
Intrusion Detection System using ensemble learning (machine learning)

IDS-ML implementation of an intrusion detection system using ensemble machine learning methods Data set This project is carried out using the UNSW-15

4 Nov 25, 2022
Demonstrational Session git repo for H SAF User Workshop (28/1)

5th H SAF User Workshop The 5th H SAF User Workshop supported by EUMeTrain will be held in online in January 24-28 2022. This repository contains inst

H SAF 4 Aug 04, 2022
Invasive Plant Species Identification

Invasive_Plant_Species_Identification Used LiDAR Odometry and Mapping (LOAM) to create a 3D point cloud map which can be used to identify invasive pla

2 May 12, 2022
A really easy-to-use and powerful sudoku solver.

SodukuSolver This is a really useful sudoku solver with a Qt gui. USAGE Enter the numbers in and click "RUN"! If you don't want to wait, simply press

Ujhhgtg Teams 11 Jun 02, 2022
In the AI for TSP competition we try to solve optimization problems using machine learning.

AI for TSP Competition Goal In the AI for TSP competition we try to solve optimization problems using machine learning. The competition will be hosted

Paulo da Costa 11 Nov 27, 2022
wmctrl ported to Python Ctypes

work in progress wmctrl is a command that can be used to interact with an X Window manager that is compatible with the EWMH/NetWM specification. wmctr

Iyad Ahmed 22 Dec 31, 2022
Rohit Ingole 2 Mar 24, 2022
Repository relating to the CVPR21 paper TimeLens: Event-based Video Frame Interpolation

TimeLens: Event-based Video Frame Interpolation This repository is about the High Speed Event and RGB (HS-ERGB) dataset, used in the 2021 CVPR paper T

Robotics and Perception Group 544 Dec 19, 2022
Office source code of paper UniFuse: Unidirectional Fusion for 360$^\circ$ Panorama Depth Estimation

UniFuse (RAL+ICRA2021) Office source code of paper UniFuse: Unidirectional Fusion for 360$^\circ$ Panorama Depth Estimation, arXiv, Demo Preparation I

Alibaba 47 Dec 26, 2022
EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

EncT5 (Unofficial) Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks About Finetune T5 model for classification & r

Jangwon Park 34 Jan 01, 2023
Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image

NonCuboidRoom Paper Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image Cheng Yang*, Jia Zheng*, Xili Dai, Rui Tang, Yi Ma, Xiao

67 Dec 15, 2022
UMEC: Unified Model and Embedding Compression for Efficient Recommendation Systems

[ICLR 2021] "UMEC: Unified Model and Embedding Compression for Efficient Recommendation Systems" by Jiayi Shen, Haotao Wang*, Shupeng Gui*, Jianchao Tan, Zhangyang Wang, and Ji Liu

VITA 39 Dec 03, 2022
TextureGAN in Pytorch

TextureGAN This code is our PyTorch implementation of TextureGAN [Project] [Arxiv] TextureGAN is a generative adversarial network conditioned on sketc

Patsorn 147 Dec 14, 2022
Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs

Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs This is an implemetation of the paper Few-shot Relation Extraction via Baye

MilaGraph 36 Nov 22, 2022
OneShot Learning-based hotword detection.

EfficientWord-Net Hotword detection based on one-shot learning Home assistants require special phrases called hotwords to get activated (eg:"ok google

ANT-BRaiN 102 Dec 25, 2022
Fog Simulation on Real LiDAR Point Clouds for 3D Object Detection in Adverse Weather

LiDAR fog simulation Created by Martin Hahner at the Computer Vision Lab of ETH Zurich. This is the official code release of the paper Fog Simulation

Martin Hahner 110 Dec 30, 2022
Use your Philips Hue lights as Racing Flags. Works with Assetto Corsa, Assetto Corsa Competizione and iRacing.

phue-racing-flags Use your Philips Hue lights as Racing Flags. Explore the docs » Report Bug · Request Feature Table of Contents About The Project Bui

50 Sep 03, 2022
This is the official repository of Music Playlist Title Generation: A Machine-Translation Approach.

PlyTitle_Generation This is the official repository of Music Playlist Title Generation: A Machine-Translation Approach. The paper has been accepted by

SeungHeonDoh 6 Jan 03, 2022
Kaggle Feedback Prize - Evaluating Student Writing 15th solution

Kaggle Feedback Prize - Evaluating Student Writing 15th solution First of all, I would like to thank the excellent notebooks and discussions from http

Lingyuan Zhang 6 Mar 24, 2022