Automatic voice-synthetised summaries of latest research papers on arXiv

Overview

PaperWhisperer

PaperWhisperer is a Python application that keeps you up-to-date with research papers. How? It retrieves the latest articles from arXiv on a topic, by performing a keyword-based search. Then, it creates vocal summaries of the articles using Text-To-Speech and stores them to disk.

Installation

To install the package, move to the root of the repo and type in the console:

$ pip install .

If you plan to develop the package further, install the package in editable mode also installing the packages necessary to run unittests:

$ pip install -e .[test]

Testing

To run unittests, issue the following command from the root of the repo:

$ pytest

Package structure

The package is divided into 2 sub-packages:

  • retrieval
  • tts

retrieval contains data structures and facilities necessary to retrieve articles from arXiv. Under the hood, the app uses arxiv, a Python package that is a wrapper around the arXiv free API.

tts has facilities to generate speech renditions of text-based article summaries. The summary of an article consists of its title, authors, and abstract. Speech synthesis is performed using Google Cloud Text-To-Speech.

Setting up Google Cloud Text-To-Speech

PaperWhisperer uses Google Cloud Text-To-Speech to synthesise speech.

In order to be able to use this service, you should:

  1. create an account on Google Cloud,
  2. create a Cloud Platform project,
  3. enable the Text-To-Speech API in the project
  4. setup authentication
  5. download a Json private key

More info on how to set up Google Cloud Text-To-Speech

Environment variables

The app uses an environment variable called GOOGLE_APPLICATION_CREDENTIALS to connect to Google Cloud Text-To-Speech safely.

In config.yml, set GOOGLE_APPLICATION_CREDENTIALS to the path of the Json private key you previously downloaded while setting up the Google service.

Without this step, you won't be able to connect to Google Cloud Text-To-Speech, and the app will throw an error.

How to create summaries

To create summaries for a keyword search, use the create_summaries entry point. This is the only console script of the package and the main entry point of the application.

Below is an example of how you can run the script:

$ create_summaries "generate chord progressions" 100 /save/dir 40

The script takes 4 positional arguments:

  • keywords used for searching articles (more than one keyword is possible)
  • maximum number of articles to retrieve
  • directory where to store vocal summaries
  • retrieve articles no older than this integer value in days

Dependencies

PaperWhisperer depends on the following packages:

  • arxiv==1.2.0
  • google-cloud-texttospeech
  • python-dotenv

YouTube video

Learn more about PaperWhisperer in this project presentation video on The Sound of AI YouTube channel.

Owner
Valerio Velardo
AI audio/music researcher. Love Python.
Valerio Velardo
PuppetGAN - Cross-Domain Feature Disentanglement and Manipulation just got way better! 🚀

Better Cross-Domain Feature Disentanglement and Manipulation with Improved PuppetGAN Quite cool... Right? Introduction This repo contains a TensorFlow

Giorgos Karantonis 5 Aug 25, 2022
Algorithmic encoding of protected characteristics and its implications on disparities across subgroups

Algorithmic encoding of protected characteristics and its implications on disparities across subgroups This repository contains the code for the paper

Team MIRA - BioMedIA 15 Oct 24, 2022
Code for "The Intrinsic Dimension of Images and Its Impact on Learning" - ICLR 2021 Spotlight

dimensions Estimating the instrinsic dimensionality of image datasets Code for: The Intrinsic Dimensionaity of Images and Its Impact On Learning - Phi

Phil Pope 41 Dec 10, 2022
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intenti

NVIDIA Corporation 6.9k Jan 03, 2023
This program will stylize your photos with fast neural style transfer.

Neural Style Transfer (NST) Using TensorFlow Demo TensorFlow TensorFlow is an end-to-end open source platform for machine learning. It has a comprehen

Ismail Boularbah 1 Aug 08, 2022
Open source annotation tool for machine learning practitioners.

doccano doccano is an open source text annotation tool for humans. It provides annotation features for text classification, sequence labeling and sequ

7.1k Jan 01, 2023
MARE - Multi-Attribute Relation Extraction

MARE - Multi-Attribute Relation Extraction Repository for the paper submission: #TODO: insert link, when available Environment Tested with Ubuntu 18.0

0 May 11, 2021
Using some basic methods to show linkages and transformations of robotic arms

roboticArmVisualizer Python GUI application to create custom linkages and adjust joint angles. In the future, I plan to add 2d inverse kinematics solv

Sandesh Banskota 1 Nov 19, 2021
This is the official PyTorch implementation for "Mesa: A Memory-saving Training Framework for Transformers".

A Memory-saving Training Framework for Transformers This is the official PyTorch implementation for Mesa: A Memory-saving Training Framework for Trans

Zhuang AI Group 105 Dec 06, 2022
Official implementation of Few-Shot and Continual Learning with Attentive Independent Mechanisms

Few-Shot and Continual Learning with Attentive Independent Mechanisms This repository is the official implementation of Few-Shot and Continual Learnin

Chikan_Huang 25 Dec 08, 2022
Machine Learning toolbox for Humans

Reproducible Experiment Platform (REP) REP is ipython-based environment for conducting data-driven research in a consistent and reproducible way. Main

Yandex 662 Nov 20, 2022
The implementation of 'Image synthesis via semantic composition'.

Image synthesis via semantic synthesis [Project Page] by Yi Wang, Lu Qi, Ying-Cong Chen, Xiangyu Zhang, Jiaya Jia. Introduction This repository gives

DV Lab 71 Jan 06, 2023
Public repo for the ICCV2021-CVAMD paper "Is it Time to Replace CNNs with Transformers for Medical Images?"

Is it Time to Replace CNNs with Transformers for Medical Images? Accepted at ICCV-2021: Workshop on Computer Vision for Automated Medical Diagnosis (C

Christos Matsoukas 80 Dec 27, 2022
[NeurIPS 2021] “Improving Contrastive Learning on Imbalanced Data via Open-World Sampling”,

Improving Contrastive Learning on Imbalanced Data via Open-World Sampling Introduction Contrastive learning approaches have achieved great success in

VITA 24 Dec 17, 2022
Expressive Body Capture: 3D Hands, Face, and Body from a Single Image

Expressive Body Capture: 3D Hands, Face, and Body from a Single Image [Project Page] [Paper] [Supp. Mat.] Table of Contents License Description Fittin

Vassilis Choutas 1.3k Jan 07, 2023
This solves the autonomous driving issue which is supported by deep learning technology. Given a video, it splits into images and predicts the angle of turning for each frame.

Self Driving Car An autonomous car (also known as a driverless car, self-driving car, and robotic car) is a vehicle that is capable of sensing its env

Sagor Saha 4 Sep 04, 2021
GT4SD, an open-source library to accelerate hypothesis generation in the scientific discovery process.

The GT4SD (Generative Toolkit for Scientific Discovery) is an open-source platform to accelerate hypothesis generation in the scientific discovery process. It provides a library for making state-of-t

Generative Toolkit 4 Scientific Discovery 142 Dec 24, 2022
A toolkit for making real world machine learning and data analysis applications in C++

dlib C++ library Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real worl

Davis E. King 11.6k Jan 01, 2023
Identify the emotion of multiple speakers in an Audio Segment

MevonAI - Speech Emotion Recognition Identify the emotion of multiple speakers in a Audio Segment Report Bug · Request Feature Try the Demo Here Table

Suyash More 110 Dec 03, 2022
Official Pytorch Implementation of GraphiT

GraphiT: Encoding Graph Structure in Transformers This repository implements GraphiT, described in the following paper: Grégoire Mialon*, Dexiong Chen

Inria Thoth 80 Nov 27, 2022