This repository contains the code for TACL2021 paper: SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization

Related tags

Deep Learningsummac
Overview

SummaC: Summary Consistency Detection

This repository contains the code for TACL2021 paper: SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization

We release: (1) the trained SummaC models, (2) the SummaC Benchmark and data loaders, (3) training and evaluation scripts.

Trained SummaC Models

The two trained models SummaC-ZS and SummaC-Conv are implemented in model_summac.py (link):

  • SummaC-ZS does not require a model file (as the model is zero-shot and not trained): it can be used as seen at the bottom of the model_summac.py.
  • SummaC-Conv requires a start_file which contains the trained weight for the convolution layer. The default start_file used to compute results is available in this repository ( summac_conv_vitc_sent_perc_e.bin download link).

Example use

from model_summac import SummaCZS

model = SummaCZS(granularity="sentence", model_name="vitc")

document = """Scientists are studying Mars to learn about the Red Planet and find landing sites for future missions.
One possible site, known as Arcadia Planitia, is covered instrange sinuous features.
The shapes could be signs that the area is actually made of glaciers, which are large masses of slow-moving ice.
Arcadia Planitia is in Mars' northern lowlands."""

summary1 = "There are strange shape patterns on Arcadia Planitia. The shapes could indicate the area might be made of glaciers. This makes Arcadia Planitia ideal for future missions."
summary2 = "There are strange shape patterns on Arcadia Planitia. The shapes could indicate the area might be made of glaciers."

score1 = model.score([document], [summary1])
print("Summary Score 1 consistency: %.3f" % (score1["scores"][0])) # Prints: 0.587

score2 = model.score([document], [summary2])
print("Summary Score 2 consistency: %.3f" % (score2["scores"][0])) # Prints: 0.877

To load all the necessary files: (1) clone this repository, (2) add the reposity to Python path: export PYTHONPATH="${PYTHONPATH}:/path/to/summac/"

SummaC Benchmark

The SummaC Benchmark consists of 6 summary consistency datasets that have been standardized to a binary classification task. The datasets included are:


% Positive is the percentage of positive (consistent) summaries. IAA is the inter-annotator agreement (Fleiss Kappa). Source is the dataset used for the source documents (CNN/DM or XSum). # Summarizers is the number of summarizers (extractive and abstractive) included in the dataset. # Sublabel is the number of labels in the typology used to label summary errors.

The data-loaders for the benchmark are included in utils_summac_benchmark.py (link). Because the dataset relies on previously published work, the dataset requires the manual download of several datasets. For each of the 6 tasks, the link and instruction to download are present as a comment in the file. Once all the files have been compiled, the benchmark can be loaded and standardized by running:

from utils_summac_benchmark import SummaCBenchmark
benchmark_validation = SummaCBenchmark(benchmark_folder="/path/to/summac_benchmark/", cut="val")

Note: we have a plan to streamline the process by further improving to automatically download necessary files if not present, if you would like to participate please let us know. If encoutering an issue in the manual download process, please contact us.

Cite the work

If you make use of the code, models, or algorithm, please cite our paper. Bibtex to come.

Contributing

If you'd like to contribute, or have questions or suggestions, you can contact us at [email protected]. All contributions welcome, for example helping make the benchmark more easily downloadable, or improving model performance on the benchmark.

Owner
Philippe Laban
Philippe Laban
Fiddle is a Python-first configuration library particularly well suited to ML applications.

Fiddle Fiddle is a Python-first configuration library particularly well suited to ML applications. Fiddle enables deep configurability of parameters i

Google 227 Dec 26, 2022
Publication describing 3 ML examples at NSLS-II and interfacing into Bluesky

Machine learning enabling high-throughput and remote operations at large-scale user facilities. Overview This repository contains the source code and

BNL 4 Sep 24, 2022
Automated Evidence Collection for Fake News Detection

Automated Evidence Collection for Fake News Detection This is the code repo for the Automated Evidence Collection for Fake News Detection paper accept

Mrinal Rawat 2 Apr 12, 2022
The final project for "Applying AI to Wearable Device Data" course from "AI for Healthcare" - Udacity.

Motion Compensated Pulse Rate Estimation Overview This project has 2 main parts. Develop a Pulse Rate Algorithm on the given training data. Then Test

Omar Laham 2 Oct 25, 2022
Runtime type annotations for the shape, dtype etc. of PyTorch Tensors.

torchtyping Type annotations for a tensor's shape, dtype, names, ... Turn this: def batch_outer_product(x: torch.Tensor, y: torch.Tensor) - torch.Ten

Patrick Kidger 1.2k Jan 03, 2023
NasirKhusraw - The TSP solved using genetic algorithm and show TSP path overlaid on a map of the Iran provinces & their capitals.

Nasir Khusraw : Travelling Salesman Problem The TSP solved using genetic algorithm. This project show TSP path overlaid on a map of the Iran provinces

J Brave 2 Sep 01, 2022
Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding (CVPR2022)

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding by Qiaole Dong*, Chenjie Cao*, Yanwei Fu Paper and Supple

Qiaole Dong 190 Dec 27, 2022
Code for PhySG: Inverse Rendering with Spherical Gaussians for Physics-based Relighting and Material Editing

PhySG: Inverse Rendering with Spherical Gaussians for Physics-based Relighting and Material Editing CVPR 2021. Project page: https://kai-46.github.io/

Kai Zhang 141 Dec 14, 2022
Learning Off-Policy with Online Planning, CoRL 2021

LOOP: Learning Off-Policy with Online Planning Accepted in Conference of Robot Learning (CoRL) 2021. Harshit Sikchi, Wenxuan Zhou, David Held Paper In

Harshit Sikchi 24 Nov 22, 2022
ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers

ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers Official implementation of ViewFormer. ViewFormer is a NeRF-free neural rend

Jonáš Kulhánek 169 Dec 30, 2022
Probabilistic-Monocular-3D-Human-Pose-Estimation-with-Normalizing-Flows

Probabilistic-Monocular-3D-Human-Pose-Estimation-with-Normalizing-Flows This is the official implementation of the ICCV 2021 Paper "Probabilistic Mono

62 Nov 23, 2022
code associated with ACL 2021 DExperts paper

DExperts Hi! This repository contains code for the paper DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts to appear at

Alisa Liu 68 Dec 15, 2022
ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Sign-Agnostic Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page This repository contains the implementation

63 Nov 18, 2022
ONNX Command-Line Toolbox

ONNX Command Line Toolbox Aims to improve your experience of investigating ONNX models. Use it like onnx infershape /path/to/model.onnx. (See the usag

黎明灰烬 (王振华 Zhenhua WANG) 23 Nov 13, 2022
NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

880 Jan 07, 2023
Densely Connected Convolutional Networks, In CVPR 2017 (Best Paper Award).

Densely Connected Convolutional Networks (DenseNets) This repository contains the code for DenseNet introduced in the following paper Densely Connecte

Zhuang Liu 4.5k Jan 03, 2023
Geometric Vector Perceptron --- a rotation-equivariant GNN for learning from biomolecular structure

Geometric Vector Perceptron Code to accompany Learning from Protein Structure with Geometric Vector Perceptrons by B Jing, S Eismann, P Suriana, RJL T

Dror Lab 85 Dec 29, 2022
RADIal is available now! Check the download section

Latest news: RADIal is available now! Check the download section. However, because we are currently working on the data anonymization, we provide for

valeo.ai 55 Jan 03, 2023
Cupytorch - A small framework mimics PyTorch using CuPy or NumPy

CuPyTorch CuPyTorch是一个小型PyTorch,名字来源于: 不同于已有的几个使用NumPy实现PyTorch的开源项目,本项目通过CuPy支持

Xingkai Yu 23 Aug 17, 2022
Image-to-image translation with conditional adversarial nets

pix2pix Project | Arxiv | PyTorch Torch implementation for learning a mapping from input images to output images, for example: Image-to-Image Translat

Phillip Isola 9.3k Jan 08, 2023