Code and dataset for the EMNLP 2021 Finding paper "Can NLI Models Verify QA Systems’ Predictions?"

Last update: Oct 21, 2022

Related tags

Overview

Can NLI Models Verify QA Systems' Predictions?

This repository contains the data and code for the following paper:

**Can NLI Models Verify QA Systems' Predictions? **
Jifan Chen, Eunsol Choi, Greg Durrett
EMNLP 2021 Findings

@article{chen2021can,
  title={Can NLI Models Verify QA Systems' Predictions?},
  author={Chen, Jifan and Choi, Eunsol and Durrett, Greg},
  journal={EMNLP Findings},
  year={2021}
}

Datasets

The NLI data converted from QA datasets through our pipeline described in the paper can be found here

Data Format

The data files are formatted as jsonlines; each example is described as the following:

Field	Description
`example_id`	Example ID
`title_text`	Title of the Wikipedia page of the example, could be NONE
`paragraph_text`	Paragraph containing the answer
`question_text`	Question
`answer_text`	Answer of the question
`answer_sent_text`	Sentence containing the answer
`decontext_answer_sent_text`	Decontextualized sentence containing the answer
`question_statement_text`	Declarative version of the question by combining the answer
`answer_scores`	Top 5 Answer score computed by the QA(BERT-joint) model
`is_correct`	Whether the answer is correct
`answer_sent_text`	Sentence containing the answer

Models

Getting started

git clone https://github.com/jifan-chen/QA-Verification-Via-NLI.git

Install the dependencies by running pip install -r requirements.txt

Question Converter & Decontextualizer

See README in seq2seq_converter.

NQ-NLI

coming soon

Contact

Please contact at [email protected] if you have any questions.

Code and dataset for the EMNLP 2021 Finding paper "Can NLI Models Verify QA Systems’ Predictions?"

Related tags

Overview

Can NLI Models Verify QA Systems' Predictions?

Datasets

Data Format

Models

Getting started

Question Converter & Decontextualizer

NQ-NLI

Contact

Owner

Jifan Chen

Tools for curating biomedical training data for large-scale language modeling

Material for GW4SHM workshop, 16/03/2022.

Input english text, then translate it between languages n times using the Deep Translator Python Library.

Estimation of the CEFR complexity score of a given word, sentence or text.

Amazon Multilingual Counterfactual Dataset (AMCD)

A Python package implementing a new model for text classification with visualization tools for Explainable AI :octocat:

Transformer Based Korean Sentence Spacing Corrector

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

Code examples for my Write Better Python Code series on YouTube.

Nested Named Entity Recognition

An extension for asreview implements a version of the tf-idf feature extractor that saves the matrix and the vocabulary.

My Implementation for the paper EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks using Tensorflow

Ukrainian TTS (text-to-speech) using Coqui TTS

An official repository for tutorials of Probabilistic Modelling and Reasoning (2021/2022) - a University of Edinburgh master's course.

An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

NLP - Machine learning

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

Semi-automated vocabulary generation from semantic vector models

Simple Annotated implementation of GPT-NeoX in PyTorch