SCROLLS
This repository contains the official code of the paper: "SCROLLS: Standardized CompaRison Over Long Language Sequences".
Links
Citation
@misc{shaham2022scrolls,
title={SCROLLS: Standardized CompaRison Over Long Language Sequences},
author={Uri Shaham and Elad Segal and Maor Ivgi and Avia Efrat and Ori Yoran and Adi Haviv and Ankit Gupta and Wenhan Xiong and Mor Geva and Jonathan Berant and Omer Levy},
year={2022},
eprint={2201.03533},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Loading the SCROLLS Benchmark Datasets
-
via
🤗 Datasets (huggingface/datasets) library (recommended):-
Usage:
from datasets import load_dataset qasper_dataset = load_dataset("tau/scrolls", "qasper") """ Options are: ["gov_report", "summ_screen_fd", "qmsum", "narrative_qa", "qasper", "quality", "contract_nli"] """
-
via ZIP files, where each split is in a JSONL file: