Code for the ECIR'22 paper "Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators"

Overview

Query Variation Generators

This repository contains the code and annotation data for the ECIR'22 paper "Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators".

Setup

Install the requirements using

pip install -r requirements.txt

Steps to reproduce the results

First we need to generate_weak supervsion for the desired test sets. We can do that with the scripts/generate_weak_supervision.py. In the paper we test for TREC-DL ('msmarco-passage/trec-dl-2019/judged') and ANTIQUE ('antique/train/split200-valid'), but any IR-datasets (https://ir-datasets.com/index.html) can be used here (as TASK).

python ${REPO_DIR}/examples/generate_weak_supervision.py 
    --task $TASK \
    --output_dir $OUT_DIR 

This will generate one query variation for each method for the original queries. After this, we manually annotated the query variations generated, in order to keep only valid ones for analysis. For that we use analyze_weak_supervision.py (prepares data for manual anotation) and analyze_auto_query_generation_labeling.py (combines auto labels and anotations.).

However, for reproducing the results we can directly use the annotated query set to test neural ranking models robustness (RQ1):

python ${REPO_DIR}/disentangled_information_needs/evaluation/query_rewriting.py \
        --task 'irds:msmarco-passage/trec-dl-2019/judged' \
        --output_dir $OUT_DIR/ \
        --variations_file $OUT_DIR/$VARIATIONS_FILE_TREC_DL \
        --retrieval_model_name "BM25+KNRM" \
        --train_dataset "irds:msmarco-passage/train" \
        --max_iter $MAX_ITER

by using the annotated variations file directly here "$OUT_DIR/$VARIATIONS_FILE_TREC_DL". The same can be done to run rank fusion (RQ2) by replacing query_rewriting.py with rank_fusion.py.

The scripts evaluate_weak_supervision.sh and evaluate_rank_fusion.sh run all models and datasets for both research questions . The first generates the main table of results, Table 4 in the paper, and the second generates the tables for the rank fusion experiments (only available in the Arxiv version of the paper).

Modules and Folders

  • scripts: Contain most of the analysis scripts and also commands to run entire experiments.
  • examples: Contain an example on how to generate query variations.
  • disentangled_information_needs/evaluation: Scripts to evaluate robustness of models for query variations and also to evaluate rank fusion of query variations.
  • disentangled_information_needs/transformations: Methods to generate query variations.
Owner
Gustavo Penha
Researcher - IR - RecSys - ML - NLP. https://linktr.ee/guzpenha
Gustavo Penha
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

This is an official pytorch implementation of ActionCLIP: A New Paradigm for Video Action Recognition [arXiv] Overview Content Prerequisites Data Prep

268 Jan 09, 2023
Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic S

Ken Lin 17 Oct 12, 2022
Official repository of "DeepMIH: Deep Invertible Network for Multiple Image Hiding", TPAMI 2022.

DeepMIH: Deep Invertible Network for Multiple Image Hiding (TPAMI 2022) This repo is the official code for DeepMIH: Deep Invertible Network for Multip

Junpeng Jing 67 Nov 22, 2022
This is a TensorFlow implementation for C2-Rec

This is a TensorFlow implementation for C2-Rec We refer to the repo SASRec. Requirements requirement.txt Datasets This repo includes Amazon Beauty dat

7 Nov 14, 2022
Transfer Learning library for Deep Neural Networks.

Transfer and meta-learning in Python Each folder in this repository corresponds to a method or tool for transfer/meta-learning. xfer-ml is a standalon

Amazon 245 Dec 08, 2022
Graph Convolutional Networks in PyTorch

Graph Convolutional Networks in PyTorch PyTorch implementation of Graph Convolutional Networks (GCNs) for semi-supervised classification [1]. For a hi

Thomas Kipf 4.5k Dec 31, 2022
Project page of the paper 'Analyzing Perception-Distortion Tradeoff using Enhanced Perceptual Super-resolution Network' (ECCVW 2018)

EPSR (Enhanced Perceptual Super-resolution Network) paper This repo provides the test code, pretrained models, and results on benchmark datasets of ou

Subeesh Vasu 78 Nov 19, 2022
Code accompanying "Adaptive Methods for Aggregated Domain Generalization"

Adaptive Methods for Aggregated Domain Generalization (AdaClust) Official Pytorch Implementation of Adaptive Methods for Aggregated Domain Generalizat

Xavier Thomas 15 Sep 20, 2022
A pytorch-based real-time segmentation model for autonomous driving

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation This project contains the Pytorch implementation for the proposed CFPNet: pap

342 Dec 22, 2022
Source code for The Power of Many: A Physarum Swarm Steiner Tree Algorithm

Physarum-Swarm-Steiner-Algo Source code for The Power of Many: A Physarum Steiner Tree Algorithm Code implements ideas from the following papers: Sher

Sheryl Hsu 2 Mar 28, 2022
Official implementation of the paper "Topographic VAEs learn Equivariant Capsules"

Topographic Variational Autoencoder Paper: https://arxiv.org/abs/2109.01394 Getting Started Install requirements with Anaconda: conda env create -f en

T. Andy Keller 69 Dec 12, 2022
Pull sensitive data from users on windows including discord tokens and chrome data.

⭐ For a 🍪 Pegasus Pull sensitive data from users on windows including discord tokens and chrome data. Features 🟩 Discord tokens 🟩 Geolocation data

Addi 44 Dec 31, 2022
《A-CNN: Annularly Convolutional Neural Networks on Point Clouds》(2019)

A-CNN: Annularly Convolutional Neural Networks on Point Clouds Created by Artem Komarichev, Zichun Zhong, Jing Hua from Department of Computer Science

Artёm Komarichev 44 Feb 24, 2022
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

Advanced Image Manipulation Lab @ Samsung AI Center Moscow 4.7k Dec 31, 2022
A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022) https://arxiv.org/abs/2203.09388 Jianqi Ma, Zheto

MA Jianqi, shiki 104 Jan 05, 2023
Plaything for Autistic Children (demo for PaddlePaddle/Wechaty/Mixlab project)

星星的孩子 - 一款为孤独症孩子设计的聊天机器人游戏 孤独症儿童是目前常常被忽视的一类群体。他们有着类似性格内向的特征,实际却受着广泛性发育障碍的折磨。 项目背景 这类儿童在与人交往时存在着沟通障碍,其特点表现在: 社交交流差,互动障碍明显 认知能力有限,被动认知 兴趣狭窄,重复刻板,缺乏变化和想象

Tianyi Pan 35 Nov 24, 2022
PyTorch implementation of Spiking Neural Networks trained on surrogate gradient & BPTT using snntorch.

snn-localization repo PyTorch implementation of Spiking Neural Networks trained on surrogate gradient & BPTT using snntorch. Install Dependencies Orig

Sami BARCHID 1 Jan 06, 2022
SOTR: Segmenting Objects with Transformers [ICCV 2021]

SOTR: Segmenting Objects with Transformers [ICCV 2021] By Ruohao Guo, Dantong Niu, Liao Qu, Zhenbo Li Introduction This is the official implementation

186 Dec 20, 2022
Simple and Distributed Machine Learning

Synapse Machine Learning SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. Sy

Microsoft 3.9k Dec 30, 2022
Author: Wenhao Yu ([email protected]). ACL 2022. Commonsense Reasoning on Knowledge Graph for Text Generation

Diversifying Commonsense Reasoning Generation on Knowledge Graph Introduction -- This is the pytorch implementation of our ACL 2022 paper "Diversifyin

DM2 Lab @ ND 61 Dec 30, 2022