PiRank: Learning to Rank via Differentiable Sorting

Last update: Dec 17, 2022

Related tags

Overview

PiRank: Learning to Rank via Differentiable Sorting

This repository provides a reference implementation for learning PiRank-based models as described in the paper:

PiRank: Learning to Rank via Differentiable Sorting
Robin Swezey, Aditya Grover, Bruno Charron and Stefano Ermon.
Paper: https://arxiv.org/abs/2012.06731

Requirements

The codebase is implemented in Python 3.7. To install the necessary base requirements, run the following commands:

pip install -r requirements.txt

If you intend to use a GPU, modify requirements.txt to install tensorflow-gpu instead of tensorflow.

You will also need the NeuralSort implementation available here. Make sure it is added to your PYTHONPATH.

Datasets

PiRank was tested on the two following datasets:

The MSLR WEB30K data can be found at this address.
The Yahoo! C14 dataset can be found at this address.

Additionally, the code is expected to work with any dataset stored in the standard LibSVM format used for LTR experiments.

Scripts

There are two scripts for the code:

pirank_simple.py implements a simple depth-1 PiRank loss (d=1). It is used in the experiments of sections 4.1 (benchmark evaluation on MSLR-WEB30K and Yahoo! C14 datasets), 4.2.1 (effect of temperature parameter), and 4.2.2 (effect of training list size).
pirank_deep.py implements the deeper PiRank losses (d>=1). It is used for the experiments of section 4.2.3 and comes with a convenient synthetic data generator as well as more tuning options.

Options

Options are handled by Sacred (see Examples section below).

pirank_simple.py and pirank_deep.py

PiRank-related:

Parameter	Default Value	Description
loss_fn	pirank_simple_loss	The loss function to use (either a TFR RankingLossKey, or loss function from the script)
ste	False	Whether to use the Straight-Through Estimator
ndcg_k	15	[email protected] cutoff when using NS-NDCG loss

NeuralSort-related:

Parameter	Default Value	Description
tau	5	Temperature
taustar	1e-10	Temperature for trues and straight-through estimation.

TensorFlow-Ranking and architecture-related:

Parameter	Default Value	Description
hidden_layers	"256,tanh,128,tanh,64,tanh"	Hidden layers for an example-wise feedforward network in the format size,activation,...,size,activation
num_features	136	Number of features per document. The default value is for MSLR and depends on the dataset (e.g. for Yahoo!, please change to 700).
list_size	100	List size used for training
group_size	1	Group size used in score function

Training-related:

Parameter	Default Value	Description
train_path	"/data/MSLR-WEB30K/Fold*/train.txt"	Input file path used for training
vali_path	"/data/MSLR-WEB30K/Fold*/vali.txt"	Input file path used for validation
test_path	"/data/MSLR-WEB30K/Fold*/test.txt"	Input file path used for testing
model_dir	None	Output directory for models
num_epochs	200	Number of epochs to train, set 0 to just test
lr	1e-4	initial learning rate
batch_size	32	The batch size for training
num_train_steps	None	Number of steps for training
num_vali_steps	None	Number of steps for validation
num_test_steps	None	Number of steps for testing
learning_rate	0.01	Learning rate for optimizer
dropout_rate	0.5	The dropout rate before output layer
optimizer	Adagrad	The optimizer for gradient descent

Sacred:

In addition, you can use regular parameters from Sacred (such as -m for logging the experiment to MongoDB).

pirank_deep.py only

Parameter	Default Value	Description
merge_block_size	None	Block size used if merging, None if not merging
top_k	None	Use a different Top-k for merging than final [email protected] for loss
straight_backprop	False	Backpropagate on scores only through NS operator
full_loss	False	Use the complete loss at the end of merge
tau_scheme	None	Which scheme to use for temperature going deeper (default: constant)
data_generator	None	Data generator (default: TFR\s libsvm); use this for synthetic generation
num_queries	30000	Number of queries for synthetic data generator
num_query_features	10	Number of columns used as factors for each query by synthetic data generator
actual_list_size	None	Size of actual list per query in synthetic data generation
train_path	"/data/MSLR-WEB30K/Fold*/train.txt"	Input file path used for training; alternatively value of seed if using data generator
vali_path	"/data/MSLR-WEB30K/Fold*/vali.txt"	Input file path used for validation; alternatively value of seed if using data generator
test_path	"/data/MSLR-WEB30K/Fold*/test.txt"	Input file path used for testing; alternatively value of seed if using data generator
with_opa	True	Include pairwise metric OPA

Examples

Run the benchmark experiment of section 4.1 with PiRank simple loss on MSLR-WEB30K

cd pirank
python3 pirank_simple.py with loss_fn=pirank_simple_loss \
    ndcg_k=10 \
    tau=5 \
    list_size=80 \
    hidden_layers=256,relu,256,relu,128,relu,64,relu \
    train_path=/data/MSLR-WEB30K/Fold1/train.txt \
    vali_path=/data/MSLR-WEB30K/Fold1/vali.txt \
    test_path=/data/MSLR-WEB30K/Fold1/test.txt \
    num_features=136 \
    optimizer=Adam \
    learning_rate=0.00001 \
    num_epochs=100 \
    batch_size=16 \
    model_dir=/tmp/model

Run the benchmark experiment of section 4.1 with PiRank simple loss on Yahoo! C14

cd pirank
python3 pirank_simple.py with loss_fn=pirank_simple_loss \
    ndcg_k=10 \
    tau=5 \
    list_size=80 \
    hidden_layers=256,relu,256,relu,128,relu,64,relu \
    train_path=/data/YAHOO/set1.train.txt \
    vali_path=/data/YAHOO/set1.valid.txt \
    test_path=/data/YAHOO/set1.test.txt \
    num_features=700 \
    optimizer=Adam \
    learning_rate=0.00001 \
    num_epochs=100 \
    batch_size=16 \
    model_dir=/tmp/model

Run the benchmark experiment of section 4.1 with classic LambdaRank on MSLR-WEB30K

cd pirank
python3 pirank_simple.py with loss_fn=lambda_rank_loss \
    ndcg_k=10 \
    tau=5 \
    list_size=80 \
    hidden_layers=256,relu,256,relu,128,relu,64,relu \
    train_path=/data/MSLR-WEB30K/Fold1/train.txt \
    vali_path=/data/MSLR-WEB30K/Fold1/vali.txt \
    test_path=/data/MSLR-WEB30K/Fold1/test.txt \
    num_features=136 \
    optimizer=Adam \
    learning_rate=0.00001 \
    num_epochs=100 \
    batch_size=16 \
    model_dir=/tmp/model

Run the scaling ablation experiment of section 4.2.3 using synthetic data generation (d=2)

cd pirank
python3 pirank_deep.py with loss_fn=pirank_deep_loss \
    ndcg_k=10 \
    ste=True \
    merge_block_size=100 \
    tau=5 \
    taustar=1e-10 \
    tau_scheme=square \
    data_generator=synthetic_data_generator \
    actual_list_size=1000 \
    list_size=1000 \
    vali_list_size=1000 \
    test_list_size=1000 \
    full_loss=False \
    train_path=0 \
    vali_path=1 \
    test_path=2 \
    num_queries=1000 \
    num_features=25 \
    num_query_features=5 \
    hidden_layers=256,relu,256,relu,128,relu,128,relu,64,relu,64,relu \
    optimizer=Adam \
    learning_rate=0.00001 \
    num_epochs=100 \
    batch_size=16

Help

If you need help, reach out to Robin Swezey or raise an issue.

Citing

If you find PiRank useful in your research, please consider citing the following paper:

@inproceedings{
swezey2020pirank,
title={PiRank: Learning to Rank via Differentiable Sorting},
author={Robin Swezey and Aditya Grover and Bruno Charron and Stefano Ermon},
year={2020},
url={},
}

PiRank: Learning to Rank via Differentiable Sorting

Related tags

Overview

PiRank: Learning to Rank via Differentiable Sorting

Requirements

Datasets

Scripts

Options

pirank_simple.py and pirank_deep.py

pirank_deep.py only

Examples

Help

Citing

Owner

Official implementation of ACTION-Net: Multipath Excitation for Action Recognition (CVPR'21).

Deep Learning agent of Starcraft2, similar to AlphaStar of DeepMind except size of network.

Unsupervised MRI Reconstruction via Zero-Shot Learned Adversarial Transformers

Accommodating supervised learning algorithms for the historical prices of the world's favorite cryptocurrency and boosting it through LightGBM.

PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"

Recurrent Neural Network Tutorial, Part 2 - Implementing a RNN in Python and Theano

Pseudo-rng-app - whos needs science to make a random number when you have pseudoscience?

Cupytorch - A small framework mimics PyTorch using CuPy or NumPy

Tools for computational pathology

Hierarchical Attentive Recurrent Tracking

Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

[ICCV2021] IICNet: A Generic Framework for Reversible Image Conversion

[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

Bounding Wasserstein distance with couplings

Code for "MetaMorph: Learning Universal Controllers with Transformers", Gupta et al, ICLR 2022

Retinal vessel segmentation based on GT-UNet

Official code for article "Expression is enough: Improving traﬀic signal control with advanced traﬀic state representation"

Confidence Propagation Cluster aims to replace NMS-based methods as a better box fusion framework in 2D/3D Object detection

Jiminy Cricket Environment (NeurIPS 2021)