Improving Generalization Bounds for VC Classes Using the Hypergeometric Tail Inversion


Improving Generalization Bounds for VC Classes Using the Hypergeometric Tail Inversion


This directory provides an implementation of the algorithms used to compute the hypergeometric tail pseudo-inverse, as well as the code used to produce all figures of the paper "Improving Generalization Bounds for VC Classes Using the Hypergeometric Tail Inversion" by Leboeuf, LeBlanc and Marchand.


To run the scripts, one must first install the package and its requirements. To do so, run the following command from the root directory:

pip install .

Doing so will also provide you with the package hypergeo, which implements an algorithm to compute the hypergeometric tail pseudo-inverses.


The code was written to run on Python 3.8 or more recent version. The requirements are shown in the file requirements.txt and can be installed using the command:

pip install -r requirements.txt

The code

The code is split into 2 parts: the 'hypergeo' package and the 'scripts' directory.

The hypergeo package implements the utilities regarding the hypergeometric distribution (to compute the tail and its inverse), the binomial distribution (reimplementing the inverse as the scipy version suffered from numerical unstabilities) and some generalization bounds.

The scripts files produce the figures found in the paper using the hypergeo package. All figures are generated directly in LaTeX using the package python2latex. To run a script, navigate from the command line to the directory root directory of the project and run the command

/ .py" ">
python "./scripts/


The code does not provide command line control on the parameters of each script. However, each script is fairly simple, and parameters can be directly changed in the __main__ part of the script.

Scripts used in the body of the paper

  • Section 3.3: The ghost sample trade-off. In this section, we claim that optimizing m' gives relative gain between 8% and 10%. To obtain these number, you need to run the file mprime_tradeoff/ to first generate the data, and then run mprime_tradeoff/

  • Section 5: Numerical comparison. Figure 1a and 1b are obtain by executing the scripts bounds_comparison/ and bounds_comparison/ respectively. Figure 2a and 2b are obtain by executing the scripts bounds_comparison/, the first setting the variable risk to 0, the second by setting it equal to 0.1.

Scripts used in the appendices of the paper

  • Appendix B: Overview of the hypergeometric distribution. Figure 3 is generated from hypergeometric_tail/ Figure 4 is generated from hypergeometric_tail/ Algorithm 1 is implemented in the hypergeo file hypergeo/ as the function hypergeometric_tail_inverse. Algorithm 2 is implemented in the hypergeo file hypergeo/ as the function berkopec_hypergeometric_tail_inverse.

  • Appendix D: In-depth analysis of the ghost sample trade-off. Figure 5 is generated from mprime_tradeoff/ Figure 6 is generated from mprime_tradeoff/

  • Appendix E: The hypergeometric tail inversion relative deviation bound. To generate Figure 7 and 8, you must first run the file relative_deviation_mprime_tradeoff/ to generate the data, then run the script relative_deviation_mprime_tradeoff/ to produce Figure 7 and relative_deviation_comparison/ to produce Figure 8.

  • Appendix G: The hypergeometric tail lower bound . Figure 9 is generated from lower_bound/

  • Appendix F: Further numerical comparisons. Figure 10 and 12a are generated from bounds_comparison/ by changing the parameters of the scripts. Figure 11 and 12b is generated from bounds_comparison/ by changing the parameters of the scripts. Figure 13a and 13b are generated from bounds_comparison/ and bounds_comparison/ respectively.


The script pseudo-inverse_benchmarking/ benchmarks the various algorithms used to invert the hypergeometric tail. The 'tests' directory contains unit tests using the package pytest.

Jean-Samuel Leboeuf
PhD candidate in Computer Sciences (Machine Learning). MSc in Theoretical Physics.
Jean-Samuel Leboeuf
UMEC: Unified Model and Embedding Compression for Efficient Recommendation Systems

[ICLR 2021] "UMEC: Unified Model and Embedding Compression for Efficient Recommendation Systems" by Jiayi Shen, Haotao Wang*, Shupeng Gui*, Jianchao Tan, Zhangyang Wang, and Ji Liu

VITA 39 Dec 03, 2022
Official repository of "DeepMIH: Deep Invertible Network for Multiple Image Hiding", TPAMI 2022.

DeepMIH: Deep Invertible Network for Multiple Image Hiding (TPAMI 2022) This repo is the official code for DeepMIH: Deep Invertible Network for Multip

Junpeng Jing 67 Nov 22, 2022
Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features

MediumVC MediumVC is an utterance-level method towards any-to-any VC. Before that, we propose SingleVC to perform A2O tasks(Xi → Ŷi) , Xi means utter

谷下雨 47 Dec 25, 2022
Image Deblurring using Generative Adversarial Networks

DeblurGAN arXiv Paper Version Pytorch implementation of the paper DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks. Our netwo

Orest Kupyn 2.2k Jan 01, 2023
PyTorch common framework to accelerate network implementation, training and validation

pytorch-framework PyTorch common framework to accelerate network implementation, training and validation. This framework is inspired by works from MML

Dongliang Cao 3 Dec 19, 2022
Adversarial Reweighting for Partial Domain Adaptation

Adversarial Reweighting for Partial Domain Adaptation Code for paper "Xiang Gu, Xi Yu, Yan Yang, Jian Sun, Zongben Xu, Adversarial Reweighting for Par

12 Dec 01, 2022
Optimized primitives for collective multi-GPU communication

NCCL Optimized primitives for inter-GPU communication. Introduction NCCL (pronounced "Nickel") is a stand-alone library of standard communication rout

NVIDIA Corporation 2k Jan 09, 2023
The easiest tool for extracting radiomics features and training ML models on them.

Simple pipeline for experimenting with radiomics features Installation git clone cd classrad pi

Piotr Woźnicki 17 Aug 04, 2022
Implementation for Homogeneous Unbalanced Regularized Optimal Transport

HUROT: An Homogeneous formulation of Unbalanced Regularized Optimal Transport. This repository provides code related to this preprint. This is an alph

Théo Lacombe 1 Feb 17, 2022
Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Face-Detection-with-MTCNN Face detection is a computer vision problem that involves finding faces in photos. It is a trivial problem for humans to sol

Chetan Hirapara 3 Oct 07, 2022

Keras-1D-NN-Classifier This code is based on the reference codes linked below. reference 1, reference 2 This code is for 1-D array data classification

Jae-Hoon Shim 6 May 18, 2021
GitHub repository for the ICLR Computational Geometry & Topology Challenge 2021

ICLR Computational Geometry & Topology Challenge 2022 Welcome to the ICLR 2022 Computational Geometry & Topology challenge 2022 --- by the ICLR 2022 W

42 Dec 13, 2022
CoINN: Correlated-informed neural networks: a new machine learning framework to predict pressure drop in micro-channels

CoINN: Correlated-informed neural networks: a new machine learning framework to predict pressure drop in micro-channels Accurate pressure drop estimat

Alejandro Montanez 0 Jan 21, 2022
J.A.R.V.I.S is an AI virtual assistant made in python.

J.A.R.V.I.S is an AI virtual assistant made in python. Running JARVIS Without Python To run JARVIS without python: 1. Head over to our installation pa

somePythonProgrammer 16 Dec 29, 2022
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation

LAVT: Language-Aware Vision Transformer for Referring Image Segmentation Where we are ? 12.27 目前和原论文仍有1%左右得差距,但已经力压很多SOTA了 ckpt__448_epoch_25.pth mIoU

zichengsaber 60 Dec 11, 2022
A PyTorch implementation of "ANEMONE: Graph Anomaly Detection with Multi-Scale Contrastive Learning", CIKM-21

ANEMONE A PyTorch implementation of "ANEMONE: Graph Anomaly Detection with Multi-Scale Contrastive Learning", CIKM-21 Dependencies python==3.6.1 dgl==

Graph Analysis & Deep Learning Laboratory, GRAND 30 Dec 14, 2022
A Flexible Generative Framework for Graph-based Semi-supervised Learning (NeurIPS 2019)

G3NN This repo provides a pytorch implementation for the 4 instantiations of the flexible generative framework as described in the following paper: A

Jiaqi Ma 14 Oct 11, 2022
School of Artificial Intelligence at the Nanjing University (NJU)School of Artificial Intelligence at the Nanjing University (NJU)

F-Principle This is an exercise problem of the digital signal processing (DSP) course at School of Artificial Intelligence at the Nanjing University (

Thyrix 5 Nov 23, 2022
Subdivision-based Mesh Convolutional Networks

Subdivision-based Mesh Convolutional Networks The official implementation of SubdivNet in our paper, Subdivion-based Mesh Convolutional Networks Requi

Zheng-Ning Liu 181 Dec 28, 2022
Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr.

fix_m1_rgb Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr. No warranty provided for using th

Kevin Gao 116 Jan 01, 2023