Algorithmic encoding of protected characteristics and its implications on disparities across subgroups

Overview

Algorithmic encoding of protected characteristics and its implications on disparities across subgroups

Components of a deep neural networks

This repository contains the code for the paper

B. Glocker, S. Winzeck. Algorithmic encoding of protected characteristics and its implications on disparities across subgroups. 2021. under review. arXiv:2110.14755

Dataset

The CheXpert imaging dataset together with the patient demographic information used in this work can be downloaded from https://stanfordmlgroup.github.io/competitions/chexpert/.

Code

For running the code, we recommend setting up a dedicated Python environment.

Setup Python environment using conda

Create and activate a Python 3 conda environment:

conda create -n pymira python=3
conda activate chexploration

Install PyTorch using conda:

conda install pytorch torchvision cudatoolkit=10.1 -c pytorch

Setup Python environment using virtualenv

Create and activate a Python 3 virtual environment:

virtualenv -p python3 <path_to_envs>/chexploration
source <path_to_envs>/chexploration/bin/activate

Install PyTorch using pip:

pip install torch torchvision

Install additional Python packages:

pip install matplotlib jupyter pandas seaborn pytorch-lightning scikit-learn scikit-image tensorboard tqdm openpyxl

How to use

In order to replicate the results presented in the paper, please follow these steps:

  1. Download the CheXpert dataset, copy the file train.csv to the datafiles folder
  2. Download the CheXpert demographics data, copy the file CHEXPERT DEMO.xlsx to the datafiles folder
  3. Run the notebook chexpert.sample.ipynb to generate the study data
  4. Adjust the variable img_data_dir to point to the imaging data and run the following scripts
  5. Run the notebook chexpert.predictions.ipynb to evaluate all three prediction models
  6. Run the notebook chexpert.explorer.ipynb for the unsupervised exploration of feature representations

Additionally, there are scripts chexpert.sex.split.py and chexpert.race.split.py to run SPLIT on the disease detection model. The default setting in all scripts is to train a DenseNet-121 using the training data from all patients. The results for models trained on subgroups only can be produced by changing the path to the datafiles (e.g., using full_sample_train_white.csv and full_sample_val_white.csv instead of full_sample_train.csv and full_sample_val.csv).

Note, the Python scripts also contain code for running the experiments using a ResNet-34 backbone which requires less GPU memory.

Trained models

All trained models, feature embeddings and output predictions can be found here.

Funding sources

This work is supported through funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant Agreement No. 757173, Project MIRA, ERC-2017-STG) and by the UKRI London Medical Imaging & Artificial Intelligence Centre for Value Based Healthcare.

License

This project is licensed under the Apache License 2.0.

Owner
Team MIRA - BioMedIA
Team MIRA - BioMedIA
Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.

Torch-template-for-deep-learning Pytorch implementations of some **classical backbone CNNs, data enhancement, torch loss, attention, visualization and

Li Shengyan 270 Dec 31, 2022
Deep Surface Reconstruction from Point Clouds with Visibility Information

Data, code and pretrained models for the paper Deep Surface Reconstruction from Point Clouds with Visibility Information.

Raphael Sulzer 23 Jan 04, 2023
The repository offers the official implementation of our BMVC 2021 paper in PyTorch.

CrossMLP Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation Bin Ren1, Hao Tang2, Nicu Sebe1. 1University of Trento, Italy, 2ETH, Switzerla

Bingoren 16 Jul 27, 2022
Implementation of the paper "Generating Symbolic Reasoning Problems with Transformer GANs"

Generating Symbolic Reasoning Problems with Transformer GANs This is the implementation of the paper Generating Symbolic Reasoning Problems with Trans

Reactive Systems Group 1 Apr 18, 2022
PyTorch implementation for View-Guided Point Cloud Completion

PyTorch implementation for View-Guided Point Cloud Completion

22 Jan 04, 2023
Real-time LIDAR-based Urban Road and Sidewalk detection for Autonomous Vehicles đźš—

urban_road_filter: a real-time LIDAR-based urban road and sidewalk detection algorithm for autonomous vehicles Dependency ROS (tested with Kinetic and

JKK - Vehicle Industry Research Center 180 Dec 12, 2022
A GPU-optional modular synthesizer in pytorch, 16200x faster than realtime, for audio ML researchers.

torchsynth The fastest synth in the universe. Introduction torchsynth is based upon traditional modular synthesis written in pytorch. It is GPU-option

torchsynth 229 Jan 02, 2023
BMW TechOffice MUNICH 148 Dec 21, 2022
Recursive Bayesian Networks

Recursive Bayesian Networks This repository contains the code to reproduce the results from the NeurIPS 2021 paper Lieck R, Rohrmeier M (2021) Recursi

Robert Lieck 11 Oct 18, 2022
Pytorch Implementation of Spiking Neural Networks Calibration, ICML 2021

SNN_Calibration Pytorch Implementation of Spiking Neural Networks Calibration, ICML 2021 Feature Comparison of SNN calibration: Features SNN Direct Tr

Yuhang Li 60 Dec 27, 2022
Portfolio Optimization and Quantitative Strategic Asset Allocation in Python

Riskfolio-Lib Quantitative Strategic Asset Allocation, Easy for Everyone. Description Riskfolio-Lib is a library for making quantitative strategic ass

Riskfolio 1.7k Jan 07, 2023
Code for Graph-to-Tree Learning for Solving Math Word Problems (ACL 2020)

Graph-to-Tree Learning for Solving Math Word Problems PyTorch implementation of Graph based Math Word Problem solver described in our ACL 2020 paper G

Jipeng Zhang 66 Nov 23, 2022
A hue shift helper for OBS

obs-hue-shift A hue shift helper for OBS This is a repo based on the really nice script Hegemege made. The original script can be found https://gist.g

Alexis Tyler 1 Jan 10, 2022
Leaf: Multiple-Choice Question Generation

Leaf: Multiple-Choice Question Generation Easy to use and understand multiple-choice question generation algorithm using T5 Transformers. The applicat

Kristiyan Vachev 62 Dec 20, 2022
Code for our paper "MG-GAN: A Multi-Generator Model Preventing Out-of-Distribution Samples in Pedestrian Trajectory Prediction" published at ICCV 2021.

MG-GAN: A Multi-Generator Model Preventing Out-of-Distribution Samples in Pedestrian Trajectory Prediction This repository contains the code for the p

Sven 30 Jan 05, 2023
Demystifying How Self-Supervised Features Improve Training from Noisy Labels

Demystifying How Self-Supervised Features Improve Training from Noisy Labels This code is a PyTorch implementation of the paper "[Demystifying How Sel

<a href=[email protected]"> 4 Oct 14, 2022
A Semantic Segmentation Network for Urban-Scale Building Footprint Extraction Using RGB Satellite Imagery

A Semantic Segmentation Network for Urban-Scale Building Footprint Extraction Using RGB Satellite Imagery This repository is the official implementati

Aatif Jiwani 42 Dec 08, 2022
This repository is an implementation of paper : Improving the Training of Graph Neural Networks with Consistency Regularization

CRGNN Paper : Improving the Training of Graph Neural Networks with Consistency Regularization Environments Implementing environment: GeForce RTX™ 3090

THUDM 28 Dec 09, 2022
CondNet: Conditional Classifier for Scene Segmentation

CondNet: Conditional Classifier for Scene Segmentation Introduction The fully convolutional network (FCN) has achieved tremendous success in dense vis

ycszen 31 Jul 22, 2022
git《Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction》(ECCV 2020) GitHub:

Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction Code for the ECCV 2020 paper by Yiming Qian and Yasutaka Furukawa Getting

37 Dec 04, 2022