CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer

Related tags

Deep LearningCSAW-M
Overview

CSAW-M

This repository contains code for CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer. Source code for training models to estimate the mammographic masking level along with the checkpoints are made available here.
The repo containing the annotation tool developed to annotate CSAW-M could be found here. The dataset could be found here.


Training and evaluation

  • In order to train a model, please refer to scripts/train.sh where we have prepared commands and arguments to train a model. In order to encourage reproducibility, we also provide the cross-validation splits that we used in the project (please refer to the dataset website to access them). scripts/cross_val.sh provides example commands to run cross-validation.
  • In order to evaluate a trained model, please refer to scripts/eval.sh with example commands and arguments to evaluate a model.
  • Checkpoints could be downloaded from here.

Important arguments defined in in the main module

  • --train and --evaluate which should be used in training and evaluating models respectively.
  • --model_name: specifies the model name, which will then be used for saving/loading checkpoints
  • --loss_type: defines which loss type to train the model with. It could be either one_hot which means training the model in a multi-class setup under usual cross entropy loss, or multi_hot which means training the model in a multi-label setup using multi-hot encoding (defined for ordinal labels). Please refer to paper for more details.
  • --img_size: specifies the image size to train the model with.
  • Almost all the params in params.yml could be overridden using the corresponding arguments. Please refer to main.py to see the corresponding args.

Other notes

  • It is assumed that main.py is called from inside the src directory.
  • It is important to note that in the beginning of the main script, after reading/checking arguments, params defined in params.ymlis read and updated according to args, after which a call to the set_globals (defined in main.py) is made. This sets global params needed to run the program (GPU device, loggers etc.) For every new high-level module (like main.py) that accepts running arguments and calls other modules, this function shoud be called, as other modules assume that these global params are set.
  • By default, there is no suggested validation csv files, but in cross-validation (using --cv) the train/validation splits in each fold are extracted from the cv_files paths specified in params.yml.
  • In src/experiments.py you can find the call to the function that preprocesses the raw images. For some images we have defined a special set of parameters to be used to ensure text is successfully removed from the images during preprocessing. We have documented every step of the preprocessing function to make it more udnerstandable - feel free to modify it if you want to have your own preprocessed images!
  • The Dockerfile and packages used in this project could be found in the docker folder.

Citation

If you use this work, please cite our paper:

@article{sorkhei2021csaw,
  title={CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer},
  author={Sorkhei, Moein and Liu, Yue and Azizpour, Hossein and Azavedo, Edward and Dembrower, Karin and Ntoula, Dimitra and Zouzos, Athanasios and Strand, Fredrik and Smith, Kevin},
  year={2021}
}

Questions or suggestions?

Please feel free to contact us in case you have any questions or suggestions!

Owner
Yue Liu
PhD student in deep learning at KTH.
Yue Liu
Keras documentation, hosted live at keras.io

Keras.io documentation generator This repository hosts the code used to generate the keras.io website. Generating a local copy of the website pip inst

Keras 2k Jan 08, 2023
Repository for MDPGT

MD-PGT Repository for implementing and reproducing the results for the paper MDPGT: Momentum-based Decentralized Policy Gradient Tracking. Available E

Xian Yeow Lee 2 Dec 30, 2021
Tracking Pipeline helps you to solve the tracking problem more easily

Tracking_Pipeline Tracking_Pipeline helps you to solve the tracking problem more easily I integrate detection algorithms like: Yolov5, Yolov4, YoloX,

VNOpenAI 32 Dec 21, 2022
Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch

Rotary Embeddings - Pytorch A standalone library for adding rotary embeddings to transformers in Pytorch, following its success as relative positional

Phil Wang 110 Dec 30, 2022
Weakly-supervised object detection.

Wetectron Wetectron is a software system that implements state-of-the-art weakly-supervised object detection algorithms. Project CVPR'20, ECCV'20 | Pa

NVIDIA Research Projects 342 Jan 05, 2023
PED: DETR for Crowd Pedestrian Detection

PED: DETR for Crowd Pedestrian Detection Code for PED: DETR For (Crowd) Pedestrian Detection Paper PED: DETR for Crowd Pedestrian Detection Installati

36 Sep 13, 2022
Scikit-learn compatible estimation of general graphical models

skggm : Gaussian graphical models using the scikit-learn API In the last decade, learning networks that encode conditional independence relationships

213 Jan 02, 2023
OMLT: Optimization and Machine Learning Toolkit

OMLT is a Python package for representing machine learning models (neural networks and gradient-boosted trees) within the Pyomo optimization environment.

C⚙G - Imperial College London 179 Jan 02, 2023
Constructing Neural Network-Based Models for Simulating Dynamical Systems

Constructing Neural Network-Based Models for Simulating Dynamical Systems Note this repo is work in progress prior to reviewing This is a companion re

Christian Møldrup Legaard 21 Nov 25, 2022
A library for performing coverage guided fuzzing of neural networks

TensorFuzz: Coverage Guided Fuzzing for Neural Networks This repository contains a library for performing coverage guided fuzzing of neural networks,

Brain Research 195 Dec 28, 2022
Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.

CycleGAN PyTorch | project page | paper Torch implementation for learning an image-to-image translation (i.e. pix2pix) without input-output pairs, for

Jun-Yan Zhu 11.5k Dec 30, 2022
Safe Control for Black-box Dynamical Systems via Neural Barrier Certificates

Safe Control for Black-box Dynamical Systems via Neural Barrier Certificates Installation Clone the repository: git clone https://github.com/Zengyi-Qi

Zengyi Qin 3 Oct 18, 2022
AI drive app that can help user become beautiful.

爱美丽 Beauty 简体中文 Features Beauty is an AI drive app that can help user become beautiful. it contain those functions: face score cheek face beauty repor

Starved Midnight 1 Jan 30, 2022
[NeurIPS 2021] Galerkin Transformer: a linear attention without softmax

[NeurIPS 2021] Galerkin Transformer: linear attention without softmax Summary A non-numerical analyst oriented explanation on Toward Data Science abou

Shuhao Cao 159 Dec 20, 2022
Mask-invariant Face Recognition through Template-level Knowledge Distillation

Mask-invariant Face Recognition through Template-level Knowledge Distillation This is the official repository of "Mask-invariant Face Recognition thro

Fadi Boutros 35 Dec 06, 2022
Lua-parser-lark - An out-of-box Lua parser written in Lark

An out-of-box Lua parser written in Lark Such parser handles a relaxed version o

Taine Zhao 2 Jul 19, 2022
Riemannian Convex Potential Maps

Modeling distributions on Riemannian manifolds is a crucial component in understanding non-Euclidean data that arises, e.g., in physics and geology. The budding approaches in this space are limited b

Facebook Research 61 Nov 28, 2022
Training DALL-E with volunteers from all over the Internet using hivemind and dalle-pytorch (NeurIPS 2021 demo)

Training DALL-E with volunteers from all over the Internet This repository is a part of the NeurIPS 2021 demonstration "Training Transformers Together

<a href=[email protected]"> 19 Dec 13, 2022
Object Database for Super Mario Galaxy 1/2.

Super Mario Galaxy Object Database Welcome to the public object database for Super Mario Galaxy and Super Mario Galaxy 2. Here, we document all object

Aurum 9 Dec 04, 2022
Portfolio Optimization and Quantitative Strategic Asset Allocation in Python

Riskfolio-Lib Quantitative Strategic Asset Allocation, Easy for Everyone. Description Riskfolio-Lib is a library for making quantitative strategic ass

Riskfolio 1.7k Jan 07, 2023