Pre-training of Graph Augmented Transformers for Medication Recommendation

Last update: Dec 27, 2022

Related tags

Overview

G-Bert

Pre-training of Graph Augmented Transformers for Medication Recommendation

Intro

G-Bert combined the power of Graph Neural Networks and BERT (Bidirectional Encoder Representations from Transformers) for medical code representation and medication recommendation. We use the graph neural networks (GNNs) to represent the structure information of medical codes from a medical ontology. Then we integrate the GNN representation into a transformer-based visit encoder and pre-train it on single-visit EHR data. The pre-trained visit encoder and representation can be fine-tuned for downstream medical prediction tasks. Our model is the first to bring the language model pre-training schema into the healthcare domain and it achieved state-of-the-art performance on the medication recommendation task.

Requirements

pytorch>=0.4
python>=3.5
torch_geometric==1.0.3

Guide

We list the structure of this repo as follows:

.
├── [4.0K]  code/
│   ├── [ 13K]  bert_models.py % transformer models
│   ├── [5.9K]  build_tree.py % build ontology
│   ├── [4.3K]  config.py % hyperparameters for G-Bert
│   ├── [ 11K]  graph_models.py % GAT models
│   ├── [   0]  __init__.py
│   ├── [9.8K]  predictive_models.py % G-Bert models
│   ├── [ 721]  run_alternative.sh % script to train G-Bert
│   ├── [ 19K]  run_gbert.py % fine tune G-Bert
│   ├── [ 19K]  run_gbert_side.py
│   ├── [ 18K]  run_pretraining.py % pre-train G-Bert
│   ├── [4.4K]  run_tsne.py # output % save embedding for tsne visualization
│   └── [4.7K]  utils.py
├── [4.0K]  data/
│   ├── [4.9M]  data-multi-side.pkl 
│   ├── [3.6M]  data-multi-visit.pkl % patients data with multi-visit
│   ├── [4.3M]  data-single-visit.pkl % patients data with singe-visit
│   ├── [ 11K]  dx-vocab-multi.txt % diagnosis codes vocabulary in multi-visit data
│   ├── [ 11K]  dx-vocab.txt % diagnosis codes vocabulary in all data
│   ├── [ 29K]  EDA.ipynb % jupyter version to preprocess data
│   ├── [ 18K]  EDA.py % python version to preprocess data
│   ├── [6.2K]  eval-id.txt % validation data ids
│   ├── [6.9K]  px-vocab-multi.txt % procedure codes vocabulary in multi-visit data
│   ├── [ 725]  rx-vocab-multi.txt % medication codes vocabulary in multi-visit data
│   ├── [2.6K]  rx-vocab.txt % medication codes vocabulary in all data
│   ├── [6.2K]  test-id.txt % test data ids
│   └── [ 23K]  train-id.txt % train data ids
└── [4.0K]  saved/
    └── [4.0K]  GBert-predict/ % model files to reproduce our result
        ├── [ 371]  bert_config.json 
        └── [ 12M]  pytorch_model.bin

Preprocessing Data

We have released the preprocessing codes named data/EDA.ipynb to process data using raw files from MIMIC-III dataset. You can download data files from MIMIC and get necessary mapping files from GAMENet.

Quick Test

To validate the performance of G-Bert, you can run the following script since we have provided the trained model binary file and well-preprocessed data.

cd code/
python run_gbert.py --model_name GBert-predict --use_pretrain --pretrain_dir ../saved/GBert-predict --graph

Cite

Please cite our paper if you find this code helpful:

@article{shang2019pre,
  title={Pre-training of Graph Augmented Transformers for Medication Recommendation},
  author={Shang, Junyuan and Ma, Tengfei and Xiao, Cao and Sun, Jimeng},
  journal={arXiv preprint arXiv:1906.00346},
  year={2019}
}

Acknowledgement

Many thanks to the open source repositories and libraries to speed up our coding progress.

Pre-training of Graph Augmented Transformers for Medication Recommendation

Related tags

Overview

G-Bert

Intro

Requirements

Guide

Preprocessing Data

Quick Test

Cite

Acknowledgement

Owner

Code for Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

SGPT: Multi-billion parameter models for semantic search

✔️ Visual, reactive testing library for Julia. Time machine included.

Multi-modal Vision Transformers Excel at Class-agnostic Object Detection

Prior-Guided Multi-View 3D Head Reconstruction

Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation (ICCV 2021)

Official code for "Decoupling Zero-Shot Semantic Segmentation"

House3D: A Rich and Realistic 3D Environment

LLVIP: A Visible-infrared Paired Dataset for Low-light Vision

A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python.

Chatbot in 200 lines of code using TensorLayer

A keras implementation of ENet (abandoned for the foreseeable future)

DI-smartcross - Decision Intelligence Platform for Traffic Crossing Signal Control

LegoDNN: a block-grained scaling tool for mobile vision systems

Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation

⚖️🔁🔮🕵️‍♂️🦹🖼️ Code for Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances paper.

Learning Facial Representations from the Cycle-consistency of Face (ICCV 2021)

Plaything for Autistic Children (demo for PaddlePaddle/Wechaty/Mixlab project)

DRLib：A concise deep reinforcement learning library, integrating HER and PER for almost off policy RL algos.

[SDM 2022] Towards Similarity-Aware Time-Series Classification

Pre-training of Graph Augmented Transformers for Medication Recommendation

Related tags

Overview

G-Bert

Intro

Requirements

Guide

Preprocessing Data

Quick Test

Cite

Acknowledgement

Owner

Code for Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

SGPT: Multi-billion parameter models for semantic search

✔️ Visual, reactive testing library for Julia. Time machine included.

Multi-modal Vision Transformers Excel at Class-agnostic Object Detection

Prior-Guided Multi-View 3D Head Reconstruction

Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation (ICCV 2021)

Official code for "Decoupling Zero-Shot Semantic Segmentation"

House3D: A Rich and Realistic 3D Environment

LLVIP: A Visible-infrared Paired Dataset for Low-light Vision

A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python.

Chatbot in 200 lines of code using TensorLayer

A keras implementation of ENet (abandoned for the foreseeable future)

DI-smartcross - Decision Intelligence Platform for Traffic Crossing Signal Control

LegoDNN: a block-grained scaling tool for mobile vision systems

Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation

⚖️🔁🔮🕵️‍♂️🦹🖼️ Code for *Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances* paper.

Learning Facial Representations from the Cycle-consistency of Face (ICCV 2021)

Plaything for Autistic Children (demo for PaddlePaddle/Wechaty/Mixlab project)

DRLib：A concise deep reinforcement learning library, integrating HER and PER for almost off policy RL algos.

[SDM 2022] Towards Similarity-Aware Time-Series Classification

⚖️🔁🔮🕵️‍♂️🦹🖼️ Code for Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances paper.