DANet for Tabular data classification/ regression.

Related tags

Deep LearningDANet
Overview

Deep Abstract Networks

A pyTorch implementation for AAAI-2022 paper DANets: Deep Abstract Networks for Tabular Data Classification and Regression.

Brief Introduction

Tabular data are ubiquitous in real world applications. Although many commonly-used neural components (e.g., convolution) and extensible neural networks (e.g., ResNet) have been developed by the machine learning community, few of them were effective for tabular data and few designs were adequately tailored for tabular data structures. In this paper, we propose a novel and flexible neural component for tabular data, called Abstract Layer (AbstLay), which learns to explicitly group correlative input features and generate higher-level features for semantics abstraction. Also, we design a structure re-parameterization method to compress AbstLay, thus reducing the computational complexity by a clear margin in the reference phase. A special basic block is built using AbstLays, and we construct a family of Deep Abstract Networks (DANets) for tabular data classification and regression by stacking such blocks. In DANets, a special shortcut path is introduced to fetch information from raw tabular features, assisting feature interactions across different levels. Comprehensive experiments on real-world tabular datasets show that our AbstLay and DANets are effective for tabular data classification and regression, and the computational complexity is superior to competitive methods.

DANets illustration

DANets

Downloads

Dataset

Download the datasets from the following links:

(Optional) Before starting the program, you may change the file format to .pkl by using svm2pkl() or csv2pkl() functions in ./data/data_util.py.

Weights for inference models

The demo weights for Forest Cover Type dataset is available in the folder "./Weights/".

How to use

Setting

  1. Clone or download this repository, and cd the path.
  2. Build a working python environment. Python 3.7 is fine for this repository.
  3. Install packages following the requirements.txt, e.g., by using pip install -r requirements.txt.

Training

  1. Set the hyperparameters in config files (./config/default.py or ./config/*.yaml).
    Notably, the hyperparameters in .yaml file will cover those in default.py.

  2. Run by python main.py --c [config_path] --g [gpu_id].

    • -c: The config file path
    • -g: GPU device ID
  3. The checkpoint models and best models will be saved at the ./logs file.

Inference

  1. Replace the resume_dir path with the file path containing your trained model/weight.
  2. Run codes by using python predict.py -d [dataset_name] -m [model_file_path] -g [gpu_id].
    • -d: Dataset name
    • -m: Model path for loading
    • -g: GPU device ID

Config Hyperparameters

Normal parameters

  • dataset: str
    The dataset name given must match those in ./data/dataset.py.

  • task: str
    Choose one of the pre-given tasks 'classification' and 'regression'.

  • resume_dir: str
    The log path containing the checkpoint models.

  • logname: str
    The directory names of the models save at ./logs.

  • seed: int
    The random seed.

Model parameters

  • layer: int (default=20)
    Number of abstract layers to stack

  • k: int (default=5)
    Number of masks

  • base_outdim: int (default=64)
    The output feature dimension in abstract layer.

  • drop_rate: float (default=0.1)
    Dropout rate in shortcut module

Fit parameters

  • lr: float (default=0.008)
    Learning rate

  • max_epochs: int (default=5000)
    Maximum number of epochs in training.

  • patience: int (default=1500)
    Number of consecutive epochs without improvement before performing early stopping. If patience is set to 0, then no early stopping will be performed.

  • batch_size: int (default=8192)
    Number of examples per batch.

  • virtual_batch_size: int (default=256)
    Size of the mini batches used for "Ghost Batch Normalization". virtual_batch_size must divide batch_size.

Citations

@inproceedings{danets, 
   title={DANets: Deep Abstract Networks for Tabular Data Classification and Regression}, 
   author={Chen, Jintai and Liao, Kuanlun and Wan, Yao and Chen, Danny Z and Wu, Jian}, 
   booktitle={AAAI}, 
   year={2022}
 }
Owner
Ronnie Rocket
Ronnie Rocket
A curated list of awesome resources combining Transformers with Neural Architecture Search

A curated list of awesome resources combining Transformers with Neural Architecture Search

Yash Mehta 173 Jan 03, 2023
Code-free deep segmentation for computational pathology

NoCodeSeg: Deep segmentation made easy! This is the official repository for the manuscript "Code-free development and deployment of deep segmentation

André Pedersen 26 Nov 23, 2022
This initial strategy was developed specifically for larger pools and is based on taking a moving average and deriving Bollinger Bands to create a projected active liquidity range.

Gamma's Strategy One This initial strategy was developed specifically for larger pools and is based on taking a moving average and deriving Bollinger

Gamma Strategies 46 Dec 02, 2022
Code for KDD'20 "Generative Pre-Training of Graph Neural Networks"

GPT-GNN: Generative Pre-Training of Graph Neural Networks GPT-GNN is a pre-training framework to initialize GNNs by generative pre-training. It can be

Ziniu Hu 346 Dec 19, 2022
NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations

NL-Augmenter 🦎 → 🐍 The NL-Augmenter is a collaborative effort intended to add transformations of datasets dealing with natural language. Transformat

684 Jan 09, 2023
Implementation of Nyström Self-attention, from the paper Nyströmformer

Nyström Attention Implementation of Nyström Self-attention, from the paper Nyströmformer. Yannic Kilcher video Install $ pip install nystrom-attention

Phil Wang 95 Jan 02, 2023
Deep Learning Models for Causal Inference

Extensive tutorials for learning how to build deep learning models for causal inference using selection on observables in Tensorflow 2.

Bernard J Koch 151 Dec 31, 2022
Official pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"

SinGAN Project | Arxiv | CVF | Supplementary materials | Talk (ICCV`19) Official pytorch implementation of the paper: "SinGAN: Learning a Generative M

Tamar Rott Shaham 3.2k Dec 25, 2022
This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Predicting Patient Outcomes with Graph Representation Learning This repository contains the code used for Predicting Patient Outcomes with Graph Repre

Emma Rocheteau 76 Dec 22, 2022
This repository contains the code for the ICCV 2019 paper "Occupancy Flow - 4D Reconstruction by Learning Particle Dynamics"

Occupancy Flow This repository contains the code for the project Occupancy Flow - 4D Reconstruction by Learning Particle Dynamics. You can find detail

189 Dec 29, 2022
Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides

Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides Project | This repo is the officia

CVSM Group - email: <a href=[email protected]"> 33 Dec 28, 2022
🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

AI City 2021: Connecting Language and Vision for Natural Language-Based Vehicle Retrieval 🏆 The 1st Place Submission to AICity Challenge 2021 Natural

82 Dec 29, 2022
An open framework for Federated Learning.

Welcome to Intel® Open Federated Learning Federated learning is a distributed machine learning approach that enables organizations to collaborate on m

Intel Corporation 397 Dec 27, 2022
IhoneyBakFileScan Modify - 批量网站备份文件扫描器,增加文件规则,优化内存占用

ihoneyBakFileScan_Modify 批量网站备份文件泄露扫描工具 2022.2.8 添加、修改内容 增加备份文件fuzz规则 修改备份文件大小判断

VMsec 220 Jan 05, 2023
A Physics-based Noise Formation Model for Extreme Low-light Raw Denoising (CVPR 2020 Oral & TPAMI 2021)

ELD The implementation of CVPR 2020 (Oral) paper "A Physics-based Noise Formation Model for Extreme Low-light Raw Denoising" and its journal (TPAMI) v

Kaixuan Wei 359 Jan 01, 2023
Pixel-Perfect Structure-from-Motion with Featuremetric Refinement (ICCV 2021, Oral)

Pixel-Perfect Structure-from-Motion (ICCV 2021 Oral) We introduce a framework that improves the accuracy of Structure-from-Motion by refining keypoint

Computer Vision and Geometry Lab 831 Dec 29, 2022
这是一个unet-pytorch的源码,可以训练自己的模型

Unet:U-Net: Convolutional Networks for Biomedical Image Segmentation目标检测模型在Pytorch当中的实现 目录 性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Downl

Bubbliiiing 567 Jan 05, 2023
Improving Machine Translation Systems via Isotopic Replacement

CAT (Improving Machine Translation Systems via Isotopic Replacement) Machine translation plays an essential role in people’s daily international commu

Zeyu Sun 10 Nov 30, 2022
Activity tragle - Google is tracking everything, we just look at it

activity_tragle Google is tracking everything, we just look at it here. You need

BERNARD Guillaume 1 Feb 15, 2022
Fully Convolutional Refined Auto Encoding Generative Adversarial Networks for 3D Multi Object Scenes

Fully Convolutional Refined Auto-Encoding Generative Adversarial Networks for 3D Multi Object Scenes This repository contains the source code for Full

Yu Nishimura 106 Nov 21, 2022