Sequence-tagging using deep learning

Overview

Classification using Deep Learning

Requirements

  • PyTorch version >= 1.9.1+cu111
  • Python version >= 3.8.10
  • PyTorch-Lightning version >= 1.4.9
  • Huggingface Transformers version >= 4.11.3
  • Tensorboard version >= 2.6.0
  • Pandas >= 1.3.4
  • Scikit-learn: numpy>=1.14.6, scipy>=1.1.0, threadpoolctl>=2.0.0, joblib>=0.11

Installation

pip3 install transformers
pip3 install pytorch-lightning
pip3 install tensorboard
pip3 install pandas
pip3 install scikit-learn
git clone https://github.com/vineetk1/clss.git
cd clss

Note that the default directory is clss. Unless otherwise stated, all commands from the Command-Line-Interface must be delivered from the default directory.

Download the dataset

  1. Create a data directory.
mkdir data
  1. Download a dataset in the data directory.

Saving all informtion and results of an experiment

All information about the experiment is stored in a unique directory whose path starts with tensorboard_logs and ends with a unique version-number. Its contents consist of hparams.yaml, hyperperameters_used.yaml, test-results.txt, events.* files, and a checkpoints directory that has one or more checkpoint-files.

Train, validate, and test a model

Following command trains a model, saves the last checkpoint plus checkpoints that have the lowest validation loss, runs the test dataset on the checkpointed model with the lowest validation loss, and outputs the results of the test:

python3 Main.py input_param_files/bert_seq_class

The user-settable hyper-parameters are in the file input_param_files/bert_seq_class. An explanation on the contents of this file is at input_param_files/README.md. A list of all the hyper-parameters is in the PyTorch-Lightning documentation, and any hyper-parameter can be used.
To assist in Training, the two parameters auto_lr_find and auto_scale_batch_size in the file input_param_files/bert_seq_class enable the software to automatically find an initial Learning-Rate and a Batch-Size respectively.
As training progresses, graphs of "training-loss vs. epoch #", "validation-loss vs. epoch #", and "learning-rate vs. batch #" are plotted in real-time on the TensorBoard. Training is stopped by typing, at the Command-Line-Interface, the keystroke ctrl-c. The current training information is checkpointed, and training stops. Training can be resumed, at some future time, from the checkpointed file.
Dueing testing, the results are sent to the standard-output, and also saved in the *test-results.txt" file that include the following: general information about the dataset and the classes, confusion matrix, precision, recall, f1, average f1, and weighted f1.

Resume training, validation, and testing a model with same hyper-parameters

Resume training a checkpoint model with the same model- and training-states by using the following command:

python3 Main.py input_param_files/bert_seq_class-res_from_chkpt

The user-settable hyper-parameters are in the file input_param_files/bert_seq_class-res_from_chkpt. An explanation on the contents of this file is at input_param_files/README.md.

Change hyper-parameters and continue training, validation, and testing a model

Continue training a checkpoint model with the same model-state but different hyperparameters for the training-state by using the following command:

python3 Main.py input_param_files/bert_seq_class-ld_chkpt

The user-settable hyper-parameters are in the file input_param_filesbert_seq_class-ld_chkpt. An explanation on the contents of this file is at input_param_files/README.md.

Further test a checkpoint model with a new dataset

Test a checkpoint model by using the following command:

python3 Main.py input_param_files/bert_seq_class-ld_chkpt_and_test

The user-settable hyper-parameters are in the file input_param_files/bert_seq_class-ld_chkpt_and_test. An explanation on the contents of this file is at input_param_files/README.md.

Owner
Vineet Kumar
Vineet Kumar
Hide screen when boss is approaching.

BossSensor Hide your screen when your boss is approaching. Demo The boss stands up. He is approaching. When he is approaching, the program fetches fac

Hiroki Nakayama 6.2k Jan 07, 2023
A python comtrade load library accelerated by go

Comtrade-GRPC Code for python used is mainly from dparrini/python-comtrade. Just patch the code in BinaryDatReader.parse for parsing a little more eff

Bo 1 Dec 27, 2021
A pre-trained language model for social media text in Spanish

RoBERTuito A pre-trained language model for social media text in Spanish READ THE FULL PAPER Github Repository RoBERTuito is a pre-trained language mo

25 Dec 29, 2022
ICLR2021 (Under Review)

Self-Supervised Time Series Representation Learning by Inter-Intra Relational Reasoning This repository contains the official PyTorch implementation o

Haoyi Fan 58 Dec 30, 2022
UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus General info This is

71 Oct 25, 2022
Pytorch implementation of paper: "NeurMiPs: Neural Mixture of Planar Experts for View Synthesis"

NeurMips: Neural Mixture of Planar Experts for View Synthesis This is the official repo for PyTorch implementation of paper "NeurMips: Neural Mixture

James Lin 101 Dec 13, 2022
Official code for 'Robust Siamese Object Tracking for Unmanned Aerial Manipulator' and offical introduction to UAMT100 benchmark

SiamSA: Robust Siamese Object Tracking for Unmanned Aerial Manipulator Demo video 📹 Our video on Youtube and bilibili demonstrates the evaluation of

Intelligent Vision for Robotics in Complex Environment 12 Dec 18, 2022
Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering

BPR Binary Passage Retriever (BPR) is an efficient neural retrieval model for open-domain question answering. BPR integrates a learning-to-hash techni

Studio Ousia 147 Dec 07, 2022
The backbone CSPDarkNet of YOLOX.

YOLOX-Backbone The backbone CSPDarkNet of YOLOX. In this project, you can enjoy: CSPDarkNet-S CSPDarkNet-M CSPDarkNet-L CSPDarkNet-X CSPDarkNet-Tiny C

Jianhua Yang 9 Aug 22, 2022
Official PyTorch Implementation of "Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs". NeurIPS 2020.

Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs This repository is the implementation of SELAR. Dasol Hwang* , Jinyoung Pa

MLV Lab (Machine Learning and Vision Lab at Korea University) 48 Nov 09, 2022
face property detection pytorch

This is the face property train code of project face-detection-project

i am x 2 Oct 18, 2021
Cleaned test data list of DukeMTMC-reID, ICCV2021

Cleaned DukeMTMC-reID Cleaned data list of DukeMTMC-reID released with our paper accepted by ICCV 2021: Learning Instance-level Spatial-Temporal Patte

14 Feb 19, 2022
Deep High-Resolution Representation Learning for Human Pose Estimation

Deep High-Resolution Representation Learning for Human Pose Estimation (accepted to CVPR2019) News If you are interested in internship or research pos

HRNet 167 Dec 27, 2022
An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics.

Sketch Simulator An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics. See

12 Dec 18, 2022
Relative Positional Encoding for Transformers with Linear Complexity

Stochastic Positional Encoding (SPE) This is the source code repository for the ICML 2021 paper Relative Positional Encoding for Transformers with Lin

Antoine Liutkus 48 Nov 16, 2022
My course projects for the 2021 Spring Machine Learning course at the National Taiwan University (NTU)

ML2021Spring There are my projects for the 2021 Spring Machine Learning course at the National Taiwan University (NTU) Course Web : https://speech.ee.

Ding-Li Chen 15 Aug 29, 2022
Optimizing synthesizer parameters using gradient approximation

Optimizing synthesizer parameters using gradient approximation NASH 2021 Hackathon! These are some experiments I conducted during NASH 2021, the Neura

Jordie Shier 10 Feb 10, 2022
code for Fast Point Cloud Registration with Optimal Transport

robot This is the repository for the paper "Accurate Point Cloud Registration with Robust Optimal Transport". We are in the process of refactoring the

28 Jan 04, 2023
Implementation of "JOKR: Joint Keypoint Representation for Unsupervised Cross-Domain Motion Retargeting"

JOKR: Joint Keypoint Representation for Unsupervised Cross-Domain Motion Retargeting Pytorch implementation for the paper "JOKR: Joint Keypoint Repres

45 Dec 25, 2022
Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021 The code for training mCOLT/mRASP2, a multilingua

104 Jan 01, 2023