RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.

Last update: Dec 22, 2022

Overview

RITA: a Study on Scaling Up Generative Protein Sequence Models

RITA is a family of autoregressive protein models, developed by a collaboration of Lighton, the OATML group at Oxford, and the Debbie Marks Lab at Harvard.

Model	#Params	d_model	layers	lm loss uniref-100
Small	85M	768	12	2.31
Medium	300M	1024	24	2.01
Large	680M	1536	24	1.82
XLarge	1.2B	2048	24	1.70

Results

For full results see our preprint: https://arxiv.org/abs/2205.05789

Usage

Instantiate a model like so:

from transformers import AutoModel, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("lightonai/RITA_s, trust_remote_code=True")
tokenizer = AutoTokenizer.from_pretrained("lightonai/RITA_s")

for generation we support pipelines:

from transformers import pipeline
rita_gen = pipeline('text-generation', model=model, tokenizer=tokenizer)
sequences = rita_gen("MAB", max_length=20, do_sample=True, top_k=950, repetition_penalty=1.2, 
                     num_return_sequences=2, eos_token_id=2)
for seq in sequences:
    print(f"seq: {seq['generated_text'].replace(' ', '')}")

Or see example.py

How to cite

@article{hesslow2022rita,
  title={RITA: a Study on Scaling Up Generative Protein Sequence Models},
  author={Hesslow, Daniel and Zanichelli, Niccol{\'o} and Notin, Pascal and Poli, Iacopo and Marks, Debora},
  journal={arXiv preprint arXiv:2205.05789},
  year={2022}
}

RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.

Related tags

Overview

RITA: a Study on Scaling Up Generative Protein Sequence Models

Results

Usage

How to cite

Owner

LightOn

Tools for investing in Python

Masked regression code - Masked Regression

Vision-and-Language Navigation in Continuous Environments using Habitat

HTSeq is a Python library to facilitate processing and analysis of data from high-throughput sequencing (HTS) experiments.

Intro-to-dl - Resources for "Introduction to Deep Learning" course.

Official implementation of the article "Unsupervised JPEG Domain Adaptation For Practical Digital Forensics"

PyTorch implementation of DCT fast weight RNNs

A video scene detection algorithm is designed to detect a variety of different scenes within a video

Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features

Sleep staging from ECG, assisted with EEG

An implementation of Fastformer: Additive Attention Can Be All You Need in TensorFlow

Code repository for "Stable View Synthesis".

Repo for "Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions" https://arxiv.org/abs/2201.12296

Bravia core script for python

PyTorch implementation of Glow

Repository for MeshTalk supplemental material and code once the (already approved) 16 GHS captures our lab will make publicly available are released.

On the Limits of Pseudo Ground Truth in Visual Camera Re-Localization

This tool uses Deep Learning to help you draw and write with your hand and webcam.

Algorithmic Trading using RNN

OpenMMLab Computer Vision Foundation