OpenChat: Opensource chatting framework for generative models

Overview

OpenChat: Opensource chatting framework for generative models

    ____   ____   ______ _   __   ______ __  __ ___   ______
   / __ \ / __ \ / ____// | / /  / ____// / / //   | /_  __/
  / / / // /_/ // __/  /  |/ /  / /    / /_/ // /| |  / /   
 / /_/ // ____// /___ / /|  /  / /___ / __  // ___ | / /    
 \____//_/    /_____//_/ |_/   \____//_/ /_//_/  |_|/_/     
  • OpenChat is opensource chatting framework for generative models.
  • You can talk with AI with only one line of code.



Installation

pip install openchat



Model Supports

  • DialoGPT : [small, medium, large]
  • BlenderBot : [small, medium, large, xlarge]
  • Coming Soon...



Usage

1. Terminal environment

  • Chatting on terminal.
  • Type .exit if you want to exit dialogue.
  • Type .clear if you want to clear all histories.
  • You can add additional user keywords. (Details are described below.)
>>> from openchat import OpenChat
>>> OpenChat(model="blenderbot", size="large")
Chat with facebook/blenderbot-1B-distill !
user : Hi. It is nice day.
bot : It is a nice day out. I am going to go for a walk. What are you up to?
user : I'll play computer game. Do you like playing game?
bot : I love playing video games. What kind of games do you like to play on your computer?
user : I love racing game. It is so exited. How about you?
bot : I love racing games as well. I like to play on the Playstation 4. What about you?
user : Usually I like to play on PC. I have a question. Who is CEO of Apple?
bot : The CEO of Apple is Steve Jobs. He founded the company in 1976.
user : That's right. I have an IPhone. It is my favorite phone. What is your phone?
bot : I have a Samsung Galaxy S8. I love it. It's the best phone I've ever had.
user : .exit
bot : good bye.

  • If you want to use GPU, use argument device="cuda"
>>> from openchat import OpenChat
>>> OpenChat(model="blenderbot", size="large", device="cuda")

2. Your own environment (not terminal)

  • You can use user id to manage user-specific history.
  • This can be useful when deployed on Facebook messenger or WhatsApp.
  • There is a web demo implementation in the /demo folder.


2.1. Write your own environment class

  • Make your own environment class inherited from BaseEnv
  • And implement your own run(model: BaseModel) method like below.
from typing import Dict
from flask import Flask, render_template
from flask_cors import CORS
from openchat.envs import BaseEnv
from openchat.models import BaseModel


class WebDemoEnv(BaseEnv):

    def __init__(self):
        super().__init__()
        self.app = Flask(__name__)
        CORS(self.app)

    def run(self, model: BaseModel):

        @self.app.route("/")
        def index():
            return render_template("index.html", title=model.name)

        @self.app.route('/send//', methods=['GET'])
        def send(user_id, text: str) -> Dict[str, str]:

            if text in self.keywords:
                # Format of self.keywords dictionary
                # self.keywords['/exit'] = (exit_function, 'good bye.')

                _out = self.keywords[text][1]
                # text to print when keyword triggered

                self.keywords[text][0](user_id, text)
                # function to operate when keyword triggered

            else:
                _out = model.predict(user_id, text)

            return {"output": _out}

        self.app.run(host="0.0.0.0", port=8080)

2.2. Start to run application.

from openchat import OpenChat
from demo.web_demo_env import WebDemoEnv

OpenChat(model="blenderbot", size="large", env=WebDemoEnv())



3. Additional Options

3.1. Add custom Keywords

  • You can add new manual keyword such as .exit, .clear,
  • call the self.add_keyword('.new_keyword', 'message to print', triggered_function)' method.
  • triggered_function should be form of function(user_id:str, text:str)
from openchat.envs import BaseEnv


class YourOwnEnv(BaseEnv):
    
    def __init__(self):
        super().__init__()
        self.add_keyword(".new_keyword", "message to print", self.function)

    def function(self, user_id: str, text: str):
        """do something !"""
        



3.2. Modify generation options

  • You can modify max_context_length (number of input history tokens, default is 128).
>>> OpenChat(size="large", device="cuda", max_context_length=256)

  • You can modify generation options ['num_beams', 'top_k', 'top_p'].
>>> model.predict(
...     user_id="USER_ID",
...     text="Hello.",
...     num_beams=5,
...     top_k=20,
...     top_p=0.8,
... )



3.3. Check histories

  • You can check all dialogue history using self.histories
from openchat.envs import BaseEnv


class YourOwnEnv(BaseEnv):
    
    def __init__(self):
        super().__init__()
        print(self.histories)
{
    user_1 : {'user': [] , 'bot': []},
    user_2 : {'user': [] , 'bot': []},
    ...more...
    user_n : {'user': [] , 'bot': []},
}

3.4. Clear histories

  • You can clear all dialogue histories
from flask import Flask
from openchat.envs import BaseEnv
from openchat.models import BaseModel

class YourOwnEnv(BaseEnv):
    
    def __init__(self):
        super().__init__()
        self.app = Flask(__name__)

    def run(self, model: BaseModel):
        
        @self.app.route('/send//', methods=['GET'])
        def send(user_id, text: str) -> Dict[str, str]:
            
            self.clear(user_id, text)
            # clear all histories ! 



License

Copyright 2021 Hyunwoong Ko.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Owner
Hyunwoong Ko
Co-Founder and Research Engineer at @tunib-ai. previously @kakaobrain.
Hyunwoong Ko
Deep Learning Topics with Computer Vision & NLP

Deep learning Udacity Course Deep Learning Topics with Computer Vision & NLP for the AWS Machine Learning Engineer Nanodegree Program Tasks are mostly

Simona Mircheva 1 Jan 20, 2022
Poetry PEP 517 Build Backend & Core Utilities

Poetry Core A PEP 517 build backend implementation developed for Poetry. This project is intended to be a light weight, fully compliant, self-containe

Poetry 293 Jan 02, 2023
Simple Python script to scrape youtube channles of "Parity Technologies and Web3 Foundation" and translate them to well-known braille language or any language

Simple Python script to scrape youtube channles of "Parity Technologies and Web3 Foundation" and translate them to well-known braille language or any

Little Endian 1 Apr 28, 2022
WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

Google Research Datasets 740 Dec 24, 2022
Python library to make development of portfolio analysis faster and easier

Trafalgar Python library to make development of portfolio analysis faster and easier Installation 🔥 For the moment, Trafalgar is still in beta develo

Santosh Passoubady 641 Jan 01, 2023
AI-powered literature discovery and review engine for medical/scientific papers

AI-powered literature discovery and review engine for medical/scientific papers paperai is an AI-powered literature discovery and review engine for me

NeuML 819 Dec 30, 2022
Deduplication is the task to combine different representations of the same real world entity.

Deduplication is the task to combine different representations of the same real world entity. This package implements deduplication using active learning. Active learning allows for rapid training wi

63 Nov 17, 2022
Graphical user interface for Argos Translate

Argos Translate GUI Website | GitHub | PyPI Graphical user interface for Argos Translate. Install pip3 install argostranslategui

Argos Open Tech 16 Dec 07, 2022
NLP and Text Generation Experiments in TensorFlow 2.x / 1.x

Code has been run on Google Colab, thanks Google for providing computational resources Contents Natural Language Processing(自然语言处理) Text Classificati

1.5k Nov 14, 2022
Silero Models: pre-trained speech-to-text, text-to-speech models and benchmarks made embarrassingly simple

Silero Models: pre-trained speech-to-text, text-to-speech models and benchmarks made embarrassingly simple

Alexander Veysov 3.2k Dec 31, 2022
A script that automatically creates a branch name using google translation api and jira api

About google translation api와 jira api을 사용하여 자동으로 브랜치 이름을 만들어주는 스크립트 Setup 환경변수에 다음 3가지를 등록해야 한다. JIRA_USER : JIRA email (ex: hyunwook.kim 2 Dec 20, 2021

Translate U is capable of translating the text present in an image from one language to the other.

Translate U is capable of translating the text present in an image from one language to the other. The app uses OCR and Google translate to identify and translate across 80+ languages.

Neelanjan Manna 1 Dec 22, 2021
Switch spaces for knowledge graph embeddings

SwisE Switch spaces for knowledge graph embeddings. Requirements: python3 pytorch numpy tqdm Reproduce the results To reproduce the reported results,

Shuai Zhang 4 Dec 01, 2021
Maix Speech AI lib, including ASR, chat, TTS etc.

Maix-Speech 中文 | English Brief Now only support Chinese, See 中文 Build Clone code by: git clone https://github.com/sipeed/Maix-Speech Compile x86x64 c

Sipeed 267 Dec 25, 2022
BERT, LDA, and TFIDF based keyword extraction in Python

BERT, LDA, and TFIDF based keyword extraction in Python kwx is a toolkit for multilingual keyword extraction based on Google's BERT and Latent Dirichl

Andrew Tavis McAllister 41 Dec 27, 2022
A program that uses real statistics to choose the best times to bet on BloxFlip's crash gamemode

Bloxflip Smart Bet A program that uses real statistics to choose the best times to bet on BloxFlip's crash gamemode. https://bloxflip.com/crash. THIS

43 Jan 05, 2023
Implementation of TTS with combination of Tacotron2 and HiFi-GAN

Tacotron2-HiFiGAN-master Implementation of TTS with combination of Tacotron2 and HiFi-GAN for Mandarin TTS. Inference In order to inference, we need t

SunLu Z 7 Nov 11, 2022
Syntax-aware Multi-spans Generation for Reading Comprehension (TASLP 2022)

SyntaxGen Syntax-aware Multi-spans Generation for Reading Comprehension (TASLP 2022) In this repo, we upload all the scripts for this work. Due to siz

Zhuosheng Zhang 3 Jun 13, 2022
Train and use generative text models in a few lines of code.

blather Train and use generative text models in a few lines of code. To see blather in action check out the colab notebook! Installation Use the packa

Dan Carroll 16 Nov 07, 2022
Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classifi

186 Dec 24, 2022