当前位置:网站首页>Flagai Feizhi: AI basic model open source project, which supports one click call of OPT and other models

Flagai Feizhi: AI basic model open source project, which supports one click call of OPT and other models

2022-06-23 19:21:00 Zhiyuan community

One background

GPT-3、OPT series 、 Pre training models such as enlightenment are NLP The field has achieved remarkable results , But different code warehouses have different implementation styles , And the techniques used in pre training large models are also different , Created a technological gap . For quick loading 、 Training 、 Reasoning different big models , Use the latest and fastest model parallel technology and improve the convenience of user training and using models , Zhiyuan Artificial Intelligence Research Institute launched FlagAI( Feizhi ) Basic model open source project , Provide support for functions such as one click model enlargement .

Two FlagAI characteristic

FlagAI Feizhi is a fast 、 Easy to use and scalable AI Basic model toolkit . Support one click call to a variety of mainstream basic models , At the same time, it adapts to a variety of downstream tasks in both Chinese and English .

  • FlagAI Support the Enlightenment of up to 10 billion parameters GLM( See GLM Introduce ), It also supports BERT、RoBERTa、GPT2、T5 Model 、Meta OPT Models and Huggingface Transformers Model of .
  • FlagAI Provide API To quickly download and in a given ( in / english ) Use these pre training models on the text , You can fine tune it on your own dataset (fine-tuning) Or applications Prompt learning (prompt-tuning).
  • FlagAI Provide rich basic model downstream task support , For example, text classification 、 information extraction 、 Question and answer 、 Abstract 、 Text generation, etc , Good support for both Chinese and English .
  • FlagAI By the three most popular data / Model parallel library (PyTorch/Deepspeed/Megatron-LM) Provide support , Seamless integration between them . stay FlagAI On , You can parallel your training with less than ten lines of code 、 Testing process , It is also convenient to use various model acceleration techniques .

Open source project address :https://github.com/BAAI-Open/FlagAI

3、 ... and Application example

One click call Pipeline

for example : call GLM The pre training model directly performs Chinese Q & A 、 Phrase completion task

You only need three lines of code to load GLM-large-ch Model and corresponding tokenizer.

Besides , We also support one click calls :

  1. Title generation task
  2. Universal NER Named entity recognition task
  3. Text continuation task
  4. Semantic similarity matching task, etc

The training sample

In addition to the convenient one click call for different tasks ,FlagAI It also provides a wealth of training examples , Data samples are provided for each training example , It is more convenient to understand the training process .

For example, using Bert The model is used to train the title generation task , The training related directory structure is organized as follows ( See open source repository for details examples Catalog https://github.com/BAAI-Open/FlagAI/tree/master/examples):

among data The directory is a sample data structure ,news.tsv For sample data format , It is more convenient to understand the training process ;train.py Script file for training ;generate.py Script files for reasoning .

FlagAI Provides a very rich Chinese / English training examples , You can directly download relevant data for training , You can also flexibly switch your own data sets , Achieve training 、 Support the whole process of reasoning .

Four Multiple data parallel 、 Model parallel training mode supports

FlagAI Support multiple parallel training strategies , Include :

  1. pytorch: The conventional pytorch Single card training
  2. pytorchDDP: The conventional pytorch Data parallelism , DOCA training
  3. deepspeed: Microsoft open source efficient deep learning training optimization Library , Increase the utilization efficiency of multi card training video memory , Details please see :https://github.com/microsoft/DeepSpeed
  4. deepspeed+mpu:mpu Open source for NVIDIA megatron-lm Model parallel method , Details please see :https://github.com/NVIDIA/Megatron-LM

stay FlagAI in , You can choose different training methods according to the size of the model , also FlagAI It also integrates many practical skills and operations , for example fp16 Semi precision 、 Gradient accumulation 、 Gradient recalculation 、CPU offload And various parallel computing strategies . about Bert-base Level model , The combination of these techniques can reduce the memory utilization 50% above , Training speed has been greatly improved ; Using a single card V100 It can also be easy finetune Ten billion level model , Come and try it .

5、 ... and The follow-up plan

at present FlagAI Most of the supported are NLP Common models in , In later versions ,FlagAI Will continue to integrate computer vision and multimodal orientation of the pre training model , for example Vit、Swin-Transformer、PVT etc. , send FlagAI More general . Welcome to try , And put forward valuable opinions , You can also join us in many ways , Discuss big model technology together .

Project address :https://github.com/BAAI-Open/FlagAI

FlagAI Technology exchange group

原网站

版权声明
本文为[Zhiyuan community]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/174/202206231832471036.html