当前位置:网站首页>Started a natural language model bloom
Started a natural language model bloom
2022-06-27 23:20:00 【Xu zhougeng】
stay 2 Years ago ,OpenAI Made a 1750 A neural network model with 100 million parameters , namely GPT-3( Is its predecessor GPT-2, about 15 One hundred million parameters , Of 100 Many times ). It is a pity , GPT-3 There is no open source data set for its pre trained parameters , Users can only invoke what they provide API, Because of this API You must apply for , So I haven't had a chance to use . however , There are already many companies based on GPT-3 Developed some applications , for instance GitHub Just open one GitHub Copilot For intelligent completion code , If you are interested, you can try .
generally ,GPT-3 The neural network model at this level has only OpenAI, Giant technology companies like Google can develop and train , After all 1750 The number of arguments is not a small number . however , One by about 1000 International volunteers composed of many academic volunteers do not believe in evil , They are using value 700 Million dollars of public funds to sponsor the calculation of time to train a person with 1760 A natural language model with 100 million parameters ,BLOOM.
Out of interest , I'm in my own Macbook Pro Tested this on BLOOM-1b3 Model , As the name suggests, the model has 13 Million parameters . Why not test the complete 1750 Billion parameters ? Because light reading 13 100 million parameters to Python in , You need to occupy 6G Left and right memory , complete 1750 One hundred million parameters , At least one 1T Memory server . And the complete parameter version has not been completed yet , Still training .
BLOOM Is based on Transformers, and Transformers The installation requirements of meet the following three points
Python 3.6+ Flax 0.3.2+/ PyTorch 1.3.1+ / TensorFlow 2.3+ Any frame Rust( Used to compile tokenizers)
stay Mac On , We can go through homebrew To install rust
brew install rust
In the framework of deep learning , I chose PyTorch, Because it supports the use of M1 Chip GPU Accelerate
# Installing a virtual environment
python3 -m pip install --user --upgrade pip
python3 -m pip install --user virtualenv
# Creating a virtual environment
python3 -m venv pytorch
source pytorch/bin/activate
# install pytorch
pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
# install transformers
pip3 install transformers
After open the python, Load our model ( First run , transformers Will automatically help download the model )
from transformers import AutoTokenizer, AutoModelForCausalLM
# participle
tokenizer = AutoTokenizer.from_pretrained('bigscience/bloom-1b3')
# Model
model = AutoModelForCausalLM.from_pretrained('bigscience/bloom-1b3')
Because I know nothing about natural semantic processing , So only according to the sample code , Use transformres Of pipeline, Build a text generation tool .
from transformers import pipeline
generator = pipeline(task="text-generation", model=model, tokenizer=tokenizer)
# English input
generator("bioinformatics is ", max_length=30, num_return_sequence=5)
# Output results
[{'generated_text': 'bioinformatics is a branch of computer science that deals with the analysis of data and the design of algorithms that can be used to solve problems in'}]
# Chinese input
generator(" Bioinformatics is a subject ")
# Output results
[{'generated_text': ' Bioinformatics is a new interdisciplinary subject , It involves bioinformatics 、 Computer science 、 mathematics '}]
From the results ,13 Billion parameter BLOOM The performance of the model has been very much in line with my expectations , Far more than before Transformers Default when testing GPT-2 Model .
Unfortunately, my ability is limited , Yes BLOOM Our exploration stops here , Wait until I am right Transformers With more mastery , Update the relevant content .
Reference material
https://www.nature.com/articles/d41586-022-01705-z
边栏推荐
- 在线JSON转PlainText工具
- fiddler 监听不到接口怎么办
- Azure Kinect DK realizes 3D reconstruction (Jetson real-time version)
- Advertising is too "wild", Yoshino "surrenders"
- 打造南沙“强芯”,南沙首届IC Nansha大会召开
- Ice cream or snow "high"?
- Realization of kaggle cat dog recognition by pytorch
- 【经典干货书】数据科学中的信息理论方法,561页pdf
- 【剑指Offer】48. 最长不含重复字符的子字符串
- 第一性原理(最优解理论)
猜你喜欢

月薪3万的狗德培训,是不是一门好生意?

Spark BUG實踐(包含的BUG:ClassCastException;ConnectException;NoClassDefFoundError;RuntimeExceptio等。。。。)

Feign implements path escape through custom annotations

2022年第一季度“广州好人”刘磊峰:具有强烈的诚信意识和食品安全意识

Azure Kinect DK 实现三维重建 (PC非实时版)

Discuz taobaoke website template / Dean taobaoke shopping style commercial version template

PE buys a underwear company

网易云“情怀”底牌失守

小芯片chiplet技术杂谈

Livox lidar+ Hikvision camera real-time 3D reconstruction based on loam to generate RGB color point cloud
随机推荐
clickonce 部署ClickOnce应用程序时出错-清单中的引用与下载的程序集的标识不匹配
云辅助隐私集合求交(Server-Aided PSI)协议介绍:学习
The choice and trade-off between vector recall and literal recall
ABAP随笔-关于ECC后台server读取Excel方案的想法
Discuz small fish game wind shadow legend business gbk+utf8 version template /dz game website template
ABAP essay - material master data interface enhancement - tab enhancement
How to start ID from 1 after MySQL deletes a table
小芯片chiplet技术杂谈
Using xgboost with tidymodels
【经典干货书】数据科学中的信息理论方法,561页pdf
【蓝桥杯集训100题】scratch数字计算 蓝桥杯scratch比赛专项预测编程题 集训模拟练习题第16题
Spark bug Practice (including bug: classcastexception; connectexception; noclassdeffounderror; runtimeexceptio, etc.)
UESTC (shenhengtao team) & JD AI (Mei Tao team) proposed a structured dual stream attention network for video Q & A, with performance SOTA! Better than the method based on dual video representation!
Senior headhunting team manager: interviewed 3000 consultants, summarized and organized 8 commonalities (Mao Sheng)
netERR_ CONNECTION_ Refused solution
C# Winform 读取Resources图片
移动端避免使用100vh[通俗易懂]
Technical implementation process of easycvr platform routing log function [code attached]
Spug - 轻量级自动化运维平台
Typora 1.2.5等版本下载