当前位置:网站首页>RuntimeError: CUDA error: CUBLAS_ STATUS_ ALLOC_ Failed when calling `cublascreate (handle) `problem solving
RuntimeError: CUDA error: CUBLAS_ STATUS_ ALLOC_ Failed when calling `cublascreate (handle) `problem solving
2022-07-07 07:05:00 【Go crazy first.】
One 、 Problem description
Use transformers package call pytorch Framework of the Bert When training the model , Use normal bert-base-cased Run normally on other datasets , But use it Roberta But always report errors :RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`
I worked hard for several days and didn't find out what the mistake was , Keep reminding Online batch_size Is it too big to cause , It is amended as follows 16->8->4->2 It's no use .
By comparing with other data sets , Find me in tokenizer Added new special_token, This may lead to the wrong report !

Two 、 Problem solving
In the original tokenizer Add special_tokens when , Forget to model Of tokenizer Update the vocabulary of Lead to !
The complete update method is :
from transformers import BertTokenizer, BertModel
tokenizer = BertTokenizer.from_pretrained('bert-base-cased')
# Add special words
tokenizer.add_special_tokens({'additional_special_tokens':["<S>"]})
model = BertModel.from_pretrained("bert-base-cased")
# Update the size of the thesaurus in the model !
# important !
model.resize_token_embeddings(len(tokenizer))3、 ... and 、 Problem solving
Can pass , Start training !

边栏推荐
- How DHCP router works
- DHCP路由器工作原理
- 【NOI模拟赛】区域划分(结论,构造)
- $refs:组件中获取元素对象或者子组件实例:
- Sword finger offer high quality code
- Problems and precautions about using data pumps (expdp, impdp) to export and import large capacity tables in Oracle migration
- A slow SQL drags the whole system down
- 【mysqld】Can't create/write to file
- 分布式id解决方案
- 网络基础 —— 报头、封装和解包
猜你喜欢

MATLAB小技巧(29)多项式拟合 plotfit

健身房如何提高竞争力?

途家、木鸟、美团……民宿暑期战事将起

Sword finger offer high quality code

jdbc数据库连接池使用问题

Comment les entreprises gèrent - elles les données? Partager les leçons tirées des quatre aspects de la gouvernance des données

Jetpack Compose 远不止是一个UI框架这么简单~

使用TCP/IP四层模型进行网络传输的基本流程

MOS tube parameters μ A method of Cox

SolidWorks GB Library (steel profile library, including aluminum profile, aluminum tube and other structures) installation and use tutorial (generating aluminum profile as an example)
随机推荐
2022年全国所有A级景区数据(13604条)
SVN version management in use replacement release and connection reset
Data of all class a scenic spots in China in 2022 (13604)
多学科融合
Installing redis and windows extension method under win system
学术报告系列(六) - Autonomous Driving on the journey to full autonomy
Big coffee gathering | nextarch foundation cloud development meetup is coming
Basic process of network transmission using tcp/ip four layer model
.net core 访问不常见的静态文件类型(MIME 类型)
from .onnxruntime_pybind11_state import * # noqa ddddocr运行报错
Redhat5 installing vmware tools under virtual machine
Several index utilization of joint index ABC
Basic introduction of JWT
$refs:组件中获取元素对象或者子组件实例:
DB2获取表信息异常:Caused by: com.ibm.db2.jcc.am.SqlException: [jcc][t4][1065][12306][4.25.13]
Bus消息总线
How to install swoole under window
How to do sports training in venues?
偏执的非合格公司
工具类:对象转map 驼峰转下划线 下划线转驼峰