当前位置:网站首页>RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`问题解决
RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`问题解决
2022-07-07 02:18:00 【不撸先疯。】
一、问题描述
使用transformers包调用pytorch框架的Bert预训练模型时,使用正常的bert-base-cased在其他数据集上正常运行,但是运用Roberta的时候却一直报错:RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`
忙活了好几天也没查出是什么错误,网上一直提醒batch_size是否太大导致,修改为16->8->4->2都没有用。
通过与其他数据集的对比,发现我在tokenizer中加入了新的special_token,导致可能报错了!

二、问题解决
在原始的tokenizer中加入special_tokens时,忘记将model的tokenizer的词表进行更新导致!
完整更新方式为:
from transformers import BertTokenizer, BertModel
tokenizer = BertTokenizer.from_pretrained('bert-base-cased')
# 添加特殊词
tokenizer.add_special_tokens({'additional_special_tokens':["<S>"]})
model = BertModel.from_pretrained("bert-base-cased")
# 在模型中更新词表的大小!
# 重要!
model.resize_token_embeddings(len(tokenizer))三、问题解决
可以通过,开始训练!

边栏推荐
- PostgreSQL database timescaledb function time_ bucket_ Gapfill() error resolution and license replacement
- JESD204B时钟网络
- Abnova 体外转录 mRNA工作流程和加帽方法介绍
- POI导出Excel:设置字体、颜色、行高自适应、列宽自适应、锁住单元格、合并单元格...
- 微信小程序隐藏video标签的进度条组件
- 面试中有哪些经典的数据库问题?
- C interview encryption program: input plaintext by keyboard, convert it into ciphertext through encryption program and output it to the screen.
- MySQL的安装
- [solution] final app status- undefined, exitcode- 16
- 2022Android面试必备知识点,一文全面总结
猜你喜欢

FlexRay通信协议概述

缓存在高并发场景下的常见问题

Doctoral application | Professor Hong Liang, Academy of natural sciences, Shanghai Jiaotong University, enrolls doctoral students in deep learning

请问如何查一篇外文文献的DOI号?

Implementation of VGA protocol based on FPGA

matlab / ENVI 主成分分析实现及结果分析

地质学类比较有名的外文期刊有哪些?

当前发布的SKU(销售规格)信息中包含疑似与宝贝无关的字

Apache ab 压力测试

Redis(二)—Redis通用命令
随机推荐
JVM 全面深入
Problems and precautions about using data pumps (expdp, impdp) to export and import large capacity tables in Oracle migration
C language interview to write a function to find the first public string in two strings
How to set up in touch designer 2022 to solve the problem that leap motion is not recognized?
博士申请 | 上海交通大学自然科学研究院洪亮教授招收深度学习方向博士生
MySQL卸载文档-Windows版
HKUST & MsrA new research: on image to image conversion, fine tuning is all you need
Several key steps of software testing, you need to know
请问如何查一篇外文文献的DOI号?
Developers don't miss it! Oar hacker marathon phase III chain oar track registration opens
【从零开始】win10系统部署Yolov5详细过程(CPU,无GPU)
常用函数detect_image/predict
2022Android面试必备知识点,一文全面总结
C language (structure) defines a user structure with the following fields:
ip地址那点事
Tkinter window selects PCD file and displays point cloud (open3d)
隐马尔科夫模型(HMM)学习笔记
Niuke Xiaobai monthly race 52 E. sum logarithms in groups (two points & inclusion and exclusion)
JWT 认证
LM11丨重构K线构建择时交易策略