当前位置:网站首页>How to use the pre training language model
How to use the pre training language model
2022-07-29 06:11:00 【Quinn-ntmy】
How to use the pre training model
One 、 Ideas
First of all, consider Data volume of the target model And Correlation between target data and source data .
Generally, it should be based on the different similarity between the data set and the pre training model data set , Adopt different treatment methods .
Above picture
1、 The data set is small , High data similarity
Ideal situation , Sure Use the pre training model as a feature extractor , So it is sometimes called feature extraction .
practice : Remove the output layer , Treat the rest of the network as a fixed feature extractor , Apply to new datasets .
2、 Data aggregation , High data similarity
frozen In the pretreatment model A few lower layers , Modify the classifier , Then start training again on the basis of the new data set .
3、 The data set is small , Data similarity is not high
frozen In the pre training model Less high-level network , Then retrain the back network , Modify the classifier . The similarity is not high ,so The process of retraining is critical !!
The data set size is insufficient, which is passed frozen Some of the pre training models Lower network layer Make up for .
4、 Data aggregation , Data similarity is not great
Large data sets ,NN The training process is more efficient . But when the similarity is not high , The pre training model will be very inefficient ,to do: In the pre training model Weights are all initialized Then start training again on the basis of the new data set .
【 notes 】 In the specific operation , Often try many methods at the same time , Choose the best .
Two 、 Get pre training model
1、PyTorch The toolkit torchvision Medium models modular (torchvision.models), You need to set pretrained=True.
2、tensorflow.keras.application or Can be in TensorFlowHub Website (https://tfhub.dev/google/) Upload and download .
3、huggingFace-transformers(NLP Pre training model library )
边栏推荐
- ML16 neural network (2)
- 六、基于深度学习关键点的指针式表计识别
- [semantic segmentation] full attention network for semantic segmentation
- [semantic segmentation] overview of semantic segmentation
- Typical case of xdfs & Aerospace Institute HPC cluster
- ML9自学笔记
- 虚假新闻检测论文阅读(五):A Semi-supervised Learning Method for Fake News Detection in Social Media
- 基于STM32开源:磁流体蓝牙音箱(包含源码+PCB)
- ML7自学笔记
- Transformer回顾+理解
猜你喜欢

How to perform POC in depth with full flash distribution?

clion+opencv+aruco+cmake配置

研究生新生培训第二周:卷积神经网络基础

基于STM32开源:磁流体蓝牙音箱(包含源码+PCB)

ML6自学笔记

"Full flash measurement" database acceleration solution

电力电子:单项逆变器设计(MATLAB程序+AD原理图)

【Transformer】TransMix: Attend to Mix for Vision Transformers

Improve quality with intelligence financial imaging platform solution

四、One-hot和损失函数的应用
随机推荐
1、 Combine multiple txt files into one TXT file
Discussion on the design of distributed full flash memory automatic test platform
【Transformer】ATS: Adaptive Token Sampling For Efficient Vision Transformers
【Transformer】ACMix:On the Integration of Self-Attention and Convolution
京微齐力:基于HMEP060的OLED字符显示(及FUXI工程建立演示)
2、 During OCR training, txt files and picture data are converted to LMDB file format
二、OCR训练时,将txt文件和图片数据转为lmdb文件格式
二、如何保存MNIST数据集中train和test的图片?
6、 Pointer meter recognition based on deep learning key points
ML9自学笔记
【Transformer】SOFT: Softmax-free Transformer with Linear Complexity
Chongqing Avenue cloud bank, as a representative of the software industry, was invited to participate in the signing ceremony of key projects in Yuzhong District
torch.nn.Parameter()函数理解
迁移学习—Geodesic Flow Kernel for Unsupervised Domain Adaptation
The third week of postgraduate freshman training: resnet+resnext
二、多并发实现接口压力测试
第一周任务 深度学习和pytorch基础
1、 Pytorch Cookbook (common code Collection)
"Full flash measurement" database acceleration solution
ML15-神经网络(1)