当前位置:网站首页>How to use the pre training language model
How to use the pre training language model
2022-07-29 06:11:00 【Quinn-ntmy】
How to use the pre training model
One 、 Ideas
First of all, consider Data volume of the target model And Correlation between target data and source data .
Generally, it should be based on the different similarity between the data set and the pre training model data set , Adopt different treatment methods .
Above picture
1、 The data set is small , High data similarity
Ideal situation , Sure Use the pre training model as a feature extractor , So it is sometimes called feature extraction .
practice : Remove the output layer , Treat the rest of the network as a fixed feature extractor , Apply to new datasets .
2、 Data aggregation , High data similarity
frozen In the pretreatment model A few lower layers , Modify the classifier , Then start training again on the basis of the new data set .
3、 The data set is small , Data similarity is not high
frozen In the pre training model Less high-level network , Then retrain the back network , Modify the classifier . The similarity is not high ,so The process of retraining is critical !!
The data set size is insufficient, which is passed frozen Some of the pre training models Lower network layer Make up for .
4、 Data aggregation , Data similarity is not great
Large data sets ,NN The training process is more efficient . But when the similarity is not high , The pre training model will be very inefficient ,to do: In the pre training model Weights are all initialized Then start training again on the basis of the new data set .
【 notes 】 In the specific operation , Often try many methods at the same time , Choose the best .
Two 、 Get pre training model
1、PyTorch The toolkit torchvision Medium models modular (torchvision.models), You need to set pretrained=True.
2、tensorflow.keras.application or Can be in TensorFlowHub Website (https://tfhub.dev/google/) Upload and download .
3、huggingFace-transformers(NLP Pre training model library )
边栏推荐
- 5、 Image pixel statistics
- 【语义分割】Fully Attentional Network for Semantic Segmentation
- 1、 Usage of common loss function
- 2022春招——芯动科技FPGA岗技术面(一面心得)
- PyTorch的数据读取机制
- 虚假新闻检测论文阅读(五):A Semi-supervised Learning Method for Fake News Detection in Social Media
- 1、 Combine multiple txt files into one TXT file
- ML9自学笔记
- 2022春招——禾赛科技FPGA技术岗(一、二面,收集于:数字IC打工人及FPGA探索者)
- 一、迁移学习与fine-tuning有什么区别?
猜你喜欢

Jianzhi core taocloud full flash SDS helps build high-performance cloud services

一、常见损失函数的用法

ML9自学笔记
![[semantic segmentation] setr_ Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformer](/img/aa/daccb45d5b6d4a3f7e8517dd5bd7d2.png)
[semantic segmentation] setr_ Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformer

第一周任务 深度学习和pytorch基础

【Transformer】SOFT: Softmax-free Transformer with Linear Complexity
![[semantic segmentation] full attention network for semantic segmentation](/img/5b/e5143701d60bc16a1ec620b03edbb3.png)
[semantic segmentation] full attention network for semantic segmentation

D3.js vertical relationship diagram (with arrows and text description of connecting lines)

1、 Focal loss theory and code implementation

【Transformer】AdaViT: Adaptive Tokens for Efficient Vision Transformer
随机推荐
ML16 neural network (2)
2、 How to save the images of train and test in MNIST dataset?
Transfer learning
三、如何搞自定义数据集?
"Full flash measurement" database acceleration solution
零基础学FPGA(五):时序逻辑电路设计之计数器(附有呼吸灯实验、简单组合逻辑设计介绍)
Pytorch Basics (Introductory)
备份谷歌或其他浏览器插件
[semantic segmentation] setr_ Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformer
2022春招——芯动科技FPGA岗技术面(一面心得)
一、PyTorch Cookbook(常用代码合集)
2、 Summary of deep learning data enhancement methods
二、多并发实现接口压力测试
华为云14天鸿蒙设备开发-Day1环境搭建
Set automatic build in idea - change the code, and refresh the page without restarting the project
电力电子:单项逆变器设计(MATLAB程序+AD原理图)
一、多个txt文件合并成1个txt文件
ML6自学笔记
基于FPGA:多目标运动检测(手把手教学①)
GAN:生成对抗网络 Generative Adversarial Networks