当前位置:网站首页>How to use the pre training language model
How to use the pre training language model
2022-07-29 06:11:00 【Quinn-ntmy】
How to use the pre training model
One 、 Ideas
First of all, consider Data volume of the target model And Correlation between target data and source data .
Generally, it should be based on the different similarity between the data set and the pre training model data set , Adopt different treatment methods .
Above picture
1、 The data set is small , High data similarity
Ideal situation , Sure Use the pre training model as a feature extractor , So it is sometimes called feature extraction .
practice : Remove the output layer , Treat the rest of the network as a fixed feature extractor , Apply to new datasets .
2、 Data aggregation , High data similarity
frozen In the pretreatment model A few lower layers , Modify the classifier , Then start training again on the basis of the new data set .
3、 The data set is small , Data similarity is not high
frozen In the pre training model Less high-level network , Then retrain the back network , Modify the classifier . The similarity is not high ,so The process of retraining is critical !!
The data set size is insufficient, which is passed frozen Some of the pre training models Lower network layer Make up for .
4、 Data aggregation , Data similarity is not great
Large data sets ,NN The training process is more efficient . But when the similarity is not high , The pre training model will be very inefficient ,to do: In the pre training model Weights are all initialized Then start training again on the basis of the new data set .
【 notes 】 In the specific operation , Often try many methods at the same time , Choose the best .
Two 、 Get pre training model
1、PyTorch The toolkit torchvision Medium models modular (torchvision.models), You need to set pretrained=True.
2、tensorflow.keras.application or Can be in TensorFlowHub Website (https://tfhub.dev/google/) Upload and download .
3、huggingFace-transformers(NLP Pre training model library )
边栏推荐
- ReportingService WebService form authentication
- ML16 neural network (2)
- Migration learning robot visual domain adaptation with low rank reconstruction
- PyTorch基础知识(可入门)
- [semantic segmentation] Introduction to mapillary dataset
- AttributeError: module ‘tensorflow‘ has no attribute ‘placeholder‘
- 避坑:关于两个HC-05主从一体蓝牙模块互连,连不上问题
- 一、PyTorch Cookbook(常用代码合集)
- ROS教程(Xavier)
- 【Transformer】TransMix: Attend to Mix for Vision Transformers
猜你喜欢

D3.js vertical relationship diagram (with arrows and text description of connecting lines)

华为云14天鸿蒙设备开发-Day1环境搭建

【Transformer】AdaViT: Adaptive Vision Transformers for Efficient Image Recognition

PyTorch基础知识(可入门)
![[semantic segmentation] overview of semantic segmentation](/img/79/0c22bd28206fee281fa754c336b3b4.png)
[semantic segmentation] overview of semantic segmentation

iSCSI vs iSER vs NVMe-TCP vs NVMe-RDMA

零基础学FPGA(五):时序逻辑电路设计之计数器(附有呼吸灯实验、简单组合逻辑设计介绍)

基于STC51:四轴飞控开源项目原理图与源码(入门级DIY)

【Attention】Visual Attention Network

基于FPGA:运动目标检测(原理图+源码+硬件选择,可用毕设)
随机推荐
ML16 neural network (2)
[image classification] how to use mmclassification to train your classification model
Beijing Baode & taocloud jointly build the road of information innovation
一、Focal Loss理论及代码实现
Transfer learning
虚假新闻检测论文阅读(一):Fake News Detection using Semi-Supervised Graph Convolutional Network
入门到入魂:单片机如何利用TB6600高精度控制步进电机(42/57)
pip安装后仍有解决ImportError: No module named XX
Is flutter being quietly abandoned? On the future of flutter
第三周周报 ResNet+ResNext
基于STM32开源:磁流体蓝牙音箱(包含源码+PCB)
2021-06-10
Pytorch Basics (Introductory)
GAN:生成对抗网络 Generative Adversarial Networks
Error importing Spacy module - oserror: [e941] can't find model 'en'
一、multiprocessing.pool.RemoteTraceback
【Transformer】AdaViT: Adaptive Tokens for Efficient Vision Transformer
Error in installing pyspider under Windows: Please specify --curl dir=/path/to/build/libcurl solution
迁移学习——Transitive Transfer Learning
一、迁移学习与fine-tuning有什么区别?