当前位置:网站首页>Meta learning Brief
Meta learning Brief
2022-07-02 07:57:00 【MezereonXP】
Meta Learning sketch
Let's review , Traditional machine learning or deep learning process :
- Identify training and test data sets
- Determine the model structure
- Initialize model parameters ( Usually some commonly used random distribution )
- Initialize optimizer types and parameters
- Training , Until it converges
Meta Learning The goal is to learn some steps 2,3,4 Parameters of , We call it Meta knowledge (meta- knowledge)
It might as well be formalized
Suppose the data set is D = { ( x 1 , y 1 ) , . . . , ( x N , y N ) } D = \{(x_1,y_1),...,(x_N,y_N)\} D={ (x1,y1),...,(xN,yN)} among x i x_i xi It's input , y i y_i yi Is the output tag
Our goal is to get a prediction model y ^ = f ( x ; θ ) \hat{y} = f(x;\theta) y^=f(x;θ) , among θ \theta θ Represent the parameters of the model , x x x For input at the same time y ^ \hat{y} y^ Is the output of the prediction
The form of optimization is :
θ ∗ = arg min θ L ( D ; θ , ω ) \theta^*=\arg \min_{\theta} \mathcal{L}(D;\theta,\omega) θ∗=argθminL(D;θ,ω)
Among them ω \omega ω Meta knowledge , Include :
- Optimizer type
- Model structure
- Initial distribution of model parameters
- …
We will compare the existing data sets D D D Divide tasks , Cut into multiple task sets , Each task set includes a training set and a test set , In the form of :
D s o u r c e = { ( D s o u r c e t r a i n , D s o u r c e v a l ) ( i ) } i = 1 M D_{source} = \{(D^{train}_{source},D^{val}_{source})^{(i)}\}_{i=1}^{M} Dsource={ (Dsourcetrain,Dsourceval)(i)}i=1M
The optimization objective is :
ω ∗ = arg max ω log p ( ω ∣ D s o u r c e ) \omega^* = \arg \max_{\omega} \log p(\omega|D_{source}) ω∗=argωmaxlogp(ω∣Dsource)
That is, in the multiple task sets we segment , Find a set of configurations ( That is, meta knowledge ), Make it optimal for these tasks .
This step is generally called Meta training (meta-training)
find ω ∗ \omega^* ω∗ after , It can be applied to a target task data set D t a r g e t = { ( D t a r g e t t r a i n , D t a r g e t v a l ) } D_{target} = \{(D_{target}^{train}, D_{target}^{val})\} Dtarget={ (Dtargettrain,Dtargetval)}
Carry out traditional training on this , That is to find an optimal model parameter θ ∗ \theta^* θ∗
θ ∗ = arg max θ log p ( θ ∣ ω ∗ , D t a r g e t t r a i n ) \theta^* = \arg\max_{\theta}\log p(\theta|\omega^*, D_{target}^{train}) θ∗=argθmaxlogp(θ∣ω∗,Dtargettrain)
This step is called Meta test (meta-testing)
边栏推荐
- 业务架构图
- 【Mixup】《Mixup:Beyond Empirical Risk Minimization》
- Rhel7 operation level introduction and switching operation
- 【学习笔记】反向误差传播之数值微分
- CVPR19-Deep Stacked Hierarchical Multi-patch Network for Image Deblurring论文复现
- 【Paper Reading】
- 【Mixed Pooling】《Mixed Pooling for Convolutional Neural Networks》
- open3d学习笔记五【RGBD融合】
- Machine learning theory learning: perceptron
- 【FastDepth】《FastDepth:Fast Monocular Depth Estimation on Embedded Systems》
猜你喜欢
w10升级至W11系统,黑屏但鼠标与桌面快捷方式能用,如何解决
open3d学习笔记四【表面重建】
超时停靠视频生成
Traditional target detection notes 1__ Viola Jones
Eklavya -- infer the parameters of functions in binary files using neural network
Timeout docking video generation
Proof and understanding of pointnet principle
【Paper Reading】
Implementation of yolov5 single image detection based on pytorch
[in depth learning series (8)]: principles of transform and actual combat
随机推荐
业务架构图
Correction binoculaire
使用C#语言来进行json串的接收
超时停靠视频生成
【MnasNet】《MnasNet:Platform-Aware Neural Architecture Search for Mobile》
Common machine learning related evaluation indicators
【Mixup】《Mixup:Beyond Empirical Risk Minimization》
In the era of short video, how to ensure that works are more popular?
ABM thesis translation
Win10+vs2017+denseflow compilation
TimeCLR: A self-supervised contrastive learning framework for univariate time series representation
w10升级至W11系统,黑屏但鼠标与桌面快捷方式能用,如何解决
Open3d learning notes 1 [first glimpse, file reading]
[binocular vision] binocular correction
【双目视觉】双目立体匹配
Open3d learning note 4 [surface reconstruction]
【Mixup】《Mixup:Beyond Empirical Risk Minimization》
What if the notebook computer cannot run the CMD command
Label propagation
Replace self attention with MLP