当前位置:网站首页>Meta learning Brief
Meta learning Brief
2022-07-02 07:57:00 【MezereonXP】
Meta Learning sketch
Let's review , Traditional machine learning or deep learning process :
- Identify training and test data sets
- Determine the model structure
- Initialize model parameters ( Usually some commonly used random distribution )
- Initialize optimizer types and parameters
- Training , Until it converges
Meta Learning The goal is to learn some steps 2,3,4 Parameters of , We call it Meta knowledge (meta- knowledge)
It might as well be formalized
Suppose the data set is D = { ( x 1 , y 1 ) , . . . , ( x N , y N ) } D = \{(x_1,y_1),...,(x_N,y_N)\} D={ (x1,y1),...,(xN,yN)} among x i x_i xi It's input , y i y_i yi Is the output tag
Our goal is to get a prediction model y ^ = f ( x ; θ ) \hat{y} = f(x;\theta) y^=f(x;θ) , among θ \theta θ Represent the parameters of the model , x x x For input at the same time y ^ \hat{y} y^ Is the output of the prediction
The form of optimization is :
θ ∗ = arg min θ L ( D ; θ , ω ) \theta^*=\arg \min_{\theta} \mathcal{L}(D;\theta,\omega) θ∗=argθminL(D;θ,ω)
Among them ω \omega ω Meta knowledge , Include :
- Optimizer type
- Model structure
- Initial distribution of model parameters
- …
We will compare the existing data sets D D D Divide tasks , Cut into multiple task sets , Each task set includes a training set and a test set , In the form of :
D s o u r c e = { ( D s o u r c e t r a i n , D s o u r c e v a l ) ( i ) } i = 1 M D_{source} = \{(D^{train}_{source},D^{val}_{source})^{(i)}\}_{i=1}^{M} Dsource={ (Dsourcetrain,Dsourceval)(i)}i=1M
The optimization objective is :
ω ∗ = arg max ω log p ( ω ∣ D s o u r c e ) \omega^* = \arg \max_{\omega} \log p(\omega|D_{source}) ω∗=argωmaxlogp(ω∣Dsource)
That is, in the multiple task sets we segment , Find a set of configurations ( That is, meta knowledge ), Make it optimal for these tasks .
This step is generally called Meta training (meta-training)
find ω ∗ \omega^* ω∗ after , It can be applied to a target task data set D t a r g e t = { ( D t a r g e t t r a i n , D t a r g e t v a l ) } D_{target} = \{(D_{target}^{train}, D_{target}^{val})\} Dtarget={ (Dtargettrain,Dtargetval)}
Carry out traditional training on this , That is to find an optimal model parameter θ ∗ \theta^* θ∗
θ ∗ = arg max θ log p ( θ ∣ ω ∗ , D t a r g e t t r a i n ) \theta^* = \arg\max_{\theta}\log p(\theta|\omega^*, D_{target}^{train}) θ∗=argθmaxlogp(θ∣ω∗,Dtargettrain)
This step is called Meta test (meta-testing)
边栏推荐
- 【Programming】
- 【Sparse-to-Dense】《Sparse-to-Dense:Depth Prediction from Sparse Depth Samples and a Single Image》
- Execution of procedures
- WCF更新服务引用报错的原因之一
- [CVPR‘22 Oral2] TAN: Temporal Alignment Networks for Long-term Video
- 【Random Erasing】《Random Erasing Data Augmentation》
- 【MagNet】《Progressive Semantic Segmentation》
- [in depth learning series (8)]: principles of transform and actual combat
- 【AutoAugment】《AutoAugment:Learning Augmentation Policies from Data》
- Apple added the first iPad with lightning interface to the list of retro products
猜你喜欢
【BiSeNet】《BiSeNet:Bilateral Segmentation Network for Real-time Semantic Segmentation》
[mixup] mixup: Beyond Imperial Risk Minimization
【雙目視覺】雙目矯正
jetson nano安装tensorflow踩坑记录(scipy1.4.1)
【Programming】
【Cascade FPD】《Deep Convolutional Network Cascade for Facial Point Detection》
Implementation of yolov5 single image detection based on onnxruntime
程序的内存模型
联邦学习下的数据逆向攻击 -- GradInversion
【Cascade FPD】《Deep Convolutional Network Cascade for Facial Point Detection》
随机推荐
Jetson nano installation tensorflow stepping pit record (scipy1.4.1)
Open3d learning notes 1 [first glimpse, file reading]
open3d环境错误汇总
用全连接层替代掉卷积 -- RepMLP
【双目视觉】双目立体匹配
[learning notes] numerical differentiation of back error propagation
【Sparse-to-Dense】《Sparse-to-Dense:Depth Prediction from Sparse Depth Samples and a Single Image》
针对语义分割的真实世界的对抗样本攻击
CVPR19-Deep Stacked Hierarchical Multi-patch Network for Image Deblurring论文复现
C # connect to MySQL database
Open3D学习笔记一【初窥门径,文件读取】
Network metering - transport layer
论文tips
【MagNet】《Progressive Semantic Segmentation》
服务器的内网可以访问,外网却不能访问的问题
Gensim如何冻结某些词向量进行增量训练
将恶意软件嵌入到神经网络中
【MobileNet V3】《Searching for MobileNetV3》
应对长尾分布的目标检测 -- Balanced Group Softmax
【FastDepth】《FastDepth:Fast Monocular Depth Estimation on Embedded Systems》