当前位置：网站首页>DDTL: Domain Transfer Learning at a Distance

DDTL: Domain Transfer Learning at a Distance

2022-08-04 01:40:00 【A moment of loss】

DDTL问题：The target domain can be completely different from the source domain.

1、A selective learning algorithm is proposed（SLA）,以Supervised autoencoder or supervised convolutional autoencoderAs a base model for handling different types of input.

2、SLAThe algorithm gradually selects from the intermediate domainUseful unlabeled data作为桥梁,in order to break down the huge distributional differences in the transfer of knowledge between two distant domains.

迁移学习：Learning methods that borrow knowledge from the source domain to enhance the learning ability of the target domain

Task1：Transfer knowledge between cat and tiger images.Transfer learning algorithms achieve better performance than some supervised learning algorithms.

Task2：Transfer knowledge between human faces and airplane images.The transfer learning algorithm failed,Because its performance is worse than supervised learning algorithms.然而,当应用SLA算法时,获得了更好的性能.

1、DDTL问题定义
大小为 $n_{S}$ 的Source field tag data： $S=\left \{ \left ( x_{S}^{1} ,y_{S}^{1}\right ),..., \left ( x_{S}^{n_{S}} ,y_{S}^{n_{S}}\right )\right \}$ ;
大小为 $n_{T}$ 的Target domain tag data： $T=\left \{ \left (x _{T}^{1},y_{T}^{1} \right ) ,...,\left (x _{T}^{n_{T}},y_{T}^{n_{T}} \right )\right \}$ ;
多个The intermediate fields are unlabeled data的混合: $I=\left \{x _{I}^{1} ,...,x_{I}^{n_{I}}\right \}$ .
A domain corresponds to one of a particular classification problemconcept or category,Such as recognizing faces or airplanes from images.
问题描述：Assume that the classification problems in both the source and target domains are binary.All data points should be in the same feature space.设 $p_{S}\left ( x \right )$ 、 $p_{S}\left ( y|x \right )$ 和 $p_{S}\left ( x,y \right )$ are the source domain data边际分布、Conditional and joint distributions;The three distributions relative to the target domain are $p_{T}\left ( x \right )$ , $p_{T}\left ( y|x \right )$ , $p_{T}\left ( x,y \right )$ ; $p_{I}\left ( x \right )$ is the marginal distribution of the intermediate domain.在DDTL问题中：
$p_{T}\left ( x \right )\neq p_{S}\left ( x \right )$ ;
$p_{T}\left ( x \right )\neq p_{I}\left ( x \right )$ ;
$p_{T}\left ( y|x \right )\neq p_{S}\left ( y|x \right )$ .
目标：利用中间域中的未标记数据,each other in the original距离较远Build a bridge between the source and target domains,And through the bridge from the source domainTransfer supervision knowledge来训练目标域accurate classifier.
PS：Not all data in the intermediate domain should be similar to the source domain data,Some of these data can be very different.因此,简单地Building bridges using all intermediate data may fail.

2、SLA：Selective Learning Algorithms
2.1Autoencoders and their variants
自动编码器是一种Unsupervised Feedforward Neural Networks,具有输入层、One or more hidden and output layers,It usually consists of two processes：编码和解码.
输入： $x\in R^{q}$ ;
编码函数： $f_{e}\left ( \cdot \right )$ Encode it to map it to a hidden representation;
解码函数： $f_{d}\left ( \cdot \right )$ Decode to reconstructx.
The process of the autoencoder can be summarized as：
编码： $h=f_{e}\left ( x\right )$ ;
解码： $\widehat{x}=f_{d}\left ( h \right )$ .
其中 $\widehat{x}$ is an approximation to the original inputxthe reconstructed input.By minimizing the reconstruction error on all training data,即
$\min_{f_{e},f_{d}}\sum_{i=1}^{n}\left \| \widehat{x_{i}} -x_{i}\right \|_{2}^{2}$
2.2Instance selection by reconstruction error
在实践中,Because the source domain and the target domain are far apart,Only a portion of the source domain data may be useful to the target domain,The situation is similar for intermediate domains.因此,In order to select useful instances from the intermediate domain,And remove the relevant instance of the target domain from the source domain,通过Minimize the reconstruction error for selected instances in the source and intermediate domains and all instances in the target domainto learn a pair of encoding and decoding functions.The objective function to be minimized is formulated as follows：
$\tau _{1}\left (f _{e} ,f_{d},v_{S},v_{T}\right )=\frac{1}{n_{S}}\sum_{i=1}^{n_{S}}v_{S}^{i}\left \| \widehat{x_{S}^{i}}-x_{S}^{i} \right \|_{2}^{2}+\frac{1}{n_{I}}\sum_{i=1}^{n_{I}}v_{I}^{i}\left \| \widehat{x_{I}^{i}}-x_{I}^{i} \right \|_{2}^{2}+\frac{1}{n_{T}}\sum_{i=1}^{n_{T}}v_{T}^{i}\left \| \widehat{x_{T}^{i}}-x_{T}^{i} \right \|_{2}^{2}+R\left ( v_{S},v_{T} \right )$
$v_{S}^{i}$ 、 $v_{I}^{j}$ ∈ ｛0,1｝：source domainithe first instance and the intermediate domainj个实例selection indicator.当值为1时,The corresponding instance will be selected,Otherwise it will be deselected.
$R\left ( v_{S},v_{T} \right )$ ： $v_{S}$ 和 $v_{T}$ Regularization term on ,通过将 $v_{S}$ 和 $v_{T}$ All values of are set to zero to avoid some unimportant solutions.将其定义为：
$R\left ( v_{S},v_{T} \right )=-\frac{\lambda _{S}}{n_{S}}\sum_{i=1}^{n_{S}}v_{S}^{i}-\frac{\lambda _{I}}{n_{I}}\sum_{i=1}^{n_{I}}v_{S}^{i}$
Minimizing this term is equivalent to encouraging from the source domain and the intermediate domainChoose as many instances as possible.Two regularization parameters $\lambda _{S}$ 和 $\lambda _{I}$ Controls the importance of this regularization term.
2.3Explanation of auxiliary information
Incorporate auxiliary information when learning hidden representations for different domains.
源域和目标域：Data tags can be used as auxiliary information;
中间域：没有标签信息.
Treat predictions on intermediate domains as auxiliary information,And use the predicted confidence to guide the learning of hidden representations.具体而言,We propose to incorporate auxiliary information into learning by minimizing the following function：
$\tau _{2}\left (f _{c},f _{e} ,f _{d}\right)=\frac{1}{n_{S}}\sum_{i=1}^{n_{S}}v_{S}^{i}\iota \left ( y_{S}^{i} ,f_{c}\left ( h_{S}^{i} \right )\right )+\frac{1}{n_{T}}\sum_{i=1}^{n_{T}}v_{T}^{i}\iota \left ( y_{T}^{i} ,f_{c}\left ( h_{T}^{i} \right )\right )+\frac{1}{n_{I}}\sum_{i=1}^{n_{I}}v_{I}^{i}g \left (f_{c}\left ( h_{I}^{i} \right ) \right )$
$f_{c}\left ( \cdot \right )$ ：is the classification function that outputs the classification probability;
g（·）：定义为 $g\left ( z \right )=-zInz-\left ( 1-z \right )In\left ( 1-z \right )$ ,其中0≤ z≤ 1;
将 $g\left ( \cdot \right )$ 用于选择Instances of high prediction confidence in the intermediate domain.
2.4总体目标函数
DDTL的最终目标函数如下：
$\min_{\theta ,v}\tau =\tau _{1}+\tau _{2}, s.t. v_{S}^{i},v_{I}^{i}\in \left \{ 0,1 \right \}$
其中 $v=\left \{ v_{S} ,v_{T}\right \}$ ;Θ表示函数 $f_{c}\left ( \cdot \right )$ 、 $f_{e}\left ( \cdot \right )$ 和 $f_{d}\left ( \cdot \right )$ 的所有参数.
Use block coordinatesdecedent（BCD）方法,在每次迭代中,在保持其他变量不变的情况下,Sequentially optimize the variables in each block.
在 $v_{S}^{i}$ selected to have low reconstruction error and low training loss;在 $v_{I}^{i}$ selected to have low reconstruction error and low training loss.
The deep learning architecture is shown in the figure below.

3、总结
This paper studies a new oneDDTL问题,The source domain and the target domain are far apart,But can be connected through some intermediate domains.为了解决DDTL问题,提出了SLA算法,from the intermediate domainGradually select unlabeled data,to connect two distant domains.

原网站

版权声明
本文为[A moment of loss]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/216/202208040110337435.html

当前位置：网站首页>DDTL: Domain Transfer Learning at a Distance

DDTL: Domain Transfer Learning at a Distance

边栏推荐

猜你喜欢

随机推荐