当前位置:网站首页>论文笔记: 多标签学习 MSWL
论文笔记: 多标签学习 MSWL
2022-06-25 21:57:00 【闵帆】
摘要: 分享对论文的理解. 原文见 Zhang, J., Li, S., Jiang, M., & Tan, K. C. (2020). Learning from weakly labeled data based on manifold regularized sparse model. IEEE Transactions on Cybernetics, (pp. 1–14).
1. 论文贡献
- 解决半监督的带缺失值多标签学习问题. 其实有缺失标签的时候, 已经是半监督了. 这篇论文强调的是, 有些样本一个标签都不给.
- 全局与局部标签相关性.
- 稀疏性学习模型, 条件属性的辨别性.
2. 基本符号
| 符号 | 含义 | 说明 |
|---|---|---|
| X ∈ R n × d \mathbf{X} \in \mathbb{R}^{n \times d} X∈Rn×d | 属性矩阵 | |
| Y ∈ { − 1 , 1 } n × q \mathbf{Y} \in \{-1, 1\}^{n \times q} Y∈{ −1,1}n×q | 标签矩阵 | |
| C ∈ { 0 , 1 } n × q \mathbf{C} \in \{0, 1\}^{n \times q} C∈{ 0,1}n×q | 观测标签矩阵 | c i j = 0 c_{ij} = 0 cij=0 对应于 y i j = − 1 y_{ij} = -1 yij=−1 或 1 1 1 |
| Ω = { 1 , … , n } × { 1 , … , c } \mathbf{\Omega} = \{1, \dots, n\} \times \{1, \dots, c\} Ω={ 1,…,n}×{ 1,…,c} | 观测标签位置集合 | |
| W ∈ R m × l \mathbf{W} \in \mathbb{R}^{m \times l} W∈Rm×l | 系数矩阵 | 仍然是线性模型 |
| w i ∈ R m \mathbf{w}_i \in \mathbb{R}^m wi∈Rm | 某一标签的系数向量 | |
| C ∈ R l × l \mathbf{C} \in \mathbb{R}^{l \times l} C∈Rl×l | 标签相关性矩阵 | 成对相关性, 不满足对称性 |
3. 算法

基本的优化目标:
min W V ( X , C , W ) + γ Ω ( W ) + μ Z ( X , C , W ) , (1) \min_{\mathbf{W}} V(\mathbf{X}, \mathbf{C}, \mathbf{W}) + \gamma \Omega(\mathbf{W}) + \mu Z(\mathbf{X}, \mathbf{C}, \mathbf{W}), \tag{1} WminV(X,C,W)+γΩ(W)+μZ(X,C,W),(1)
其中 V V V 是损失函数, Z Z Z 根据标签相关性信息增强弱标签学习能力.
3.1 损失函数
V ( X , C , W ) = ∥ X W − Y ~ ∥ 2 2 , (2) V(\mathbf{X}, \mathbf{C}, \mathbf{W}) = \|\mathbf{XW} - \tilde{\mathbf{Y}}\|_2^2, \tag{2} V(X,C,W)=∥XW−Y~∥22,(2)
其中 Y ~ \tilde{\mathbf{Y}} Y~ 是从 C \mathbf{C} C 计算而来, 希望拟合 Y \mathbf{Y} Y. 具体方法如下:
如果 c i j = 0 c_{ij} = 0 cij=0, 表示缺值或负标签, 则
c ~ i j = ∑ p ∈ N j c i p b p j , (3) \tilde{c}_{ij} = \sum_{p \in \mathcal{N}_j} c_{ip} b_{pj}, \tag{3} c~ij=p∈Nj∑cipbpj,(3)
其中 N j \mathcal{N}_j Nj 表示标签 j j j 的所有邻居标签, b p j b_{pj} bpj 表示标签 p p p 与标签 j j j 的相关性. 可以记为 (这里有点小的问题, 丢失了邻居信息)
C ~ = C ( B + I ) . \tilde{\mathbf{C}} = \mathbf{C}(\mathbf{B} + \mathbf{I}). C~=C(B+I).
y ~ i j = { 1 , c ~ i j ≥ 1 ; c ~ i j , 0 < c ~ i j < 1 ; 0 , c ~ i j ≤ 0. (4) \tilde{y}_{ij} = \left\{\begin{array}{ll} 1, & \tilde{c}_{ij} \geq 1;\\ \tilde{c}_{ij}, & 0 < \tilde{c}_{ij} < 1;\\ 0, & \tilde{c}_{ij} \leq 0. \end{array}\right.\tag{4} y~ij=⎩⎨⎧1,c~ij,0,c~ij≥1;0<c~ij<1;c~ij≤0.(4)
3.2 正则项
使用 l 2 , 1 \mathcal{l}_{2, 1} l2,1 范数控制稀疏性.
Ω ( W ) = ∥ W ∥ 2 , 1 = ∑ i = 1 n ∑ j = 1 t w i j 2 , \Omega(\mathbf{W}) = \|\mathbf{W}\|_{2, 1} = \sum_{i = 1}^n \sqrt{\sum_{j = 1}^t w_{ij}^2}, Ω(W)=∥W∥2,1=i=1∑nj=1∑twij2,
即逐行取 2 范数再相加. 更多理解见 这里.
3.3 标签相关性学习 (全局与局部的流形正则)
- 全局相关性
min b i ∥ C − i b i − c i ∥ 2 2 + λ ∥ b i ∥ , (6) \min_{\mathbf{b}_i} \|\mathbf{C}_{-i} \mathbf{b}_i - \mathbf{c}_i\|_2^2 + \lambda \|\mathbf{b}_i\|, \tag{6} bimin∥C−ibi−ci∥22+λ∥bi∥,(6)
其中 C − i \mathbf{C}_{-i} C−i 是将第 i i i 列标签全部置为 0 所获得的不完整矩阵. 该式的具体优化方法略, 反正我也没看懂. - 局部相关性
min S ∑ i = 1 n ∥ x i − ∑ j ∈ N i s j i x j ∥ 2 , (11) \min_{\mathbf{S}} \sum_{i = 1}^n \|\mathbf{x}_i - \sum_{j \in \mathcal{N}_i} s_{ji} \mathbf{x}_j\|^2, \tag{11} Smini=1∑n∥xi−j∈Ni∑sjixj∥2,(11)
其中 K K K 是邻居数量, s i j s_{ij} sij 是 x i \mathbf{x}_i xi 与其邻居 x j \mathbf{x}_j xj 的相似性.
注意原文有几个小问题:
- s i j ∈ S s_{ij} \in \mathbf{S} sij∈S 的写法不正确, 后者并非一个矩阵. 其实不写也没有歧义;
- 第 j j j 个邻居与第 j j j 个标签之间, 相关了一个间址, 因此使用 j ∈ N i j \in \mathcal{N}_i j∈Ni;
- 怀疑下标 2 没写, 导致不是 2 范数.
最后
Z ( X , C , W ) = α ∥ W − W B ∥ F 2 + β ∥ X W − S X W ∥ F 2 Z(\mathbf{X}, \mathbf{C}, \mathbf{W}) = \alpha \|\mathbf{W} - \mathbf{WB}\|_F^2 + \beta \|\mathbf{XW} - \mathbf{SXW}\|_F^2 Z(X,C,W)=α∥W−WB∥F2+β∥XW−SXW∥F2
3.4 扩展到半监督学习
自悟.
4. 小结
- 三个部分各司其职.
- 流行学习.
边栏推荐
- ES6学习-- LET
- ES6 learning -- let
- Relinearization in homomorphic encryption (ckks)
- NRM source switching tool
- This 110 year old "longevity" enterprise has been planning for the next century
- Glory launched the points mall to support the exchange of various glory products
- Oracle - data query
- pdm导入vscode的实现方式
- New network security competition of the secondary vocational group in 2022
- The applet draws a simple pie chart
猜你喜欢

Use apiccloud AVM multi terminal component to quickly realize the search function in the app

QT learning setting executable exe attribute (solving the problem of Chinese attribute garbled)

Basic concepts of processor scheduling
2. What is the geometric meaning of a vector multiplying its transpose?

Utilisation de la classe Ping d'Unity

Ribbon core ⼼ source code analysis

The applet draws a simple pie chart

剑指 Offer 46. 把数字翻译成字符串(DP)

Which PHP open source works deserve attention

2022爱分析· IT运维厂商全景报告
随机推荐
Analysis report on market demand situation and investment direction of China's optical transmission equipment industry from 2022 to 2028
ES6 -- 形参设置初始值、拓展运算符、迭代器、生成函数
ORACLE - 数据查询
ES6 - numerical extension and object extension
2022-2028 global selective laser sintering service industry research and trend analysis report
异或运算符简单逻辑运算 a^=b
Record the learning record of the exists keyword once
ES6-- 集合
MySQL数据库索引
Which PHP open source works deserve attention
How do I project points on a 3D plane- How to project a point onto a plane in 3D?
Reasons why MySQL cannot be connected externally after installing MySQL database on ECs and Solutions
Why absolute positioning overlaps
How to solve the problem of SQL?
The wisdom of questioning? How to ask questions?
哪些PHP开源作品值得关注
Chapter 3 use of requests Library
What should it personnel over 35 years old do if they are laid off by the company one day?
How to use the find command
New network security competition of the secondary vocational group in 2022