当前位置:网站首页>Paper notes: limit multi label learning galaxc (temporarily stored, not finished)
Paper notes: limit multi label learning galaxc (temporarily stored, not finished)
2022-07-06 02:14:00 【Min fan】
Abstract : Share your understanding of the paper . See the original D. Saini, A. K. Jain, K. Dave, J. Jiao, A. Singh, R. Zhang and M. Varma, GalaXC: Graph neural networks with labelwise attention for extreme classification, in WWW 2021. 7 Among the authors 6 This is from Microsoft Research , Fight them , I feel like I have a funny head .
1. Contribution of thesis
- Deal with the situation that labels exist in documents : labels and documents cohabit the same space.
- Use tag text and tag relevance : label text and label correlations, label metadata.
- Tag level attention mechanism : label-wise attention mechanism.
- Hot start ( Some labels are known ) The effect is good : warm-start scenarios where predictions need to be made on data points with partially revealed label sets,
- Can handle millions of tags .
- Fast and good .
2. motivation
- Work has shown that , With the use of application independent features ( For example, traditional word bag features ) comparison , Learning intensive application specific document representation can lead to better predictions .These works have demonstrated that learning dense application-specific document representations can lead to better predictions than using application-agnostic features such as the traditional bag-of-words features.
- 5-10 Short text of tags . For example, use the title to predict relevant web pages or advertisements . Short textual descriptions with typically only 5-10 tokens. Examples include applications such as predicting related webpages or related products using only the title of a given webpage/product and predicting relevant ads/keywords/searches for
user queries. - Use a variety of metadata, such as tag text 、 Label relevance 、 Label hierarchy , Better serve the tail label . XC applications often make available label metadata in various forms such as label text, label correlations or label hierarchies.
- Label features . Contemporary XC algorithms have explored utilizing label features.
- Hot start and auxiliary data sources . Warm-start and auxiliary sources of data.
- Most of the existing work uses document diagrams instead of documents - Label map ( see Table 1). existing works mostly use document-document graphs and not joint document-label graphs at extreme scales.
2. Basic symbols
| Symbol | meaning | remarks |
|---|---|---|
| G \mathbb{G} G | Bipartite graph | G = ( D ∪ L , E ) \mathbb{G} = (\mathbb{D} \cup \mathbb{L}, \mathbb{E}) G=(D∪L,E) |
| D \mathbb{D} D | A collection of text nodes | The element is recorded as d d d, The base number is N N N |
| L \mathbb{L} L | Label node set | The element is recorded as l l l, The base number is L L L |
| y i \mathbf{y}_i yi | The first i i i A real label vector of text | The value range is { − 1 , + 1 } L \{-1, +1\}^L { −1,+1}L |
| x ^ i 0 \hat{\mathbf{x}}_i^0 x^i0 | The first i i i The eigenvector of a document | D D D dimension |
| z ^ l 0 \hat{\mathbf{z}}_l^0 z^l0 | The first l l l Eigenvectors of labels | D D D dimension |
| v ^ n 0 \hat{\mathbf{v}}_n^0 v^n0 | x ^ i 0 \hat{\mathbf{x}}_i^0 x^i0 And z ^ l 0 \hat{\mathbf{z}}_l^0 z^l0 The unified expression of | D D D dimension |
| N \mathcal{N} N | Ask neighbors to operate | V → 2 V \mathbb{V} \to 2^\mathbb{V} V→2V |
| C \mathcal{C} C | Convolution operation | |
| T \mathcal{T} T | Transformation operation | transformation |
| a ^ n k \hat{\mathbf{a}}_n^k a^nk | C k ( { v ^ m k − 1 , a ^ m k − 1 : m ∈ N ( n ) } ) \mathcal{C}_k(\{\hat{\mathbf{v}}_m^{k-1}, \hat{\mathbf{a}}_m^{k-1}: m \in \mathcal{N}(n)\}) Ck({ v^mk−1,a^mk−1:m∈N(n)}) | GNN operation |
| v ^ n k \hat{\mathbf{v}}_n^k v^nk | T k ( { v ^ n k − 1 , a ^ n k − 1 } ) \mathcal{T}_k(\{\hat{\mathbf{v}}_n^{k-1}, \hat{\mathbf{a}}_n^{k-1}\}) Tk({ v^nk−1,a^nk−1}) | GNN operation |
| W \mathbf{W} W | coefficient matrix | D × L D \times L D×L dimension |
| K K K | hop Count | |
| e l k e_{lk} elk | label l l l In the k k k individual hop scalar |
3. programme
Graph convolution block The specific operation is
a ^ n k = C k ( a ^ n k − 1 ) = ( 1 + ϵ k ) ⋅ a ^ n k − 1 + ∑ m ∈ N ( n ) a ^ m k − 1 \hat{\mathbf{a}}_n^k = \mathcal{C}_k(\hat{\mathbf{a}}_n^{k-1}) = (1 + \epsilon_k) \cdot \hat{\mathbf{a}}_n^{k-1} + \sum_{m \in \mathcal{N}(n)}\hat{\mathbf{a}}_m^{k-1} a^nk=Ck(a^nk−1)=(1+ϵk)⋅a^nk−1+m∈N(n)∑a^mk−1
Embedding The specific operation is
v ^ n k = T k ( a ^ n k ) \hat{\mathbf{v}}_n^k = \mathcal{T}_k(\hat{\mathbf{a}}_n^k) v^nk=Tk(a^nk)
Make
α l k = exp ( e l k ) / ∑ k ′ ∈ [ K ] exp e l k ′ \alpha_{lk} = \exp(e_{lk}) / \sum_{k' \in [K]} \exp e_{lk'} αlk=exp(elk)/k′∈[K]∑expelk′
It represents the first k k k individual hop Proportion of time .
The calculation formula of label embedding is
x ^ ( l ) = ∑ k ∈ [ k ] α l k ⋅ x ^ k \hat{\mathbf{x}}^{(l)} = \sum_{k \in [k]} \alpha_{lk} \cdot \hat{\mathbf{x}}^{k} x^(l)=k∈[k]∑αlk⋅x^k
Be careful : there k k k The power has not been understood .
The tag score is
s l = * w l , x ^ ( l ) * s_l = \langle \mathbf{w}_l, \hat{\mathbf{x}}^{(l)} \rangle sl=*wl,x^(l)*
4. Summary
Before reading the program , I can't understand this paper at all .
边栏推荐
- Visualstudio2019 compilation configuration lastools-v2.0.0 under win10 system
- [community personas] exclusive interview with Ma Longwei: the wheel is not easy to use, so make it yourself!
- 【coppeliasim】高效传送带
- Xshell 7 Student Edition
- Grabbing and sorting out external articles -- status bar [4]
- Blue Bridge Cup embedded_ STM32 learning_ Key_ Explain in detail
- 使用npm发布自己开发的工具包笔记
- Dynamics 365 开发协作最佳实践思考
- Sword finger offer 12 Path in matrix
- Use image components to slide through photo albums and mobile phone photo album pages
猜你喜欢

The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower

Using SA token to solve websocket handshake authentication

PHP campus movie website system for computer graduation design
![[solution] every time idea starts, it will build project](/img/fc/e68f3e459768abb559f787314c2124.jpg)
[solution] every time idea starts, it will build project
![抓包整理外篇——————状态栏[ 四]](/img/1e/2d44f36339ac796618cd571aca5556.png)
抓包整理外篇——————状态栏[ 四]

Accelerating spark data access with alluxio in kubernetes

PHP campus financial management system for computer graduation design

2022 PMP project management examination agile knowledge points (8)

Redis-字符串类型

MySQL index
随机推荐
01. Go language introduction
Regular expressions: examples (1)
Redis-列表
[flask] obtain request information, redirect and error handling
[width first search] Ji Suan Ke: Suan tou Jun goes home (BFS with conditions)
Redis list
Global and Chinese markets of nasal oxygen tubes 2022-2028: Research Report on technology, participants, trends, market size and share
论文笔记: 极限多标签学习 GalaXC (暂存, 还没学完)
Global and Chinese market of commercial cheese crushers 2022-2028: Research Report on technology, participants, trends, market size and share
MySQL index
Jisuanke - t2063_ Missile interception
Multi function event recorder of the 5th National Games of the Blue Bridge Cup
Open source | Ctrip ticket BDD UI testing framework flybirds
一题多解,ASP.NET Core应用启动初始化的N种方案[上篇]
Minecraft 1.16.5 生化8 模组 2.0版本 故事书+更多枪械
Computer graduation design PHP animation information website
机器学习训练与参数优化的一般过程 (讨论)
Know MySQL database
The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
剑指 Offer 38. 字符串的排列