当前位置:网站首页>Vector similarity evaluation method
Vector similarity evaluation method
2022-07-29 02:17:00 【Algorithm with temperature】
Link to the original text : Vector similarity evaluation method
The use of similarity in work can be said to be quite frequent , Today I'll introduce it to you pytorch Four commonly used vector similarity evaluation ideas :
- CosineSimilarity
- DotProductSimilarity
- BiLinearSimilarity
- MultiHeadedSimilarity
1 Cosine similarity
Cosine similarity is familiar to everyone . The cosine of the angle between two vectors is used to measure the difference between two individuals . The closer the cosine is to 1, It means that the closer the angle is 0 degree , That is, the more similar the two vectors are .
import torch
import torch.nn as nn
import math
class CosineSimilarity(nn.Module):
def forward(self, tensor_1, tensor_2):
normalized_tensor_1 = tensor_1 / tensor_1.norm(dim=-1, keepdim=True)
normalized_tensor_2 = tensor_2 / tensor_2.norm(dim=-1, keepdim=True)
return (normalized_tensor_1 * normalized_tensor_2).sum(dim=-1)
2 DotProductSimilarity
This similarity function calculates the dot product between each pair of vectors , And use optional scaling to reduce the variance of the output , To adjust the output of the results .
class DotProductSimilarity(nn.Module):
def __init__(self, scale_output=False):
super(DotProductSimilarity, self).__init__()
self.scale_output = scale_output
def forward(self, tensor_1, tensor_2):
result = (tensor_1 * tensor_2).sum(dim=-1)
if self.scale_output:
# TODO why allennlp do multiplication at here ?
result /= math.sqrt(tensor_1.size(-1))
return result
Cosine method and dot product method are the most commonly used mathematical methods , In complex scenes, we can add the idea of neural network to the method of calculating similarity .
3 BiLinearSimilarity
This similarity function performs a bilinear transformation of two input vectors , Is to add the neural network linear layer . This function has a weight matrix “W” And a deviation “b”, And the similarity between the two vectors , The formula is :
x T W y + b x^TWy+b xTWy+b
After calculation, you can use the activation function , The default is inactive .
class BiLinearSimilarity(nn.Module):
def __init__(self, tensor_1_dim, tensor_2_dim, activation=None):
super(BiLinearSimilarity, self).__init__()
self.weight_matrix = nn.Parameter(torch.Tensor(tensor_1_dim, tensor_2_dim))
self.bias = nn.Parameter(torch.Tensor(1))
self.activation = activation
self.reset_parameters()
def reset_parameters(self):
nn.init.xavier_uniform_(self.weight_matrix)
self.bias.data.fill_(0)
def forward(self, tensor_1, tensor_2):
intermediate = torch.matmul(tensor_1, self.weight_matrix)
result = (intermediate * tensor_2).sum(dim=-1) + self.bias
if self.activation is not None:
result = self.activation(result)
return result
According to this idea , We can evolve trilinear transformation , The formula is :
W T [ x , y , x ∗ y ] + b W^T[x,y,x*y]+b WT[x,y,x∗y]+b
Only on the original basis, all features and the relationship between features are changed into input , Interested friends can do it by themselves .
4 MultiHeadedSimilarity
This similarity function borrows transformer many “ head ” To calculate the similarity . We project the input tensor into several new tensors , And calculate the similarity of each projection tensor .
class MultiHeadedSimilarity(nn.Module):
def __init__(self,
num_heads,
tensor_1_dim,
tensor_1_projected_dim=None,
tensor_2_dim=None,
tensor_2_projected_dim=None,
internal_similarity=DotProductSimilarity()):
super(MultiHeadedSimilarity, self).__init__()
self.num_heads = num_heads
self.internal_similarity = internal_similarity
tensor_1_projected_dim = tensor_1_projected_dim or tensor_1_dim
tensor_2_dim = tensor_2_dim or tensor_1_dim
tensor_2_projected_dim = tensor_2_projected_dim or tensor_2_dim
if tensor_1_projected_dim % num_heads != 0:
raise ValueError("Projected dimension not divisible by number of heads: %d, %d"
% (tensor_1_projected_dim, num_heads))
if tensor_2_projected_dim % num_heads != 0:
raise ValueError("Projected dimension not divisible by number of heads: %d, %d"
% (tensor_2_projected_dim, num_heads))
self.tensor_1_projection = nn.Parameter(torch.Tensor(tensor_1_dim, tensor_1_projected_dim))
self.tensor_2_projection = nn.Parameter(torch.Tensor(tensor_2_dim, tensor_2_projected_dim))
self.reset_parameters()
def reset_parameters(self):
torch.nn.init.xavier_uniform_(self.tensor_1_projection)
torch.nn.init.xavier_uniform_(self.tensor_2_projection)
def forward(self, tensor_1, tensor_2):
projected_tensor_1 = torch.matmul(tensor_1, self.tensor_1_projection)
projected_tensor_2 = torch.matmul(tensor_2, self.tensor_2_projection)
last_dim_size = projected_tensor_1.size(-1) // self.num_heads
new_shape = list(projected_tensor_1.size())[:-1] + [self.num_heads, last_dim_size]
split_tensor_1 = projected_tensor_1.view(*new_shape)
last_dim_size = projected_tensor_2.size(-1) // self.num_heads
new_shape = list(projected_tensor_2.size())[:-1] + [self.num_heads, last_dim_size]
split_tensor_2 = projected_tensor_2.view(*new_shape)
return self.internal_similarity(split_tensor_1, split_tensor_2)
summary
The complex approach is to carry out more linear changes and combinations of linear changes on the basis of vectors . In fact, we can create our own calculation methods according to business scenarios , Because the advantage of neural network is that we can build it ourselves at will .
Link to the original text : Vector similarity evaluation method
边栏推荐
- JVM memory overflow online analysis dump file and online analysis open.Hprof file to get JVM operation report how jvisualvm online analysis
- (cvpr-2019) selective kernel network
- 第十五天(VLAN相关知识)
- MySQL high performance optimization notes (including 578 pages of notes PDF document), collected
- TI C6000 TMS320C6678 DSP+ Zynq-7045的PS + PL异构多核案例开发手册(2)
- Blind separation of speech signals based on ICA and DL
- (CVPR-2019)选择性的内核网络
- "Wei Lai Cup" 2022 Niuke summer multi school training camp 3, sign in question cajhf
- Excel 打开包含汉字的 csv 文件出现乱码该怎么办?
- Ignore wechat font settings
猜你喜欢

Basic working principle and LTSpice simulation of 6T SRAM

“蔚来杯“2022牛客暑期多校训练营2,签到题GJK

Control buzzer based on C51

特殊流&Properties属性集实例遇到的问题及解决方法

Verilog procedure assignment statements: blocking & non blocking

Monadic linear function perceptron: Rosenblatt perceptron

Anti crawler mechanism solution: JS code generates random strings locally

指针——黄金阶段

基于C51控制蜂鸣器
[email protected],国产化率达100%"/>全志T3/A40i工业核心板,4核[email protected],国产化率达100%
随机推荐
h5背景音乐通过触摸自动播放
Cookie和Session
控制输入框弹出弹窗 和不弹出窗口
Monadic linear function perceptron: Rosenblatt perceptron
Navigation--实现Fragment之间数据传递和数据共享
iVX低代码平台系列详解 -- 概述篇(二)
数学建模——带相变材料的低温防护服御寒仿真模拟
(arxiv-2018) 重新审视基于视频的 Person ReID 的时间建模
What is the function of data parsing?
The number of consecutive subarrays whose leetcode/ product is less than k
Mysql存储json格式数据
How to prevent all kinds of affiliated fraud?
[electronic components] constant voltage, amplify the current of the load (triode knowledge summary)
第十五天(VLAN相关知识)
【RT学习笔记1】RT-Thread外设例程——控制Led灯闪烁
(arxiv-2018) reexamine the time modeling of person Reid based on video
How to find the right agent type? Multi angle analysis for you!
Mathematical modeling -- Optimization of picking in warehouse
MySQL安装常见报错处理大全
What is scope and scope chain