当前位置:网站首页>Bert whitening vector dimension reduction and its application
Bert whitening vector dimension reduction and its application
2022-06-24 14:25:00 【loong_ XL】
Reference resources :https://kexue.fm/archives/8069
https://kexue.fm/archives/9079
https://zhuanlan.zhihu.com/p/531476789
Input :vv Is a three-dimensional matrix composed of multiple vectors 
result :v_data1 256 dimension 
def compute_kernel_bias(vecs, n_components=256):
""" Calculation kernel and bias
vecs.shape = [num_samples, embedding_size],
The last transformation :y = (x + bias).dot(kernel)
"""
mu = vecs.mean(axis=0, keepdims=True)
cov = np.cov(vecs.T)
# print(cov)
u, s, vh = np.linalg.svd(cov)
print(np.diag(1 / np.sqrt(s) ))
W = np.dot(u, np.diag(1 / np.sqrt(s)))
return W[:, :n_components], -mu
def transform_and_normalize(vecs, kernel=None, bias=None):
""" The final vector is normalized
"""
if not (kernel is None or bias is None):
vecs = (vecs + bias).dot(kernel)
return vecs / (vecs**2).sum(axis=1, keepdims=True)**0.5
v_data = np.array(vv[0]) ## vv[0] A two-dimensional matrix composed of multiple vectors , If you input a vector, the two-dimensional matrix calculation will report an error
kernel,bias=compute_kernel_bias(v_data)
# print(kernel,bias)
v_data1=transform_and_normalize(v_data, kernel=kernel, bias=bias)
*** A single vector on the line will be calculated as a whole kernel,bias use , direct transform_and_normalize(v_data, kernel=kernel, bias=bias) Just go
import numpy as np
data = np.random.rand(5,768)
print('data.shape = ')
print(data.shape,data)
def compute_kernel_bias(vecs):
""" Calculation kernel and bias
vecs.shape = [num_samples, embedding_size],
The last transformation :y = (x + bias).dot(kernel)
"""
mu = vecs.mean(axis=0, keepdims=True)
cov = np.cov(vecs.T)
u, s, vh = np.linalg.svd(cov)
W = np.dot(u, np.diag(1 / np.sqrt(s)))
return W, -mu
def transform_and_normalize(vecs, kernel=None, bias=None):
""" Apply transformation , And then standardize
"""
if not (kernel is None or bias is None):
vecs = (vecs + bias).dot(kernel)
return vecs / (vecs**2).sum(axis=1, keepdims=True)**0.5
kernel,bias = compute_kernel_bias(data)
kernel = kernel[:,:64]
print('kernel.shape = ')
print(kernel.shape)
print('bias.shape = ')
print(bias.shape)
data = transform_and_normalize(data, kernel, bias)
print('data.shape = ')
print(data.shape,data)

Dimension reduction of single vector on line
data1 = np.random.rand(1,768)
data1_1 = transform_and_normalize(data1, kernel, bias)

边栏推荐
- R language constructs regression model diagnosis (normality is invalid), performs variable transformation, and uses powertransform function in car package to perform box Cox transform to normality on
- [leetcode] 10. Regular expression matching
- MySQL复合索引探究
- 专精特新“小巨人”再启动,“企业上云”数字赋能
- 六石管理学:垃圾场效应:工作不管理,就会变成垃圾场
- [learn ZABBIX from scratch] I. Introduction and deployment of ZABBIX
- 【无标题】
- [environment setup] zip volume compression
- 10_那些格調很高的個性簽名
- Three efficient programming skills of go language
猜你喜欢
![二叉树中最大路径和[处理好任意一颗子树,就处理好了整个树]](/img/d0/91ab1cc1851d7137a1cab3cf458302.png)
二叉树中最大路径和[处理好任意一颗子树,就处理好了整个树]

How to avoid placing duplicate orders
![[bitbear story collection] June MVP hero story | technology practice collision realm thinking](/img/b7/ca2f8cfb124e7c68da0293624911d1.png)
[bitbear story collection] June MVP hero story | technology practice collision realm thinking

Explore cloud native databases and take a broad view of future technological development
![[deep learning] storage form of nchw, nhwc and chwn format data](/img/4f/4478d96132eb2547f6ec09ae49639e.jpg)
[deep learning] storage form of nchw, nhwc and chwn format data

Kunpeng arm server compilation and installation paddlepaddle

Mit-6.824-lab4a-2022 (ten thousand words explanation - code construction)

专精特新“小巨人”再启动,“企业上云”数字赋能

Go language - use of goroutine coroutine

Linux 安装 CenOS7 MySQL - 8.0.26
随机推荐
Linux 安装 CenOS7 MySQL - 8.0.26
Second, the examinee must see | consolidate the preferred question bank to help the examinee make the final dash
leetcode.12 --- 整数转罗马数字
R language constructs regression model diagnosis (normality is invalid), performs variable transformation, and uses powertransform function in car package to perform box Cox transform to normality on
Generate binary tree according to preorder & inorder traversal [partition / generation / splicing of left subtree | root | right subtree]
10 Ces autographes très stylisés.
Don't underestimate the integral mall. It can play a great role
R language plot visualization: the visualization model creates a grid in the classification contour (contour) and meshgrid of the entire data space, in which the distance between each point is determi
Py's toad: a detailed introduction to toad, its installation and use
MySQL日志管理、备份与恢复
鲲鹏arm服务器编译安装PaddlePaddle
Redis interview questions
MySQL复合索引探究
Bert-whitening 向量降维及使用
根据前序&中序遍历生成二叉树[左子树|根|右子树的划分/生成/拼接问题]
MES在流程和离散制造企业的15个差别(下)
简谈企业Power BI CI /CD 实施框架
MySQL title
Keras深度学习实战(11)——可视化神经网络中间层输出
OpenHarmony 1