当前位置:网站首页>Singular Value Decomposition(SVD)
Singular Value Decomposition(SVD)
2022-06-09 18:07:00 【梦想家DBA】
Matrix decomposition (matrix factorization),involves describing a given matrix using its constitutent elements.The most known and widely used widely used matrix decomposition method is the Singular -Value Decomposition(SVD) which makes it more stable than other methods.such as eigendecomposition.In this tutorial , you will discover the Singular-Value Decomposition method for decomposition method for decomposing a matrix into its constituent elements.
After completing this tutorial , you will know:
- What Singular-value decomposition is and what is involved.
- How to calculate an SVD and reconstruct a rectangular and square matrix from SVD elements.
- How to calculate the pseudoinverse and perform dimensionality reduction using the SVD.
1.1 Tutorial Overview
This tutorial is divided into 5 parts; they are:
- What is the Singular-Value Decomposition
- Calculate Singular-Value Decomposition
- Reconstruct Matrix
- Pseudoinverse
- Dimensionality Reduction
1.2 What is the Singular-Value Decomposition
SVD is a matrix decomposition method for reducing a matrix to its constituent parts in order to make certain subsequent matrix calculations simpler. we will focus on the SVD for real-valued matrixes and ignore the case for complex numbers.

- A is the real n × m matrix that we wish to decompose
- U is an m × m matrix
- Σ represented by the uppercase Greek letter sigma) is an m × n diagonal matrix
-
is the V transpose of an n × n matrix where T is a superscript.
The SVD is used widely both in the calculation of other matrix operations, such as matrix inverse, but also as a data reduction method in machine learning. SVD can also be used in least squares linear regression, image compression, and denoising data.
1.3 Calculate Singular-Value Decomposition
The SVD can be calculated by calling the svd() function. The function takes a matrix and returns the U, Σ and
elements. The Σ diagonal matrix is returned as a vector of singular values. The V matrix is returned in a transposed form, e.g.
. The example below defines a 3 × 2 matrix and calculates the singular-value decomposition.
# Example of calculating a singular-value decomposition
# singular-value decomposition
from numpy import array
from scipy.linalg import svd
# define a matrix
A = array([
[1, 2],
[3, 4],
[5, 6]
])
print(A)
# factorize
U,s,V = svd(A)
print(U)
print(s)
print(V) Running the example first prints the defined 3×2 matrix, then the 3×3 U matrix, 2 element Σ vector, and 2 × 2
matrix elements calculated from the decomposition.

1.4 Reconstruct Matrix
The original matrix can be reconstructed from the U, Σ, and
elements. The U, s, and V elements returned from the svd() cannot be multiplied directly. The s vector must be converted into a diagonal matrix using the diag() function. By default, this function will create a square matrix that is m × m, relative to our original matrix. This causes a problem as the size of the matrices do not fit the rules of matrix multiplication, where the number of columns in a matrix must match the number of rows in the subsequent matrix. After creating the square Σ diagonal matrix, the sizes of the matrices are relative to the original n × m matrix that we are decomposing, as follows:

Where,in fact,we require:
We can achieve this by creating a new Σ matrix of all zero values that is m × n (e.g. more rows) and populate the first n × n part of the matrix with the square diagonal matrix calculated via diag().
# reconstructt rectangular matrix from svd
from numpy import array
from numpy import diag
from numpy import zeros
from scipy.linalg import svd
# define matrix
A = array([
[1, 2],
[3, 4],
[5, 6]
])
print(A)
# factorize
U,s,V = svd(A)
#create m x n Sigma matrix
Sigma = zeros((A.shape[0], A.shape[1]))
# populate Sigma with n x n diagonal matrix
Sigma[:A.shape[1],:A.shape[1]] = diag(s)
# reconstruct matrix
B = U.dot(Sigma.dot(V))
print(B)Running the example first prints the original matrix, then the matrix reconstructed from the SVD elements.

The above complication with the Σ diagonal only exists with the case where m and n are not equal. The diagonal matrix can be used directly when reconstructing a square matrix, as follows.
# reconstruct square matrix from svd
from numpy import array
from numpy import diag
from scipy.linalg import svd
# define matrix
A = array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
print(A)
# factorize
U,s,V = svd(A)
# create n x n Sigma matrix
Sigma = diag(s)
# reconstruct matrix
B = U.dot(Sigma.dot(V))
print(B)Running the example prints the original 3 × 3 matrix and the version reconstructed directly from the SVD elements.

1.5 Pseudoinverse
The pseudoinverse is the generalization of the matrix inverse for square matrices to rectangular matrices where the number of rows and columns are not equal. It is also called the Moore-Penrose Inverse after two independent discoverers of the method or the Generalized Inverse.


# Example of calculating the pseudoinverse
# pseudoinverse
from numpy import array
from numpy.linalg import pinv
# define matrix
A = array([
[0.1, 0.2],
[0.3, 0,4],
[0.5, 0.6],
[0.7, 0.8]])
print(A)
# calculate pseudoinverse
B = pinv(A)
print(B)
# pseudoinverse via svd
from numpy import array
from numpy.linalg import svd
from numpy import zeros
from numpy import diag
# define matrix
A = array([
[0.1, 0.2],
[0.3, 0.4],
[0.5, 0.6],
[0.7, 0.8]
])
print(A)
# factorize
U,s,V = svd(A)
# reciprocals of s
d = 1.0 / s
# create m x n D matrix
D = zeros(A.shape)
# populate D with n x n diagonal matrix
D[:A.shape[1],: A.shape[1]] = diag(d)
# calculate pseudoinverse
B = V.T.dot(D.T).dot(U.T)
print(B)Running the example first prints the defined rectangular matrix and the pseudoinverse that matches the above results from the pinv() function.

1.6 Dimensionality Reduction
A popular application of SVD is for dimensionality reduction. Data with a large number of features,such as more features (columns) than observations (rows) may be reduced to a smaller subset of features that are most relevant to the prediction problem. The result is a matrix with a lower rank that is said to approximate the original matrix. To do this we can perform an SVD operation on the original data and select the top k largest singular values in Σ. These columns can be selected from Σ and the rows selected from
. An approximate B of the original vector A can then be reconstructed.

In natural language processing, this approach can be used on matrices of word occurrences or word frequencies in documents and is called Latent Semantic Analysis or Latent Semantic Indexing. In practice, we can retain and work with a descriptive subset of the data called T. This is a dense summary of the matrix or a projection.

Further, this transform can be calculated and applied to the original matrix A as well as other similar matrices.

The example below demonstrates data reduction with the SVD. First a 3 × 10 matrix is defined, with more columns than rows. The SVD is calculated and only the first two features are selected. The elements are recombined to give an accurate reproduction of the original matrix. Finally the transform is calculated two different ways.
# data reduction with svd
from numpy import array
from numpy import diag
from numpy import zeros
from scipy.linalg import svd
# define matrix
A = array([
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27, 28, 29, 30]
])
print(A)
# factorize
U,s,V = svd(A)
# create m x n Sigma matrix
Sigma = zeros((A.shape[0],A.shape[1]))
#populate Sigma with n x n diagonal matrix
Sigma[:A.shape[0],:A.shape[0]] = diag(s)
# select
n_elements = 2
Sigma = Sigma[:,:n_elements]
V = V[:n_elements,:]
# reconstruct
B = U.dot(Sigma.dot(V))
print(B)
# transform
T = U.dot(Sigma)
print(T)
T = A.dot(V.T)
print(T)
Running the example first prints the defined matrix then the reconstructed approximation, followed by two equivalent transforms of the original matrix.

The scikit-learn provides a TruncatedSVD class that implements this capability directly. The TruncatedSVD class can be created in which you must specify the number of desirable features or components to select, e.g. 2. Once created, you can fit the transform (e.g. calculate V T k ) by calling the fit() function, then apply it to the original matrix by calling the transform() function. The result is the transform of A called T above. The example below demonstrates the TruncatedSVD class.
# svd data reduction in scikit-learn
from numpy import array
from sklearn.decomposition import TruncatedSVD
# define matrix
A = array([
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27, 28, 29, 30]
])
print(A)
# create transform
svd = TruncatedSVD(n_components=2)
# fit transform
svd.fit(A)
# apply transform
result = svd.transform(A)
print(result)Running the example first prints the defined matrix, followed by the transformed version of the matrix. We can see that the values match those calculated manually above, except for the sign on some values. We can expect there to be some instability when it comes to the sign given the nature of the calculations involved and the differences in the underlying libraries and methods used. This instability of sign should not be a problem in practice as long as the transform is trained for reuse.

边栏推荐
- 基于FPGA的SD卡读写设计及仿真verilog
- Redis知识点&面试题总结
- MySQL并行复制(MTS)原理(完整版)
- AI首席架构师3-AICA-智慧城市中的AI应用实践
- Moco -Momentum Contrast for Unsupervised Visual Representation Learning
- [data processing] pandas reads SQL data
- redis源码学习-01_Clion中调试redis源码
- About concurrency and parallelism, are the fathers of go and Erlang wrong?
- Some interesting b+ tree optimization experiments
- AI首席架构师4-AICA-百度CV技术应用及产业落地心得
猜你喜欢

Redis source code learning-02_ memory allocation

新股解读|位列活动内容营销行业第二位,多想云推SaaS产品打造竞争壁垒

Epigentek chromatin accessibility test kit principles and procedures

Management of free memory

NLP-文本表示-词袋模型和TF-IDF

如何以案例学习kd树构建和搜索过程?

Redis知识点&面试题总结
MySQL community server 8.0.29 installation and configuration method graphic tutorial

导电滑环的主要应用在那些方面

UnicodeDecodeError: ‘gbk‘ codec can‘t decode byte 0xad in position 2: illegal multibyte sequence
随机推荐
NLP keyword extraction overview
Redis基础与高级
Solve the chrome prompt: "your connection is not a private connection" and there is no continue access option
10分钟快速入门RDS【华为云至简致远】
Snap announced that the upgraded camera products and AR ecology will continue to penetrate the Chinese market
pytorch设置随机种子
logrotate
AI chief architect 4-aica-baidu CV technology application and industry landing experience
线程中断
Redis source code learning-05_ Dictionary, hash table
Fastjson deserialization Remote Code Execution Vulnerability
sqllite create a database
谁说Redis不能存大key
如何以案例学习kd树构建和搜索过程?
What are the main applications of conductive slip rings
Abbexa AEC 色原试剂盒使用说明
深入理解联合索引的最左前缀原则
Ali's 10-year Technician: seven ways of thinking of the leader
Fastjson反序列化远程代码执行漏洞
About concurrency and parallelism, are the fathers of go and Erlang wrong?