当前位置:网站首页>Mathematical Essays: Notes on the angle between vectors in high dimensional space
Mathematical Essays: Notes on the angle between vectors in high dimensional space
2022-06-12 07:54:00 【Espresso Macchiato】
- On Mathematics : Notes on the angles between vectors in high-dimensional space
1. Problem description
The story originated in long long ago When I saw sujianlin's blog, I mentioned a conclusion :
- The probabilities of two random vectors in high dimensional space are orthogonal to each other .
I was concerned about this conclusion at that time , Today, I suddenly thought of this problem , Just want to take advantage of the holiday to verify this conclusion .
obviously , Sujianlin mentioned in his blog that the expression of this conclusion is quite casual , To better define our problem , Let's refine it :
- about n Two random vectors on the unit sphere in the middle of dimension , The angle between them θ \theta θ stay n When taking the larger value , Tend to be 90 degree .
It feels difficult to prove this conclusion , But it is relatively easy to demonstrate this result , Our first reaction is Monte Carlo simulation , Actually generate N N N Group n n n Uniform unit vectors in dimensional space , Then look at the angular distribution between them .
however , To do this , We first need to generate n Uniformly distributed unit vectors in dimensional space .
2. n Uniform vector in dimensional space
1. 2 D and 3 Special cases in dimensional space
First , Let's examine some simple cases , namely 2 Peace-keeping 3 The situation in dimensional space .
1. 2 Uniform distribution vector in dimensional space
The uniform distribution vector in two-dimensional space is actually the uniform distribution vector on the unit circle , therefore , We just need to give one 0 0 0 To π \pi π An angle evenly distributed between ϕ \phi ϕ We can get a uniformly distributed unit vector v ⃗ = ( s i n θ , c o s θ ) \vec{v} = (sin\theta, cos\theta) v=(sinθ,cosθ).
3. 3 Uniform distribution vector in dimensional space
For the three-dimensional case , In fact, I believe that most readers can easily write solutions if they are familiar with coordinate system transformation .
Our polar coordinates are as follows :
{ x = r ⋅ s i n θ ⋅ s i n ϕ y = r ⋅ s i n θ ⋅ c o s ϕ z = r ⋅ c o s θ \left \{ \begin{aligned} x & = r\cdot sin\theta \cdot sin\phi \\ y & = r\cdot sin\theta \cdot cos\phi \\ z & = r\cdot cos\theta \end{aligned} \right. ⎩⎪⎨⎪⎧xyz=r⋅sinθ⋅sinϕ=r⋅sinθ⋅cosϕ=r⋅cosθ
And then we can get , The expression formula of unit volume element is :
ρ = d x d y d z = r 2 s i n θ d r d θ d ϕ \begin{aligned} \rho & = dxdydz \\ & = r^2sin\theta drd\theta d\phi \end{aligned} ρ=dxdydz=r2sinθdrdθdϕ
You can see , For unit bin , The specific expression is ρ = C ⋅ s i n θ d θ d ϕ = C ′ ⋅ d c o s θ ⋅ d ϕ \rho = C \cdot sin\theta d\theta d\phi = C' \cdot dcos\theta \cdot d\phi ρ=C⋅sinθdθdϕ=C′⋅dcosθ⋅dϕ. therefore , To generate a uniform distribution , We just need to follow c o s θ cos\theta cosθ The distribution of generates a θ \theta θ, Then generate a 0 0 0 To 2 π 2\pi 2π Evenly distributed above ϕ \phi ϕ that will do .
Give specific python The implementation is as follows :
import numpy as np
def dummy():
theta = np.arccos(np.random.uniform(-1, 1))
phi = np.random.uniform() * 2 * np.pi
x = np.sin(theta) * np.sin(phi)
y = np.sin(theta) * np.cos(phi)
z = np.cos(theta)
return (x, y, z)
2. n A uniform vector in a dimensional coordinate system
Now? , Let's look at n Cases in dimensional space .
We imitate 3 In the case of dimensional space , Just give the polar coordinate expression of the volume element first , Then the expression of the space angle is examined .
give n The polar coordinate transformation in dimensional space is as follows :
{ x 1 = r ⋅ c o s θ 1 x 2 = r ⋅ s i n θ 1 ⋅ c o s θ 2 x 3 = r ⋅ s i n θ 1 ⋅ s i n θ 2 ⋅ c o s θ 3 . . . x n − 1 = r ⋅ s i n θ 1 ⋅ s i n θ 2 ⋅ . . . ⋅ s i n θ n − 2 ⋅ c o s θ n − 1 x n = r ⋅ s i n θ 1 ⋅ s i n θ 2 ⋅ . . . ⋅ s i n θ n − 2 ⋅ s i n θ n − 1 \left\{ \begin{aligned} & x_1 = r \cdot cos\theta_1 \\ & x_2 = r \cdot sin\theta_1 \cdot cos\theta_2 \\ & x_3 = r \cdot sin\theta_1 \cdot sin\theta_2 \cdot cos\theta_3 \\ & ... \\ & x_{n-1} = r \cdot sin\theta_1 \cdot sin\theta_2 \cdot ... \cdot sin\theta_{n-2} \cdot cos\theta_{n-1} \\ & x_n = r \cdot sin\theta_1 \cdot sin\theta_2 \cdot ... \cdot sin\theta_{n-2} \cdot sin\theta_{n-1} \end{aligned} \right. ⎩⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎧x1=r⋅cosθ1x2=r⋅sinθ1⋅cosθ2x3=r⋅sinθ1⋅sinθ2⋅cosθ3...xn−1=r⋅sinθ1⋅sinθ2⋅...⋅sinθn−2⋅cosθn−1xn=r⋅sinθ1⋅sinθ2⋅...⋅sinθn−2⋅sinθn−1
You can get n Volume elements in dimensional space :
d x 1 d x 2 . . . d x n = ∂ ( x 1 , x 2 , . . . , x n ) ∂ ( r , θ 1 , θ 2 , . . . , θ n − 1 ) ) ⋅ d r d θ 1 d θ 2 . . . d θ n − 1 = d e t ∣ ∂ x 1 ∂ r ∂ x 1 ∂ θ 1 . . . ∂ x 1 ∂ θ n − 1 ∂ x 2 ∂ r ∂ x 2 ∂ θ 1 . . . ∂ x 2 ∂ θ n − 1 . . . ∂ x n ∂ r ∂ x n ∂ θ 1 . . . ∂ x n ∂ θ n − 1 ∣ ⋅ d r d θ 1 d θ 2 . . . d θ n − 1 = r n − 1 s i n n − 2 θ 1 s i n n − 3 θ 2 . . . s i n 2 θ n − 3 s i n θ n − 2 ⋅ d r d θ 1 d θ 2 . . . d θ n − 1 \begin{aligned} dx_1dx_2...dx_n & = \frac{\partial(x_1, x_2, ..., x_n)}{\partial(r, \theta_1, \theta_2, ..., \theta_{n-1}))} \cdot drd\theta_1d\theta_2...d\theta_{n-1} \\ \\ & = det \begin{vmatrix} \frac{\partial x_1}{\partial r} & \frac{\partial x_1}{\partial \theta_1} & ... & \frac{\partial x_1}{\partial \theta_{n-1}} \\ \frac{\partial x_2}{\partial r} & \frac{\partial x_2}{\partial \theta_1} & ... & \frac{\partial x_2}{\partial \theta_{n-1}} \\ ... \\ \frac{\partial x_n}{\partial r} & \frac{\partial x_n}{\partial \theta_1} & ... & \frac{\partial x_n}{\partial \theta_{n-1}} \end{vmatrix} \cdot drd\theta_1d\theta_2...d\theta_{n-1} \\ \\ & = r^{n-1}sin^{n-2}\theta_1sin^{n-3}\theta_2...sin^2\theta_{n-3}sin\theta_{n-2} \cdot drd\theta_1d\theta_2...d\theta_{n-1} \end{aligned} dx1dx2...dxn=∂(r,θ1,θ2,...,θn−1))∂(x1,x2,...,xn)⋅drdθ1dθ2...dθn−1=det∣∣∣∣∣∣∣∣∣∂r∂x1∂r∂x2...∂r∂xn∂θ1∂x1∂θ1∂x2∂θ1∂xn.........∂θn−1∂x1∂θn−1∂x2∂θn−1∂xn∣∣∣∣∣∣∣∣∣⋅drdθ1dθ2...dθn−1=rn−1sinn−2θ1sinn−3θ2...sin2θn−3sinθn−2⋅drdθ1dθ2...dθn−1
thus , We just need to make any θ i \theta_i θi Satisfy the distribution condition s i n n − 1 − i θ i d θ i sin^{n-1-i} \theta_i d\theta_{i} sinn−1−iθidθi It's evenly distributed , Then we can get a random vector with uniform spatial angle distribution in the whole space .
Of course , This is not an easy thing to do .
Of course , If you can make x i x_i xi yes ( − ∞ , ∞ ) (-\infty, \infty) (−∞,∞) Uniform distribution in range , In fact, the randomly generated vector is also uniform in the space angle , But it is also obviously difficult to achieve .
3. Ingenious application of normal distribution
here , We give a black Technology , That is, although we cannot ( − ∞ , ∞ ) (-\infty, \infty) (−∞,∞) Generate a uniform random distribution in the range , But we can come second , If for a n Dimension vector , Its values in each dimension satisfy the normal distribution N ( 0 , 1 ) N(0, 1) N(0,1), So the randomly generated vector is in any n It is also uniformly distributed in the angle of dimensional space .
We examine it at any n The probability density in the volume element of dimensional space is as follows :
ρ = Π i = 1 n 1 2 π e − x i 2 2 = ( 1 2 π ) n ⋅ e ∑ i = 1 n x i 2 / 2 = ( 1 2 π ) n ⋅ e r 2 / 2 \begin{aligned} \rho & = \Pi_{i=1}^{n} \frac{1}{\sqrt{2\pi}} e^{-\frac{x_i^2}{2}} \\ & = (\frac{1}{\sqrt{2\pi}})^n \cdot e^{\sum_{i=1}^{n} x_i^2 / 2} \\ & = (\frac{1}{\sqrt{2\pi}})^n \cdot e^{r^2 / 2} \\ \end{aligned} ρ=Πi=1n2π1e−2xi2=(2π1)n⋅e∑i=1nxi2/2=(2π1)n⋅er2/2
You can see , The probability density on this space volume element is only related to the radial distance r of , It has nothing to do with the space angle , therefore , Constructed in the above way n The dimension vector is in n The angle of dimensional space is uniformly distributed .
And we normalize it , You can get n Uniformly distributed unit vectors in dimensional space .
We verify the effectiveness of the above method in two-dimensional and three-dimensional space as follows :

3. n The angle between two vectors in dimensional space
Sum up , We can n Random generation of unit vectors in dimensional space .
that , We can examine the angle and dimension between two random vectors through Monte Carlo generation n The changing relationship between .
The results are shown as follows :

You can see :
- For any dimension n, The angle between two random vectors θ \theta θ The average value of is 90 degree ;
- With dimensions n An increase in , Angle θ \theta θ The distribution of standard deviation decreases gradually , Finally converged to 0, That is, in high-dimensional space , The angle between the two vectors will be very close to 90 degree .
4. summary & reflection
Sum up , This conclusion mentioned by sujianlin in his blog has been proved , It is indeed a very interesting conclusion .
边栏推荐
- Primal problem and dual problem
- Kalman filter encapsulation function
- R语言dplyr包mutate_at函数和one_of函数将dataframe数据中指定数据列(通过向量指定)的数据类型转化为因子类型
- Voice assistant - Qu - single entity recall
- 2022 electrician (elementary) examination question bank and simulation examination
- R language uses neuralnet package to build neural network regression model (feedforward neural network regression model) and calculate MSE value (mean square error) of the model on the test set
- R语言e1071包的naiveBayes函数构建朴素贝叶斯模型、predict函数使用朴素贝叶斯模型对测试数据进行预测推理、table函数构建混淆矩阵
- Compiling principle on computer -- functional drawing language (V): compiler and interpreter
- Rich dad, poor dad Abstract
- 2021.10.31-11.1 scientific research log
猜你喜欢

Some summaries of mathematical modeling competition in 2022

Rich dad, poor dad Abstract

L'effet de l'oie sauvage sur l'économie numérique verte de Guilin

Arrangement of statistical learning knowledge points -- maximum likelihood estimation (MLE) and maximum a posteriori probability (map)

Improvement of hash function based on life game

Summary of machine learning + pattern recognition learning (IV) -- decision tree

Chapter 4 - key management and distribution

vscode 1.68变化与关注点(整理导入语句/实验性新命令中心等)

Support vector machine (SVM)

Voice assistant -- Qu -- query error correction and rewriting
随机推荐
Latex usage problems and skills summary (under update)
20220526 损失函数
qt. qpa. plugin: Could not load the Qt platform plugin “xcb“ in “***“
20220524 deep learning technology points
Voice assistant - Introduction and interaction process
Voice assistant - Measurement Indicators
解决逆向工程Mapper重复问题
Exposure compensation, white increase and black decrease theory
R语言glm函数构建泊松回归模型(possion)、epiDisplay包的poisgof函数对拟合的泊松回归模型进行拟合优度检验、即模型拟合的效果、验证模型是否有过度离散overdispersion
In depth learning - overview of image classification related models
Summary of machine learning + pattern recognition learning (V) -- Integrated Learning
Architecture and performance analysis of convolutional neural network
Voice assistant - overall architecture and design
Fundamentals of Mathematics - Taylor Theorem
Arrangement of statistical learning knowledge points -- maximum likelihood estimation (MLE) and maximum a posteriori probability (map)
20220526 yolov1-v5
The R language converts the data of the specified data column in the dataframe data from decimal to percentage representation, and the data to percentage
Logistic regression
Web page performance optimization interview questions
石油储运生产 2D 可视化,组态应用赋能工业智慧发展