当前位置:网站首页>Johnson–Lindenstrauss Lemma
Johnson–Lindenstrauss Lemma
2022-07-04 02:14:00 【FakeOccupational】
Johnson–Lindenstrauss lemma

lemma : Given ϵ > 0 \epsilon>0 ϵ>0, Random vector module length varies with n Converge exponentially to 1.
along with machine towards The amount x ∈ R n in Of Every time individual sit mark Mining sample since N ( 0 , 1 n ) P ( ∣ ∥ x ∥ 2 − 1 ∣ ≥ ε ) ≤ 2 exp ( − ε 2 n 8 ) Random vector x\in R^n Each coordinate in is sampled from N(0,\frac{1}{n})\\ P(|\Vert x\Vert^2 - 1| \geq \varepsilon) \leq 2\exp\left(-\frac{\varepsilon^2 n}{8}\right) along with machine towards The amount x∈Rn in Of Every time individual sit mark Mining sample since N(0,n1)P(∣∥x∥2−1∣≥ε)≤2exp(−8ε2n)
lemma : Also sample two random vectors , Approximately orthogonal .
P ( ∣ * x 1 , x 2 * ∣ ≥ ε ) ≤ 4 exp ( − ε 2 n 8 ) P(|\langle x_1, x_2\rangle| \geq \varepsilon) \leq 4\exp\left(-\frac{\varepsilon^2 n}{8}\right) P(∣*x1,x2*∣≥ε)≤4exp(−8ε2n)
Johnson–Lindenstrauss Lemma
Given ϵ > 0 \epsilon>0 ϵ>0, x i ∈ R m ( i = 1 , … , N ) , Such as On Mining sample Out One individual along with machine Moment front A ∈ R n × m , n > 24 log N ε 2 x_i \in R^m(i=1,…,N), A random matrix is sampled as above A\in \R^{n×m},n > \frac{24\log N}{\varepsilon^2} xi∈Rm(i=1,…,N), Such as On Mining sample Out One individual along with machine Moment front A∈Rn×m,n>ε224logN
( 1 − ε ) ∥ v i − v j ∥ 2 ≤ ∥ A v i − A v j ∥ 2 ≤ ( 1 + ε ) ∥ v i − v j ∥ 2 (1-\varepsilon)\Vert v_i - v_j\Vert^2 \leq \Vert Av_i - A v_j\Vert^2 \leq (1+\varepsilon)\Vert v_i - v_j\Vert^2 (1−ε)∥vi−vj∥2≤∥Avi−Avj∥2≤(1+ε)∥vi−vj∥2
application
Cosine theorem
Calculate the similarity of two sentences , You can use it first TF-IDF Algorithm to generate word frequency vector , Then calculate the cosine angle , The smaller, the more similar .
hash function
hash function (MD5 etc. ) Turn the article into a fixed length string , such as 32 position . In the front-end encryption, I have implemented the right “123456” The encryption .
simhash
The traditional hash function cannot compare the similarity between the two articles .simhash technology , It is Google Algorithm invented to solve large-scale web page de duplication . Use 0,1 Represents the final calculation result , XOR operation for comparison .
Johnson–Lindenstrauss lemma + discretization : In European Space N A little bit , After the same random projection mapping , They will still maintain their original relative positions . Then discretize the result of random projection ( Less than 90° by 1, Greater than 90° by 01 Similar as 1, Otherwise 0), Convenient for calculation and storage .
On the basis of the above ,simhash Word segmentation of the article , For every word hash, Yes hash Result weighting , Merge word vectors ( Add in sequence ), The final result is obtained by dimension reduction and other processing . There are pairs. simhash Explanation of algorithm , It seems that the operation of calculation is different .
2: or − 1 , send use − 1 Just can No take the Yes towards The amount Set in stay One individual like limit . \tiny or -1, Use -1 It is not necessary to concentrate all vectors in one quadrant . or −1, send use −1 Just can No take the Yes towards The amount Set in stay One individual like limit .
Reference resources
Reference resources
Reference resources
High dimensional random
To what extent has the theory of machine learning progressed ?
Du, S. S., Kakade, S. M., Wang, R., & Yang, L. F. (2019). Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning?
Database-friendly random projections:
Johnson-Lindenstrauss with binary coins
attention Application in
DGBR Algorithm
IJCAI’21 Secure Deep Graph Generation with Link Differential Privacy
Horse Er can Husband No etc. type : P ( x ≥ a ) ≤ E [ x ] a cut Than snow Husband No etc. type : P ( ( x − E [ x ] ) 2 ≥ a 2 ) ≤ E [ ( x − E [ x ] ) 2 ] a 2 = V a r [ x ] a 2 Markov inequality :P(x\geq a)\leq \frac{\mathbb{E}[x]}{a}\\ Chebyshev inequality :P((x - \mathbb{E}[x])^2\geq a^2) \leq \frac{\mathbb{E}[(x - \mathbb{E}[x])^2]}{a^2}=\frac{\mathbb{V}ar[x]}{a^2} Horse Er can Husband No etc. type :P(x≥a)≤aE[x] cut Than snow Husband No etc. type :P((x−E[x])2≥a2)≤a2E[(x−E[x])2]=a2Var[x]
Bernstein inequality
边栏推荐
- [typora installation package] old typera installation package, free version
- Applet graduation project based on wechat selection voting applet graduation project opening report function reference
- Why is the operation unsuccessful (unresolved) uncaught syntaxerror: invalid or unexpected token (resolved)
- Libcblas appears when installing opencv import CV2 so. 3:cannot open shared object file:NO such file or directory
- Will the memory of ParticleSystem be affected by maxparticles
- MySQL advanced (Advanced) SQL statement (I)
- 12. Gettimeofday() and time()
- Sword finger offer 14- I. cut rope
- Small program graduation project based on wechat e-book small program graduation project opening report function reference
- MySQL utilise la vue pour signaler les erreurs, Explicit / show ne peut pas être publié; Verrouillage des fichiers privés pour la table sous - jacente
猜你喜欢
![Setting function of Jerry's watch management device [chapter]](/img/0b/8fab078e1046dbc22aa3327c49faa7.jpg)
Setting function of Jerry's watch management device [chapter]

C language black Technology: Archimedes spiral! Novel, interesting, advanced~

What are the advantages and disadvantages of data center agents?

Save Private Ryan - map building + voltage dp+deque+ shortest circuit

ZABBIX API pulls the values of all hosts of a monitoring item and saves them in Excel

LeetCode 168. Detailed explanation of Excel list name

Small program graduation project based on wechat e-book small program graduation project opening report function reference

17. File i/o buffer
![[leetcode daily question] a single element in an ordered array](/img/3a/2b465589b70cd6aeec08e79fcf40d4.jpg)
[leetcode daily question] a single element in an ordered array
![The contact data on Jerry's management device supports reading and updating operations [articles]](/img/89/d36e785bd94c2373c34fb95eee3a9c.jpg)
The contact data on Jerry's management device supports reading and updating operations [articles]
随机推荐
mysql使用視圖報錯,EXPLAIN/SHOW can not be issued; lacking privileges for underlying table
Jerry's watch information type table [chapter]
How to view the computing power of GPU?
Jerry's update contact [article]
中電資訊-信貸業務數字化轉型如何從星空到指尖?
Feign implements dynamic URL
Small program graduation project based on wechat video broadcast small program graduation project opening report function reference
16. System and process information
Applet graduation design is based on wechat course appointment registration. Applet graduation design opening report function reference
Setting function of Jerry's watch management device [chapter]
MySQL advanced SQL statement (1)
LV1 Roche limit
Question d: Haffman coding
Global and Chinese market for travel wheelchairs 2022-2028: Research Report on technology, participants, trends, market size and share
A. ABC
Pesticide synergist - current market situation and future development trend
Final consistency of MESI cache in CPU -- why does CPU need cache
Network byte order
60 year old people buy medical insurance and recommend a better product
Mysql-15 aggregate function