当前位置:网站首页>Understanding of expectation, variance, covariance and correlation coefficient
Understanding of expectation, variance, covariance and correlation coefficient
2022-07-08 01:23:00 【You roll, I don't roll】
Catalog
1、 Mathematical expectation ( mean value )
4、 The correlation coefficient ρ
One sentence summary : expect Reflects the average level , variance It reflects the fluctuation degree of data , covariance It reflects the correlation between two random variables ( Dimensionality ), The correlation coefficient It reflects the dimensionless correlation between two random variables .
1、 Mathematical expectation ( mean value )
For random variables and their probabilities weighted mean :
The expectation here is the mean , In statistics, samples are used to replace the whole in most cases , Therefore, the average value of the sample is calculated as :
2、 variance D(X) or Var(X)
It is used to understand the deviation between the actual index and the average value , That is, it reflects the dispersion of data values .
if X Value set , Then its variance is small , conversely , X The more dispersed, the greater the variance .
D(X) Satisfy the following properties :
When X And Y Satisfy Independent homologous distribution (iid) when , , here :
there That's what we'll talk about later covariance .
in addition Standard deviation ( Mean square error ) The calculation formula of is : , And X Have the same dimension .
In sample analysis , The calculation formula of variance is :
Be careful : This is divided by 1/(n-1).
Why does divide appear in variance calculation n And divided by n-1 Two cases ?:
Divide n It calculates the population variance , Divide n-1 It calculates the sample variance ( That is, the unbiased estimation of the total variance ). But in reality, it is often unrealistic to calculate the total variance , One of the research contents of statistics is to infer the population with samples , Therefore, we often use sample variance to replace the overall situation .
Why is the sample variance calculated by n-1 Well ? Because we must calculate the sample mean before calculating the sample variance ( let me put it another way , Will sum the samples ), This leads to the n If the item is determined n-1 Item's words , The first n Items can be determined , That is, the degree of freedom is n-1, So the probability of each occurrence is 1/(n-1) , So you have to divide by n-1. In terms of linear algebra , this n Quantity is not independent , If the n If a quantity is regarded as a vector, it is linearly related , Can be n-1 A linearly independent vector representation .
If divided by n It means that we know the mean value of the population sample in advance μ( This μ It is known. , Not calculated , Because in reality, it is often unrealistic to calculate the overall average ), At this time, the probability of occurrence of all quantities is 1/n, So the variance at this time The calculation of is divided by n. But this situation can only be regarded as the ideal calculation method , In reality, it is basically impossible , In reality, most cases estimate the population based on samples , Therefore, our common variance calculation formula is to divide by n-1 了 .
3、 covariance Cov(X,Y)
Covariance is used to describe the correlation between two variables . Covariance is a dimensional quantity .
if X And Y Are independent of each other , be .
4、 The correlation coefficient ρ
The correlation coefficient is also used to describe the correlation between two variables , But unlike covariance , The correlation coefficient is a dimensionless quantity , The formula is as follows .
in addition , call
by X、Y Standardization of . Then there are :
The nature of the correlation coefficient :
- . The greater the value of, the greater the degree of linear correlation , When the value is large, it is called X And Y The linear correlation is good ; Time description X And Y There is no linear relationship , But there may be other relationships , For example, for obedience The random variable on X Come on , if X1=sinX,X2=cosX, although , But satisfied .
- The necessary and sufficient conditions for : There is a constant a、b, bring
5、 Covariance matrix
Covariance matrix is used to describe the covariance between different dimensions of multidimensional random variables .
set up n Dimensional random variable The second-order covariance of is
Then the matrix
be called n Dimensional random variable The covariance matrix of . because , Therefore, the covariance matrix is also a symmetric matrix , The variance forms the elements on its diagonal , Covariance constitutes the non diagonal element . In a general way ,n The distribution of dimensional random variables is unknown , Or it's too complicated , So difficult to deal with mathematically , Therefore, covariance matrix is very important in practical application . Covariance matrix is widely used in statistics, machine learning and other fields .
边栏推荐
- Content of one frame
- Chapter VIII integrated learning
- About how USRP sets the sampling frequency below the minimum sampling frequency reached by the hardware
- The Ministry of housing and urban rural development officially issued the technical standard for urban information model (CIM) basic platform, which will be implemented from June 1
- 2022 tea master (intermediate) examination questions and tea master (intermediate) examination skills
- Su embedded training - Day8
- How to transfer Netease cloud music /qq music to Apple Music
- 2022 R1 fast opening pressure vessel operation test question bank and R1 fast opening pressure vessel operation free test questions
- Codeforces Round #804 (Div. 2)
- Generic configuration legend
猜你喜欢
Redis 主从复制
2022 examination for safety production management personnel of hazardous chemical production units and new version of examination questions for safety production management personnel of hazardous chem
6. Dropout application
Ag7120 and ag7220 explain the driving scheme of HDMI signal extension amplifier | ag7120 and ag7220 design HDMI signal extension amplifier circuit reference
Definition and classification of energy
Common effects of line chart
Recommend a document management tool Zotero | with tutorials and learning paths
How to write mark down on vscode
5. Discrete control and continuous control
Using GPU to train network model
随机推荐
General configuration toolbox
130. 被围绕的区域
Vscode reading Notepad Chinese display garbled code
130. 被圍繞的區域
Design method and reference circuit of type C to hdmi+ PD + BB + usb3.1 hub (rj45/cf/tf/ sd/ multi port usb3.1 type-A) multifunctional expansion dock
Study notes of single chip microcomputer and embedded system
2022 safety officer-c certificate examination summary and safety officer-c certificate reexamination examination
Smart grid overview
8. Optimizer
2. Nonlinear regression
Cross modal semantic association alignment retrieval - image text matching
Guojingxin center "friendship and righteousness" - the meta universe based on friendship and friendship, and the parallel of "honguniverse"
Scheme selection and scheme design of multifunctional docking station for type C to VGA HDMI audio and video launched by ange in Taiwan | scheme selection and scheme explanation of usb-c to VGA HDMI c
Chapter 16 intensive learning
Generic configuration legend
1. Linear regression
Chapter 7 Bayesian classifier
[loss function] entropy / relative entropy / cross entropy
C# ?,?.,?? .....
Su embedded training - Day8