当前位置:网站首页>Matlab / envi principal component analysis implementation and result analysis
Matlab / envi principal component analysis implementation and result analysis
2022-07-07 06:24:00 【PeanutbutterBoh】
Antecedents feed : Recently, I was doing principal component analysis to screen variables , The purpose is to calculate the load of each environmental data on different principal components , But it feels wrong with the results of other papers , So I refer to some literature to try to understand .
Catalog
1 Principal component load
Baidu Encyclopedia said : Principal component load ( oad of principal component) The correlation coefficient between original variables and principal components in principal component analysis .
Further understanding : Refer to this document for a particularly detailed link : Principal component analysis
So it's easy to understand , The principal component load is the coefficient before the original data , The principal component can be obtained by multiplying the principal component load with the original data .
2 matlab Principal component analysis experiment
According to the above ideas , I use them separately ENVI and matlab Did an experiment .
What I use here is 26 Environment variables , The resolution is the same . I want to use principal component analysis to calculate the loads of different variables , To determine which variables are important, and then keep them to run the model . Go straight up matlab Code for
clear all; clc
[tifname,tifpath] = uigetfile('.tif',' Selection environment tif data ','MultiSelect','on');
for i = 1:numel(tifname)
[A,~] = geotiffread([tifpath,tifname{
i}]);
Environ_Var(:,:,i) = double(A); % Synthesize those environment variables into a matrix
end
E = reshape(Environ_Var,size(Environ_Var,1)*size(Environ_Var,2),...
size(Environ_Var,3)); % Change three-dimensional data into two-dimensional , Columns are variables
E(E == -9999) = nan;
E = E(~isnan(E(:,1)),:); % Remove nan value
E_Norm = normalize(E); % Normalized variable
[coeff,score,latent,~,explained,~] = pca(E_Norm,'Centered',false);
Be careful :1、 My variables have different unit dimensions , So we should standardize them , And then call pca Functions do not need to be centralized .
2、 No data value in my environment variable is -9999, So I removed them before standardization .
From the calculation results, we can know :
coeff That is, the main component load , Each column corresponds to a principal component (matlab It is called the principal component coefficient in the help document )
Why do you say that , from matlab Help document ( link : Principal component analysis of raw data ) You know :score*coeff ’ Can be restored to the original data .( As for why coeff The transpose ,score Is the principal component score , Rows correspond to observations , Columns correspond to components ;coeff Each column of contains a coefficient of the principal component . So use score Go to take coeff One line of is an observation )
Then I also tried , It was found that it was , The recovered data is the same as my standardized data .
From the beginning AT X = PC You can know by changing , Load is the principal component coefficient A.
But there is another problem here matlab Calculate this coeff Need to take -1( That is, the symbols are opposite ) Is the real variable load , As for why to look down ENVI The experiment of
3 ENVI Principal component analysis experiment
See the previous blog for specific operations ENVI5.3.1 Use Landsat 8 Image PCA example operation
But this blog has only pure steps , No analysis results , And pay attention : Choose correlation matrix or covariance matrix : If the dimension between the data is large, use the correlation matrix , If the dimension difference is not big or you have standardized the data, then use covariance . and ENVI There must be no nan value , Mask if necessary
Pay attention to this point, and then look directly ENVI result .( There is no screenshot of the specific results, I exist excel in )
You can see ENVI Will give you characteristic value 、 Things like eigenvectors , The eigenvector is that each line represents a component , Columns represent variables .
Now take out the load of the first principal component (ENVI Eigenvector 、matlab Of coeff) Take a look at the results :
It's amazing that the two values are almost the same, but the symbols are opposite . How to determine which symbol is correct ? That depends on the meaning of the load . that SST year mean give an example ( For example, you can also see this https://wenku.baidu.com/view/5dc0b7c1514de518964bcf84b9d528ea81c72f6f.html It's very detailed ), Now I know if SST Rising is bad for my biological survival , that SST The load of should be a minus sign .
So in matlab Draw a picture in , The closer the variable is to the origin , It needs to be eliminated
4 summary
To calculate the load of the variable on the principal component ,ENVI It's the eigenvector eigenvector,matlab Is the principal component coefficient coeff cube -1 Change number .
边栏推荐
- 软件测试的几个关键步骤,你需要知道
- Array proof during st table preprocessing
- 可极大提升编程思想与能力的书有哪些?
- Peripheral driver library development notes 43: GPIO simulation SPI driver
- JVM monitoring and diagnostic tools - command line
- A freshman's summary of an ordinary student [I don't know whether we are stupid or crazy, but I know to run forward all the way]
- Financial risk control practice - decision tree rule mining template
- 那些自损八百的甲方要求
- How to keep accounts of expenses in life
- ICML 2022 | 探索语言模型的最佳架构和训练方法
猜你喜欢
为不同类型设备构建应用的三大更新 | 2022 I/O 重点回顾
[SOC FPGA] peripheral PIO button lights up
Jcmd of JVM command: multifunctional command line
Convert numbers to string strings (to_string()) convert strings to int sharp tools stoi();
屏幕程序用串口无法调试情况
Check point: the core element for enterprises to deploy zero trust network (ztna)
VMware安装后打开就蓝屏
PostgreSQL database timescaledb function time_ bucket_ Gapfill() error resolution and license replacement
雷特智能家居龙海祁:从专业调光到全宅智能,20年专注成就专业
Jmeter自带函数不够用?不如自己动手开发一个
随机推荐
ML's shap: Based on the adult census income binary prediction data set (whether the predicted annual income exceeds 50K), use the shap decision diagram combined with the lightgbm model to realize the
港科大&MSRA新研究:关于图像到图像转换,Fine-tuning is all you need
Ctfshow-- common posture
3531. 哈夫曼树
Subghz, lorawan, Nb IOT, Internet of things
go-microservice-simple(2) go-Probuffer
Laravel uses Tencent cloud cos5 full tutorial
Oracle迁移中关于大容量表使用数据泵(expdp、impdp)导出导入容易出现的问题和注意事项
Haqi projection Black Horse posture, avec seulement six mois de forte pénétration du marché des projecteurs de 1000 yuans!
New Year Fireworks code plus copy, are you sure you don't want to have a look
postgresql 数据库 timescaledb 函数time_bucket_gapfill()报错解决及更换 license
屏幕程序用串口无法调试情况
Find duplicate email addresses
"Parse" focalloss to solve the problem of data imbalance
If you don't know these four caching modes, dare you say you understand caching?
Jinfo of JVM command: view and modify JVM configuration parameters in real time
laravel 使用腾讯云 COS5全教程
Go language learning notes - Gorm use - native SQL, named parameters, rows, tosql | web framework gin (IX)
Party A's requirements for those who have lost 800 yuan
VIM mapping large K