当前位置:网站首页>Matlab / envi principal component analysis implementation and result analysis
Matlab / envi principal component analysis implementation and result analysis
2022-07-07 06:24:00 【PeanutbutterBoh】
Antecedents feed : Recently, I was doing principal component analysis to screen variables , The purpose is to calculate the load of each environmental data on different principal components , But it feels wrong with the results of other papers , So I refer to some literature to try to understand .
Catalog
1 Principal component load
Baidu Encyclopedia said : Principal component load ( oad of principal component) The correlation coefficient between original variables and principal components in principal component analysis .
Further understanding : Refer to this document for a particularly detailed link : Principal component analysis 

So it's easy to understand , The principal component load is the coefficient before the original data , The principal component can be obtained by multiplying the principal component load with the original data .
2 matlab Principal component analysis experiment
According to the above ideas , I use them separately ENVI and matlab Did an experiment .
What I use here is 26 Environment variables , The resolution is the same . I want to use principal component analysis to calculate the loads of different variables , To determine which variables are important, and then keep them to run the model . Go straight up matlab Code for
clear all; clc
[tifname,tifpath] = uigetfile('.tif',' Selection environment tif data ','MultiSelect','on');
for i = 1:numel(tifname)
[A,~] = geotiffread([tifpath,tifname{
i}]);
Environ_Var(:,:,i) = double(A); % Synthesize those environment variables into a matrix
end
E = reshape(Environ_Var,size(Environ_Var,1)*size(Environ_Var,2),...
size(Environ_Var,3)); % Change three-dimensional data into two-dimensional , Columns are variables
E(E == -9999) = nan;
E = E(~isnan(E(:,1)),:); % Remove nan value
E_Norm = normalize(E); % Normalized variable
[coeff,score,latent,~,explained,~] = pca(E_Norm,'Centered',false);
Be careful :1、 My variables have different unit dimensions , So we should standardize them , And then call pca Functions do not need to be centralized .
2、 No data value in my environment variable is -9999, So I removed them before standardization .
From the calculation results, we can know :
coeff That is, the main component load , Each column corresponds to a principal component (matlab It is called the principal component coefficient in the help document )
Why do you say that , from matlab Help document ( link : Principal component analysis of raw data ) You know :score*coeff ’ Can be restored to the original data .( As for why coeff The transpose ,score Is the principal component score , Rows correspond to observations , Columns correspond to components ;coeff Each column of contains a coefficient of the principal component . So use score Go to take coeff One line of is an observation )
Then I also tried , It was found that it was , The recovered data is the same as my standardized data .
From the beginning AT X = PC You can know by changing , Load is the principal component coefficient A.
But there is another problem here matlab Calculate this coeff Need to take -1( That is, the symbols are opposite ) Is the real variable load , As for why to look down ENVI The experiment of
3 ENVI Principal component analysis experiment
See the previous blog for specific operations ENVI5.3.1 Use Landsat 8 Image PCA example operation
But this blog has only pure steps , No analysis results , And pay attention : Choose correlation matrix or covariance matrix : If the dimension between the data is large, use the correlation matrix , If the dimension difference is not big or you have standardized the data, then use covariance . and ENVI There must be no nan value , Mask if necessary
Pay attention to this point, and then look directly ENVI result .( There is no screenshot of the specific results, I exist excel in )
You can see ENVI Will give you characteristic value 、 Things like eigenvectors , The eigenvector is that each line represents a component , Columns represent variables .
Now take out the load of the first principal component (ENVI Eigenvector 、matlab Of coeff) Take a look at the results :
It's amazing that the two values are almost the same, but the symbols are opposite . How to determine which symbol is correct ? That depends on the meaning of the load . that SST year mean give an example ( For example, you can also see this https://wenku.baidu.com/view/5dc0b7c1514de518964bcf84b9d528ea81c72f6f.html It's very detailed ), Now I know if SST Rising is bad for my biological survival , that SST The load of should be a minus sign .
So in matlab Draw a picture in , The closer the variable is to the origin , It needs to be eliminated 
4 summary
To calculate the load of the variable on the principal component ,ENVI It's the eigenvector eigenvector,matlab Is the principal component coefficient coeff cube -1 Change number .
边栏推荐
- Crudini 配置文件编辑工具
- A very good JVM interview question article (74 questions and answers)
- Swagger3 configuration
- Knight defeats demon king (Backpack & DP)
- Rk3399 platform development series explanation (WiFi) 5.53, hostapd (WiFi AP mode) configuration file description
- ML's shap: Based on the adult census income binary prediction data set (whether the predicted annual income exceeds 50K), use the shap decision diagram combined with the lightgbm model to realize the
- Go language learning notes - Gorm use - native SQL, named parameters, rows, tosql | web framework gin (IX)
- When we talk about immutable infrastructure, what are we talking about
- tkinter窗口选择pcd文件并显示点云(open3d)
- JVM命令之 jstack:打印JVM中线程快照
猜你喜欢

Shared memory for interprocess communication

字符串常量与字符串对象分配内存时的区别

go-microservice-simple(2) go-Probuffer
![[FPGA] EEPROM based on I2C](/img/28/f4f2efda4b5feb973c9cf07d9d908f.jpg)
[FPGA] EEPROM based on I2C
![[SOC FPGA] peripheral PIO button lights up](/img/34/58728bddbf91eb69e9c0062dbfd531.jpg)
[SOC FPGA] peripheral PIO button lights up

Test the foundation of development, and teach you to prepare for a fully functional web platform environment

Sequential storage of stacks

A very good JVM interview question article (74 questions and answers)

一段程序让你明白什么静态内部类,局部内部类,匿名内部类

POI excel export, one of my template methods
随机推荐
为不同类型设备构建应用的三大更新 | 2022 I/O 重点回顾
Bypass open_ basedir
Ant manor safety helmet 7.8 ant manor answer
You don't know the complete collection of recruitment slang of Internet companies
How to set up in touch designer 2022 to solve the problem that leap motion is not recognized?
哈趣投影黑马之姿,仅用半年强势突围千元投影仪市场!
Financial risk control practice - decision tree rule mining template
laravel 使用腾讯云 COS5全教程
[FPGA] EEPROM based on I2C
3531. 哈夫曼树
Vscode for code completion
FlexRay通信协议概述
Software testing knowledge reserve: how much do you know about the basic knowledge of "login security"?
C note 13
JVM命令之 jstack:打印JVM中线程快照
Redis(一)——初识Redis
「解析」FocalLoss 解决数据不平衡问题
3531. Huffman tree
JVM监控及诊断工具-命令行篇
dolphinscheduler3.x本地启动