当前位置:网站首页>Data dimensionality reduction factor analysis
Data dimensionality reduction factor analysis
2022-07-02 19:19:00 【Lu 727】
1、 effect
Factor analysis is based on the idea of dimension reduction , In the case of no or less loss of original data information as far as possible , The complex variables are aggregated into a few independent common factors , These common factors can reflect the main information of many variables , While reducing the number of variables , It also reflects the internal relationship between variables . Generally, factor analysis has three functions : One is to reduce the dimension of factors , Second, calculate the factor weight , Third, calculate the weighted calculation factor to summarize the comprehensive score .
2、 Input / output description
Input :2 Two or more quantitative variables ( Assuming that N A variable ).
Output : The minimum dimension reduction is 1 dimension ( A variable , Generally used for comprehensive evaluation ), Maximum dimension reduction N A variable ( Generally used for data desensitization ), At the same time, the composition weight of each variable after dimension reduction can be obtained , Used to represent the data retention of the original variable .
3、 Case example
According to the region 2021 Per capita in GDP、 Per capita disposable income and other indicators , Quantitatively evaluate the ranking of economic development level of multiple provinces, cities and regions or the weight of each index

4、 Modeling steps
Factor analysis is a method of reducing multidimensional variables to a few common factors according to the correlation between variables , Then the multidimensional variable statistical analysis method is analyzed . The basic idea is to divide the original variables into two parts : One part is the linear combination of common factors , Condensing represents most of the information in the original variables ; The other part is the special factor which has nothing to do with the common factor , It reflects the linear combination of common factors and original variables The gap between .p Dimension variable
The factor analysis model is :

Or as

among f =[f 1 ,f 2 ,…,f m ]T namely by carry take Of Male common because Son towards The amount , generation surface 了 primary beginning change The amount in No can straight Pick up view measuring but customer view save stay Of m (m <p) Three mutually independent common influencing factors ;A=(
) Is the factor load matrix , matrix Elements aik by change The amount x i Yes Male common because Son fk The load of , It reflects the correlation coefficient between the two , The greater the absolute value , The more relevant ;
For multidimensional variables x The key to establish the factor analysis model is to solve the factor load matrix A And the common factor vector f , The steps are as follows :
1. In order to eliminate the influence of different dimensions of variables , To contain n individual p Samples of dimensional variables X=[x1 ,x2 ,…,xn ] Standardize . After standardization , The mean value of each variable is 0, The variance of 1. For the convenience of expression, the standardized variables are still used X Express , Its elements are :

2. Find the covariance matrix of the sample S, Its elements are :

3. For the sample covariance matrix S Do eigenvalue decomposition , obtain p Eigenvalues λ1 ≥λ2≥…≥λp ≥0, The corresponding eigenvalue vector is γ1 , γ2 ,…,γp , Before taking it m The eigenvector of the largest eigenvalue estimates the factor load matrix . At the same time, in order to ensure the variance of each component of the common factor vector by 1, Divide it by the corresponding standard deviation λj . The corresponding eigenvector in the factor load matrix γj Then multiply by λj . Therefore, the factor load matrix

The parameter m Determined by the cumulative variance contribution rate of common factors , namely

It is generally believed , At present m The cumulative variance contribution rate of common factors exceeds 90% when , It can be considered that before m The linear combination of common factors can basically restore the original variable information .
Common factor vector f , That is, the specific score of the original variable on the common factor can be estimated by regression method

Go through the above steps , After obtaining the factor load matrix and the common factor vector , Then we can get that the special factor vector of the original variable is :

边栏推荐
- SIFT特征点提取「建议收藏」
- How to play when you travel to Bangkok for the first time? Please keep this money saving strategy
- R语言ggplot2可视化分面图(facet):gganimate包基于transition_time函数创建动态散点图动画(gif)
- The mybatieshelperpro tool can be generated to the corresponding project folder if necessary
- STM32G0 USB DFU 升级校验出错-2
- 使用CLion编译OGLPG-9th-Edition源码
- 潇洒郎:彻底解决Markdown图片问题——无需上传图片——无需网络——转发给他人图片无缺失
- 【ERP软件】ERP体系二次开发有哪些危险?
- Fastdfs installation
- yolov3 训练自己的数据集之生成train.txt
猜你喜欢

使用 Cheat Engine 修改 Kingdom Rush 中的金钱、生命、星

聊聊电商系统中红包活动设计

Machine learning notes - time series prediction research: monthly sales of French champagne
![[0701] [paper reading] allowing data imbalance issue with perforated input during influence](/img/c7/9b7dc4b4bda4ecfe07aec1367fe059.png)
[0701] [paper reading] allowing data imbalance issue with perforated input during influence

【JVM调优实战100例】02——虚拟机栈与本地方法栈调优五例
![[0701] [论文阅读] Alleviating Data Imbalance Issue with Perturbed Input During Inference](/img/c7/9b7dc4b4bda4ecfe07aec1367fe059.png)
[0701] [论文阅读] Alleviating Data Imbalance Issue with Perturbed Input During Inference

Mysql高级篇学习总结7:Mysql数据结构-Hash索引、AVL树、B树、B+树的对比

数据降维——因子分析

yolov3 训练自己的数据集之生成train.txt
![[100 cases of JVM tuning practice] 01 - introduction of JVM and program counter](/img/c4/3bba96fda92328704c2ddd929dcdf6.png)
[100 cases of JVM tuning practice] 01 - introduction of JVM and program counter
随机推荐
Develop fixed asset management system, what voice is used to develop fixed asset management system
2022 software engineering final exam recall Edition
Learn the knowledge points of eight part essay ~ ~ 1
Web2.0的巨头纷纷布局VC,Tiger DAO VC或成抵达Web3捷径
Date tool class (updated from time to time)
横向越权与纵向越权[通俗易懂]
Gamefi链游系统开发(NFT链游开发功能)丨NFT链游系统开发(Gamefi链游开发源码)
GMapping代码解析[通俗易懂]
Mysql高级篇学习总结8:InnoDB数据存储结构页的概述、页的内部结构、行格式
数字滚动带动画
电脑使用哪个录制视频软件比较好
Transformation of thinking consciousness is the key to the success or failure of digital transformation of construction enterprises
golang:[]byte转string
使用 Cheat Engine 修改 Kingdom Rush 中的金钱、生命、星
IEDA refactor的用法
教程篇(5.0) 10. 故障排除 * FortiEDR * Fortinet 網絡安全專家 NSE 5
juypter notebook 修改默认打开文件夹以及默认浏览器
[100 cases of JVM tuning practice] 02 - five cases of virtual machine stack and local method stack tuning
Novice must see, click two buttons to switch to different content
Progress progress bar