当前位置:网站首页>Data standardization processing
Data standardization processing
2022-06-28 20:15:00 【Burger jingle】
1. Why should data standardization be carried out
In the multi index evaluation system , Due to the different nature of each evaluation index , It usually has different dimensions and orders of magnitude . When the level of each index varies greatly , If the original index value is directly used for analysis , It will highlight the role of indicators with higher values in comprehensive analysis , Relatively weaken the role of indicators with low numerical level . therefore , In order to ensure the reliability of the results , It is necessary to standardize the original index data .
2. What is data standardization
Scale the data , To fall into a small, specific area . It is often used in the processing of some comparison and evaluation indicators , Remove the unit limit of data , Convert it to dimensionless pure values , It is convenient to compare and weight indexes of different units or scales .
3. Data standardization methods
Common methods are :min-max Standardization (Min-max normalization),log Function conversion ,atan Function conversion ,z-score Standardization (zero-mena normalization, This method is most commonly used ), Fuzzy quantification . This article only introduces min-max Law ( Standardized methods ),z-score Law ( Normalization method ), Normalization method .
Method 1 : Standardized methods (min-max Law )
min-max Standardization (Min-maxnormalization) Also called dispersion standardization , It's a linear transformation of the original data , Make the result fall to [0,1] Section , The conversion function is as follows : among max Is the maximum value of the sample data ,min Is the minimum value of sample data . One drawback of this approach is that when new data is added , May lead to max and min The change of , Need to redefine .

characteristic : It's a linear transformation of the original data , The result falls into [0,1] Section
Method 2 : Normalization method (z-score Law )
The most common way to standardize is Z Standardization , It's also SPSS The most commonly used standardization method in :z-score Standardization (zero-meannormalization) It's also called standard deviation standardization , The processed data conform to the standard normal distribution , That is, the mean value is 0, The standard deviation is 1, The transformation function is : among μ Is the mean of all sample data ,σ Is the standard deviation of all sample data .

- This method is based on the mean value of the original data (mean) And standard deviation (standard deviation) Standardize data . take A Original value x Use z-score Standardize to x’.
- z-score Standardized methods apply to properties A When the maximum and minimum values of are unknown , Or there are outliers beyond the value range .
- spss The default standardization method is z-score Standardization .
- use Excel Conduct z-score Standardized methods : stay Excel There are no ready-made functions in , Step by step calculation is required , In fact, the standardized formula is very simple .
Steps are as follows :
1. Find out the variables ( indicators ) The arithmetic mean of ( Mathematical expectation )xi And standard deviation si ;
2. Standardized treatment :
zij=(xij-xi)/si
among :zij Is the normalized variable value ;xij Is the actual variable value .
3. Reverse the sign in front of the indicator .
The standardized variable values revolve around 0 Up and down , Greater than 0 Above average , Less than 0 It means below average .
Method 3 : Normalization method

边栏推荐
- Markdown mermaid种草(1)_ mermaid简介
- Why does next() in iterator need to be forcibly converted?
- Can layoffs really save China's Internet?
- Pyinstaller打包pikepdf失败的问题排查
- 2022年T电梯修理考试题库模拟考试平台操作
- rsync远程同步
- ROS中quaternion四元數和歐拉角轉換
- Database learning notes (sql04)
- ThreadLocal原理
- [graduation season · advanced technology Er] hard work can only pass, hard work can be excellent!
猜你喜欢

C # connect to the database to complete the operation of adding, deleting, modifying and querying
![[graduation season · advanced technology Er] hard work can only pass, hard work can be excellent!](/img/e5/b6035abfa7d4bb59c3080d3b87ce45.jpg)
[graduation season · advanced technology Er] hard work can only pass, hard work can be excellent!

Markdown mermaid种草(1)_ mermaid简介

2022焊工(初级)特种作业证考试题库及答案

C # application interface development foundation - form control

2022年T电梯修理考试题库模拟考试平台操作

还在付费下论文吗?快来跟我一起白piao知网

Software supply chain security risk guide for enterprise digitalization and it executives

Markdown mermaid種草(1)_ mermaid簡介

不同框架的绘制神经网络结构可视化
随机推荐
2280.Cupboards
Various types of long
员工薪资管理系统
社招两年半10个公司28轮面试面经
[324. swing sequence II]
计网 | 一文解析TCP协议所有知识点
Win 10 create a gin framework project
R language GLM generalized linear model: logistic regression, Poisson regression fitting mouse clinical trial data (dose and response) examples and self-test questions
市值1200亿美金,老牌财税巨头Intuit是如何做到的?
关键字long
[go language questions] go from 0 to entry 5: comprehensive review of map, conditional sentences and circular sentences
2788.Cifera
522. longest special sequence II (greedy & double pointer)
3. 整合 Listener
数据资产为王,如何解析企业数字化转型与数据资产管理的关系?
Troubleshooting of pyinstaller failed to pack pikepdf
SQL server2019 create a new SQL server authentication user name and log in
head、tail查看文件
Risc-v instruction set
Real number operation