当前位置：网站首页>Mathematical modeling and detailed explanation of basic knowledge (common knowledge points of Chemistry)

Mathematical modeling and detailed explanation of basic knowledge (common knowledge points of Chemistry)

2022-07-29 01:02:00 【Full stack programmer webmaster】

Hello everyone , I meet you again , I'm your friend, Quan Jun .

Summary of common knowledge points and methods of digital simulation

One 、 Comprehensive evaluation method

According to the theoretical basis of each evaluation method , Modern comprehensive evaluation methods can be roughly divided into the following four categories ：

1、 Expert evaluation method

2、 Operations research and other mathematical methods

2.1、 Analytic hierarchy process （AHP）

2.2 、 Fuzzy comprehensive evaluation method （FCE）

2.3 、 Data envelopment analysis （DEA）

3 、 Based on statistical and economic methods

3.1 、TOPSIS Evaluation method , Entropy weight method can be used for optimization

3.2 、 Primary and secondary analysis and factor analysis

Principal component analysis overcomes Correlation 、 Overlapping , Replace more variables with fewer variables , This substitution can reflect most of the information of the original multiple variables , This is actually a kind of “ Dimension reduction ” Thought . Factor analysis uses a few hypothetical variables to represent its basic data structure . These hypothetical variables can reflect the main information of many original variables . The original variable is an observable explicit variable , Hypothetical variables are unobservable potential variables , It's called a factor .

3.3 、 Cost benefit method

4 、 New evaluation method

4.1 、 Artificial neural network evaluation method （ANN）

be based on BP The comprehensive evaluation method of artificial neural network has the advantages of fast operation 、 Efficient problem solving 、 Strong self-learning ability 、 Strong fault tolerance and other advantages , The process of comprehensive evaluation by evaluation experts is well simulated , Therefore, it has broad application prospects .

4.2 、 Grey comprehensive evaluation method

Grey system theory mainly uses the known information to determine the unknown information of the system , Make the system from “ ash ” change “ white ”. Its biggest feature is that there is no strict requirement on the sample size , Don't obey any distribution . Grey correlation degree is one of the main aspects of the application of grey system theory .

5、 Hybrid method ： Combined evaluation method

Two 、 Interpolation and fitting （ Numerical calculation method ）

1、 interpolation

1.1、 Newton interpolation 1.2、 Lagrange interpolation 1.3、 Emmett interpolation 1.4、 Spline interpolation

2、 fitting

2.1 Least squares fit 2.2 Best approximation （ Best square 、 Best consistency, etc ）

3、 ... and 、 Hypothesis testing （ Probability theory and mathematical statistical methods ）

1、 The correlation coefficient

1.1、 Pearson correlation coefficient

Pearson correlation coefficient is applicable to continuous variables with normal distribution . Sensitive to outliers . Usually I use t Test and other methods to test the Pearson correlation coefficient . We need to confirm that these two variables are linearly related . Continuous data , Normal distribution , linear relationship , All meet , Pearson correlation coefficient is the most appropriate . If the data is sequenced , Then use Spearman rank correlation coefficient .

1.2、 Spearman correlation coefficient

Another definition ： Pearson correlation coefficient between grades . Pearson correlation coefficient is applicable to linear relationship , And Spearman correlation coefficient is applicable to monotonic relation （ The slope of the linear relationship is fixed ）. Pearson correlation coefficient is calculated using metadata , Spearman correlation coefficient is calculated based on rank .

1.3、 Kendall's tau coefficient

Kendall's tau coefficient , Also known as Kendall rank correlation coefficient , It is also a rank correlation coefficient , however , Its target object is an ordered class variable , For example, ranking 、 age group 、 Obesity rating ( Severe obesity , Moderate obesity 、 Mild obesity 、 Not fat ) etc. . It can measure the monotonic relationship between two ordered variables .

1.4、 Differences and choices

Compared with Pearson correlation coefficient , Spearman correlation coefficient 、 Kendall's tau coefficient , Is the correlation coefficient based on the data rank . Because these estimators operate on rank , Instead of data values , So they are robust to outliers , And can deal with specific types of nonlinear relationships . Most of the time , Rank based estimators are suitable for small-scale data sets and specific hypothesis tests . （ Reference resources ： 1、 What is the correlation coefficient 2、 Pearson 、 Spearman 、 Introduction of Kendall correlation coefficient and its application in feature selection ）

2、 Normal distribution mean hypothesis test method

Common methods ：t test ,Z test , Chi square test ,F test etc.

3、 Normal distribution test

The test of judging whether the population obeys the normal distribution by using the observed data is called the normality test . It is an important special goodness of fit hypothesis test in statistical judgment . Common methods are ： Skewness and kurtosis , Graphic method , Nonparametric test . Commonly used ：（ Reference resources ： All normality tests are here ） 3.1 skewness - Kurtosis test method 3.2 Graphic method ： adopt Histogram 、P-P chart 、QQ chart Preliminary judgement . 3.3 Nonparametric test Kolmogorov - Sminov test （Kolmogorov-Smirnov test）, abbreviation K-S test （ It is suitable for exploring the distribution of continuous random variables 、 In contrast, it is suitable for large samples （>50）） Shapiro - Wilke test （Shapiro—Wilk test）, abbreviation S-W test .（ Small sample ）

Four 、 Return to

（ Reference resources ： Super dry ： Understand regression analysis ） 1、 Linear regression 、 Locally weighted linear regression 2、 Multiple regression （ The estimation method is divided into ordinary 、 The generalized least squares method , In a broad sense, heteroscedasticity or autocorrelation is allowed in the error term , Be careful Goodness of fit index ） 3、 Heteroscedasticity 、 Multicollinearity 、 Stepwise regression Heteroscedasticity ： The variance of random interference does not vary with the independent variables . Check for the presence of , If exist , It will lead to invalid parameter estimation 、 Parameter significance test is meaningless 、 Model predictions fail . Multicollinearity ： Due to the precise correlation or high correlation between explanatory variables, the model estimation is distorted or difficult to estimate accurately . The existence of collinearity will make the error of the least squares estimator of the regression coefficient larger . Through the variance expansion factor (Variance inflation factor) And tolerance (tolerance) To Diagnose Multicollinearity ,VIF And tolerance are reciprocal . solve ： Exclude variables that cause collinearity 、 The difference method 、lasso Regression and ridge regression Stepwise regression is divided into ： Choose ahead , Choose back , Gradually choose . （ Filter and eliminate variables that cause Multicollinearity , Eliminate redundant features , Reduce prediction error . New problems may arise ： Endogeneity . Too much reduction will lead to over fitting ）（ Cross section data are prone to heteroscedasticity problems ; Time series data are prone to autocorrelation .） 4、 Ridge return （ Join in L2 Regular linear regression , On the basis of minimizing the mean square error of general linear regression, a parameter is added w Of L2 The penalty term of norm , Thus minimizing the sum of squares of penalty term residuals , That is, the identity matrix is introduced on the basis of ordinary linear regression .） and lasso Return to （ Join in L1 Regularization ）: On the basis of standard linear regression, add L1 and L2 Regularization （ Regularization ： Reduce the possibility of over fitting the model ,L1 Tend to learn sparse weight matrix ,L2 Tend to learn smaller and more decentralized weights ）

5、 ... and 、 graph theory

1、Dijkstras Algorithm ( Shortest path problem in weighted graph )

2、Bellman–Ford algorithm Algorithm ( Solve the single source shortest path problem ）

It's better than D One aspect of the algorithm is that the weight of the edge can be negative 、 Implement a simple , The disadvantage is that the time complexity is too high . But the algorithm can be optimized in several ways , Improved efficiency )、

3、floyd Algorithm

（ An algorithm for finding the shortest path between multiple sources in a given weighted graph by using the idea of dynamic programming , And Dijkstra Similar algorithm ）

6、 ... and 、 classification

1、 Logical regression （LR）： Often used in two classifications （ Reference resources ：【 machine learning 】 Logical regression ） 2、 Linear discriminant analysis （LDA Or called Fisher Discrimination ） And multi classification problems （ Application “ Split ” The strategy of , Solve the problem of multi classification by using multiple binary classifiers , That is, the problem of multiple classifications is divided into multiple binary classification problems , Train several binary learning devices , At last, we integrate multiple classification results to get a conclusion .）

7、 ... and 、 clustering

（ Reference resources ： What data scientists need to know 5 A clustering algorithm ）

1、K-means clustering algorithm

K-means clustering algorithm and K-means++ Algorithm

2、 Density based clustering with noise （DBSCAN）

3、Mean shift Algorithm , Also known as mean shift algorithm

4、EM Algorithm

Gaussian mixture model + Clustering expectation maximization （EM） optimization algorithm

5、 System （ level ） clustering algorithm

8、 ... and 、 Time series analysis

1、 Exponential smoothing method

Exponential smoothing method is to reveal the change law of its historical data in time order , It overcomes that the moving average prediction method does not make full use of the information of all data of time series and the N The shortcomings of data equality , And the process is clear 、 Convenient calculation . Exponential smoothing method is mainly divided into one-time exponential smoothing method and multiple exponential smoothing method , For different time series , The times of adopting exponential smoothing method are also different in Jiangxi Province GDP The data is curvilinear .

2、 Common models ：

2.1、AR、MA、ARMA Model

2.2、ARIMA Models and SARIMA Model

（ Reference resources ：AR、MA And ARMA Model ）

2.3、ARCH Models and GARCH Model

ARCH Model （ Autoregressive conditional heteroscedasticity model ） and GARCH Model （ In a broad sense ARCH Model , yes ARCH The expansion of the model ）

3、 Grey prediction GM(1,1)（ Commonly used 、 To use less ）

4、 Neural network related methods

4.1、RNN-LSTM-GRU

RNN ： Processing sequence data （ A stream of interdependent data ）, Each time, the previous output result will be , Take it to the next hidden layer , Training together . shortcoming ： Short term memory , Unable to handle long input sequence training ; RNN It costs a lot of money . Optimize ：LSTM（ Long and short term memory network ）, Retain important information in long series of data . GRU（ stay LSTM Some simplification and adjustment have been made on the model of ）

Nine 、 forecast

1、 Statistical prediction method

1.1、 Short term forecast

Decomposition analysis （ A one-time short-term forecast or eliminate the factors of seasonal changes before using other forecasting methods , Just the historical data of the sequence ） Moving average （ Repeated prediction without seasonal changes , Only the historical data of the dependent variable is needed , It takes time to choose weights for the first time ） Exponential smoothing （ Repeated prediction with or without seasonal changes , Only the historical data of the dependent variable is needed , Modeling takes time ） Adaptive filtering （ The properties applicable to trend patterns change over time , And there is no repeated prediction of seasonal changes , Only the historical data of the dependent variable is needed ,, But it is time-consuming to formulate and check the model specifications ） Stationary time series prediction （ An advanced prediction method suitable for the development of any sequence , But the calculation process is complex 、 tedious ） Intervention analysis model prediction method （ Historical data and influence time ,）

1.2、 Short and medium term forecast

Linear regression prediction （ The most time-consuming ） Nonlinear regression prediction method （ Multiple model tests ） Grey prediction method （ It is applicable to the development of time series with an exponential trend , According to historical data ） State space model and Kalman filter （ It is applicable to the prediction of various time series , Establish a state space model based on historical data ）

1.3、 Medium and long term forecast

Trend extrapolation （ When the relevant variables of the predicted project are expressed in time , Use nonlinear regression , Just historical data 、 Time consuming ）

2、 Machine learning methods

Ten 、 Common planning problems （lingo）

1、 Goal planning （GP） Solution idea ： Weighting factor 、 Priority 、 Effective solution 2、 Nonlinear programming （ constraint \ Unrestricted ） 3、 Dynamic programming （DP） 4、 Integer programming problem

11、 ... and 、 Other supplements

1、 Grey correlation analysis ： This method can usually be used to analyze the impact of various factors on the results , This method can also be used to solve the comprehensive evaluation problems that change with time , Its core is to establish the parent sequence that changes with time according to certain rules , Take the changes of each evaluation object over time as a sub sequence , Find the correlation between each sub sequence and the parent sequence , Draw a conclusion according to the Correlation . 2、 Common machine learning methods

Publisher ： Full stack programmer stack length , Reprint please indicate the source ：https://javaforall.cn/128977.html Link to the original text ：https://javaforall.cn

原网站

版权声明
本文为[Full stack programmer webmaster]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/210/202207282250424460.html