当前位置:网站首页>Mathematical modeling and detailed explanation of basic knowledge (common knowledge points of Chemistry)
Mathematical modeling and detailed explanation of basic knowledge (common knowledge points of Chemistry)
2022-07-29 01:02:00 【Full stack programmer webmaster】
Hello everyone , I meet you again , I'm your friend, Quan Jun .
Summary of common knowledge points and methods of digital simulation
- One 、 Comprehensive evaluation method
- Two 、 Interpolation and fitting ( Numerical calculation method )
- 3、 ... and 、 Hypothesis testing ( Probability theory and mathematical statistical methods )
- Four 、 Return to
- 5、 ... and 、 graph theory
- 6、 ... and 、 classification
- 7、 ... and 、 clustering
- 8、 ... and 、 Time series analysis
- Nine 、 forecast
- Ten 、 Common planning problems (lingo)
- 11、 ... and 、 Other supplements
One 、 Comprehensive evaluation method
According to the theoretical basis of each evaluation method , Modern comprehensive evaluation methods can be roughly divided into the following four categories :
1、 Expert evaluation method
2、 Operations research and other mathematical methods
2.1、 Analytic hierarchy process (AHP)
2.2 、 Fuzzy comprehensive evaluation method (FCE)
2.3 、 Data envelopment analysis (DEA)
3 、 Based on statistical and economic methods
3.1 、TOPSIS Evaluation method , Entropy weight method can be used for optimization
3.2 、 Primary and secondary analysis and factor analysis
Principal component analysis overcomes Correlation 、 Overlapping , Replace more variables with fewer variables , This substitution can reflect most of the information of the original multiple variables , This is actually a kind of “ Dimension reduction ” Thought . Factor analysis uses a few hypothetical variables to represent its basic data structure . These hypothetical variables can reflect the main information of many original variables . The original variable is an observable explicit variable , Hypothetical variables are unobservable potential variables , It's called a factor .
3.3 、 Cost benefit method
4 、 New evaluation method
4.1 、 Artificial neural network evaluation method (ANN)
be based on BP The comprehensive evaluation method of artificial neural network has the advantages of fast operation 、 Efficient problem solving 、 Strong self-learning ability 、 Strong fault tolerance and other advantages , The process of comprehensive evaluation by evaluation experts is well simulated , Therefore, it has broad application prospects .
4.2 、 Grey comprehensive evaluation method
Grey system theory mainly uses the known information to determine the unknown information of the system , Make the system from “ ash ” change “ white ”. Its biggest feature is that there is no strict requirement on the sample size , Don't obey any distribution . Grey correlation degree is one of the main aspects of the application of grey system theory .
5、 Hybrid method : Combined evaluation method
Two 、 Interpolation and fitting ( Numerical calculation method )
1、 interpolation
1.1、 Newton interpolation 1.2、 Lagrange interpolation 1.3、 Emmett interpolation 1.4、 Spline interpolation
2、 fitting
2.1 Least squares fit 2.2 Best approximation ( Best square 、 Best consistency, etc )
3、 ... and 、 Hypothesis testing ( Probability theory and mathematical statistical methods )
1、 The correlation coefficient
1.1、 Pearson correlation coefficient
Pearson correlation coefficient is applicable to continuous variables with normal distribution . Sensitive to outliers . Usually I use t Test and other methods to test the Pearson correlation coefficient . We need to confirm that these two variables are linearly related . Continuous data , Normal distribution , linear relationship , All meet , Pearson correlation coefficient is the most appropriate . If the data is sequenced , Then use Spearman rank correlation coefficient .
1.2、 Spearman correlation coefficient
Another definition : Pearson correlation coefficient between grades . Pearson correlation coefficient is applicable to linear relationship , And Spearman correlation coefficient is applicable to monotonic relation ( The slope of the linear relationship is fixed ). Pearson correlation coefficient is calculated using metadata , Spearman correlation coefficient is calculated based on rank .
1.3、 Kendall's tau coefficient
Kendall's tau coefficient , Also known as Kendall rank correlation coefficient , It is also a rank correlation coefficient , however , Its target object is an ordered class variable , For example, ranking 、 age group 、 Obesity rating ( Severe obesity , Moderate obesity 、 Mild obesity 、 Not fat ) etc. . It can measure the monotonic relationship between two ordered variables .
1.4、 Differences and choices
Compared with Pearson correlation coefficient , Spearman correlation coefficient 、 Kendall's tau coefficient , Is the correlation coefficient based on the data rank . Because these estimators operate on rank , Instead of data values , So they are robust to outliers , And can deal with specific types of nonlinear relationships . Most of the time , Rank based estimators are suitable for small-scale data sets and specific hypothesis tests . ( Reference resources : 1、 What is the correlation coefficient 2、 Pearson 、 Spearman 、 Introduction of Kendall correlation coefficient and its application in feature selection )
2、 Normal distribution mean hypothesis test method
Common methods :t test ,Z test , Chi square test ,F test etc.
3、 Normal distribution test
The test of judging whether the population obeys the normal distribution by using the observed data is called the normality test . It is an important special goodness of fit hypothesis test in statistical judgment . Common methods are : Skewness and kurtosis , Graphic method , Nonparametric test . Commonly used :( Reference resources : All normality tests are here ) 3.1 skewness - Kurtosis test method 3.2 Graphic method : adopt Histogram 、P-P chart 、QQ chart Preliminary judgement . 3.3 Nonparametric test Kolmogorov - Sminov test (Kolmogorov-Smirnov test), abbreviation K-S test ( It is suitable for exploring the distribution of continuous random variables 、 In contrast, it is suitable for large samples (>50)) Shapiro - Wilke test (Shapiro—Wilk test), abbreviation S-W test .( Small sample )
Four 、 Return to
( Reference resources : Super dry : Understand regression analysis ) 1、 Linear regression 、 Locally weighted linear regression 2、 Multiple regression ( The estimation method is divided into ordinary 、 The generalized least squares method , In a broad sense, heteroscedasticity or autocorrelation is allowed in the error term , Be careful Goodness of fit index ) 3、 Heteroscedasticity 、 Multicollinearity 、 Stepwise regression Heteroscedasticity : The variance of random interference does not vary with the independent variables . Check for the presence of , If exist , It will lead to invalid parameter estimation 、 Parameter significance test is meaningless 、 Model predictions fail . Multicollinearity : Due to the precise correlation or high correlation between explanatory variables, the model estimation is distorted or difficult to estimate accurately . The existence of collinearity will make the error of the least squares estimator of the regression coefficient larger . Through the variance expansion factor (Variance inflation factor) And tolerance (tolerance) To Diagnose Multicollinearity ,VIF And tolerance are reciprocal . solve : Exclude variables that cause collinearity 、 The difference method 、lasso Regression and ridge regression Stepwise regression is divided into : Choose ahead , Choose back , Gradually choose . ( Filter and eliminate variables that cause Multicollinearity , Eliminate redundant features , Reduce prediction error . New problems may arise : Endogeneity . Too much reduction will lead to over fitting ) ( Cross section data are prone to heteroscedasticity problems ; Time series data are prone to autocorrelation .) 4、 Ridge return ( Join in L2 Regular linear regression , On the basis of minimizing the mean square error of general linear regression, a parameter is added w Of L2 The penalty term of norm , Thus minimizing the sum of squares of penalty term residuals , That is, the identity matrix is introduced on the basis of ordinary linear regression .) and lasso Return to ( Join in L1 Regularization ): On the basis of standard linear regression, add L1 and L2 Regularization ( Regularization : Reduce the possibility of over fitting the model ,L1 Tend to learn sparse weight matrix ,L2 Tend to learn smaller and more decentralized weights )
5、 ... and 、 graph theory
1、Dijkstras Algorithm ( Shortest path problem in weighted graph )
2、Bellman–Ford algorithm Algorithm ( Solve the single source shortest path problem )
It's better than D One aspect of the algorithm is that the weight of the edge can be negative 、 Implement a simple , The disadvantage is that the time complexity is too high . But the algorithm can be optimized in several ways , Improved efficiency )、
3、floyd Algorithm
( An algorithm for finding the shortest path between multiple sources in a given weighted graph by using the idea of dynamic programming , And Dijkstra Similar algorithm )
6、 ... and 、 classification
1、 Logical regression (LR): Often used in two classifications ( Reference resources :【 machine learning 】 Logical regression ) 2、 Linear discriminant analysis (LDA Or called Fisher Discrimination ) And multi classification problems ( Application “ Split ” The strategy of , Solve the problem of multi classification by using multiple binary classifiers , That is, the problem of multiple classifications is divided into multiple binary classification problems , Train several binary learning devices , At last, we integrate multiple classification results to get a conclusion .)
7、 ... and 、 clustering
( Reference resources : What data scientists need to know 5 A clustering algorithm )
1、K-means clustering algorithm
K-means clustering algorithm and K-means++ Algorithm
2、 Density based clustering with noise (DBSCAN)
3、Mean shift Algorithm , Also known as mean shift algorithm
4、EM Algorithm
Gaussian mixture model + Clustering expectation maximization (EM) optimization algorithm
5、 System ( level ) clustering algorithm
8、 ... and 、 Time series analysis
1、 Exponential smoothing method
Exponential smoothing method is to reveal the change law of its historical data in time order , It overcomes that the moving average prediction method does not make full use of the information of all data of time series and the N The shortcomings of data equality , And the process is clear 、 Convenient calculation . Exponential smoothing method is mainly divided into one-time exponential smoothing method and multiple exponential smoothing method , For different time series , The times of adopting exponential smoothing method are also different in Jiangxi Province GDP The data is curvilinear .
2、 Common models :
2.1、AR、MA、ARMA Model
2.2、ARIMA Models and SARIMA Model
( Reference resources :AR、MA And ARMA Model )
2.3、ARCH Models and GARCH Model
ARCH Model ( Autoregressive conditional heteroscedasticity model ) and GARCH Model ( In a broad sense ARCH Model , yes ARCH The expansion of the model )
3、 Grey prediction GM(1,1)( Commonly used 、 To use less )
4、 Neural network related methods
4.1、RNN-LSTM-GRU
RNN : Processing sequence data ( A stream of interdependent data ), Each time, the previous output result will be , Take it to the next hidden layer , Training together . shortcoming : Short term memory , Unable to handle long input sequence training ; RNN It costs a lot of money . Optimize :LSTM( Long and short term memory network ), Retain important information in long series of data . GRU( stay LSTM Some simplification and adjustment have been made on the model of )
Nine 、 forecast
1、 Statistical prediction method
1.1、 Short term forecast
Decomposition analysis ( A one-time short-term forecast or eliminate the factors of seasonal changes before using other forecasting methods , Just the historical data of the sequence ) Moving average ( Repeated prediction without seasonal changes , Only the historical data of the dependent variable is needed , It takes time to choose weights for the first time ) Exponential smoothing ( Repeated prediction with or without seasonal changes , Only the historical data of the dependent variable is needed , Modeling takes time ) Adaptive filtering ( The properties applicable to trend patterns change over time , And there is no repeated prediction of seasonal changes , Only the historical data of the dependent variable is needed ,, But it is time-consuming to formulate and check the model specifications ) Stationary time series prediction ( An advanced prediction method suitable for the development of any sequence , But the calculation process is complex 、 tedious ) Intervention analysis model prediction method ( Historical data and influence time ,)
1.2、 Short and medium term forecast
Linear regression prediction ( The most time-consuming ) Nonlinear regression prediction method ( Multiple model tests ) Grey prediction method ( It is applicable to the development of time series with an exponential trend , According to historical data ) State space model and Kalman filter ( It is applicable to the prediction of various time series , Establish a state space model based on historical data )
1.3、 Medium and long term forecast
Trend extrapolation ( When the relevant variables of the predicted project are expressed in time , Use nonlinear regression , Just historical data 、 Time consuming )
2、 Machine learning methods
Ten 、 Common planning problems (lingo)
1、 Goal planning (GP) Solution idea : Weighting factor 、 Priority 、 Effective solution 2、 Nonlinear programming ( constraint \ Unrestricted ) 3、 Dynamic programming (DP) 4、 Integer programming problem
11、 ... and 、 Other supplements
1、 Grey correlation analysis : This method can usually be used to analyze the impact of various factors on the results , This method can also be used to solve the comprehensive evaluation problems that change with time , Its core is to establish the parent sequence that changes with time according to certain rules , Take the changes of each evaluation object over time as a sub sequence , Find the correlation between each sub sequence and the parent sequence , Draw a conclusion according to the Correlation . 2、 Common machine learning methods
Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/128977.html Link to the original text :https://javaforall.cn
边栏推荐
猜你喜欢

Jupyter notebook中5个有趣的魔法命令
![Error reporting: the network preview shows {xxx:['this field is required']}](/img/96/b0a6c01543fcbcc6d3262b3797fae2.jpg)
Error reporting: the network preview shows {xxx:['this field is required']}

Cookie和Session

时间序列数据的预处理方法总结

AQS principle

QT静态编译程序(Mingw编译)

Daniel guild Games: summary and future outlook of this year

Techo hub Fuzhou Station dry goods attack | talk with developers about new industrial intelligence technology

靠云业务独撑收入增长大梁,微软仍然被高估?

【无标题】
随机推荐
mysql存储过程 实现创建一张表(复制原表的结构新建的表)
将Word中的表格以图片形式复制到微信发送
自制 | 纯手工自制一个16位RISC架构CPU
Techo Hub 福州站干货来袭|与开发者共话工业智能新技术
DDD领域驱动设计如何进行工程化落地
[notes for question brushing] specified interval reversal in the linked list
新拟态个人引导页源码
How to explain JS' bind simulation implementation to your girlfriend
Copu Professor Lu Shouqun was invited to give a keynote speech at the open atom global open source summit
直流无刷电机控制器(换电机霍尔收费多少)
mysql时间按小时格式化_mysql时间格式化,按时间段查询的MySQL语句[通俗易懂]
SQL server only has database files and no log files. The solution to the 1813 error in restoring data times
保护性拷贝&无状态
selenium对接代理与seleniumwire访问开发者工具NetWork
QT静态编译程序(Mingw编译)
Send SMS verification code asynchronously using Ronglian cloud celery
双链表的定义 ~
机器学习 | MATLAB实现RBF径向基神经网络newrbe参数设定
华为发布HarmonyOS 3.0,向“万物互联”再迈一步
管理区解耦架构见过吗?能帮客户搞定大难题的