当前位置:网站首页>SAS discriminant analysis (Bayes criterion and proc discrim process)

SAS discriminant analysis (Bayes criterion and proc discrim process)

2022-06-11 01:26:00 I have a clear idea

The following table shows the relevant financial data of the two types of companies , One is a bankrupt company , The data in the table are the four-year financial indicators of these companies in the two years before bankruptcy . One is the four same financial indicators of the non bankrupt company and the bankrupt company in the same period . These four indicators are

  The data of each company is shown in the following table ( In the last column of the table “0” Means bankrupt company ,“1” Means a non bankrupt company )

numberx1x2x3x4group
1-0.45-0.411.090.450
2-0.56-0.311.510.160
30.060.021.010.40
4-0.07-0.091.450.260
5-0.1-0.091.560.670
6-0.14-0.070.710.280
70.040.011.50.710
8-0.06-0.061.370.40
90.07-0.011.370.340
10-0.13-0.141.420.440
11-0.23-0.30.330.180
120.070.021.310.250
130.0102.150.70
14-0.28-0.231.190.660
150.150.051.880.270
160.370.111.990.380
17-0.08-0.081.510.420
180.050.031.680.950
190.0101.260.60
200.120.111.140.170
21-0.28-0.271.270.510
10.510.12.490.541
20.080.022.010.531
30.380.113.270.351
40.190.052.250.331
50.320.074.240.631
60.310.054.450.691
70.120.052.520.691
8-0.020.022.050.351
90.220.082.350.41
100.170.071.80.521
110.150.052.170.551
12-0.1-0.012.50.581
130.14-0.030.460.261
140.140.072.610.521
150.150.062.230.561
160.160.052.310.21
170.290.061.840.381
180.540.112.330.481
19-0.33-0.093.010.471
200.480.091.240.181
210.560.114.290.451
220.20.081.990.31
230.470.162.920.451
240.170.042.450.141
250.580.045.060.131

  Experimental code :

proc import out=temp1                                                                                                                   
datafile="C:\Users\86166\Desktop\IT\SAS experiment \ experiment 9\1.xls"                                                                                
DBMS=EXCEL2000 replace;                                                                                                                 
run;   

/*1、2、3*/ 
proc discrim data=temp1  wcov simple pool=no manova method=normal crosslisterr listerr;
class group;
var x1-x2;
priors equal;
run;
/*4*/
proc discrim data=temp1  pool=no manova method=normal crosslisterr listerr;
class group;
var x1-x2;
priors '0'=0.05 '1'=0.95;
run;
/*5*/ 
proc discrim data=temp1  pool=yes manova method=normal crosslisterr listerr;
class group;
var x1-x2;
priors equal;
run;
/*6*/
proc discrim data=temp1  wcov simple pool=no manova method=normal crosslisterr listerr;
class group;
var x1 x3;
priors equal;
run;
proc discrim data=temp1  pool=no manova method=normal crosslisterr listerr;
class group;
var x1 x3;
priors '0'=0.05 '1'=0.95;
run;

proc discrim data=temp1  wcov simple pool=no manova method=normal crosslisterr listerr;
class group;
var x1 x4;
priors equal;
run;
proc discrim data=temp1  pool=no manova method=normal crosslisterr listerr;
class group;
var x1 x4;
priors '0'=0.05 '1'=0.95;
run;
/*7*/ 
proc discrim data=temp1  wcov simple pool=no manova method=normal crosslisterr listerr;
class group;
var x1-x4;
priors equal;
run;
proc discrim data=temp1  pool=no manova method=normal crosslisterr listerr;
class group;
var x1-x4;
priors '0'=0.05 '1'=0.95;
run;

experimental result :——》 Discriminant analysis code picture results and data sets

Analyze the results of the experiment :

Problems and solutions in the experiment :

problem : How to determine the more reliable results under different prior probabilities ?

solve : At present, the error probability is directly used to compare

Experimental experience ( Conclusion 、 evaluation 、 Thoughts and suggestions )

  1. simple Obtain simple statistics such as the mean ,wcov Get intra group covariance ,pool=yes/no/test Use the joint covariance matrix respectively , Intra group covariance matrix , Homogeneity test of intra group covariance matrix .manova obtain 4 Statistics ,Wilks'lambda To measure The ratio of the sum of squares within the group to the total sum of squares ,Wilks'lambda Large value , It means that the mean value of each group is basically the same , In discriminant analysis , Only when the group mean values are not equal , Discriminant analysis makes sense .
  2. crosslisterr listerr Maximum a posteriori probabilities are used respectively , Calculate the probability of misjudgment by cutting ,method=normal Specifies that the population is normally distributed ,priors equal Specifies that the prior probability is equal , The prior probabilities of different classes can also be specified according to the contents of the classification .
  3. When the population is normally distributed , If covariance matrix between populations is not equal , The intra group covariance matrix ,pool=no,method=normal,priors Can be equal to , It can also be specified by frequency or special value ; If covariance matrix between populations is equal , Then the joint covariance matrix ,pool=yes,method=normal,priors Can be equal to , It can also be specified by frequency or special value . Joint covariance matrix is preferred for small samples , A priori probability generally specifies equality . When the population does not belong to the normal distribution method=npar, The nonparametric method is used for discrimination .
  4. The mean vectors of the population and each class can be represented by simple obtain
    wcov Get the intra group covariance , That is, the sample covariance
    pcov Get the combined covariance , The corresponding use conditions of these two covariances are pool relation
    pool by yes The combined covariance matrix , It means that the corresponding overall covariance matrix is different
    by no The intra group covariance matrix , It means that the corresponding populations obey the normal population with equal covariance matrix
    by test The likelihood ratio test for homogeneity of the intra group covariance matrix is modified , and slpool Used to specify the homogeneity inspection level , Default 0.1
    method by normal The representation class obeys multivariate normal distribution , by npar That is, the nonparametric method is used to disobey the distribution
    crosslisterr Output the back judgment result in the form of cross table , The knife cutting method is used
    listerr The back decision error information generated by a posteriori probability , It is required to obtain the discrimination result according to the distance criterion
    priors by equal Means that the prior probabilities are equal , by proportional Means that the prior probability is equal to the sample frequency , You can also specify a priori probability of the classification mark , But the sum is 1
    Compare the quality of the criterion , Look at the result of misjudgment Total Options , Generally speaking, whoever is smaller is better
原网站

版权声明
本文为[I have a clear idea]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/162/202206110018150211.html