当前位置:网站首页>"Analysis of 43 cases of MATLAB neural network": Chapter 30 design of combined classifier based on random forest idea - breast cancer diagnosis
"Analysis of 43 cases of MATLAB neural network": Chapter 30 design of combined classifier based on random forest idea - breast cancer diagnosis
2022-07-01 08:35:00 【mozun2020】
《MATLAB neural network 43 A case study 》: The first 30 Chapter Design of combined classifier based on random forest idea —— Breast cancer diagnosis
1. Preface
《MATLAB neural network 43 A case study 》 yes MATLAB Technology Forum (www.matlabsky.com) planning , Led by teacher wangxiaochuan ,2013 Beijing University of Aeronautics and Astronautics Press MATLAB A book for tools MATLAB Example teaching books , Is in 《MATLAB neural network 30 A case study 》 On the basis of modification 、 Complementary , Adhering to “ Theoretical explanation — case analysis — Application extension ” This feature , Help readers to be more intuitive 、 Learn neural networks vividly .
《MATLAB neural network 43 A case study 》 share 43 Chapter , The content covers common neural networks (BP、RBF、SOM、Hopfield、Elman、LVQ、Kohonen、GRNN、NARX etc. ) And related intelligent algorithms (SVM、 Decision tree 、 Random forests 、 Extreme learning machine, etc ). meanwhile , Some chapters also cover common optimization algorithms ( Genetic algorithm (ga) 、 Ant colony algorithm, etc ) And neural network . Besides ,《MATLAB neural network 43 A case study 》 It also introduces MATLAB R2012b New functions and features of neural network toolbox in , Such as neural network parallel computing 、 Custom neural networks 、 Efficient programming of neural network, etc .
In recent years, with the rise of artificial intelligence research , The related direction of neural network has also ushered in another upsurge of research , Because of its outstanding performance in the field of signal processing , The neural network method is also being applied to various applications in the direction of speech and image , This paper combines the cases in the book , It is simulated and realized , It's a relearning , I hope I can review the old and know the new , Strengthen and improve my understanding and practice of the application of neural network in various fields . I just started this book on catching more fish , Let's start the simulation example , Mainly to introduce the source code application examples in each chapter , This paper is mainly based on MATLAB2015b(32 position ) Platform simulation implementation , This is the design example of a combined classifier based on the idea of random forest in Chapter 30 of this book , Don't talk much , Start !
2. MATLAB Simulation example
open MATLAB, Click on “ Home page ”, Click on “ open ”, Find the sample file 
Choose main.m, Click on “ open ”
main.m Source code is as follows :
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% function : Design of combined classifier based on random forest idea
% Environmental Science :Win7,Matlab2015b
%Modi: C.S
% Time :2022-06-20
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Design of combined classifier based on random forest idea
%% Clear environment variables
clear all
clc
warning off
tic
%% Import data
load data.mat
% Randomly generated training set / Test set
a = randperm(569);
Train = data(a(1:500),:);
Test = data(a(501:end),:);
% Training data
P_train = Train(:,3:end);
T_train = Train(:,2);
% Test data
P_test = Test(:,3:end);
T_test = Test(:,2);
%% Create a random forest classifier
model = classRF_train(P_train,T_train);
%% The simulation test
[T_sim,votes] = classRF_predict(P_test,model);
%% Result analysis
count_B = length(find(T_train == 1));
count_M = length(find(T_train == 2));
total_B = length(find(data(:,2) == 1));
total_M = length(find(data(:,2) == 2));
number_B = length(find(T_test == 1));
number_M = length(find(T_test == 2));
number_B_sim = length(find(T_sim == 1 & T_test == 1));
number_M_sim = length(find(T_sim == 2 & T_test == 2));
disp([' Total number of cases :' num2str(569)...
' Benign :' num2str(total_B)...
' Malignant :' num2str(total_M)]);
disp([' Total number of training set cases :' num2str(500)...
' Benign :' num2str(count_B)...
' Malignant :' num2str(count_M)]);
disp([' Total number of cases in the test set :' num2str(69)...
' Benign :' num2str(number_B)...
' Malignant :' num2str(number_M)]);
disp([' Benign breast tumor was diagnosed :' num2str(number_B_sim)...
' Misdiagnosis :' num2str(number_B - number_B_sim)...
' Diagnostic rate p1=' num2str(number_B_sim/number_B*100) '%']);
disp([' Malignant breast tumor was diagnosed :' num2str(number_M_sim)...
' Misdiagnosis :' num2str(number_M - number_M_sim)...
' Diagnostic rate p2=' num2str(number_M_sim/number_M*100) '%']);
%% mapping
figure
index = find(T_sim ~= T_test);
plot(votes(index,1),votes(index,2),'r*')
hold on
index = find(T_sim == T_test);
plot(votes(index,1),votes(index,2),'bo')
hold on
legend(' Misclassification sample ',' Correctly classify the samples ')
plot(0:500,500:-1:0,'r-.')
hold on
plot(0:500,0:500,'r-.')
hold on
line([100 400 400 100 100],[100 100 400 400 100])
xlabel(' Output as category 1 Number of decision trees ')
ylabel(' Output as category 2 Number of decision trees ')
title(' Performance analysis of random forest classifier ')
%% The influence of the number of decision trees on the performance in random forest
Accuracy = zeros(1,20);
for i = 50:50:1000
%i
% Each case , function 100 Time , Average.
accuracy = zeros(1,100);
for k = 1:100
% Create random forests
model = classRF_train(P_train,T_train,i);
% The simulation test
T_sim = classRF_predict(P_test,model);
accuracy(k) = length(find(T_sim == T_test)) / length(T_test);
end
Accuracy(i/50) = mean(accuracy);
end
% mapping
figure
plot(50:50:1000,Accuracy)
xlabel(' Number of decision trees in a random forest ')
ylabel(' Classification accuracy ')
title(' The influence of the number of decision trees on the performance in random forest ')
toc
Add completed , Click on “ function ”, Start emulating , The output simulation results are as follows :
Setting to defaults 500 trees and mtry=5
Total number of cases :569 Benign :357 Malignant :212
Total number of training set cases :500 Benign :319 Malignant :181
Total number of cases in the test set :69 Benign :38 Malignant :31
Benign breast tumor was diagnosed :36 Misdiagnosis :2 Diagnostic rate p1=94.7368%
Malignant breast tumor was diagnosed :30 Misdiagnosis :1 Diagnostic rate p2=96.7742%
Time has passed 622.343231 second .




3. Summary
Random forest refers to a classifier that uses multiple trees to train and predict samples . This classifier was first developed by Leo Breiman and Adele Cutler Put forward . In machine learning , Random forest is a classifier that contains multiple decision trees , And the output category is determined by the mode of the output category of the individual tree . Leo Breiman and Adele Cutler Develop algorithms for inferring random forests . and “Random Forests” It's their trademark . The term is 1995 By Bell Laboratories Tin Kam Ho The proposed stochastic decision forest (random decision forests) And here comes . This method combines Breimans Of “Bootstrap aggregating” Ideas and Ho Of "random subspace method" To build a set of decision trees . In visual machine learning 0 There are also simulation examples for random forests in the column , The link at the end of the article can be accessed . Interested in the content of this chapter or want to fully learn and understand , It is suggested to study the contents of Chapter 30 in the book . Some of these knowledge points will be supplemented on the basis of their own understanding in the later stage , Welcome to study and exchange together .
边栏推荐
- MATLAB小技巧(16)矩阵特征向量特征值求解一致性验证--层次分析
- Configuration and startup of Chang'an chain synchronization node
- [JS reverse] MD5 encryption parameter cracking
- View drawing process analysis
- 华为机试真题专栏订阅指引
- 【js逆向】md5加密参数破解
- Thread safety analysis of [concurrent programming JUC] variables
- 公网集群对讲+GPS可视追踪|助力物流行业智能化管理调度
- 2022 examination summary of quality controller civil engineering direction post skills (quality controller) and reexamination examination of quality controller civil engineering direction post skills
- Leetcode t31: next spread
猜你喜欢
随机推荐
To prevent "activation" photos from being muddled through, databao "live detection + face recognition" makes face brushing safer
长安链同步节点配置与启动
When using charts to display data, the time field in the database is repeated. How to display the value at this time?
What is 1cr0.5mo (H) material? 1cr0.5mo (H) tensile yield strength
Vscode customize the color of each area
深度学习训练样本扩增同时修改标签名称
The data analyst will be ruined without project experience. These 8 project resources will not be taken away
Configuration and startup of Chang'an chain synchronization node
[deep analysis of C language] - data storage in memory
factory type_ Id:: create process resolution
Provincial election + noi part I dynamic planning DP
SPL installation and basic use (II)
华为机试真题专栏订阅指引
Principle and application of single chip microcomputer - off chip development
[untitled]
TypeError: __ init__ () got an unexpected keyword argument ‘autocompletion‘
Intelligent water and fertilizer integrated control system
Properties of 15MnNiNbDR low temperature vessel steel, Wugang 15MnNiDR and 15MnNiNbDR steel plates
我想知道手机注册股票开户的流程?另外,手机开户安全么?
MATLAB【函数求导】








