当前位置:网站首页>"Analysis of 43 cases of MATLAB neural network": Chapter 30 design of combined classifier based on random forest idea - breast cancer diagnosis
"Analysis of 43 cases of MATLAB neural network": Chapter 30 design of combined classifier based on random forest idea - breast cancer diagnosis
2022-07-01 08:35:00 【mozun2020】
《MATLAB neural network 43 A case study 》: The first 30 Chapter Design of combined classifier based on random forest idea —— Breast cancer diagnosis
1. Preface
《MATLAB neural network 43 A case study 》 yes MATLAB Technology Forum (www.matlabsky.com) planning , Led by teacher wangxiaochuan ,2013 Beijing University of Aeronautics and Astronautics Press MATLAB A book for tools MATLAB Example teaching books , Is in 《MATLAB neural network 30 A case study 》 On the basis of modification 、 Complementary , Adhering to “ Theoretical explanation — case analysis — Application extension ” This feature , Help readers to be more intuitive 、 Learn neural networks vividly .
《MATLAB neural network 43 A case study 》 share 43 Chapter , The content covers common neural networks (BP、RBF、SOM、Hopfield、Elman、LVQ、Kohonen、GRNN、NARX etc. ) And related intelligent algorithms (SVM、 Decision tree 、 Random forests 、 Extreme learning machine, etc ). meanwhile , Some chapters also cover common optimization algorithms ( Genetic algorithm (ga) 、 Ant colony algorithm, etc ) And neural network . Besides ,《MATLAB neural network 43 A case study 》 It also introduces MATLAB R2012b New functions and features of neural network toolbox in , Such as neural network parallel computing 、 Custom neural networks 、 Efficient programming of neural network, etc .
In recent years, with the rise of artificial intelligence research , The related direction of neural network has also ushered in another upsurge of research , Because of its outstanding performance in the field of signal processing , The neural network method is also being applied to various applications in the direction of speech and image , This paper combines the cases in the book , It is simulated and realized , It's a relearning , I hope I can review the old and know the new , Strengthen and improve my understanding and practice of the application of neural network in various fields . I just started this book on catching more fish , Let's start the simulation example , Mainly to introduce the source code application examples in each chapter , This paper is mainly based on MATLAB2015b(32 position ) Platform simulation implementation , This is the design example of a combined classifier based on the idea of random forest in Chapter 30 of this book , Don't talk much , Start !
2. MATLAB Simulation example
open MATLAB, Click on “ Home page ”, Click on “ open ”, Find the sample file 
Choose main.m, Click on “ open ”
main.m Source code is as follows :
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% function : Design of combined classifier based on random forest idea
% Environmental Science :Win7,Matlab2015b
%Modi: C.S
% Time :2022-06-20
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Design of combined classifier based on random forest idea
%% Clear environment variables
clear all
clc
warning off
tic
%% Import data
load data.mat
% Randomly generated training set / Test set
a = randperm(569);
Train = data(a(1:500),:);
Test = data(a(501:end),:);
% Training data
P_train = Train(:,3:end);
T_train = Train(:,2);
% Test data
P_test = Test(:,3:end);
T_test = Test(:,2);
%% Create a random forest classifier
model = classRF_train(P_train,T_train);
%% The simulation test
[T_sim,votes] = classRF_predict(P_test,model);
%% Result analysis
count_B = length(find(T_train == 1));
count_M = length(find(T_train == 2));
total_B = length(find(data(:,2) == 1));
total_M = length(find(data(:,2) == 2));
number_B = length(find(T_test == 1));
number_M = length(find(T_test == 2));
number_B_sim = length(find(T_sim == 1 & T_test == 1));
number_M_sim = length(find(T_sim == 2 & T_test == 2));
disp([' Total number of cases :' num2str(569)...
' Benign :' num2str(total_B)...
' Malignant :' num2str(total_M)]);
disp([' Total number of training set cases :' num2str(500)...
' Benign :' num2str(count_B)...
' Malignant :' num2str(count_M)]);
disp([' Total number of cases in the test set :' num2str(69)...
' Benign :' num2str(number_B)...
' Malignant :' num2str(number_M)]);
disp([' Benign breast tumor was diagnosed :' num2str(number_B_sim)...
' Misdiagnosis :' num2str(number_B - number_B_sim)...
' Diagnostic rate p1=' num2str(number_B_sim/number_B*100) '%']);
disp([' Malignant breast tumor was diagnosed :' num2str(number_M_sim)...
' Misdiagnosis :' num2str(number_M - number_M_sim)...
' Diagnostic rate p2=' num2str(number_M_sim/number_M*100) '%']);
%% mapping
figure
index = find(T_sim ~= T_test);
plot(votes(index,1),votes(index,2),'r*')
hold on
index = find(T_sim == T_test);
plot(votes(index,1),votes(index,2),'bo')
hold on
legend(' Misclassification sample ',' Correctly classify the samples ')
plot(0:500,500:-1:0,'r-.')
hold on
plot(0:500,0:500,'r-.')
hold on
line([100 400 400 100 100],[100 100 400 400 100])
xlabel(' Output as category 1 Number of decision trees ')
ylabel(' Output as category 2 Number of decision trees ')
title(' Performance analysis of random forest classifier ')
%% The influence of the number of decision trees on the performance in random forest
Accuracy = zeros(1,20);
for i = 50:50:1000
%i
% Each case , function 100 Time , Average.
accuracy = zeros(1,100);
for k = 1:100
% Create random forests
model = classRF_train(P_train,T_train,i);
% The simulation test
T_sim = classRF_predict(P_test,model);
accuracy(k) = length(find(T_sim == T_test)) / length(T_test);
end
Accuracy(i/50) = mean(accuracy);
end
% mapping
figure
plot(50:50:1000,Accuracy)
xlabel(' Number of decision trees in a random forest ')
ylabel(' Classification accuracy ')
title(' The influence of the number of decision trees on the performance in random forest ')
toc
Add completed , Click on “ function ”, Start emulating , The output simulation results are as follows :
Setting to defaults 500 trees and mtry=5
Total number of cases :569 Benign :357 Malignant :212
Total number of training set cases :500 Benign :319 Malignant :181
Total number of cases in the test set :69 Benign :38 Malignant :31
Benign breast tumor was diagnosed :36 Misdiagnosis :2 Diagnostic rate p1=94.7368%
Malignant breast tumor was diagnosed :30 Misdiagnosis :1 Diagnostic rate p2=96.7742%
Time has passed 622.343231 second .




3. Summary
Random forest refers to a classifier that uses multiple trees to train and predict samples . This classifier was first developed by Leo Breiman and Adele Cutler Put forward . In machine learning , Random forest is a classifier that contains multiple decision trees , And the output category is determined by the mode of the output category of the individual tree . Leo Breiman and Adele Cutler Develop algorithms for inferring random forests . and “Random Forests” It's their trademark . The term is 1995 By Bell Laboratories Tin Kam Ho The proposed stochastic decision forest (random decision forests) And here comes . This method combines Breimans Of “Bootstrap aggregating” Ideas and Ho Of "random subspace method" To build a set of decision trees . In visual machine learning 0 There are also simulation examples for random forests in the column , The link at the end of the article can be accessed . Interested in the content of this chapter or want to fully learn and understand , It is suggested to study the contents of Chapter 30 in the book . Some of these knowledge points will be supplemented on the basis of their own understanding in the later stage , Welcome to study and exchange together .
边栏推荐
猜你喜欢

Intelligent water and fertilizer integrated control system

Principle and application of single chip microcomputer - principle of parallel IO port

shardingSphere

Share 7 books I read in the first half of 2022

R语言入门

机动目标跟踪——当前统计模型(CS模型)扩展卡尔曼滤波/无迹卡尔曼滤波 matlab实现

The use of word in graduation thesis

Properties of 15MnNiNbDR low temperature vessel steel, Wugang 15MnNiDR and 15MnNiNbDR steel plates

《单片机原理及应用》—定时器、串行通信和中断系统
![[untitled]](/img/b9/6922875009c2d29224a26ed2a22b01.jpg)
[untitled]
随机推荐
Provincial election + noi part I dynamic planning DP
Luogu p1088 [noip2004 popularization group] Martians
Huawei machine test questions column subscription Guide
MATLAB小技巧(23)矩阵分析--模拟退火
Share 7 books I read in the first half of 2022
《单片机原理及应用》-片外拓展
[深度剖析C语言] —— 数据在内存中的存储
There are many problems in sewage treatment, and the automatic control system of pump station is solved in this way
Intelligent water and fertilizer integrated control system
《单片机原理与应用》——并行IO口原理
On several key issues of digital transformation
Utiliser Beef pour détourner le navigateur utilisateur
Tita OKR: a dashboard to master the big picture
Mavros sends a custom topic message to Px4
Use threejs simple Web3D effect
R语言入门
【无标题】
Configuration and startup of Chang'an chain synchronization node
XX攻击——反射型 XSS 攻击劫持用户浏览器
shardingSphere