当前位置:网站首页>"Analysis of 43 cases of MATLAB neural network": Chapter 30 design of combined classifier based on random forest idea - breast cancer diagnosis
"Analysis of 43 cases of MATLAB neural network": Chapter 30 design of combined classifier based on random forest idea - breast cancer diagnosis
2022-07-01 08:35:00 【mozun2020】
《MATLAB neural network 43 A case study 》: The first 30 Chapter Design of combined classifier based on random forest idea —— Breast cancer diagnosis
1. Preface
《MATLAB neural network 43 A case study 》 yes MATLAB Technology Forum (www.matlabsky.com) planning , Led by teacher wangxiaochuan ,2013 Beijing University of Aeronautics and Astronautics Press MATLAB A book for tools MATLAB Example teaching books , Is in 《MATLAB neural network 30 A case study 》 On the basis of modification 、 Complementary , Adhering to “ Theoretical explanation — case analysis — Application extension ” This feature , Help readers to be more intuitive 、 Learn neural networks vividly .
《MATLAB neural network 43 A case study 》 share 43 Chapter , The content covers common neural networks (BP、RBF、SOM、Hopfield、Elman、LVQ、Kohonen、GRNN、NARX etc. ) And related intelligent algorithms (SVM、 Decision tree 、 Random forests 、 Extreme learning machine, etc ). meanwhile , Some chapters also cover common optimization algorithms ( Genetic algorithm (ga) 、 Ant colony algorithm, etc ) And neural network . Besides ,《MATLAB neural network 43 A case study 》 It also introduces MATLAB R2012b New functions and features of neural network toolbox in , Such as neural network parallel computing 、 Custom neural networks 、 Efficient programming of neural network, etc .
In recent years, with the rise of artificial intelligence research , The related direction of neural network has also ushered in another upsurge of research , Because of its outstanding performance in the field of signal processing , The neural network method is also being applied to various applications in the direction of speech and image , This paper combines the cases in the book , It is simulated and realized , It's a relearning , I hope I can review the old and know the new , Strengthen and improve my understanding and practice of the application of neural network in various fields . I just started this book on catching more fish , Let's start the simulation example , Mainly to introduce the source code application examples in each chapter , This paper is mainly based on MATLAB2015b(32 position ) Platform simulation implementation , This is the design example of a combined classifier based on the idea of random forest in Chapter 30 of this book , Don't talk much , Start !
2. MATLAB Simulation example
open MATLAB, Click on “ Home page ”, Click on “ open ”, Find the sample file 
Choose main.m, Click on “ open ”
main.m Source code is as follows :
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% function : Design of combined classifier based on random forest idea
% Environmental Science :Win7,Matlab2015b
%Modi: C.S
% Time :2022-06-20
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Design of combined classifier based on random forest idea
%% Clear environment variables
clear all
clc
warning off
tic
%% Import data
load data.mat
% Randomly generated training set / Test set
a = randperm(569);
Train = data(a(1:500),:);
Test = data(a(501:end),:);
% Training data
P_train = Train(:,3:end);
T_train = Train(:,2);
% Test data
P_test = Test(:,3:end);
T_test = Test(:,2);
%% Create a random forest classifier
model = classRF_train(P_train,T_train);
%% The simulation test
[T_sim,votes] = classRF_predict(P_test,model);
%% Result analysis
count_B = length(find(T_train == 1));
count_M = length(find(T_train == 2));
total_B = length(find(data(:,2) == 1));
total_M = length(find(data(:,2) == 2));
number_B = length(find(T_test == 1));
number_M = length(find(T_test == 2));
number_B_sim = length(find(T_sim == 1 & T_test == 1));
number_M_sim = length(find(T_sim == 2 & T_test == 2));
disp([' Total number of cases :' num2str(569)...
' Benign :' num2str(total_B)...
' Malignant :' num2str(total_M)]);
disp([' Total number of training set cases :' num2str(500)...
' Benign :' num2str(count_B)...
' Malignant :' num2str(count_M)]);
disp([' Total number of cases in the test set :' num2str(69)...
' Benign :' num2str(number_B)...
' Malignant :' num2str(number_M)]);
disp([' Benign breast tumor was diagnosed :' num2str(number_B_sim)...
' Misdiagnosis :' num2str(number_B - number_B_sim)...
' Diagnostic rate p1=' num2str(number_B_sim/number_B*100) '%']);
disp([' Malignant breast tumor was diagnosed :' num2str(number_M_sim)...
' Misdiagnosis :' num2str(number_M - number_M_sim)...
' Diagnostic rate p2=' num2str(number_M_sim/number_M*100) '%']);
%% mapping
figure
index = find(T_sim ~= T_test);
plot(votes(index,1),votes(index,2),'r*')
hold on
index = find(T_sim == T_test);
plot(votes(index,1),votes(index,2),'bo')
hold on
legend(' Misclassification sample ',' Correctly classify the samples ')
plot(0:500,500:-1:0,'r-.')
hold on
plot(0:500,0:500,'r-.')
hold on
line([100 400 400 100 100],[100 100 400 400 100])
xlabel(' Output as category 1 Number of decision trees ')
ylabel(' Output as category 2 Number of decision trees ')
title(' Performance analysis of random forest classifier ')
%% The influence of the number of decision trees on the performance in random forest
Accuracy = zeros(1,20);
for i = 50:50:1000
%i
% Each case , function 100 Time , Average.
accuracy = zeros(1,100);
for k = 1:100
% Create random forests
model = classRF_train(P_train,T_train,i);
% The simulation test
T_sim = classRF_predict(P_test,model);
accuracy(k) = length(find(T_sim == T_test)) / length(T_test);
end
Accuracy(i/50) = mean(accuracy);
end
% mapping
figure
plot(50:50:1000,Accuracy)
xlabel(' Number of decision trees in a random forest ')
ylabel(' Classification accuracy ')
title(' The influence of the number of decision trees on the performance in random forest ')
toc
Add completed , Click on “ function ”, Start emulating , The output simulation results are as follows :
Setting to defaults 500 trees and mtry=5
Total number of cases :569 Benign :357 Malignant :212
Total number of training set cases :500 Benign :319 Malignant :181
Total number of cases in the test set :69 Benign :38 Malignant :31
Benign breast tumor was diagnosed :36 Misdiagnosis :2 Diagnostic rate p1=94.7368%
Malignant breast tumor was diagnosed :30 Misdiagnosis :1 Diagnostic rate p2=96.7742%
Time has passed 622.343231 second .




3. Summary
Random forest refers to a classifier that uses multiple trees to train and predict samples . This classifier was first developed by Leo Breiman and Adele Cutler Put forward . In machine learning , Random forest is a classifier that contains multiple decision trees , And the output category is determined by the mode of the output category of the individual tree . Leo Breiman and Adele Cutler Develop algorithms for inferring random forests . and “Random Forests” It's their trademark . The term is 1995 By Bell Laboratories Tin Kam Ho The proposed stochastic decision forest (random decision forests) And here comes . This method combines Breimans Of “Bootstrap aggregating” Ideas and Ho Of "random subspace method" To build a set of decision trees . In visual machine learning 0 There are also simulation examples for random forests in the column , The link at the end of the article can be accessed . Interested in the content of this chapter or want to fully learn and understand , It is suggested to study the contents of Chapter 30 in the book . Some of these knowledge points will be supplemented on the basis of their own understanding in the later stage , Welcome to study and exchange together .
边栏推荐
- Thread safety analysis of [concurrent programming JUC] variables
- Serial port to WiFi module communication
- Huawei machine test questions column subscription Guide
- shardingSphere
- 使用 setoolkit 伪造站点窃取用户信息
- Leetcode t39: combined sum
- MAVROS发送自定义话题消息给PX4
- TypeError: __init__() got an unexpected keyword argument ‘autocompletion‘
- Advanced API
- What is 1cr0.5mo (H) material? 1cr0.5mo (H) tensile yield strength
猜你喜欢

seaborn clustermap矩阵添加颜色块

01 numpy introduction

Vscode customize the color of each area

使用beef劫持用户浏览器

SPL installation and basic use (II)

性能提升2-3倍!百度智能云第二代昆仑芯服务器上线

View drawing process analysis

15Mo3 German standard steel plate 15Mo3 chemical composition 15Mo3 mechanical property analysis of Wuyang Steel Works

shardingSphere

避免按钮重复点击的小工具bimianchongfu.queren()
随机推荐
CPU design practice - Chapter 4 practical tasks - simple CPU reference design and debugging
Codeforces Round #803 (Div. 2) VP补题
TypeError: __init__() got an unexpected keyword argument ‘autocompletion‘
华为机试真题专栏订阅指引
Li Kou 1358 -- number of substrings containing all three characters (double pointer)
公网集群对讲+GPS可视追踪|助力物流行业智能化管理调度
Koltin35, headline Android interview algorithm
3、Modbus通讯协议详解
《单片机原理与应用》——并行IO口原理
The use of word in graduation thesis
串口转WIFI模块通信
栈实现计算器
Gateway-88
Thread safety analysis of [concurrent programming JUC] variables
Leetcode t39: combined sum
Hijacking a user's browser with beef
The difference between interceptors and filters
SPL installation and basic use (II)
《微机原理》——微处理器内部及外部结构
How to use OKR as the leadership framework of marketing department