当前位置:网站首页>"Analysis of 43 cases of MATLAB neural network": Chapter 30 design of combined classifier based on random forest idea - breast cancer diagnosis
"Analysis of 43 cases of MATLAB neural network": Chapter 30 design of combined classifier based on random forest idea - breast cancer diagnosis
2022-07-01 08:35:00 【mozun2020】
《MATLAB neural network 43 A case study 》: The first 30 Chapter Design of combined classifier based on random forest idea —— Breast cancer diagnosis
1. Preface
《MATLAB neural network 43 A case study 》 yes MATLAB Technology Forum (www.matlabsky.com) planning , Led by teacher wangxiaochuan ,2013 Beijing University of Aeronautics and Astronautics Press MATLAB A book for tools MATLAB Example teaching books , Is in 《MATLAB neural network 30 A case study 》 On the basis of modification 、 Complementary , Adhering to “ Theoretical explanation — case analysis — Application extension ” This feature , Help readers to be more intuitive 、 Learn neural networks vividly .
《MATLAB neural network 43 A case study 》 share 43 Chapter , The content covers common neural networks (BP、RBF、SOM、Hopfield、Elman、LVQ、Kohonen、GRNN、NARX etc. ) And related intelligent algorithms (SVM、 Decision tree 、 Random forests 、 Extreme learning machine, etc ). meanwhile , Some chapters also cover common optimization algorithms ( Genetic algorithm (ga) 、 Ant colony algorithm, etc ) And neural network . Besides ,《MATLAB neural network 43 A case study 》 It also introduces MATLAB R2012b New functions and features of neural network toolbox in , Such as neural network parallel computing 、 Custom neural networks 、 Efficient programming of neural network, etc .
In recent years, with the rise of artificial intelligence research , The related direction of neural network has also ushered in another upsurge of research , Because of its outstanding performance in the field of signal processing , The neural network method is also being applied to various applications in the direction of speech and image , This paper combines the cases in the book , It is simulated and realized , It's a relearning , I hope I can review the old and know the new , Strengthen and improve my understanding and practice of the application of neural network in various fields . I just started this book on catching more fish , Let's start the simulation example , Mainly to introduce the source code application examples in each chapter , This paper is mainly based on MATLAB2015b(32 position ) Platform simulation implementation , This is the design example of a combined classifier based on the idea of random forest in Chapter 30 of this book , Don't talk much , Start !
2. MATLAB Simulation example
open MATLAB, Click on “ Home page ”, Click on “ open ”, Find the sample file 
Choose main.m, Click on “ open ”
main.m Source code is as follows :
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% function : Design of combined classifier based on random forest idea
% Environmental Science :Win7,Matlab2015b
%Modi: C.S
% Time :2022-06-20
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Design of combined classifier based on random forest idea
%% Clear environment variables
clear all
clc
warning off
tic
%% Import data
load data.mat
% Randomly generated training set / Test set
a = randperm(569);
Train = data(a(1:500),:);
Test = data(a(501:end),:);
% Training data
P_train = Train(:,3:end);
T_train = Train(:,2);
% Test data
P_test = Test(:,3:end);
T_test = Test(:,2);
%% Create a random forest classifier
model = classRF_train(P_train,T_train);
%% The simulation test
[T_sim,votes] = classRF_predict(P_test,model);
%% Result analysis
count_B = length(find(T_train == 1));
count_M = length(find(T_train == 2));
total_B = length(find(data(:,2) == 1));
total_M = length(find(data(:,2) == 2));
number_B = length(find(T_test == 1));
number_M = length(find(T_test == 2));
number_B_sim = length(find(T_sim == 1 & T_test == 1));
number_M_sim = length(find(T_sim == 2 & T_test == 2));
disp([' Total number of cases :' num2str(569)...
' Benign :' num2str(total_B)...
' Malignant :' num2str(total_M)]);
disp([' Total number of training set cases :' num2str(500)...
' Benign :' num2str(count_B)...
' Malignant :' num2str(count_M)]);
disp([' Total number of cases in the test set :' num2str(69)...
' Benign :' num2str(number_B)...
' Malignant :' num2str(number_M)]);
disp([' Benign breast tumor was diagnosed :' num2str(number_B_sim)...
' Misdiagnosis :' num2str(number_B - number_B_sim)...
' Diagnostic rate p1=' num2str(number_B_sim/number_B*100) '%']);
disp([' Malignant breast tumor was diagnosed :' num2str(number_M_sim)...
' Misdiagnosis :' num2str(number_M - number_M_sim)...
' Diagnostic rate p2=' num2str(number_M_sim/number_M*100) '%']);
%% mapping
figure
index = find(T_sim ~= T_test);
plot(votes(index,1),votes(index,2),'r*')
hold on
index = find(T_sim == T_test);
plot(votes(index,1),votes(index,2),'bo')
hold on
legend(' Misclassification sample ',' Correctly classify the samples ')
plot(0:500,500:-1:0,'r-.')
hold on
plot(0:500,0:500,'r-.')
hold on
line([100 400 400 100 100],[100 100 400 400 100])
xlabel(' Output as category 1 Number of decision trees ')
ylabel(' Output as category 2 Number of decision trees ')
title(' Performance analysis of random forest classifier ')
%% The influence of the number of decision trees on the performance in random forest
Accuracy = zeros(1,20);
for i = 50:50:1000
%i
% Each case , function 100 Time , Average.
accuracy = zeros(1,100);
for k = 1:100
% Create random forests
model = classRF_train(P_train,T_train,i);
% The simulation test
T_sim = classRF_predict(P_test,model);
accuracy(k) = length(find(T_sim == T_test)) / length(T_test);
end
Accuracy(i/50) = mean(accuracy);
end
% mapping
figure
plot(50:50:1000,Accuracy)
xlabel(' Number of decision trees in a random forest ')
ylabel(' Classification accuracy ')
title(' The influence of the number of decision trees on the performance in random forest ')
toc
Add completed , Click on “ function ”, Start emulating , The output simulation results are as follows :
Setting to defaults 500 trees and mtry=5
Total number of cases :569 Benign :357 Malignant :212
Total number of training set cases :500 Benign :319 Malignant :181
Total number of cases in the test set :69 Benign :38 Malignant :31
Benign breast tumor was diagnosed :36 Misdiagnosis :2 Diagnostic rate p1=94.7368%
Malignant breast tumor was diagnosed :30 Misdiagnosis :1 Diagnostic rate p2=96.7742%
Time has passed 622.343231 second .




3. Summary
Random forest refers to a classifier that uses multiple trees to train and predict samples . This classifier was first developed by Leo Breiman and Adele Cutler Put forward . In machine learning , Random forest is a classifier that contains multiple decision trees , And the output category is determined by the mode of the output category of the individual tree . Leo Breiman and Adele Cutler Develop algorithms for inferring random forests . and “Random Forests” It's their trademark . The term is 1995 By Bell Laboratories Tin Kam Ho The proposed stochastic decision forest (random decision forests) And here comes . This method combines Breimans Of “Bootstrap aggregating” Ideas and Ho Of "random subspace method" To build a set of decision trees . In visual machine learning 0 There are also simulation examples for random forests in the column , The link at the end of the article can be accessed . Interested in the content of this chapter or want to fully learn and understand , It is suggested to study the contents of Chapter 30 in the book . Some of these knowledge points will be supplemented on the basis of their own understanding in the later stage , Welcome to study and exchange together .
边栏推荐
- Embedded-c language-10-enumeration / (function) pointer (function) / multi-level pointer /malloc dynamic allocation / file operation
- How can beginners correctly understand Google's official suggested architectural principles (questions?)
- 15Mo3 German standard steel plate 15Mo3 chemical composition 15Mo3 mechanical property analysis of Wuyang Steel Works
- 7-26 word length (input and output in the loop)
- Model and view of QT
- How to use OKR as the leadership framework of marketing department
- Maneuvering target tracking -- current statistical model (CS model) extended Kalman filter / unscented Kalman filter matlab implementation
- 避免按钮重复点击的小工具bimianchongfu.queren()
- 手工挖XSS漏洞
- 我想知道手机注册股票开户的流程?另外,手机开户安全么?
猜你喜欢

Properties of 15MnNiNbDR low temperature vessel steel, Wugang 15MnNiDR and 15MnNiNbDR steel plates

TypeError: __init__() got an unexpected keyword argument ‘autocompletion‘

CPU design practice - Chapter 4 practical tasks - simple CPU reference design and debugging

一套十万级TPS的IM综合消息系统的架构实践与思考

【华为机试真题详解】判断字符串子序列【2022 Q1 Q2 | 200分】

毕业论文中word的使用1-代码域标公式

3、Modbus通讯协议详解

NIO-零拷贝

Introduction to R language

CPU設計實戰-第四章實踐任務一簡單CPU參考設計調試
随机推荐
Matlab [functions and images]
[深度剖析C语言] —— 数据在内存中的存储
CPU設計實戰-第四章實踐任務一簡單CPU參考設計調試
SPL-介绍(一)
Audio audiorecord create (I)
Model and view of QT
Mavros sends a custom topic message to Px4
The use of word in graduation thesis
The difference between interceptors and filters
Pipeline detection of UAV Based on gazebo
[untitled]
Data analysis notes 11
Conception et mise en service du processeur - chapitre 4 tâches pratiques
避免按钮重复点击的小工具bimianchongfu.queren()
Leetcode T40: 组合总和II
栈实现计算器
There are many problems in sewage treatment, and the automatic control system of pump station is solved in this way
Yolov5进阶之七目标追踪最新环境搭建
factory type_ Id:: create process resolution
SPL Introduction (I)