当前位置：网站首页>SVM and ANN of OpenCV neural network library_ Use of MLP

SVM and ANN of OpenCV neural network library_ Use of MLP

2022-06-10 15:29:00 【HNU_ Liu Yuan】

brief introduction

opencv It is the most commonly used library for computer vision , It integrates many functions needed for image processing , Recently, one of them has been used SVM（ Support vector machine ） and ANN_MLP（ Hand designed multi-layer perceptron ）, Both are very classic “ neural network ”, The requirement for computing power is not so high , And it is very convenient to use , It's also faster , At the same time, it is integrated in opencv Medium , No need to install additional libraries .

SVM

In machine learning , Support vector machine （ English ：support vector machine, Often referred to as SVM, Also known as support vector network ） It is a supervised learning model and related learning algorithm for analyzing data in classification and regression analysis . Give a set of training examples , Each training instance is marked as belonging to one or the other of the two categories ,SVM The training algorithm creates a model that assigns a new instance to one of two categories , Make it a non probability binary linear classifier .
SVM A model represents an instance as a point in space , In this way, the mapping makes the instances of individual categories separated by as wide and obvious intervals as possible . then , Map new instances to the same space , And based on which side of the interval they fall to predict the category .
For some low dimensional 、 Generally, linear kernel function is used for simple separable features , For complex features , Consider using nonlinear kernel functions , Mapping to linearly separable kernel functions .
The optimization formula is ：
Insert picture description here
It's a convex optimization problem , There is no expansion here. For details, please refer to ：SVM

MLP

Multilayer perceptron （Multilayer Perceptron, abbreviation MLP） It is an artificial neural network with forward structure , Map a set of input vectors to a set of output vectors .MLP It can be seen as a directed graph , It consists of multiple node layers , Each layer is fully connected to the next layer . In addition to the input node , Each node is a neuron with nonlinear activation function （ Or processing unit ）.
MLP That is, the neural network built by the linear layer , The activation function must be nonlinear , such as ReLU、Sigmoid. The bottom layer of multi-layer perceptron is the input layer , In the middle is the hidden layer , Finally, the output layer .
MLP Has a long history , For some simple classification tasks , For example, handwritten numeral recognition , Can achieve better results . But for complex machine vision tasks , For example, testing 、 Tracking, etc , It seems that I can't do it .
Insert picture description here
MLP For a detailed introduction, please refer to ： Multilayer perceptron

data fetch

SVM and MLP The required data formats are basically similar , For this kind of supervised learning , You need to enter features and labels . Features can be viewed as a form of data that can be observed , For example, for a picture, it is pixel information , For text , It can be its word vector . A tag is an attribute represented by a feature , Such as category and other information .
In common use MNIST Data sets, for example ：
Insert picture description here
This is a number 3 Photos of the , Numbers 3 It corresponds to the label , The pixel value of this picture is a number 3 Show the characteristics of .
generally speaking , The number of features should be the same as the number of labels , Each feature corresponds to a label , But the dimensions of features are usually higher . Suppose the above figures 3 The photo size of is 16x16x1, Take the pixel of the photo as the feature vector , Its characteristic dimension is 256 dimension , The label dimension is generally set to 10 that will do , Corresponding 0~9 Ten figures .
Python The training and data reading under the environment are more convenient , In this paper, the general data preprocessing code is given ：

import cv2
import numpy as np
import os
import glob
#  Take two categories as an example 
positive_dir = ""#  Category 1 Folder path of pictures 
negative_dir = ""# Category 2 Folder path of pictures 
#  Traverse images 
positive_dir_imgs = glob.glob(positive_dir + "*.jpg")
negative_dir_imgs = glob.glob(negative_dir + "*.jpg")

img_size = 40 #  Set picture size 
train_mat=[] #  Characteristic matrix 
labels_label = np.zeros((len(positive_dir_imgs) + len(negative_dir_imgs), 2), np.float32) #  Label matrix 
#  Category 1
for positive_dir_img in positive_dir_imgs:    
    pso_img=cv2.imread(positive_dir_img,0) #  Read grayscale 
    pso_img = cv2.resize(pso_img, (img_size,img_size))     # Make it into a fixed size 
    Vect=np.zeros(img_size*img_size)        # First, then         
    for i in range(img_size):            
        for j in range (img_size):                
            Vect[img_size*i+j]=pso_img[i][j]        #  Transform two-dimensional images into one-dimensional information 
    train_mat.append(Vect) #  Add to the characteristic matrix 
print(len(train_mat)) #  Category 1 Number of pictures 
#  Category 2
for negative_dir_img in negative_dir_imgs:    
    nega_img=cv2.imread(negative_dir_img,0)
    nega_img = cv2.resize(nega_img, (img_size,img_size))     
    Vect=np.zeros(img_size*img_size)        # First, then         
    for i in range(img_size):            
        for j in range (img_size):                
            Vect[img_size*i+j]=nega_img[i][j]        
    train_mat.append(Vect)
print(len(train_mat))
train_mat = np.array(train_mat, dtype=np.float32) 
#  Label use one-hot code , For dichotomy , Category 1 The label of is [1,0], Category 2 yes [0,1]
for i in range(len(train_mat)):
    if i <= len(positive_dir_imgs):
        labels_label[i][0] = 1
        labels_label[i][1] = 0
    else:
        labels_label[i][0] = 0
        labels_label[i][1] = 1

Since then, the data processing operation has been completed , Get the characteristics （train_mat） Label corresponding to the feature （lables_label）

SVM Training and forecasting

For details, please refer to ：[SVM file ]

Training

(https://docs.opencv.org/3.4.7/d1/d2d/classcv_1_1ml_1_1SVM.html)

# establish SVM
svm=cv.ml.SVM_create()
#SVM type 
svm.setType(cv.ml.SVM_C_SVC)
# Linear kernel function 
svm.setKernel(cv.ml.SVM_LINEAR)
svm.setC(0.01)
# Start training ( data , type , label )
result = svm.train(train_mat,cv.ml.ROW_SAMPLE,labels_num)
# Save the training model 
svm.save("svm100.xml")

Open the saved xml file ： Insert picture description here
It can be seen that xml There is... In the file SVM Parameters of , Among the key information, the input feature dimension is 1600, The dimension of the output label is 2, The back is SVM Specific network parameters .

python edition SVM forecast

svm=cv.ml.SVM_load("svm100.xml")#  Data saved from previous training 
test_mat=train_mat[-10:]#SVM The processing methods of prediction data and training data are consistent , Take training data as an example 
print(test_mat)
(P1,P2) = svm.predict(test_mat)# Output results 
print(P1, P2)

C++ edition SVM forecast

To be added later ...

MLP Training and forecasting

For details, please refer to ：MLP file

MLP Training

ann = cv2.ml.ANN_MLP_create()#  establish MLP
ann.setLayerSizes(np.array([img_size*img_size, 64, 2])) # Set up MLP Every dimension of , The first is the input layer , In the middle is the hidden layer , The final output layer dimension 
ann.setActivationFunction(cv2.ml.ANN_MLP_SIGMOID_SYM, 0.6, 1.0)#  Activate function settings 
ann.setTrainMethod(cv2.ml.ANN_MLP_BACKPROP, 0.1, 0.1)#  How to train 
ann.setTermCriteria((cv2.TERM_CRITERIA_MAX_ITER | cv2.TERM_CRITERIA_EPS, 1000, 0.01))
ann.train(train_mat, 0, labels_label)
ann.save("MLP")

Open the saved MLP file ：
Insert picture description here
and SVM similar , It also contains dimension information of all layers , And the parameters of each neuron .

python edition MLP forecast

ann = cv2.ml.ANN_MLP_load("MLP")#  load MLP Model parameters 
test_mat=train_mat[-10:]#MLP Forecast data 
(P1,P2) = ann.predict(test_mat)# Output results 
print(P1, P2)

C++ edition MLP forecast

using namespace std;
#define RESIZE 40
int main()
{
	std::string xml_path = "MLP";
	cv::Ptr<cv::ml::ANN_MLP> ann = cv::ml::ANN_MLP::load(xml_path);
	std::string img_path = "/";
	cv::Mat frame, frame_resize;
	std::vector<std::string> img_paths;
	getFiles(img_path, img_paths);
	cv::Mat responseMat;
	for (int i = 0; i < img_paths.size(); i++)
	{
		frame = cv::imread(img_paths[i], 0);
		cv::resize(frame, frame_resize, cv::Size(RESIZE, RESIZE));
		cv::Mat testMat = frame_resize.clone().reshape(1, 1);
		testMat.convertTo(testMat, CV_32F);
		ann->predict(testMat, responseMat);
		float* p = responseMat.ptr<float>(0);
		cout << img_paths[i];
		if (p[0] > p[1])
		{
			cout << "predict label 1" << endl;
		}
		else
		{
			cout << "predict label 0" << endl;
		}
	}
	return 0;
}

Code summary

SVM

import cv2
import numpy as np
import os
import glob

positive_dir = "./crop_plane/"
negative_dir = "./random_crop/"

positive_dir_imgs = glob.glob(positive_dir + "*.jpg")
negative_dir_imgs = glob.glob(negative_dir + "*.jpg")

img_size = 100
train_mat=[]
labels_label = np.zeros((len(positive_dir_imgs) + len(negative_dir_imgs), 1), np.int32)
for positive_dir_img in positive_dir_imgs:    
    pso_img=cv2.imread(positive_dir_img,0)
    pso_img = cv2.resize(pso_img, (img_size,img_size))     
    Vect=np.zeros(img_size*img_size)        # First, then         
    for i in range(img_size):            
        for j in range (img_size):                
            Vect[img_size*i+j]=pso_img[i][j]        
    train_mat.append(Vect)
train_mat = np.array(train_mat, dtype=np.float32)
print(len(train_mat))
for negative_dir_img in negative_dir_imgs:    
    nega_img=cv2.imread(negative_dir_img,0)
    nega_img = cv2.resize(nega_img, (img_size,img_size))     
    Vect=np.zeros(img_size*img_size)        # First, then         
    for i in range(img_size):            
        for j in range (img_size):                
            Vect[img_size*i+j]=nega_img[i][j]        
    train_mat.append(Vect)
print(len(train_mat))
for i in range(len(train_mat)):
    if i <= len(positive_dir_imgs):
        labels_label[i][0] = 1
    else:
        labels_label[i][0] = -1

svm=cv2.ml.SVM_create()#SVM type 
svm.setType(cv2.ml.SVM_C_SVC)# Linear kernel function 
svm.setKernel(cv2.ml.SVM_LINEAR)
svm.setC(0.01)# Start training ( data , type , label )

result = svm.train(train_mat,cv2.ml.ROW_SAMPLE,labels_label)# Create a list to store test samples 
svm.save("svm100.xml")
test_mat=train_mat[-10:]#SVM forecast 
print(test_mat)
(P1,P2) = svm.predict(test_mat)# Output results 
print(P1, P2)

MLP

import cv2
import numpy as np
import os
import glob

positive_dir = "./crop_plane/"
negative_dir = "./random_crop/"

positive_dir_imgs = glob.glob(positive_dir + "*.jpg")
negative_dir_imgs = glob.glob(negative_dir + "*.jpg")

img_size = 40
train_mat=[]
labels_label = np.zeros((len(positive_dir_imgs) + len(negative_dir_imgs), 2), np.float32)
for positive_dir_img in positive_dir_imgs:    
    pso_img=cv2.imread(positive_dir_img,0)
    pso_img = cv2.resize(pso_img, (img_size,img_size))     
    Vect=np.zeros(img_size*img_size)        # First, then         
    for i in range(img_size):            
        for j in range (img_size):                
            Vect[img_size*i+j]=pso_img[i][j]        
    train_mat.append(Vect)
print(len(train_mat))
for negative_dir_img in negative_dir_imgs:    
    nega_img=cv2.imread(negative_dir_img,0)
    nega_img = cv2.resize(nega_img, (img_size,img_size))     
    Vect=np.zeros(img_size*img_size)        # First, then         
    for i in range(img_size):            
        for j in range (img_size):                
            Vect[img_size*i+j]=nega_img[i][j]        
    train_mat.append(Vect)
print(len(train_mat))
train_mat = np.array(train_mat, dtype=np.float32)
for i in range(len(train_mat)):
    if i <= len(positive_dir_imgs):
        labels_label[i][0] = 1
        labels_label[i][1] = 0
    else:
        labels_label[i][0] = 0
        labels_label[i][1] = 1
   
ann = cv2.ml.ANN_MLP_create()
ann.setLayerSizes(np.array([img_size*img_size, 64, 2]))
ann.setActivationFunction(cv2.ml.ANN_MLP_SIGMOID_SYM, 0.6, 1.0)
ann.setTrainMethod(cv2.ml.ANN_MLP_BACKPROP, 0.1, 0.1)
ann.setTermCriteria((cv2.TERM_CRITERIA_MAX_ITER | cv2.TERM_CRITERIA_EPS, 1000, 0.01))

print(train_mat.shape)
print(labels_label.shape)
ann.train(train_mat, 0, labels_label)
ann.save("MLP")
test_mat=train_mat[-10:]
print(test_mat)
(P1,P2) = ann.predict(test_mat)# Output results 
print(P1, P2)

原网站

版权声明
本文为[HNU_ Liu Yuan]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/161/202206101508049367.html