当前位置：网站首页>Statistical learning method (2/22) perceptron

Statistical learning method (2/22) perceptron

2022-06-29 01:11:00 【Xiaoshuai acridine】

Perceptron is a linear classification model of two classification , The input is the eigenvector of the instance , The output is the category of the instance , take +1 and -1 binary . The perceptron corresponds to the input space （ The feature space ） The examples are divided into positive and negative hyperplanes , It's a discriminant model . Perceptron learning aims at finding the separation hyperplane which divides the training data linearly , So , Import loss function based on misclassification , The loss function is minimized by gradient descent method , Get the perceptron model .

Eye of depth course link ：https://ai.deepshare.net/detail/p_619b93d0e4b07ededa9fcca0/5
Code link ：
https://github.com/zxs-000202/Statistical-Learning-Methods

Insert picture description here

The perceptron is a linear model , Processing linear data sets

For a linearly separable data , Finally, there are no misclassified points

The initial value of the parameter is 0


import numpy as np
import time
from tqdm import tqdm
def loadData(fileName):
    '''  load Mnist Data sets  :param fileName: Data set path to load  :return: list Data sets and tags in the form of  '''
    print('start to read data')
    #  Storage of data and marks list
    dataArr = []; labelArr = []
    #  Open file 
    fr = open(fileName, 'r')
    #  Read the file by line 
    for line in tqdm(fr.readlines()):
        #  For each row of data, press the cut button ',' For cutting , Return to the list of fields 
        curLine = line.strip().split(',')

        # Mnsit Yes 0-9 It's a sign , Because it's a two category task , So will >=5 As 1,<5 by -1
        if int(curLine[0]) >= 5:
            labelArr.append(1)
        else:
            labelArr.append(-1)
        # Storage mark 
        #[int(num) for num in curLine[1:]] ->  Traverse every line except for the first brother element （ Mark ） Convert all elements into int type 
        #[int(num)/255 for num in curLine[1:]] ->  Divide all data by 255 normalization ( It's not necessary , It's OK not to )
        dataArr.append([int(num)/255 for num in curLine[1:]])

    # return data and label
    return dataArr, labelArr

def perceptron(dataArr, labelArr, iter=50):
    '''  Perceptron training process  :param dataArr: Training set data  (list) :param labelArr:  Training set label (list) :param iter:  The number of iterations , Default 50 :return:  Well trained w and b '''
    print('start to trans')
    # Convert data into matrix form （ In machine learning, it is usually the operation of vectors , The transformation is called matrix form, which is convenient for operation ）
    # The vector of each sample in the converted data is horizontal 
    dataMat = np.mat(dataArr)
    # Convert labels into matrices , Then transpose (.T Transpose ).
    # Transpose is because you need to take... Separately in the operation label One of the elements in , If it is 1xN Matrix of , No use label[i] Read by 
    # For only 1xN Of label Can not be converted into a matrix , direct label[i] that will do , The conversion here is for the unification of format 
    labelMat = np.mat(labelArr).T
    # Get the size of the data matrix , by m*n
    m, n = np.shape(dataMat)
    # Create initial weights w, The initial values are all 0.
    #np.shape(dataMat) The return value of is m,n -> np.shape(dataMat)[1]) The value is n, And 
    # The sample length is consistent 
    w = np.zeros((1, np.shape(dataMat)[1]))
    # Initialize the bias b by 0
    b = 0
    # Initialization step , That is, in the process of gradient descent n, Control the gradient descent rate 
    h = 0.0001

    # Conduct iter Iterations 
    for k in range(iter):
        # For each sample, perform a gradient descent 
        # In Li Hang's book 2.3.1 Gradient descent used at the beginning , After all the samples have been counted , Unified 
        # Make a gradient descent 
        # stay 2.3.1 You can see the second half of （ For example, the formula 2.6 2.7）, The summation symbol is gone , Use at this time 
        # Is a random gradient descent , That is, calculate a sample and perform a gradient descent for the sample .
        # There are differences between the two , But random gradient descent is commonly used .
        for i in range(m):
            # Get the vector of the current sample 
            xi = dataMat[i]
            # Get the tag corresponding to the current sample 
            yi = labelMat[i]
            # Determine whether it is a misclassified sample 
            # The special diagnosis of misclassified samples is ： -yi(w*xi+b)>=0, For details, please refer to 2.2.2 Section 
            # The formula in the book says >0, In fact, if =0, Explain that the change point is on the hyperplane , It's not right 
            if -1 * yi * (w * xi.T + b) >= 0:
                # For misclassified samples , Make a gradient descent , to update w and b
                w = w + h *  yi * xi
                b = b + h * yi
        # Print training progress 
        print('Round %d:%d training' % (k, iter))

    # Return to the end of training w、b
    return w, b


def model_test(dataArr, labelArr, w, b):
    '''  Test accuracy  :param dataArr: Test set  :param labelArr:  Test set label  :param w:  Weight gained from training w :param b:  Training gains b :return:  Accuracy rate  '''
    print('start to test')
    # Convert the data set into matrix form for convenient operation 
    dataMat = np.mat(dataArr)
    # take label Convert to matrix and transpose , Refer to the above for details perceptron in 
    # For the explanation of this part 
    labelMat = np.mat(labelArr).T

    # Get the size of the test data set matrix 
    m, n = np.shape(dataMat)
    # Error sample count 
    errorCnt = 0
    # Traverse all test samples 
    for i in range(m):
        # Obtain a single sample vector 
        xi = dataMat[i]
        # Obtain the sample mark 
        yi = labelMat[i]
        # Get the result 
        result = -1 * yi * (w * xi.T + b)
        # If -yi(w*xi+b)>=0, It indicates that the sample is misclassified , Add one to the number of wrong samples 
        if result >= 0: errorCnt += 1
    # Accuracy rate  = 1 - （ Number of sample classification errors  /  The total number of samples ）
    accruRate = 1 - (errorCnt / m)
    # Return the correct rate 
    return accruRate

if __name__ == '__main__':
    # Get the current time 
    # Also get the current time at the end of the text , The time difference between the two is the program running time 
    start = time.time()

    # Get training sets and tags 
    trainData, trainLabel = loadData('../Mnist/mnist_train.csv')
    # Get test set and label 
    testData, testLabel = loadData('../Mnist/mnist_test.csv')

    # Training to gain weight 
    w, b = perceptron(trainData, trainLabel, iter = 30)
    # To test , Get the correct rate 
    accruRate = model_test(testData, testLabel, w, b)

    # Get the current time , As the end time 
    end = time.time()
    # Display accuracy 
    print('accuracy rate is:', accruRate)
    # Display duration 
    print('time span:', end - start)

原网站

版权声明
本文为[Xiaoshuai acridine]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/180/202206290105326256.html

当前位置：网站首页>Statistical learning method (2/22) perceptron

Statistical learning method (2/22) perceptron

边栏推荐

猜你喜欢

随机推荐