当前位置:网站首页>Introduction and basic knowledge of machine learning
Introduction and basic knowledge of machine learning
2022-07-01 02:55:00 【zhang. yao】
1. Introduction to machine learning
machine learning : Without programming directly for the problem , A research area that empowers computer learning
For a certain type of task T And performance metrics P, If a computer program T In order to P Performance measured with experience E And self-improvement , So we call this computer program from experience E Study
2. Common algorithms
2.1 Supervised algorithms
There are result marks in the sample data
classification
According to the principle
- Based on Statistics Bayesian classification
- rule-based Decision tree algorithm
- Based on Neural Networks Neural network algorithm
- Distance based KNN(K Nearest neighbor )
Common evaluation indicators
- Accuracy Ratio of predicted results to actual results
- Recall rate The correct coverage of certain results in the prediction results
- F1-Score statistic , Comprehensive evaluation classification model Value 0-1 Between
Regression algorithm
2.1.1 KNN Algorithm
k-Nearest Neighbour One of the simplest classification algorithms , If the nearest to a sample k Most of the data in the samples belong to a certain category , Then it is considered that the sample also belongs to this category , And has the characteristics of the samples on this class ,KNN Can not only predict the classification , Regression analysis can also be done ( Predict specific values )

2.1.2 Decision tree algorithm






2.2. Unsupervised algorithm
There is no result mark in the sample data
2.2.1 clustering
- Hierarchical clustering
- Density clustering
- Partition clustering


2.3 Semi supervised algorithm
Part of the sample data is marked with results

3. Detailed explanation of machine learning algorithm
3.1 Machine learning Basics
3.1.1 The basic concept of machine learning
- input space : The set of all possible values of input is called input space
- Output space : The set of all possible values of the output is called the output space
- features : The property
- Eigenvector : A vector composed of multiple features becomes a feature vector
- The feature space : The space where the eigenvector exists is called the eigenspace
- Hypothetical space : A set of mappings from input space to output space
3.1.2 The essence of machine learning
3.1.3 Three elements of machine learning methods
Method = Model + Strategy + Algorithm
- Model : Mapping from input space to output space , Choose the appropriate assumption space
- Strategy : Learning criteria or rules for calculating rules from numerous hypothesis spaces to optimal models

- Loss function : Used to measure the difference between the predicted results and the real results , The less it's worth , Represents the expected results and the real results It's usually a non negative real valued function , The process of reducing the loss function in various ways is called optimization , The loss function is recorded as L(Y,f(x))
- 0-1 Loss function If the predicted value is equal to the actual value, there will be no loss , Otherwise, it is a complete loss
- Absolute loss function : The absolute value of the difference between the predicted result and the real result
- Square loss function : The square of the difference between the predicted result and the real result
- Logarithmic loss function : Logarithmic functions are monotonic , When solving optimization problems , The result is consistent with the original goal , You can convert multiplication to addition
- Exponential loss function : monotonicity , Excellent properties of nonnegativity . Make the closer to the correct result, the smaller the error
- Folding loss function
- Empirical risk & Risk function
- Structural risk
3.2 Model evaluation and selection
3.2.1 Principles of model selection
3.2.1.1 Basic concepts
- error : The difference between the predicted output value of the model and its real value
- Training : Learning through known sample data , The process of obtaining the model
- Training error : The error between the model action and the training set
- generalization : From special to general , For machine learning, it refers to applying new sample data from the model
- The generalization error : The error of the new sample model
- Model capacity : Ability to fit various models
- Over fitting : A model performs well on the sample , Poor performance on new samples
- Under fitting : The model does not perform well on the training set
- Model selection : Choose the model with the least generalization error
3.2.2 Performance index of the model
3.2.3 The method of model evaluation
- Set aside method : Use 80% The known data set is used as the training set to train the model , Use the rest of 20% Test the trained model as a test set , The test error obtained from the test set is used as the approximate value of the generalization error , Take the model with small test error
- Test set and training set shall be mutually exclusive as far as possible
- The test set and training set are independent and identically distributed
- Cross validation : Divide the dataset into k Two mutually exclusive data subsets . Subset data is sampled hierarchically , Select one data set as the test set at a time , The rest are used as training sets , Conduct k Training and testing , Get the average , This verification method is called k Crossover verification Use different divisions , repeat p Time , be called p Time k Crossover verification
3.2.4 Comparison of model performance
3.2.4.1 Performance measurement of regression algorithm

3.2.4.2 Performance measurement of classification algorithm
边栏推荐
- C language a little bit (may increase in the future)
- Mouse over effect 8
- AI edge computing platform - beaglebone AI 64 introduction
- 彻底解决Lost connection to MySQL server at ‘reading initial communication packet
- 一文讲解发布者订阅者模式与观察者模式
- Dell服务器重启iDRAC方法
- Mouse over effect 10
- Scale SVG to container without mask / crop
- Servlet [first introduction]
- VMware vSphere 6.7虚拟化云管理之12、VCSA6.7更新vCenter Server许可
猜你喜欢
![[applet project development -- JD mall] uni app commodity classification page (first)](/img/6c/5b92fc1f18d58e0fdf6f1896188fcd.png)
[applet project development -- JD mall] uni app commodity classification page (first)

Share Creators萌芽人才培养计划来了!

咱就是说 随便整几千个表情包为我所用一下

最新接口自动化面试题

RestCloud ETL实践之无标识位实现增量数据同步

Introduction to kubernetes resource objects and common commands (II)

DenseNet网络论文学习笔记

Cloud native annual technology inventory is released! Ride the wind and waves at the right time

Const and the secret of pointers

STM32——一线协议之DS18B20温度采样
随机推荐
鼠标悬停效果二
Mouse over effect 7
大橙子疯博客搬家通知
C language a little bit (may increase in the future)
Mouse over effect 10
import tensorflow. contrib. Slim as slim error
Restcloud ETL WebService data synchronization to local
Voici le programme de formation des talents de SHARE Creators!
【微信小程序開發】樣式匯總
go: finding module for package
Lenovo x86 server restart management controller (xclarity controller) or TSM method
lavaweb【初识后续问题的解决】
【机器学习】向量化计算 -- 机器学习路上必经路
Pychart software deployment gray unable to point
Network address translation (NAT) technology
如果在小券商办理网上开户安全吗?我的资金会不会不安全?
Is it safe to open a stock account? Shanghai stock account opening procedures.
xxl-job使用指南
鼠标悬停效果一
【微信小程序开发】样式汇总



