当前位置:网站首页>Collection | can explain the development and common methods of machine learning!

Collection | can explain the development and common methods of machine learning!

2022-06-11 15:53:00 Xiaobai learns vision

Click on the above “ Xiaobai studies vision ”, Optional plus " Star standard " or “ Roof placement

Heavy dry goods , First time delivery

In recent years , Interpretable machine learning (IML) Relevant research is booming . Although this field is just beginning , However, its work on regression modeling and rule-based machine learning began in 20 century 60 years . lately ,arXiv A paper on explains machine learning briefly (IML) History of the field , An overview of the most advanced interpretable methods is given , The challenges encountered are also discussed .

When machine learning models are used in products 、 In the process of decision-making or research ,“ Interpretability ” Usually a determinant .

Interpretable machine learning (Interpretable machine learning , abbreviation IML) Can be used to discover knowledge , debugging 、 Prove the model and its prediction , And control and improve the model .

The researchers believe that IML In some cases, it can be considered that the development of has entered a new stage , However, some challenges remain .

Interpretable machine learning (IML) Brief history

In recent years, there have been many researches on interpretable machine learning , But the history of learning interpretable models from data has a long history .

Linear regression dates back to 19 It has been used since the beginning of the century , Since then, it has developed into a variety of regression analysis tools , for example , Generalized model addition (generalized additive models) And elastic networks (elastic net) etc. .

The philosophical meaning behind these statistical models is usually to make some distribution assumptions or limit the complexity of the model , And therefore impose the inherent interpretability of the model .

e8fe2cfead1fd57aea919557d6cb4077.png

And in machine learning , The modeling method used is slightly different . 

Machine learning algorithms usually follow nonlinear rules , Nonparametric methods , Instead of pre limiting the complexity of the model , In this method , The complexity of the model is controlled by one or more super parameters , And select through cross validation . This flexibility usually leads to good prediction performance of models that are difficult to explain .

Although machine learning algorithms usually focus on predicting performance , But work on the interpretability of machine learning has existed for many years . The built-in feature importance measure in random forest is one of the important milestones of interpretable machine learning .

      7cc814d84b176986fb435d69a08f6ab0.png

After a long period of development, deep learning , Finally in the 2010 Year of ImageNet To win .

The years since then , according to Google On “ Interpretable machine learning ” and “ Explicable AI” The frequency of these two search terms , It can be roughly concluded that IML Field in 2015 It took off in .

        72623a0b9068f23506390d44e9a580d0.png       

IML Common methods in

Usually by analyzing model components , Model sensitivity or alternative models to distinguish IML Method .

        79b147631e80f0e3ba3fb7ee612283ec.png       

Analyze the components of the interpretable model

To analyze the components of the model , It needs to be broken down into parts that can be explained separately . however , It is not necessary for the user to fully understand the model .

        974d9ba7ddbda351e3b8c2d93201a109.png

Generally, interpretable models are models with learnable structures and parameters , It can be assigned a specific explanation . under these circumstances , linear regression model , Decision trees and decision rules are considered interpretable .

Linear regression models can be explained by analyzing components : Model structure ( Weighted summation of features ) It is allowed to interpret the weight as the influence of the feature on the prediction .

Analyze the components of more complex models

Researchers will also analyze the components of more complex black box models . for example , You can find or generate an active CNN The image of feature graph is used to visualize convolutional neural network (CNN) Abstract features of learning .

For random forests , The minimum depth distribution and Gini coefficient of trees are used to analyze the trees in random forest , Can be used to quantify the importance of features .

4c8d9c271b73e6288f3cec842e972425.png

Model component analysis is a good tool , But its disadvantage is that it is related to a specific model , And it can not be well combined with the commonly used model selection methods , It is usually through machine learning to search for many different ML Cross validation of the model .

IML Challenges in development

Statistical uncertainty

many IML Method , for example : Arrangement and combination of feature importance, etc , An explanation is provided without quantifying the uncertainty of the explanation .

The model itself and its interpretation are calculated based on data , So there is uncertainty . At present, research is trying to quantify the uncertainty of interpretation , For example, layer by layer analysis of feature importance, correlation, etc .

Causal interpretation

Ideally , The model should reflect the real causal structure of its potential phenomenon , For causal interpretation . If used in Science IML, Then causal interpretation is usually the goal of modeling .

However, most statistical learning programs only reflect the correlation structure between features and analyze the data generation process , Not its real inherent structure . Such a causal structure will also make the model more resistant to attacks , And more useful as a basis for decision-making .

d75ed76642521e6671694c050b336af7.png

Unfortunately , Predictive performance and causality can be conflicting goals .

for example , Today's weather directly leads to tomorrow's weather , But we may only use “ Slippery ground ” This information , Use... In the prediction model “ Slippery ground ” It's useful to show the weather tomorrow , Because it contains information about today's weather , But because of ML Today's weather information is missing from the model , Therefore, there is no causal explanation .

Feature dependence

The dependence between features introduces the problems of attribution and extrapolation . for example , When features are interrelated and share information , The importance and role of features become difficult to distinguish .

The correlation characteristics in random forest are of high importance , Many methods based on sensitivity analysis will replace features , When the replaced feature has some dependence on another feature , This association will be broken , And the obtained data points will be extrapolated to areas outside the distribution .

      727d1e5746286c0ff7bef14379e5b891.png

ML The model has never been trained on such combined data , And you may not encounter similar data points in your application . therefore , Extrapolation may cause misunderstanding .

How to interpret the prediction results to individuals with different knowledge and backgrounds , And meeting the need for interpretability at the institutional or social level may be IML Future goals .

It covers a wider range of areas , For example, human-computer interaction , Psychology and Sociology . In order to solve the challenges of the future , The author believes that the field of interpretable machine learning must be extended horizontally to other fields , And it extends vertically in statistics and computer science .

Reference link :

https://arxiv.org/abs/2010.09337

The good news ! 

Xiaobai learns visual knowledge about the planet

Open to the outside world

f037ee556a5d624ccd3c3be5a40d983a.png

 download 1:OpenCV-Contrib Chinese version of extension module 

 stay 「 Xiaobai studies vision 」 Official account back office reply : Extension module Chinese course , You can download the first copy of the whole network OpenCV Extension module tutorial Chinese version , Cover expansion module installation 、SFM Algorithm 、 Stereo vision 、 Target tracking 、 Biological vision 、 Super resolution processing and other more than 20 chapters .


 download 2:Python Visual combat project 52 speak 
 stay 「 Xiaobai studies vision 」 Official account back office reply :Python Visual combat project , You can download, including image segmentation 、 Mask detection 、 Lane line detection 、 Vehicle count 、 Add Eyeliner 、 License plate recognition 、 Character recognition 、 Emotional tests 、 Text content extraction 、 Face recognition, etc 31 A visual combat project , Help fast school computer vision .


 download 3:OpenCV Actual project 20 speak 
 stay 「 Xiaobai studies vision 」 Official account back office reply :OpenCV Actual project 20 speak , You can download the 20 Based on OpenCV Realization 20 A real project , Realization OpenCV Learn advanced .


 Communication group 

 Welcome to join the official account reader group to communicate with your colleagues , There are SLAM、 3 d visual 、 sensor 、 Autopilot 、 Computational photography 、 testing 、 Division 、 distinguish 、 Medical imaging 、GAN、 Wechat groups such as algorithm competition ( It will be subdivided gradually in the future ), Please scan the following micro signal clustering , remarks :” nickname + School / company + Research direction “, for example :” Zhang San  +  Shanghai Jiaotong University  +  Vision SLAM“. Please note... According to the format , Otherwise, it will not pass . After successful addition, they will be invited to relevant wechat groups according to the research direction . Please do not send ads in the group , Or you'll be invited out , Thanks for your understanding ~
原网站

版权声明
本文为[Xiaobai learns vision]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/162/202206111538456328.html