当前位置:网站首页>MIT's latest paper, "the need for interpretable features: motivation and classification": building interpretability in the constituent elements of machine learning models
MIT's latest paper, "the need for interpretable features: motivation and classification": building interpretability in the constituent elements of machine learning models
2022-07-01 10:53:00 【Zhiyuan community】

Interpretation methods that help users understand and trust machine learning models often describe the contribution of some features used in the model to its prediction . for example , If a model predicts a patient's risk of heart disease , Doctors may want to know how much the patient's heart rate data affects the prediction . But if these features are so complex , So that users cannot understand , So what's the use of interpretation ? MIT researchers are working to improve the interpretability of features , So that decision makers can use the output of machine learning model more freely . Learn from years of field work , They developed a taxonomy , To help developers carefully design functions , Make it easier for its target audience to understand ." We found that , In the real world , Even if we use the most advanced methods to explain machine learning models , There is still a lot of confusion from the function , Not the model itself ," Doctoral students in electronic engineering and computer science 、 The main author of the paper introducing the classification Alexandra Zytek say . In order to establish this classification , Researchers have defined attributes that enable five categories of users to interpret features , From AI experts to people affected by machine learning model prediction . They also provide guidance for model creators on how to transform features into formats that are more easily understood by ordinary people . They hope that their work will inspire model creators to consider using interpretable features from the beginning of the development process , Instead of trying hard afterwards , Focus on interpretability . The study was published in the special interest group on knowledge discovery and data mining of the American Computer Association 6 Monthly Edition Explorations Newsletter On .
Lessons from the real world
Features are variables that are input into the machine learning model ; They are usually extracted from data sets .Veeramachaneni explains , Data scientists usually select and manually formulate features for models , Their main concern is to ensure that the developed features can improve the accuracy of the model , Not whether the decision-maker can understand them . for the past several years , He and his team have been working with decision makers , To identify the usability challenges of machine learning . Experts in these fields , Most of them lack the knowledge of machine learning , Often do not believe in models , Because they don't understand the characteristics that affect the prediction . In a project , They work with clinicians in the hospital intensive care unit , Use machine learning to predict the risk of complications faced by patients after cardiac surgery . Some features are presented as aggregate values , Such as the trend of the patient's heart rate changing with time . Although the characteristic of this coding is " The model is ready "( Models can process data ), But clinicians don't know how they calculate . The author says , They prefer to see the relationship between these aggregated features and the original values , So they can recognize the abnormal condition of the patient's heart rate . by comparison , A group of learning Scientists prefer features of aggregation . Instead of having something like " The number of Posts posted by a student in the discussion area " Such features , They prefer to group relevant features , And use the terms they understand to mark , Such as " Participate in "." For interpretability , One scale is not suitable for all . When you go from one region to another , There are different needs .Veeramachaneni say :" Interpretability itself has many levels . One scale does not fit all ideas is the key to researchers' classification . They define attributes that make features easier or harder to interpret for different decision makers , It also outlines which attributes may be most important to a particular user . for example , Machine learning developers may focus on having model compatible and predictive features , This means that they are expected to improve the performance of the model . On the other hand , Decision makers without machine learning experience may get better services , Because these functions are written by people , in other words , Their description is natural for users , And it is understandable , in other words , They refer to real-world indicators that users can reason ." For taxonomy , If you want to do interpretable functions , To what extent can they be explained ? You may not need all levels , It depends on the type of field expert you work in ,"Zytek say .
Put interpretability first
The researchers also outlined the feature engineering techniques that developers can adopt , In order to make the characteristics more interpretable to a specific audience . Feature engineering is a process , In the process , Data scientists transform data into formats that machine learning models can handle , Techniques used include aggregating data or normalizing values . Most models also cannot handle classified data , Unless they are converted into numerical codes . These transformations are often almost impossible for non professionals to solve .Zytek say , Creating interpretable features may involve undoing some coding . for example , A common feature engineering technique organizes the span of data , Make them all contain the same number of years . In order to make these features easier to explain , People can group age ranges in human terms , Like a baby 、 Young children 、 Children and adolescents . The author adds that , Instead of using conversion features such as average pulse rate , An interpretable feature may only be the actual pulse rate data ." In many areas , The trade-off between interpretable features and model accuracy is actually very small .Zytek say :" for example , When we work with child welfare screeners , We only use features that meet our interpretability definition to retrain the model , The performance degradation is almost negligible . On the basis of this work , Researchers are developing a system , Enable model developers to deal with complex feature transformations in a more efficient way , Create people-oriented explanations for machine learning models . The new system will also transform algorithms designed to interpret model ready data sets into formats that decision makers can understand .
The paper

https://arxiv.org/pdf/2202.11748
By developing and interpreting machine learning for real-world domains (ML) Extensive experience in application , We learned that ML The interpretability of a model depends only on its characteristics . Even simple 、 Highly interpretable model types , Such as regression model , If unexplainable features are used , It may also be difficult or incomprehensible . Different users , Especially those who use ML Model the users who make decisions in their domain , Different levels and types of feature interpretability may be required . Besides , According to our experience , We claim that " Interpretable features " This term is neither specific nor detailed , Not enough to capture feature pairs ML The full extent of the usefulness of the explanation . In this paper , We inspired and discussed three key lessons :1) We should pay more attention to what we call interpretable feature space , Or feature states that are useful to domain experts taking real-world actions ,2) It is necessary to formally classify the characteristic attributes that experts in these fields may need ( In this paper, we propose a partial classification ),3) The transformation of data from model ready state to interpretable form and the tradition of preparing features for models ML Transformation is equally important .
边栏推荐
- 爬虫(2) - Requests(1) | Requests模块的深度解析
- 毕业季·进击的技术er
- PHP有哪些优势和劣势
- 大佬们,数据湖iceberg的数据,怎样导出到mysql? 有什么工具? sqoop,datax都没
- Is it safe to open a stock account online in 2022? Is there any danger?
- Kotlin coprocessor scheduling switch threads it's time to unravel the truth
- 【Matytype】在CSDN博客中插入Mathtype行间与行内公式
- 数字藏品市场新局面
- LeetCode. 515. Find the maximum value in each tree row___ BFS + DFS + BFS by layer
- Simulink simulation circuit model of open loop buck buck buck chopper circuit based on MATLAB
猜你喜欢

华为HMS Core携手超图为三维GIS注入新动能

NeurIPS 2022 | 细胞图像分割竞赛正式启动!

数字藏品市场新局面

【论文阅读】Trajectory-guided Control Prediction for End-to-end Autonomous Driving: A Simple yet Strong Ba

12款大家都在用的產品管理平臺

Recommend a JSON visualization tool artifact!

In the new database era, don't just learn Oracle and MySQL

Suggest collecting | what to do when encountering slow SQL on opengauss?

CCNP Part XII BGP (IV)

Uncover the secrets of new products! Yadi Guanneng 3 multi product matrix to meet the travel needs of global users
随机推荐
云上“视界” 创新无限 | 2022阿里云直播峰会正式上线
华为HMS Core携手超图为三维GIS注入新动能
有大佬知道这是为啥吗?表结构都是刚直接复制的源表 mysql-cdc
Simulink simulation circuit model of open loop buck buck buck chopper circuit based on MATLAB
基金管理人的内部控制
Valgrind usage of memory leak locating tool
北汽蓝谷:业绩承压,极狐难期
12 product management platforms that everyone is using
[encounter Django] - (II) database configuration
The exclusive collection of China lunar exploration project is limited to sale!
Huawei HMS core joins hands with hypergraph to inject new momentum into 3D GIS
[.NET6]使用ML.NET+ONNX预训练模型整活B站经典《华强买瓜》
bash: ln: command not found
新品大揭秘!雅迪冠能 3 多元产品矩阵,满足全球用户出行需求
Today in history: the semiconductor war in the late 1990s; Von Neumann published the first draft; CBS acquires CNET
Huawei HMS core joins hands with hypergraph to inject new momentum into 3D GIS
Yoda unified data application -- Exploration and practice of fusion computing in ant risk scenarios
Database experiment report (II)
Handling distributed transactions with powerful dbpack (PHP tutorial)
Project0:小游戏