当前位置:网站首页>MIT's latest paper, "the need for interpretable features: motivation and classification": building interpretability in the constituent elements of machine learning models
MIT's latest paper, "the need for interpretable features: motivation and classification": building interpretability in the constituent elements of machine learning models
2022-07-01 10:53:00 【Zhiyuan community】

Interpretation methods that help users understand and trust machine learning models often describe the contribution of some features used in the model to its prediction . for example , If a model predicts a patient's risk of heart disease , Doctors may want to know how much the patient's heart rate data affects the prediction . But if these features are so complex , So that users cannot understand , So what's the use of interpretation ? MIT researchers are working to improve the interpretability of features , So that decision makers can use the output of machine learning model more freely . Learn from years of field work , They developed a taxonomy , To help developers carefully design functions , Make it easier for its target audience to understand ." We found that , In the real world , Even if we use the most advanced methods to explain machine learning models , There is still a lot of confusion from the function , Not the model itself ," Doctoral students in electronic engineering and computer science 、 The main author of the paper introducing the classification Alexandra Zytek say . In order to establish this classification , Researchers have defined attributes that enable five categories of users to interpret features , From AI experts to people affected by machine learning model prediction . They also provide guidance for model creators on how to transform features into formats that are more easily understood by ordinary people . They hope that their work will inspire model creators to consider using interpretable features from the beginning of the development process , Instead of trying hard afterwards , Focus on interpretability . The study was published in the special interest group on knowledge discovery and data mining of the American Computer Association 6 Monthly Edition Explorations Newsletter On .
Lessons from the real world
Features are variables that are input into the machine learning model ; They are usually extracted from data sets .Veeramachaneni explains , Data scientists usually select and manually formulate features for models , Their main concern is to ensure that the developed features can improve the accuracy of the model , Not whether the decision-maker can understand them . for the past several years , He and his team have been working with decision makers , To identify the usability challenges of machine learning . Experts in these fields , Most of them lack the knowledge of machine learning , Often do not believe in models , Because they don't understand the characteristics that affect the prediction . In a project , They work with clinicians in the hospital intensive care unit , Use machine learning to predict the risk of complications faced by patients after cardiac surgery . Some features are presented as aggregate values , Such as the trend of the patient's heart rate changing with time . Although the characteristic of this coding is " The model is ready "( Models can process data ), But clinicians don't know how they calculate . The author says , They prefer to see the relationship between these aggregated features and the original values , So they can recognize the abnormal condition of the patient's heart rate . by comparison , A group of learning Scientists prefer features of aggregation . Instead of having something like " The number of Posts posted by a student in the discussion area " Such features , They prefer to group relevant features , And use the terms they understand to mark , Such as " Participate in "." For interpretability , One scale is not suitable for all . When you go from one region to another , There are different needs .Veeramachaneni say :" Interpretability itself has many levels . One scale does not fit all ideas is the key to researchers' classification . They define attributes that make features easier or harder to interpret for different decision makers , It also outlines which attributes may be most important to a particular user . for example , Machine learning developers may focus on having model compatible and predictive features , This means that they are expected to improve the performance of the model . On the other hand , Decision makers without machine learning experience may get better services , Because these functions are written by people , in other words , Their description is natural for users , And it is understandable , in other words , They refer to real-world indicators that users can reason ." For taxonomy , If you want to do interpretable functions , To what extent can they be explained ? You may not need all levels , It depends on the type of field expert you work in ,"Zytek say .
Put interpretability first
The researchers also outlined the feature engineering techniques that developers can adopt , In order to make the characteristics more interpretable to a specific audience . Feature engineering is a process , In the process , Data scientists transform data into formats that machine learning models can handle , Techniques used include aggregating data or normalizing values . Most models also cannot handle classified data , Unless they are converted into numerical codes . These transformations are often almost impossible for non professionals to solve .Zytek say , Creating interpretable features may involve undoing some coding . for example , A common feature engineering technique organizes the span of data , Make them all contain the same number of years . In order to make these features easier to explain , People can group age ranges in human terms , Like a baby 、 Young children 、 Children and adolescents . The author adds that , Instead of using conversion features such as average pulse rate , An interpretable feature may only be the actual pulse rate data ." In many areas , The trade-off between interpretable features and model accuracy is actually very small .Zytek say :" for example , When we work with child welfare screeners , We only use features that meet our interpretability definition to retrain the model , The performance degradation is almost negligible . On the basis of this work , Researchers are developing a system , Enable model developers to deal with complex feature transformations in a more efficient way , Create people-oriented explanations for machine learning models . The new system will also transform algorithms designed to interpret model ready data sets into formats that decision makers can understand .
The paper

https://arxiv.org/pdf/2202.11748
By developing and interpreting machine learning for real-world domains (ML) Extensive experience in application , We learned that ML The interpretability of a model depends only on its characteristics . Even simple 、 Highly interpretable model types , Such as regression model , If unexplainable features are used , It may also be difficult or incomprehensible . Different users , Especially those who use ML Model the users who make decisions in their domain , Different levels and types of feature interpretability may be required . Besides , According to our experience , We claim that " Interpretable features " This term is neither specific nor detailed , Not enough to capture feature pairs ML The full extent of the usefulness of the explanation . In this paper , We inspired and discussed three key lessons :1) We should pay more attention to what we call interpretable feature space , Or feature states that are useful to domain experts taking real-world actions ,2) It is necessary to formally classify the characteristic attributes that experts in these fields may need ( In this paper, we propose a partial classification ),3) The transformation of data from model ready state to interpretable form and the tradition of preparing features for models ML Transformation is equally important .
边栏推荐
- Want to open an account, is it safe to open an account of Huatai Securities online?
- dotnet 控制台 使用 Microsoft.Maui.Graphics 配合 Skia 进行绘图入门
- Oracle和JSON的结合
- LeetCode.每日一题 剑指 Offer II 091. 粉刷房子 (DP问题)
- 选择在中金证券上炒股开户可以吗?安全吗?
- 12款大家都在用的產品管理平臺
- 基于Matlab的开环Buck降压斩波电路Simulink仿真电路模型搭建
- Submission lottery - light application server essay solicitation activity (may) award announcement
- Mall applet source code open source version - two open
- mysql如何把 一个数据库中的表数据 复制到 另一个数据库中(两个数据库不在同一个数据库链接下)
猜你喜欢

The Lantern Festival is held on the fifteenth day of the first month, and the Lantern Festival begins to celebrate the reunion

使用强大的DBPack处理分布式事务(PHP使用教程)

The list of winners of the digital collection of "century master" was announced

A new round of popularity of digital collections opens

数字藏品平台搭建需要注意哪些法律风险及资质?

Oracle和JSON的结合

Error: missing revert data in call exception

Wireshark TS | 快速重传和乱序之混淆

建议收藏 | 在openGauss上遇到慢SQL该怎么办?

《百年巨匠》数字藏品中奖名单公布
随机推荐
内存泄漏定位工具之 valgrind 使用
Mutual conversion of pictures in fluent uint8list format and pictures in file format
新品大揭秘!雅迪冠能 3 多元产品矩阵,满足全球用户出行需求
Suggest collecting | what to do when encountering slow SQL on opengauss?
PHP realizes lottery function
How to solve the problem of SQL?
华为HMS Core携手超图为三维GIS注入新动能
数据库实验报告(二)
flutter Uint8List格式的图片和File格式图片的互相转换
【MPC】②quadprog求解正定、半正定、负定二次规划
12 plateformes de gestion de produits utilisées par tout le monde
[MPC] ① quadratic programming problem matlab solver quadprog
Infinite innovation in cloud "vision" | the 2022 Alibaba cloud live summit was officially launched
Recommend a JSON visualization tool artifact!
谷歌新论文-Minerva:用语言模型解决定量推理问题
Lack of comparator, operational amplifier to save the field! (the op amp is recorded as a comparator circuit)
prism journal导航按钮的可用性探索记录
Design and practice of new generation cloud native database
678. Valid bracket string
数字藏品平台搭建需要注意哪些法律风险及资质?