当前位置:网站首页>MIT's latest paper, "the need for interpretable features: motivation and classification": building interpretability in the constituent elements of machine learning models
MIT's latest paper, "the need for interpretable features: motivation and classification": building interpretability in the constituent elements of machine learning models
2022-07-01 10:53:00 【Zhiyuan community】

Interpretation methods that help users understand and trust machine learning models often describe the contribution of some features used in the model to its prediction . for example , If a model predicts a patient's risk of heart disease , Doctors may want to know how much the patient's heart rate data affects the prediction . But if these features are so complex , So that users cannot understand , So what's the use of interpretation ? MIT researchers are working to improve the interpretability of features , So that decision makers can use the output of machine learning model more freely . Learn from years of field work , They developed a taxonomy , To help developers carefully design functions , Make it easier for its target audience to understand ." We found that , In the real world , Even if we use the most advanced methods to explain machine learning models , There is still a lot of confusion from the function , Not the model itself ," Doctoral students in electronic engineering and computer science 、 The main author of the paper introducing the classification Alexandra Zytek say . In order to establish this classification , Researchers have defined attributes that enable five categories of users to interpret features , From AI experts to people affected by machine learning model prediction . They also provide guidance for model creators on how to transform features into formats that are more easily understood by ordinary people . They hope that their work will inspire model creators to consider using interpretable features from the beginning of the development process , Instead of trying hard afterwards , Focus on interpretability . The study was published in the special interest group on knowledge discovery and data mining of the American Computer Association 6 Monthly Edition Explorations Newsletter On .
Lessons from the real world
Features are variables that are input into the machine learning model ; They are usually extracted from data sets .Veeramachaneni explains , Data scientists usually select and manually formulate features for models , Their main concern is to ensure that the developed features can improve the accuracy of the model , Not whether the decision-maker can understand them . for the past several years , He and his team have been working with decision makers , To identify the usability challenges of machine learning . Experts in these fields , Most of them lack the knowledge of machine learning , Often do not believe in models , Because they don't understand the characteristics that affect the prediction . In a project , They work with clinicians in the hospital intensive care unit , Use machine learning to predict the risk of complications faced by patients after cardiac surgery . Some features are presented as aggregate values , Such as the trend of the patient's heart rate changing with time . Although the characteristic of this coding is " The model is ready "( Models can process data ), But clinicians don't know how they calculate . The author says , They prefer to see the relationship between these aggregated features and the original values , So they can recognize the abnormal condition of the patient's heart rate . by comparison , A group of learning Scientists prefer features of aggregation . Instead of having something like " The number of Posts posted by a student in the discussion area " Such features , They prefer to group relevant features , And use the terms they understand to mark , Such as " Participate in "." For interpretability , One scale is not suitable for all . When you go from one region to another , There are different needs .Veeramachaneni say :" Interpretability itself has many levels . One scale does not fit all ideas is the key to researchers' classification . They define attributes that make features easier or harder to interpret for different decision makers , It also outlines which attributes may be most important to a particular user . for example , Machine learning developers may focus on having model compatible and predictive features , This means that they are expected to improve the performance of the model . On the other hand , Decision makers without machine learning experience may get better services , Because these functions are written by people , in other words , Their description is natural for users , And it is understandable , in other words , They refer to real-world indicators that users can reason ." For taxonomy , If you want to do interpretable functions , To what extent can they be explained ? You may not need all levels , It depends on the type of field expert you work in ,"Zytek say .
Put interpretability first
The researchers also outlined the feature engineering techniques that developers can adopt , In order to make the characteristics more interpretable to a specific audience . Feature engineering is a process , In the process , Data scientists transform data into formats that machine learning models can handle , Techniques used include aggregating data or normalizing values . Most models also cannot handle classified data , Unless they are converted into numerical codes . These transformations are often almost impossible for non professionals to solve .Zytek say , Creating interpretable features may involve undoing some coding . for example , A common feature engineering technique organizes the span of data , Make them all contain the same number of years . In order to make these features easier to explain , People can group age ranges in human terms , Like a baby 、 Young children 、 Children and adolescents . The author adds that , Instead of using conversion features such as average pulse rate , An interpretable feature may only be the actual pulse rate data ." In many areas , The trade-off between interpretable features and model accuracy is actually very small .Zytek say :" for example , When we work with child welfare screeners , We only use features that meet our interpretability definition to retrain the model , The performance degradation is almost negligible . On the basis of this work , Researchers are developing a system , Enable model developers to deal with complex feature transformations in a more efficient way , Create people-oriented explanations for machine learning models . The new system will also transform algorithms designed to interpret model ready data sets into formats that decision makers can understand .
The paper

https://arxiv.org/pdf/2202.11748
By developing and interpreting machine learning for real-world domains (ML) Extensive experience in application , We learned that ML The interpretability of a model depends only on its characteristics . Even simple 、 Highly interpretable model types , Such as regression model , If unexplainable features are used , It may also be difficult or incomprehensible . Different users , Especially those who use ML Model the users who make decisions in their domain , Different levels and types of feature interpretability may be required . Besides , According to our experience , We claim that " Interpretable features " This term is neither specific nor detailed , Not enough to capture feature pairs ML The full extent of the usefulness of the explanation . In this paper , We inspired and discussed three key lessons :1) We should pay more attention to what we call interpretable feature space , Or feature states that are useful to domain experts taking real-world actions ,2) It is necessary to formally classify the characteristic attributes that experts in these fields may need ( In this paper, we propose a partial classification ),3) The transformation of data from model ready state to interpretable form and the tradition of preparing features for models ML Transformation is equally important .
边栏推荐
- 想请教一下,我在广州,到哪里开户比较好?现在网上开户安全么?
- CRC 校驗
- Error: missing revert data in call exception
- 问下群里的各位,有使用flink oracle cdc的logminer方案,在生产上稳定运行的实际
- Have the bosses ever done the operation of sink shunting and writing to Clickhouse or other databases.
- 数字藏品平台搭建需要注意哪些法律风险及资质?
- Huawei HMS core joins hands with hypergraph to inject new momentum into 3D GIS
- 价值1000毕业设计校园信息发布平台网站源码
- 【MPC】①二次规划问题MATLAB求解器quadprog
- 推荐一款 JSON 可视化工具神器!
猜你喜欢

Oracle和JSON的結合

NC | 肠道细胞和乳酸菌共同作用来防止念珠菌感染

数字藏品平台搭建需要注意哪些法律风险及资质?

关于#SQL#的问题,如何解决?

商城小程序源码开源版-可二开

Matplotlib数据可视化基础

IDEA运行报错Command line is too long. Shorten command line for...

12款大家都在用的产品管理平台
![[.net6] use ml.net+onnx pre training model to liven the classic](/img/b3/b117481fba7257453011e4cdb1eaaa.png)
[.net6] use ml.net+onnx pre training model to liven the classic "Huaqiang buys melons" in station B

Today in history: the semiconductor war in the late 1990s; Von Neumann published the first draft; CBS acquires CNET
随机推荐
【MPC】①二次规划问题MATLAB求解器quadprog
106. 从中序与后序遍历序列构造二叉树
2022年已经过去一半了,是不是很突然呢?
Want to open an account, is it safe to open an account of Huatai Securities online?
SQL server2014 failed to delete the database, with an error offset of 0x0000
推荐一款 JSON 可视化工具神器!
Internal control of fund managers
Is it safe to open a stock account online in 2022? Is there any danger?
Website source code whole site download website template source code download
Combinaison Oracle et json
CRC check
Crawler (2) - requests (1) | deep parsing of requests module
bash: ln: command not found
[MPC] ① quadratic programming problem matlab solver quadprog
CodeBlocks 左侧项目栏消失,workspace 自动保存项目,Default workspace,打开上次的workspace,工作区(图文教程,已解决)
Half of 2022 has passed, isn't it sudden?
Rising Stars in Plant Sciences (RSPS2022) Finalist科学演讲会(6.30晚9点)
. Net 5.0+ does not need to rely on third-party native implementation of scheduled tasks
云上“视界” 创新无限 | 2022阿里云直播峰会正式上线
数据库实验报告(二)