当前位置：网站首页>MIT's latest paper, "the need for interpretable features: motivation and classification": building interpretability in the constituent elements of machine learning models

MIT's latest paper, "the need for interpretable features: motivation and classification": building interpretability in the constituent elements of machine learning models

2022-07-01 10:53:00 【Zhiyuan community】

Interpretation methods that help users understand and trust machine learning models often describe the contribution of some features used in the model to its prediction . for example , If a model predicts a patient's risk of heart disease , Doctors may want to know how much the patient's heart rate data affects the prediction . But if these features are so complex , So that users cannot understand , So what's the use of interpretation ？ MIT researchers are working to improve the interpretability of features , So that decision makers can use the output of machine learning model more freely . Learn from years of field work , They developed a taxonomy , To help developers carefully design functions , Make it easier for its target audience to understand ." We found that , In the real world , Even if we use the most advanced methods to explain machine learning models , There is still a lot of confusion from the function , Not the model itself ," Doctoral students in electronic engineering and computer science 、 The main author of the paper introducing the classification Alexandra Zytek say . In order to establish this classification , Researchers have defined attributes that enable five categories of users to interpret features , From AI experts to people affected by machine learning model prediction . They also provide guidance for model creators on how to transform features into formats that are more easily understood by ordinary people . They hope that their work will inspire model creators to consider using interpretable features from the beginning of the development process , Instead of trying hard afterwards , Focus on interpretability . The study was published in the special interest group on knowledge discovery and data mining of the American Computer Association 6 Monthly Edition Explorations Newsletter On .

Lessons from the real world

Features are variables that are input into the machine learning model ; They are usually extracted from data sets .Veeramachaneni explains , Data scientists usually select and manually formulate features for models , Their main concern is to ensure that the developed features can improve the accuracy of the model , Not whether the decision-maker can understand them . for the past several years , He and his team have been working with decision makers , To identify the usability challenges of machine learning . Experts in these fields , Most of them lack the knowledge of machine learning , Often do not believe in models , Because they don't understand the characteristics that affect the prediction . In a project , They work with clinicians in the hospital intensive care unit , Use machine learning to predict the risk of complications faced by patients after cardiac surgery . Some features are presented as aggregate values , Such as the trend of the patient's heart rate changing with time . Although the characteristic of this coding is " The model is ready "（ Models can process data ）, But clinicians don't know how they calculate . The author says , They prefer to see the relationship between these aggregated features and the original values , So they can recognize the abnormal condition of the patient's heart rate . by comparison , A group of learning Scientists prefer features of aggregation . Instead of having something like " The number of Posts posted by a student in the discussion area " Such features , They prefer to group relevant features , And use the terms they understand to mark , Such as " Participate in "." For interpretability , One scale is not suitable for all . When you go from one region to another , There are different needs .Veeramachaneni say ：" Interpretability itself has many levels . One scale does not fit all ideas is the key to researchers' classification . They define attributes that make features easier or harder to interpret for different decision makers , It also outlines which attributes may be most important to a particular user . for example , Machine learning developers may focus on having model compatible and predictive features , This means that they are expected to improve the performance of the model . On the other hand , Decision makers without machine learning experience may get better services , Because these functions are written by people , in other words , Their description is natural for users , And it is understandable , in other words , They refer to real-world indicators that users can reason ." For taxonomy , If you want to do interpretable functions , To what extent can they be explained ？ You may not need all levels , It depends on the type of field expert you work in ,"Zytek say .

Put interpretability first

The researchers also outlined the feature engineering techniques that developers can adopt , In order to make the characteristics more interpretable to a specific audience . Feature engineering is a process , In the process , Data scientists transform data into formats that machine learning models can handle , Techniques used include aggregating data or normalizing values . Most models also cannot handle classified data , Unless they are converted into numerical codes . These transformations are often almost impossible for non professionals to solve .Zytek say , Creating interpretable features may involve undoing some coding . for example , A common feature engineering technique organizes the span of data , Make them all contain the same number of years . In order to make these features easier to explain , People can group age ranges in human terms , Like a baby 、 Young children 、 Children and adolescents . The author adds that , Instead of using conversion features such as average pulse rate , An interpretable feature may only be the actual pulse rate data ." In many areas , The trade-off between interpretable features and model accuracy is actually very small .Zytek say ：" for example , When we work with child welfare screeners , We only use features that meet our interpretability definition to retrain the model , The performance degradation is almost negligible . On the basis of this work , Researchers are developing a system , Enable model developers to deal with complex feature transformations in a more efficient way , Create people-oriented explanations for machine learning models . The new system will also transform algorithms designed to interpret model ready data sets into formats that decision makers can understand .

The paper

https://arxiv.org/pdf/2202.11748

By developing and interpreting machine learning for real-world domains （ML） Extensive experience in application , We learned that ML The interpretability of a model depends only on its characteristics . Even simple 、 Highly interpretable model types , Such as regression model , If unexplainable features are used , It may also be difficult or incomprehensible . Different users , Especially those who use ML Model the users who make decisions in their domain , Different levels and types of feature interpretability may be required . Besides , According to our experience , We claim that " Interpretable features " This term is neither specific nor detailed , Not enough to capture feature pairs ML The full extent of the usefulness of the explanation . In this paper , We inspired and discussed three key lessons ：1） We should pay more attention to what we call interpretable feature space , Or feature states that are useful to domain experts taking real-world actions ,2） It is necessary to formally classify the characteristic attributes that experts in these fields may need （ In this paper, we propose a partial classification ）,3） The transformation of data from model ready state to interpretable form and the tradition of preparing features for models ML Transformation is equally important .

原网站

版权声明
本文为[Zhiyuan community]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/182/202207011046459410.html