Fit interpretable models. Explain blackbox machine learning.

Overview

InterpretML - Alpha Release

License Python Version Package Version Build Status Coverage LGTM Grade Maintenance

In the beginning machines learned in darkness, and data scientists struggled in the void to explain them.

Let there be light.

InterpretML is an open-source package that incorporates state-of-the-art machine learning interpretability techniques under one roof. With this package, you can train interpretable glassbox models and explain blackbox systems. InterpretML helps you understand your model's global behavior, or understand the reasons behind individual predictions.

Interpretability is essential for:

  • Model debugging - Why did my model make this mistake?
  • Feature Engineering - How can I improve my model?
  • Detecting fairness issues - Does my model discriminate?
  • Human-AI cooperation - How can I understand and trust the model's decisions?
  • Regulatory compliance - Does my model satisfy legal requirements?
  • High-risk applications - Healthcare, finance, judicial, ...

Installation

Python 3.6+ | Linux, Mac, Windows

pip install interpret

Introducing the Explainable Boosting Machine (EBM)

EBM is an interpretable model developed at Microsoft Research*. It uses modern machine learning techniques like bagging, gradient boosting, and automatic interaction detection to breathe new life into traditional GAMs (Generalized Additive Models). This makes EBMs as accurate as state-of-the-art techniques like random forests and gradient boosted trees. However, unlike these blackbox models, EBMs produce exact explanations and are editable by domain experts.

Dataset/AUROC Domain Logistic Regression Random Forest XGBoost Explainable Boosting Machine
Adult Income Finance .907±.003 .903±.002 .927±.001 .928±.002
Heart Disease Medical .895±.030 .890±.008 .851±.018 .898±.013
Breast Cancer Medical .995±.005 .992±.009 .992±.010 .995±.006
Telecom Churn Business .849±.005 .824±.004 .828±.010 .852±.006
Credit Fraud Security .979±.002 .950±.007 .981±.003 .981±.003

Notebook for reproducing table

Supported Techniques

Interpretability Technique Type
Explainable Boosting glassbox model
Decision Tree glassbox model
Decision Rule List glassbox model
Linear/Logistic Regression glassbox model
SHAP Kernel Explainer blackbox explainer
LIME blackbox explainer
Morris Sensitivity Analysis blackbox explainer
Partial Dependence blackbox explainer

Train a glassbox model

Let's fit an Explainable Boosting Machine

from interpret.glassbox import ExplainableBoostingClassifier

ebm = ExplainableBoostingClassifier()
ebm.fit(X_train, y_train)

# or substitute with LogisticRegression, DecisionTreeClassifier, RuleListClassifier, ...
# EBM supports pandas dataframes, numpy arrays, and handles "string" data natively.

Understand the model

from interpret import show

ebm_global = ebm.explain_global()
show(ebm_global)

Global Explanation Image


Understand individual predictions

ebm_local = ebm.explain_local(X_test, y_test)
show(ebm_local)

Local Explanation Image


And if you have multiple model explanations, compare them

show([logistic_regression_global, decision_tree_global])

Dashboard Image

For more information, see the documentation.

Acknowledgements

InterpretML was originally created by (equal contributions): Samuel Jenkins, Harsha Nori, Paul Koch, and Rich Caruana

EBMs are fast derivative of GA2M, invented by: Yin Lou, Rich Caruana, Johannes Gehrke, and Giles Hooker

Many people have supported us along the way. Check out ACKNOWLEDGEMENTS.md!

We also build on top of many great packages. Please check them out!

plotly | dash | scikit-learn | lime | shap | salib | skope-rules | treeinterpreter | gevent | joblib | pytest | jupyter

Citations

InterpretML
"InterpretML: A Unified Framework for Machine Learning Interpretability" (H. Nori, S. Jenkins, P. Koch, and R. Caruana 2019)
@article{nori2019interpretml,
  title={InterpretML: A Unified Framework for Machine Learning Interpretability},
  author={Nori, Harsha and Jenkins, Samuel and Koch, Paul and Caruana, Rich},
  journal={arXiv preprint arXiv:1909.09223},
  year={2019}
}
Paper link

Explainable Boosting
"Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission" (R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, and N. Elhadad 2015)
@inproceedings{caruana2015intelligible,
  title={Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission},
  author={Caruana, Rich and Lou, Yin and Gehrke, Johannes and Koch, Paul and Sturm, Marc and Elhadad, Noemie},
  booktitle={Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
  pages={1721--1730},
  year={2015},
  organization={ACM}
}
Paper link
"Accurate intelligible models with pairwise interactions" (Y. Lou, R. Caruana, J. Gehrke, and G. Hooker 2013)
@inproceedings{lou2013accurate,
  title={Accurate intelligible models with pairwise interactions},
  author={Lou, Yin and Caruana, Rich and Gehrke, Johannes and Hooker, Giles},
  booktitle={Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining},
  pages={623--631},
  year={2013},
  organization={ACM}
}
Paper link
"Intelligible models for classification and regression" (Y. Lou, R. Caruana, and J. Gehrke 2012)
@inproceedings{lou2012intelligible,
  title={Intelligible models for classification and regression},
  author={Lou, Yin and Caruana, Rich and Gehrke, Johannes},
  booktitle={Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining},
  pages={150--158},
  year={2012},
  organization={ACM}
}
Paper link
"Axiomatic Interpretability for Multiclass Additive Models" (X. Zhang, S. Tan, P. Koch, Y. Lou, U. Chajewska, and R. Caruana 2019)
@inproceedings{zhang2019axiomatic,
  title={Axiomatic Interpretability for Multiclass Additive Models},
  author={Zhang, Xuezhou and Tan, Sarah and Koch, Paul and Lou, Yin and Chajewska, Urszula and Caruana, Rich},
  booktitle={Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
  pages={226--234},
  year={2019},
  organization={ACM}
}
Paper link
"Distill-and-compare: auditing black-box models using transparent model distillation" (S. Tan, R. Caruana, G. Hooker, and Y. Lou 2018)
@inproceedings{tan2018distill,
  title={Distill-and-compare: auditing black-box models using transparent model distillation},
  author={Tan, Sarah and Caruana, Rich and Hooker, Giles and Lou, Yin},
  booktitle={Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society},
  pages={303--310},
  year={2018},
  organization={ACM}
}
Paper link
"Purifying Interaction Effects with the Functional ANOVA: An Efficient Algorithm for Recovering Identifiable Additive Models" (B. Lengerich, S. Tan, C. Chang, G. Hooker, R. Caruana 2019)
@article{lengerich2019purifying,
  title={Purifying Interaction Effects with the Functional ANOVA: An Efficient Algorithm for Recovering Identifiable Additive Models},
  author={Lengerich, Benjamin and Tan, Sarah and Chang, Chun-Hao and Hooker, Giles and Caruana, Rich},
  journal={arXiv preprint arXiv:1911.04974},
  year={2019}
}
Paper link
"Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning" (H. Kaur, H. Nori, S. Jenkins, R. Caruana, H. Wallach, J. Wortman Vaughan 2020)
@inproceedings{kaur2020interpreting,
  title={Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning},
  author={Kaur, Harmanpreet and Nori, Harsha and Jenkins, Samuel and Caruana, Rich and Wallach, Hanna and Wortman Vaughan, Jennifer},
  booktitle={Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems},
  pages={1--14},
  year={2020}
}
Paper link
"How Interpretable and Trustworthy are GAMs?" (C. Chang, S. Tan, B. Lengerich, A. Goldenberg, R. Caruana 2020)
@article{chang2020interpretable,
  title={How Interpretable and Trustworthy are GAMs?},
  author={Chang, Chun-Hao and Tan, Sarah and Lengerich, Ben and Goldenberg, Anna and Caruana, Rich},
  journal={arXiv preprint arXiv:2006.06466},
  year={2020}
}
Paper link

LIME
"Why should i trust you?: Explaining the predictions of any classifier" (M. T. Ribeiro, S. Singh, and C. Guestrin 2016)
@inproceedings{ribeiro2016should,
  title={Why should i trust you?: Explaining the predictions of any classifier},
  author={Ribeiro, Marco Tulio and Singh, Sameer and Guestrin, Carlos},
  booktitle={Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining},
  pages={1135--1144},
  year={2016},
  organization={ACM}
}
Paper link

SHAP
"A Unified Approach to Interpreting Model Predictions" (S. M. Lundberg and S.-I. Lee 2017)
@incollection{NIPS2017_7062,
 title = {A Unified Approach to Interpreting Model Predictions},
 author = {Lundberg, Scott M and Lee, Su-In},
 booktitle = {Advances in Neural Information Processing Systems 30},
 editor = {I. Guyon and U. V. Luxburg and S. Bengio and H. Wallach and R. Fergus and S. Vishwanathan and R. Garnett},
 pages = {4765--4774},
 year = {2017},
 publisher = {Curran Associates, Inc.},
 url = {http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf}
}
Paper link
"Consistent individualized feature attribution for tree ensembles" (Lundberg, Scott M and Erion, Gabriel G and Lee, Su-In 2018)
@article{lundberg2018consistent,
  title={Consistent individualized feature attribution for tree ensembles},
  author={Lundberg, Scott M and Erion, Gabriel G and Lee, Su-In},
  journal={arXiv preprint arXiv:1802.03888},
  year={2018}
}
Paper link
"Explainable machine-learning predictions for the prevention of hypoxaemia during surgery" (S. M. Lundberg et al. 2018)
@article{lundberg2018explainable,
  title={Explainable machine-learning predictions for the prevention of hypoxaemia during surgery},
  author={Lundberg, Scott M and Nair, Bala and Vavilala, Monica S and Horibe, Mayumi and Eisses, Michael J and Adams, Trevor and Liston, David E and Low, Daniel King-Wai and Newman, Shu-Fang and Kim, Jerry and others},
  journal={Nature Biomedical Engineering},
  volume={2},
  number={10},
  pages={749},
  year={2018},
  publisher={Nature Publishing Group}
}
Paper link

Sensitivity Analysis
"SALib: An open-source Python library for Sensitivity Analysis" (J. D. Herman and W. Usher 2017)
@article{herman2017salib,
  title={SALib: An open-source Python library for Sensitivity Analysis.},
  author={Herman, Jonathan D and Usher, Will},
  journal={J. Open Source Software},
  volume={2},
  number={9},
  pages={97},
  year={2017}
}
Paper link
"Factorial sampling plans for preliminary computational experiments" (M. D. Morris 1991)
@article{morris1991factorial,
  title={},
  author={Morris, Max D},
  journal={Technometrics},
  volume={33},
  number={2},
  pages={161--174},
  year={1991},
  publisher={Taylor \& Francis Group}
}
Paper link

Partial Dependence
"Greedy function approximation: a gradient boosting machine" (J. H. Friedman 2001)
@article{friedman2001greedy,
  title={Greedy function approximation: a gradient boosting machine},
  author={Friedman, Jerome H},
  journal={Annals of statistics},
  pages={1189--1232},
  year={2001},
  publisher={JSTOR}
}
    
Paper link

Open Source Software
"Scikit-learn: Machine learning in Python" (F. Pedregosa et al. 2011)
@article{pedregosa2011scikit,
  title={Scikit-learn: Machine learning in Python},
  author={Pedregosa, Fabian and Varoquaux, Ga{\"e}l and Gramfort, Alexandre and Michel, Vincent and Thirion, Bertrand and Grisel, Olivier and Blondel, Mathieu and Prettenhofer, Peter and Weiss, Ron and Dubourg, Vincent and others},
  journal={Journal of machine learning research},
  volume={12},
  number={Oct},
  pages={2825--2830},
  year={2011}
}
Paper link
"Collaborative data science" (Plotly Technologies Inc. 2015)
@online{plotly, 
  author = {Plotly Technologies Inc.}, 
  title = {Collaborative data science}, 
  publisher = {Plotly Technologies Inc.}, 
  address = {Montreal, QC}, 
  year = {2015}, 
  url = {https://plot.ly} }
  
Link
"Joblib: running python function as pipeline jobs" (G. Varoquaux and O. Grisel 2009)
@article{varoquaux2009joblib,
  title={Joblib: running python function as pipeline jobs},
  author={Varoquaux, Ga{\"e}l and Grisel, O},
  journal={packages. python. org/joblib},
  year={2009}
}
  
Link

Videos

External links

Papers that use or compare EBMs

External tools

Contact us

There are multiple ways to get in touch:









































If a tree fell in your random forest, would anyone notice?

Comments
  • M1 Apple Silicon support

    M1 Apple Silicon support

    Hello,

    I was trying to installing and use EBM Classifier on my M1 computer but came across the following error:

    dlopen(/Users/antran/miniforge3/envs/sk-env/lib/python3.9/site-packages/interpret/glassbox/ebm/../../lib/lib_ebm_native_mac_x64.dylib, 0x0006): tried:
    '/Users/antran/miniforge3/envs/sk-env/lib/python3.9/site-packages/interpret/glassbox/ebm/../../lib/lib_ebm_native_mac_x64.dylib'
    (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e')),
    '/Users/antran/miniforge3/envs/sk-env/lib/python3.9/site-packages/interpret/lib/lib_ebm_native_mac_x64.dylib'
    (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e'))
    

    I was wondering if interpret is supported on M1 chip yet? Is there any work-around for the error?

    Thank you!

    opened by antranttu 21
  • Use graphs in a Jupyter notebook?

    Use graphs in a Jupyter notebook?

    Thanks for this library.

    I'm following along with the README.md and got to:

    from interpret import show
    
    ebm_global = ebm.explain_global()
    show(ebm_global)
    

    When I run that in my Jupyter notebook I get: RuntimeError: Could not find open port.

    Maybe it's trying to run a web server from a notebook?

    Can I just make the individual graphs in the notebook? How?

    I see functions in interpret.visual.plot, but I'm having a bit of trouble finding the right objects to pass to it.

    opened by dfrankow 19
  • Does EBM support weighted datasets?

    Does EBM support weighted datasets?

    Hi @interpret-ml:

    Does EBM support weighted datasets now? I see discussion on this topic last year but not sure whether this feature has been added.

    Appreciate your help!

    opened by flippercy 13
  • Question on Calibration

    Question on Calibration

    I'm testing the classifier algorithm on a dataset (unfortunately confidential) with a binary target (70,000 rows and about 40 predictors) and seeing that while the rank ordering is competitive with other tree based methods, the predictions seem poorly calibrated - even on the training data itself. The prediction is always lower than the actual. I am wondering if there might be a cause based on the algorithm that could be tuned or if this has been seen in development?

    The model is trained using the default settings (I have tweaked multiple parameters and not found any impact)

    ebm = ExplainableBoostingClassifier(n_estimators=16,interactions=0,n_jobs=10)
    ebm.fit(X,y)
    

    The prediction is made on the training data.

    p=ebm.predict_proba(X)[:,1]
    print(np.mean(p))  # THIS IS 0.023
    

    I rank the predictions into deciles (10% bins) and plot the actual target rate and the mean prediction probability for each decile. The rank order is good, AUC is high (this is the training data of course) but we underpredict systematically. The red horizontal line is the overall mean of the training data which is significantly higher than the mean prediction noted above (0.1 versus 0.02)

    image

    bug 
    opened by AllardJM 13
  • Install fails due to Microsoft Visual C++ 14.0 dependency

    Install fails due to Microsoft Visual C++ 14.0 dependency

    I created a new conda environment with python 3.7, pandas and jupyter.

    Then i activated the environment and ran pip install interpret per the install instructions.

    Unfortunately, it failed while building shap, seemingly due to not having visual studio build tools installed error: Microsoft Visual C++ 14.0 is required. Get it with "Build Tools for Visual Studio": https://visualstudio.microsoft.com/downloads/ Did I skip a step here?

    Are there any plans for packaging this as a conda package so that it can handle dependencies like this so that the user doesn't have to do additional installs and configuration beyond running "conda install interpret" or "pip install interpret"?

    Also, do I need to install visual studio to use this library? or just C++14. It's not clear from the error message what's going on here and when I go to the build tools site it prompts me to install a ton of stuff.

    Thanks!

    Failed to build shap
    Installing collected packages: greenlet, zope.event, zope.interface, pycparser, cffi, gevent, dash-table, click, itsdangerous, Werkzeug, Flask, brotli, flask-compress, retrying, plotly, dash-renderer, dash-core-components, dash-html-components, future, dash, dash-cytoscape, urllib3, chardet, idna, requests, psutil, joblib, cycler, kiwisolver, pyparsing, matplotlib, scipy, tqdm, pillow, threadpoolctl, scikit-learn, imageio, PyWavelets, tifffile, networkx, scikit-image, lime, SALib, dill, shap, treeinterpreter, interpret-core, interpret
        Running setup.py install for shap ... error
        ERROR: Command errored out with exit status 1:
         command: 'C:\Users\charleswm\AppData\Local\Continuum\miniconda3\envs\interpret-demo\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\charleswm\\AppData\\Local\\Temp\\pip-install-osv0fkp_\\shap\\setup.py'"'"'; __file__='"'"'C:\\Users\\charleswm\\AppData\\Local\\Temp\\pip-install-osv0fkp_\\shap\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\charleswm\AppData\Local\Temp\pip-record-er9n9syn\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\Users\charleswm\AppData\Local\Continuum\miniconda3\envs\interpret-demo\Include\shap'
             cwd: C:\Users\charleswm\AppData\Local\Temp\pip-install-osv0fkp_\shap\
        Complete output (67 lines):
        running install
        running build
        running build_py
        creating build
        creating build\lib.win-amd64-3.7
        creating build\lib.win-amd64-3.7\shap
        copying shap\common.py -> build\lib.win-amd64-3.7\shap
        copying shap\datasets.py -> build\lib.win-amd64-3.7\shap
        copying shap\__init__.py -> build\lib.win-amd64-3.7\shap
        creating build\lib.win-amd64-3.7\shap\explainers
        copying shap\explainers\additive.py -> build\lib.win-amd64-3.7\shap\explainers
        copying shap\explainers\bruteforce.py -> build\lib.win-amd64-3.7\shap\explainers
        copying shap\explainers\explainer.py -> build\lib.win-amd64-3.7\shap\explainers
        copying shap\explainers\gradient.py -> build\lib.win-amd64-3.7\shap\explainers
        copying shap\explainers\kernel.py -> build\lib.win-amd64-3.7\shap\explainers
        copying shap\explainers\linear.py -> build\lib.win-amd64-3.7\shap\explainers
        copying shap\explainers\mimic.py -> build\lib.win-amd64-3.7\shap\explainers
        copying shap\explainers\partition.py -> build\lib.win-amd64-3.7\shap\explainers
        copying shap\explainers\permutation.py -> build\lib.win-amd64-3.7\shap\explainers
        copying shap\explainers\pytree.py -> build\lib.win-amd64-3.7\shap\explainers
        copying shap\explainers\sampling.py -> build\lib.win-amd64-3.7\shap\explainers
        copying shap\explainers\tf_utils.py -> build\lib.win-amd64-3.7\shap\explainers
        copying shap\explainers\tree.py -> build\lib.win-amd64-3.7\shap\explainers
        copying shap\explainers\__init__.py -> build\lib.win-amd64-3.7\shap\explainers
        creating build\lib.win-amd64-3.7\shap\explainers\other
        copying shap\explainers\other\coefficent.py -> build\lib.win-amd64-3.7\shap\explainers\other
        copying shap\explainers\other\lime.py -> build\lib.win-amd64-3.7\shap\explainers\other
        copying shap\explainers\other\maple.py -> build\lib.win-amd64-3.7\shap\explainers\other
        copying shap\explainers\other\random.py -> build\lib.win-amd64-3.7\shap\explainers\other
        copying shap\explainers\other\treegain.py -> build\lib.win-amd64-3.7\shap\explainers\other
        copying shap\explainers\other\__init__.py -> build\lib.win-amd64-3.7\shap\explainers\other
        creating build\lib.win-amd64-3.7\shap\explainers\deep
        copying shap\explainers\deep\deep_pytorch.py -> build\lib.win-amd64-3.7\shap\explainers\deep
        copying shap\explainers\deep\deep_tf.py -> build\lib.win-amd64-3.7\shap\explainers\deep
        copying shap\explainers\deep\__init__.py -> build\lib.win-amd64-3.7\shap\explainers\deep
        creating build\lib.win-amd64-3.7\shap\plots
        copying shap\plots\bar.py -> build\lib.win-amd64-3.7\shap\plots
        copying shap\plots\colorconv.py -> build\lib.win-amd64-3.7\shap\plots
        copying shap\plots\colors.py -> build\lib.win-amd64-3.7\shap\plots
        copying shap\plots\decision.py -> build\lib.win-amd64-3.7\shap\plots
        copying shap\plots\dependence.py -> build\lib.win-amd64-3.7\shap\plots
        copying shap\plots\embedding.py -> build\lib.win-amd64-3.7\shap\plots
        copying shap\plots\force.py -> build\lib.win-amd64-3.7\shap\plots
        copying shap\plots\force_matplotlib.py -> build\lib.win-amd64-3.7\shap\plots
        copying shap\plots\image.py -> build\lib.win-amd64-3.7\shap\plots
        copying shap\plots\monitoring.py -> build\lib.win-amd64-3.7\shap\plots
        copying shap\plots\partial_dependence.py -> build\lib.win-amd64-3.7\shap\plots
        copying shap\plots\summary.py -> build\lib.win-amd64-3.7\shap\plots
        copying shap\plots\text.py -> build\lib.win-amd64-3.7\shap\plots
        copying shap\plots\waterfall.py -> build\lib.win-amd64-3.7\shap\plots
        copying shap\plots\__init__.py -> build\lib.win-amd64-3.7\shap\plots
        creating build\lib.win-amd64-3.7\shap\benchmark
        copying shap\benchmark\experiments.py -> build\lib.win-amd64-3.7\shap\benchmark
        copying shap\benchmark\measures.py -> build\lib.win-amd64-3.7\shap\benchmark
        copying shap\benchmark\methods.py -> build\lib.win-amd64-3.7\shap\benchmark
        copying shap\benchmark\metrics.py -> build\lib.win-amd64-3.7\shap\benchmark
        copying shap\benchmark\models.py -> build\lib.win-amd64-3.7\shap\benchmark
        copying shap\benchmark\plots.py -> build\lib.win-amd64-3.7\shap\benchmark
        copying shap\benchmark\__init__.py -> build\lib.win-amd64-3.7\shap\benchmark
        creating build\lib.win-amd64-3.7\shap\plots\resources
        copying shap\plots\resources\bundle.js -> build\lib.win-amd64-3.7\shap\plots\resources
        copying shap\plots\resources\logoSmallGray.png -> build\lib.win-amd64-3.7\shap\plots\resources
        copying shap\tree_shap.h -> build\lib.win-amd64-3.7\shap
        running build_ext
        numpy.get_include() C:\Users\charleswm\AppData\Local\Continuum\miniconda3\envs\interpret-demo\lib\site-packages\numpy\core\include
        building 'shap._cext' extension
        error: Microsoft Visual C++ 14.0 is required. Get it with "Build Tools for Visual Studio": https://visualstudio.microsoft.com/downloads/
        ----------------------------------------
    ERROR: Command errored out with exit status 1: 'C:\Users\charleswm\AppData\Local\Continuum\miniconda3\envs\interpret-demo\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\charleswm\\AppData\\Local\\Temp\\pip-install-osv0fkp_\\shap\\setup.py'"'"'; __file__='"'"'C:\\Users\\charleswm\\AppData\\Local\\Temp\\pip-install-osv0fkp_\\shap\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\charleswm\AppData\Local\Temp\pip-record-er9n9syn\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\Users\charleswm\AppData\Local\Continuum\miniconda3\envs\interpret-demo\Include\shap' Check the logs for full command output.
    
    opened by charleswm 11
  • Mean Absolute Score : Overall Importance

    Mean Absolute Score : Overall Importance

    Hi , Can I know how Mean absolute score for each feature(Feature Importance) is calculated in EBM global explanations? Is there any particular metric or mathematical formula to calculate those probabilities? If, yes please let me know . Please check below png . May I know how those values are calculated . How Glucose has got more score irrespective of others ? newplot

    opened by Tejamr 10
  • show(ebm_local) doesn't show anything

    show(ebm_local) doesn't show anything

    I am trying to get local explanations. I am getting this user-warning and it is not showing anything on the Jupyter lab cell.

    ebm = ExplainableBoostingClassifier() ebm.fit(Xtrain, Ytrain) ebm_local = ebm.explain_local(Xtest, Ytest) show(ebm_local)

    --/miniconda3/envs/py3/lib/python3.7/site-packages/interpret/visual/udash.py:5: UserWarning: The dash_html_components package is deprecated. Please replace import dash_html_components as html with from dash import html import dash_html_components as html --/miniconda3/envs/py3/lib/python3.7/site-packages/interpret/visual/udash.py:6: UserWarning: The dash_core_components package is deprecated. Please replace import dash_core_components as dcc with from dash import dcc import dash_core_components as dcc --/miniconda3/envs/py3/lib/python3.7/site-packages/interpret/visual/udash.py:7: UserWarning: The dash_table package is deprecated. Please replace import dash_table with from dash import dash_table

    Also, if you're using any of the table format helpers (e.g. Group), replace from dash_table.Format import Group with from dash.dash_table.Format import Group import dash_table as dt

    opened by yliyanage 9
  • IndexError using ebm.predict() method

    IndexError using ebm.predict() method

    I'm using: Ubuntu 18.04.2 LTS JupyterLab v1.04

    When I try to use the .predict() method for the ExplainableBoostingRegressor() object. I run into an IndexError:


    IndexError Traceback (most recent call last) in ----> 1 ebm.predict(X_test)

    ~/.local/lib/python3.7/site-packages/interpret/glassbox/ebm/ebm.py in predict(self, X) 1577 1578 return EBMUtils.regressor_predict( -> 1579 X, self.feature_groups_, self.additive_terms_, self.intercept_ 1580 )

    ~/.local/lib/python3.7/site-packages/interpret/glassbox/ebm/utils.py in regressor_predict(X, feature_groups, model, intercept) 196 @staticmethod 197 def regressor_predict(X, feature_groups, model, intercept): --> 198 scores = EBMUtils.decision_function(X, feature_groups, model, intercept) 199 return scores 200

    ~/.local/lib/python3.7/site-packages/interpret/glassbox/ebm/utils.py in decision_function(X, feature_groups, model, intercept) 162 X, feature_groups, model 163 ) --> 164 for _, _, scores in scores_gen: 165 score_vector += scores 166

    ~/.local/lib/python3.7/site-packages/interpret/glassbox/ebm/utils.py in scores_by_feature_group(X, feature_groups, model) 142 feature_idxs = feature_group 143 sliced_X = X[feature_idxs, :] --> 144 scores = tensor[tuple(sliced_X)] 145 146 yield set_idx, feature_group, scores

    IndexError: index -2 is out of bounds for axis 0 with size 1

    What's weird is that I only get this error when I predict over X_test. The columns are identical.

    opened by abhipannala 9
  • how to graph without spinning a local web-server

    how to graph without spinning a local web-server

    Is this API available yet? We cant seem to plot it locally with plotly offline.

    Hi @dfrankow, thanks for the issue! We're just about to introduce a few new API changes that should make this easier in our next release. One, we'll let you specify a port in the show method, so that you can pick your own port that you know is open. Second, we'll introduce a new function that doesn't spin up the local web-server, and directly uses plotly to visualize it. For now, here are a few notes:

    visualize() does return a plotly object, and you can use plotly.offline so that you don't need an api key. And yes, if you pass in a key to visualize() , you can get a specific graph back out!

    If you run this code at the top of your notebook:

    from plotly.offline import init_notebook_mode, iplot
    init_notebook_mode(connected=True)
    

    you can then use "iplot(plotly_figure)" in your notebook to get a direct plotly graph. We'll have a nicer API around this soon!

    Originally posted by @interpret-ml in https://github.com/microsoft/interpret/issues/1#issuecomment-490291898

    opened by jyipks 9
  • "Error loading dependecies" in show() method

    Hi guys, thanks for this great contribution.

    Each time I use the method 'show()' I got the following error: "Error loading dependencies"

    Examples:

    ebm_global = ebm.explain_global(name='EBM') show(ebm_global) # "error loading dependencies

    I'm using Python 3.7

    Thanks in advance,

    Nelson

    opened by nfsrules 9
  • Example of an interaction term from GA2M (a.k.a. EBM)?

    Example of an interaction term from GA2M (a.k.a. EBM)?

    GAMs are non-linear terms per feature, combined in a linear way. GA2Ms also include pairwise interactions, chosen in a heuristically efficient way with FAST.

    If I use an explainable boosting classifier/regressor, how can I tell whether it considered interaction terms?

    Can you document an example where interaction terms are used, including graphs?

    Thanks.

    question 
    opened by dfrankow 9
  • Conda installation & dash package imports

    Conda installation & dash package imports

    Hello, I installed interpretML through the conda command: conda install -c interpretml interpret

    and I run the code in: https://github.com/deepfindr/xai-series Function "show" raises this error : ModuleNotFoundError: No module named 'dash_html_components' while pointing out line 5 in udash.py: --> import dash_html_components as html Clearly, a matter of dash 2.0.

    I found this already corrected in the master branch: update dash package imports to resolve dash warnings -- Latest commit [c10b5f7] on Nov 16. (https://github.com/interpretml/interpret/commit/c10b5f76a6f54ed66750fa054956a178536ce2d4)

    But, I do not understand why my installation (through conda command) did not get the corrected version of the package. (I am clearly new to all this ... ) Is this going to be fixed soon ?

    Thank you very much for your help.

    opened by AhmedKhassiba 0
  • Getting ValueError: axes don't match array while using merge_ebms on multi labels ExplainableBoostingClassifier

    Getting ValueError: axes don't match array while using merge_ebms on multi labels ExplainableBoostingClassifier

    I'm testing merging multi labels ExplainableBoostingClassifier models with merge_ebms method (interpret==0.3.0 from pypi on python 3.7 (my stage env) and 3.8 (clean conda env from scratch only for 0.3.0 version)

    models = []
    for i in range(10):
        _model_clf = ExplainableBoostingClassifier(random_state=i, n_jobs=-1, interactions=[])
        _model_clf.fit(X_train[features], X_train[target])
        models.append(_model_clf)
    

    and I'm getting following error:

    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    /tmp/ipykernel_40170/2566246721.py in <module>
    ----> 1 model_clf = merge_ebms(models)
    
    ~/.conda/envs/image-processing/lib/python3.7/site-packages/interpret/glassbox/ebm/utils.py in merge_ebms(models)
        940                         old_mapping[model_idx],
        941                         model.bagged_scores_[term_idx][bag_idx],
    --> 942                         model.bin_weights_[term_idx] # we use these to weigh distribution of scores for mulple bins
        943                     )
        944                     new_bagged_scores.append(harmonized_bagged_scores)
    
    ~/.conda/envs/image-processing/lib/python3.7/site-packages/interpret/glassbox/ebm/utils.py in _harmonize_tensor(new_feature_idxs, new_bounds, new_bins, old_feature_idxs, old_bounds, old_bins, old_mapping, old_tensor, bin_evidence_weight)
        387     old_tensor = old_tensor.transpose(tuple(axes))
        388     if bin_evidence_weight is not None:
    --> 389         bin_evidence_weight = bin_evidence_weight.transpose(tuple(axes))
        390 
        391     mapping = []
    
    ValueError: axes don't match array
    

    I've tried setup different parameters (binning, inner and outer bags, explicitly pointing the features names and types) and the result is always the same - ValueError: axes don't match array. I'm getting this error only with multilabel problem, merging binary classifiers works perfectly fine. Any help here?

    opened by lbalec 2
  • Applying log transform to skewed outcome variable

    Applying log transform to skewed outcome variable

    My numerical outcome variable is highly skewed so applying a log transform seems to bring it closer to a normal distribution and improves the EBM training accuracy. It is also seems preferable so that the model training isn't biased by the outliers. However, I am having a hard time reconciling the scores and intercept from the model trained on the log-transformed outcome as they differ greatly (even after exponentiating to un-transform them) from a model trained on the original (i.e. not log-transformed) outcome.

    1. Any intuition on whether EBMs should benefit from transforming high skewed data?
    2. Any suggestions on how to un-transform the intercept and scores so they can be interpretable in the original outcome space?
    opened by bhu-strata 0
  • How to show probabilities instead of logits in local explanations

    How to show probabilities instead of logits in local explanations

    @paulbkoch I'm looking for a way to go about showing probabilities in local explanations.

    At the moment, this is what we have:

    1. feature contributions in terms of logits (positive and negative)
    2. an intercept

    To obtain the predicted score for the positive class (in a binary setting), interpretml does the following:

    1. start a sum using the intercept value as the initial value
    2. iterate over each feature (including interactions) 2.1) get the corresponding logit indexing the model.term_scores_ attribute (which has to do with bins and such, I need to dig deeper in this regard) 2.2) add that logit to the current sum value (which started from the intercept)
    3. add a 0 to the final sum of logits and obtain an array with two columns an N rows depending on the number of samples that we are predicting: [0, sum_of_logits]
    4. apply softmax to get predicted "probabilities" for negative and positive class

    What I want to know is the following: is there a way, starting from the predicted probability of the positive class (let's say 0.85), to know the individual contribution of each feature in terms of probability ? e.g. feat1: +0.12 feat2: -0.13 feat3: +0.4 ... so that the sum of each individual contribution (ideally a "partial" probability) add up to the predicted probability of positive class (i.e., 0.85) ? Additional question: How should we treat the intercept in such a case ?


    I tried passing each contribution in terms of logits to the logistic function but obviously it makes little sense

    Thank you very much. You did an outstanding work with this library!

    opened by francescopisu 0
  • Removed dash-table update conda recipe

    Removed dash-table update conda recipe

    Fixes https://github.com/interpretml/interpret/issues/383 Might fix Outdated Conda - Without the ability to test, it should work for Linux https://github.com/interpretml/interpret/issues/368

    opened by rxm7706 0
Releases(v0.3.0)
  • v0.3.0(Nov 19, 2022)

    [v0.3.0] - 2022-11-16

    Added

    • Full Complexity EBMs with higher order interactions supported: GA3M, GA4M, GA5M, etc... 3-way and higher-level interactions lose exact global interpretability, but retain exact local explanations Higher level interactions need to be explicitly specified. No automatic FAST detection yet
    • Mac m1 support
    • support for ordinals
    • merge_ebms now supports merging models with interactions, including higher-level interactions
    • added classic composition option during Differentially Private binning
    • support for different kinds of feature importances (avg_weight, min_max)
    • exposed interaction detection API (FAST algorithm)
    • API to calculate and show the importances of groups of features and terms.

    Changed

    • memory efficiency: About 20x less memory is required during fitting
    • predict time speed improvements. About 50x faster for Pandas CategoricalDType, and varying levels of improvements for other data types
    • handling of the differential privacy DPOther bin, and non-DP unknowns has been unified by having a universal unknown bin
    • bin weights have been changed from per-feature to per-term and are now multi-dimensional
    • improved scikit-learn compliance: We now conform to the scikit-learn 1.0 feature names API by using self.feature_names_in_ for the X column names and self.n_features_in_. We use the matching self.feature_types_in_ for feature types, and self.term_names_ for the additive term names.

    Fixed

    • merge_ebms now distributes bin weights proportionally according to volume when splitting bins
    • DP-EBMs now use sample weights instead of bin counts, which preserves privacy budget
    • improved scikit-learn compliance: The following init attributes are no longer overwritten during calls to fit: self.interactions, self.feature_names, self.feature_types
    • better handling of floating point overflows when calculating gain and validation metrics

    Breaking Changes

    • EBMUtils.merge_models function has been renamed to merge_ebms
    • renamed binning type 'quantile_humanized' to 'rounded_quantile'
    • feature type 'categorical' has been specialized into separate 'nominal' and 'ordinal' types
    • EBM models have changed public attributes:
      • feature_groups_ -> term_features_
        global_selector -> n_samples_, unique_val_counts_, and zero_val_counts_
        domain_size_ -> min_target_, max_target_
        additive_terms_ -> term_scores_
        bagged_models_ -> BaseCoreEBM has been depricated and the only useful attribute has been moved 
                          into the main EBM class (bagged_models_.model_ -> bagged_scores_)
        feature_importances_ -> has been changed into the function term_importances(), which can now also 
                                generate different types of importances
        preprocessor_ & pair_preprocessor_ -> attributes have been moved into the main EBM model class (details below)
        
    • EBMPreprocessor attributes have been moved to the main EBM model class
      • col_names_ -> feature_names_in_
        col_types_ -> feature_types_in_
        col_min_ -> feature_bounds_
        col_max_ -> feature_bounds_
        col_bin_edges_ -> bins_
        col_mapping_ -> bins_
        hist_counts_ -> histogram_counts_
        hist_edges_ -> histogram_edges_
        col_bin_counts_ -> bin_weights_ (and is now a per-term tensor)
        
    Source code(tar.gz)
    Source code(zip)
  • v0.2.7(Sep 23, 2021)

    v0.2.7 - 2021-09-23

    Added

    • Synapse cloud support for visualizations.

    Fixed

    • All category names in bar charts now visible for inline rendering (used in cloud environments).
    • Joblib preference was previously being overriden. This has been reverted to honor the user's preference.
    • Bug in categorical binning for differentially privatized EBMs has been fixed.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.6(Jul 20, 2021)

    v0.2.6 - 2021-07-20

    Adde6

    • Differential-privacy augmented EBMs now available as interpret.privacy.{DPExplainableBoostingClassifier,DPExplainableBoostingRegressor}.
    • Packages interpret and interpret-core now distributed via docker.

    Changed

    • Sampling code including stratification within EBM now performed in native code.

    Fixed

    • Computer provider with joblib can now support multiple engines with serialization support.
    • Labels are now all shown for inline rendering of horizontal bar charts.
    • JS dependencies updated.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.5(Jun 22, 2021)

    v0.2.5 - 2021-06-21

    Added

    • Sample weight support added for EBM.
    • Joint predict_and_contrib added to EBM where both predictions and feature contributions are generated in one call.
    • EBM predictions now substantially faster with categorical featured predictions.
    • Preliminary documentation for all of interpret now public at https://interpret.ml/docs.
    • Decision trees now work in cloud environments (InlineRenderer support).
    • Packages interpret and interpret-core now distributed via sdist.

    Fixed

    • EBM uniform binning bug fixed where empty bins can raise exceptions.
    • Users can no longer include duplicate interaction terms for EBM.
    • CSS adjusted for inline rendering such that it does not interfere with its hosting environment.
    • JS dependencies updated.

    Experimental

    • Ability to merge multiple EBM models into one. Found in interpret.glassbox.ebm.utils.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.4(Jan 20, 2021)

  • v0.2.3(Jan 14, 2021)

    v0.2.3 - 2021-01-13

    Major upgrades to EBM in this release. Automatic interaction detection is now included by default. This will increase accuracy substantially in most cases. Numerous optimizations to support this, especially around binary classification. Expect similar or slightly slower training times due to interactions.

    Fixed

    • Automated interaction detection uses low-resolution binning for both FAST and pairwise training.

    Changed

    • EBM argument has been reduced from outer_bags=16 to outer_bags=8.
    • EBM now includes interactions by default from interactions=0 to interactions=10.
    • Algorithm treeinterpreter is now unstable due to upstream dependencies.
    • Automated interaction detection now operates from two-pass to one-pass.
    • Numeric approximations used in boosting (i.e. approx log / exp).
    • Some arguments have been re-ordered for EBM initialization.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.2(Oct 20, 2020)

    v0.2.2 - 2020-10-19

    Fixed

    • Fixed bug on predicting unknown categories with EBM.
    • Fixed bug on max value being placed in its own bin for EBM pre-processing.
    • Numerous native fixes and optimizations.

    Added

    • Added max_interaction_bins as argument to EBM learners for different sized bins on interactions, separate to mains.
    • New binning method 'quantile_humanized' for EBM.

    Changed

    • Interactions in EBM now use their own pre-processing, separate to mains.
    • Python 3.5 no longer supported.
    • Switched from Python to native code for binning.
    • Switched from Python to native code for PRNG in EBM.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.1(Aug 7, 2020)

    v0.2.1 - 2020-08-07

    Added

    • Python 3.8 support.

    Changed

    • Dash based visualizations will always default to listen port 7001 on first attempt; if the first attempt fails it will try a random port between 7002-7999.

    Experimental (WIP)

    • Further cloud environment support.
    • Improvements for multiclass EBM global graphs.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Jul 21, 2020)

    v0.2.0 - 2020-07-21

    Breaking Changes

    • With warning, EBM classifier adapts internal validation size when there are too few instances relative to number of unique classes. This ensures that there is at least one instance of each class in the validation set.
    • Cloud Jupyter environments now use a CDN to fix major rendering bugs and performance.
      • CDN currently used is https://unpkg.com
      • If you want to specify your own CDN, add the following as the top cell
        from interpret import set_visualize_provider
        from interpret.provider import InlineProvider
        from interpret.version import __version__
        
        # Change this to your custom CDN.
        JS_URL = "https://unpkg.com/@interpretml/interpret-inline@{}/dist/interpret-inline.js".format(__version__)
        set_visualize_provider(InlineProvider(js_url=JS_URL))
        
    • EBM has changed initialization parameters:
      • schema -> DROPPED
        n_estimators -> outer_bags
        holdout_size -> validation_size
        scoring -> DROPPED
        holdout_split -> DROPPED
        main_attr -> mains
        data_n_episodes -> max_rounds
        early_stopping_run_length -> early_stopping_rounds
        feature_step_n_inner_bags -> inner_bags
        training_step_epsiodes -> DROPPED
        max_tree_splits -> max_leaves
        min_cases_for_splits -> DROPPED
        min_samples_leaf -> ADDED (Minimum number of samples that are in a leaf)
        binning_strategy -> binning
        max_n_bins -> max_bins
        
    • EBM has changed public attributes:
      • n_estimators -> outer_bags
        holdout_size -> validation_size
        scoring -> DROPPED
        holdout_split -> DROPPED
        main_attr -> mains
        data_n_episodes -> max_rounds
        early_stopping_run_length -> early_stopping_rounds
        feature_step_n_inner_bags -> inner_bags
        training_step_epsiodes -> DROPPED
        max_tree_splits -> max_leaves
        min_cases_for_splits -> DROPPED
        min_samples_leaf -> ADDED (Minimum number of samples that are in a leaf)
        binning_strategy -> binning
        max_n_bins -> max_bins
        
        attribute_sets_ -> feature_groups_
        attribute_set_models_ -> additive_terms_ (Pairs are now transposed)
        model_errors_ -> term_standard_deviations_
        
        main_episode_idxs_ -> breakpoint_iteration_[0]
        inter_episode_idxs_ -> breakpoint_iteration_[1]
        
        mean_abs_scores_ -> feature_importances_
        

    Fixed

    • Internal fixes and refactor for native code.
    • Updated dependencies for JavaScript layer.
    • Fixed rendering bugs and performance issues around cloud Jupyter notebooks.
    • Logging flushing bug fixed.
    • Labels that are shaped as nx1 matrices now automatically transform to vectors for training.

    Experimental (WIP)

    • Added support for AzureML notebook VM.
    • Added local explanation visualizations for multiclass EBM.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.22(Apr 27, 2020)

    v0.1.22 - 2020-04-27

    Upcoming Breaking Changes

    • EBM initialization arguments and public attributes will change in a near-future release.
    • There is a chance Explanation API will change in a near-future release.

    Added

    • Docstrings for top-level API including for glassbox and blackbox.

    Fixed

    • Minor fix for linear models where class wasn't propagating for logistic.

    Experimental

    • For research use, exposed optional_temp_params for EBM's Python / native layer.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.21(Apr 3, 2020)

    v0.1.21 - 2020-04-02

    Added

    • Module "glassbox.ebm.research" now has purification utilities.
    • EBM now exposes "max_n_bins" argument for its preprocessing stage.

    Fixed

    • Fix intercept not showing for local EBM binary classification.
    • Stack trace information exposed for extension system failures.
    • Better handling of sparse to dense conversions for all explainers.
    • Internal fixes for native code.
    • Better NaN / infinity handling within EBM.

    Changed

    • Binning strategy for EBM now defaulted to 'quantile' instead of 'uniform'.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.20(Dec 12, 2019)

    v0.1.20 - 2019-12-11

    Fixed

    • Major bug fix around EBM interactions. If you use interactions, please upgrade immediately. Part of the pairwise selection was not operating as expected and has been corrected.
    • Fix for handling dataframes when no named columns are specified.
    • Various EBM fixes around corner-case datasets.

    Changed

    • All top-level methods relating to show's backing web server now use visualize provider directly. In theory this shouldn't affect top-level API usage, but please raise an issue in the event of failure.
    • Memory footprint heavily reduced for EBM at around 2-3 times.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.19(Oct 26, 2019)

    v0.1.19 - 2019-10-25

    Changed

    • Changed classification metric exposed between C++/python for EBMs to log loss for future public use.
    • Warnings provided when extensions error on load.

    Fixed

    • Package joblib added to interpret-core as "required" extra.
    • Compiler fixes for Oracle Developer Studio.
    • Removed undefined behavior in EBM for several unlikely scenarios.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.18(Oct 10, 2019)

    v0.1.18 - 2019-10-09

    Added

    • Added "main_attr" argument to EBM models. Can now select a subset of features to train main effects on.
    • Added AzureML notebook VM detection for visualizations (switches to inline).

    Fixed

    • Missing values now correctly throw exceptions on explainers.
    • Major visualization fix for pairwise interaction heatmaps from EBM.
    • Corrected inline visualization height in Notebooks.

    Changed

    • Various internal C++ fixes.
    • New error messages around EBM if the model isn't fitted before calling explain_*.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.17(Sep 24, 2019)

    v0.1.17 - 2019-09-24

    Fixed

    • Morris sensitivity now works for both predict and predict_proba on scikit models.
    • Removal of debug print statements around blackbox explainers.

    Changed

    • Dependencies for numpy/scipy/pandas/scikit-learn relaxed to (1.11.1,0.18.1,0.19.2, 0.18.1) respectively.
    • Visualization provider defaults set by environment detection (cloud and local use different providers).

    Experimental (WIP)

    • Inline visualizations for show(explanation). This allows cloud notebooks, and offline notebook support. Dashboard integration still ongoing.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.16(Sep 18, 2019)

    v0.1.16 - 2019-09-17

    Added

    • Visualize and compute platforms are now refactored and use an extension system. Details on use upcoming in later release.
    • Package interpret is now a meta-package using interpret-core. This enables partial installs via interpret-core for production environments.

    Fixed

    • Updated SHAP dependency to require dill.

    Experimental (WIP)

    • Greybox introduced (explainers that only work for specific types of models). Starting with SHAP tree and TreeInterpreter.
    • Extension system now works across all explainer types and providers.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.15(Aug 26, 2019)

  • v0.1.14(Aug 21, 2019)

    v0.1.14 - 2019-08-20

    Fixed

    • Fixed occasional browser crash relating to density graphs.
    • Fixed decision trees not displaying in Jupyter notebooks.

    Changed

    • Dash components no longer pinned. Upgraded to latest.
    • Upgrade from dash-table-experiment to dash-table.
    • Numerous renames within native code.

    Experimental (WIP)

    • Explanation data methods for PDP, EBM enabled for mli interop.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.13(Aug 15, 2019)

    v0.1.13 - 2019-08-14

    Added

    • EBM has new parameter 'binning_strategy'. Can now support quantile based binning.
    • EBM now gracefully handles many edge cases around data.
    • Selenium support added for visual smoke tests.

    Fixed

    • Method debug_mode now works in wider environments including WSL.
    • Linear models in last version returned the same graphs no matter the selection. Fixed.

    Changed

    • Testing requirements now fully separate from default user install.
    • Internal EBM class has many renames associated with native codebase. Attribute has been changed to Feature.
    • Native codebase has many renames. Diff commits from v0.1.12 to v0.1.13 for more details.
    • Dependency gevent lightened to take 1.3.6 or greater. This affects cloud/older Python environments.
    • Installation for interpret package should now be 'pip install -U interpret'.
    • Removal of skope-rules as a required dependency. User now has to install it manually.
    • EBM parameter 'cont_n_bins' renamed to 'max_n_bins'.

    Experimental (WIP)

    • Extensions validation method is hardened to ensure blackbox specs are closely met.
    • Explanation methods data and visual, require key of form ('mli', key), to access mli interop.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.12(Aug 10, 2019)

  • v0.1.10(Jul 16, 2019)

Owner
InterpretML
If a tree fell in your random forest, would anyone notice?
InterpretML
Fundamentals of Machine Learning

Fundamentals-of-Machine-Learning This repository introduces the basics of machine learning algorithms for preprocessing, regression and classification

Happy N. Monday 3 Feb 15, 2022
Reggy - Regressions with arbitrarily complex regularization terms

reggy Regressions with arbitrarily complex regularization terms. Currently suppo

Kim 1 Jan 20, 2022
Built various Machine Learning algorithms (Logistic Regression, Random Forest, KNN, Gradient Boosting and XGBoost. etc)

Built various Machine Learning algorithms (Logistic Regression, Random Forest, KNN, Gradient Boosting and XGBoost. etc). Structured a custom ensemble model and a neural network. Found a outperformed

Chris Yuan 1 Feb 06, 2022
CyLP is a Python interface to COIN-OR’s Linear and mixed-integer program solvers (CLP, CBC, and CGL)

CyLP CyLP is a Python interface to COIN-OR’s Linear and mixed-integer program solvers (CLP, CBC, and CGL). CyLP’s unique feature is that you can use i

COIN-OR Foundation 161 Dec 14, 2022
Flask app to predict daily radiation from the time series of Solcast from Islamabad, Pakistan

Solar-radiation-ISB-MLOps - Flask app to predict daily radiation from the time series of Solcast from Islamabad, Pakistan.

Abid Ali Awan 1 Dec 31, 2021
This is the code repository for LRM Stochastic watershed model.

LRM-Squannacook Input data for generating stochastic streamflows are observed and simulated timeseries of streamflow. their format needs to be CSV wit

1 Feb 14, 2022
A flexible CTF contest platform for coming PKU GeekGame events

Project Guiding Star: the Backend A flexible CTF contest platform for coming PKU GeekGame events Still in early development Highlights Not configurabl

PKU GeekGame 14 Dec 15, 2022
Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Now updated with Dask to handle millions of rows.

Auto_TS: Auto_TimeSeries Automatically build multiple Time Series models using a Single Line of Code. Now updated with Dask. Auto_timeseries is a comp

AutoViz and Auto_ViML 519 Jan 03, 2023
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Ray provides a simple, universal API for building distributed applications. Ray is packaged with the following libraries for accelerating machine lear

23.3k Dec 31, 2022
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

Spark Python Notebooks This is a collection of IPython notebook/Jupyter notebooks intended to train the reader on different Apache Spark concepts, fro

Jose A Dianes 1.5k Jan 02, 2023
Machine Learning e Data Science com Python

Machine Learning e Data Science com Python Arquivos do curso de Data Science e Machine Learning com Python na Udemy, cliqe aqui para acessá-lo. O prin

Renan Barbosa 1 Jan 27, 2022
Automated Time Series Forecasting

AutoTS AutoTS is a time series package for Python designed for rapidly deploying high-accuracy forecasts at scale. There are dozens of forecasting mod

Colin Catlin 652 Jan 03, 2023
Tools for diffing and merging of Jupyter notebooks.

nbdime provides tools for diffing and merging of Jupyter Notebooks.

Project Jupyter 2.3k Jan 03, 2023
Simulation of early COVID-19 using SIR model and variants (SEIR ...).

COVID-19-simulation Simulation of early COVID-19 using SIR model and variants (SEIR ...). Made by the Laboratory of Sustainable Life Assessment (GYRO)

José Paulo Pereira das Dores Savioli 1 Nov 17, 2021
List of Data Science Cheatsheets to rule the world

Data Science Cheatsheets List of Data Science Cheatsheets to rule the world. Table of Contents Business Science Business Science Problem Framework Dat

Favio André Vázquez 11.7k Dec 30, 2022
This machine-learning algorithm takes in data from the last 60 days and tries to predict tomorrow's price of any crypto you ask it.

Crypto-Currency-Predictor This machine-learning algorithm takes in data from the last 60 days and tries to predict tomorrow's price of any crypto you

Hazim Arafa 6 Dec 04, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 23.6k Jan 03, 2023
A model to predict steering torque fully end-to-end

torque_model The torque model is a spiritual successor to op-smart-torque, which was a project to train a neural network to control a car's steering f

Shane Smiskol 4 Jun 03, 2022
Regularization and Feature Selection in Least Squares Temporal Difference Learning

Regularization and Feature Selection in Least Squares Temporal Difference Learning Description This is Python implementations of Least Angle Regressio

Mina Parham 0 Jan 18, 2022