Machine Learning toolbox for Humans

Related tags

Deep Learningrep
Overview

Reproducible Experiment Platform (REP)

Join the chat at https://gitter.im/yandex/rep Build Status PyPI version Documentation CircleCI

REP is ipython-based environment for conducting data-driven research in a consistent and reproducible way.

Main features:

  • unified python wrapper for different ML libraries (wrappers follow extended scikit-learn interface)
    • Sklearn
    • TMVA
    • XGBoost
    • uBoost
    • Theanets
    • Pybrain
    • Neurolab
    • MatrixNet service(available to CERN)
  • parallel training of classifiers on cluster
  • classification/regression reports with plots
  • interactive plots supported
  • smart grid-search algorithms with parallel execution
  • research versioning using git
  • pluggable quality metrics for classification
  • meta-algorithm design (aka 'rep-lego')

REP is not trying to substitute scikit-learn, but extends it and provides better user experience.

Howto examples

To get started, look at the notebooks in /howto/

Notebooks can be viewed (not executed) online at nbviewer
There are basic introductory notebooks (about python, IPython) and more advanced ones (about the REP itself)

Examples code is written in python 2, but library is python 2 and python 3 compatible.

Installation with Docker

We provide the docker image with REP and all it's dependencies. It is a recommended way, specially if you're not experienced in python.

Installation with bare hands

However, if you want to install REP and all of its dependencies on your machine yourself, follow this manual: installing manually and running manually.

Links

License

Apache 2.0, library is open-source.

Minimal examples

REP wrappers are sklearn compatible:

from rep.estimators import XGBoostClassifier, SklearnClassifier, TheanetsClassifier
clf = XGBoostClassifier(n_estimators=300, eta=0.1).fit(trainX, trainY)
probabilities = clf.predict_proba(testX)

Beloved trick of kagglers is to run bagging over complex algorithms. This is how it is done in REP:

from sklearn.ensemble import BaggingClassifier
clf = BaggingClassifier(base_estimator=XGBoostClassifier(), n_estimators=10)
# wrapping sklearn to REP wrapper
clf = SklearnClassifier(clf)

Another useful trick is to use folding instead of splitting data into train/test. This is specially useful when you're using some kind of complex stacking

from rep.metaml import FoldingClassifier
clf = FoldingClassifier(TheanetsClassifier(), n_folds=3)
probabilities = clf.fit(X, y).predict_proba(X)

In example above all data are splitted into 3 folds, and each fold is predicted by classifier which was trained on other 2 folds.

Also REP classifiers provide report:

report = clf.test_on(testX, testY)
report.roc().plot() # plot ROC curve
from rep.report.metrics import RocAuc
# learning curves are useful when training GBDT!
report.learning_curve(RocAuc(), steps=10)  

You can read about other REP tools (like smart distributed grid search, folding and factory) in documentation and howto examples.

Comments
  • Problem with TMVAClassifier

    Problem with TMVAClassifier

    After REP installation from here, I've met the following problem with TMVAClassifier fitting: I'm trying to train TMVAClassifier, and IOError raises after following strings: " baseline = TMVAClassifier(method='kBDT', features=variables, BoostType='Grad', NTrees=40, Shrinkage=0.01, MaxDepth=7, UseNvars=6, nCuts=-1) features=variables)

    baseline.fit(train, train['signal'])"

    Stacktrace is next: IOError Traceback (most recent call last) in () 3 UseNvars=6, nCuts=-1) 4 # baseline = TMVAClassifier(method='kBDT', NTrees=50, Shrinkage=0.05, features=variables) ----> 5 baseline.fit(train, train['signal'])

    /usr/local/lib/python2.7/dist-packages/rep-0.6.3-py2.7.egg/rep/estimators/tmva.pyc in fit(self, X, y, sample_weight) 288 self.factory_options = '{}:AnalysisType=Multiclass'.format(self.factory_options) 289 --> 290 return self._fit(X, y, sample_weight=sample_weight) 291 292 def predict_proba(self, X):

    /usr/local/lib/python2.7/dist-packages/rep-0.6.3-py2.7.egg/rep/estimators/tmva.pyc in _fit(self, X, y, sample_weight, model_type) 104 add_info = _AdditionalInformation(directory, model_type=model_type) 105 try: --> 106 self._run_tmva_training(add_info, X, y, sample_weight) 107 finally: 108 self._remove_tmp_directory(directory)

    /usr/local/lib/python2.7/dist-packages/rep-0.6.3-py2.7.egg/rep/estimators/tmva.pyc in run_tmva_training(self, info, X, y, sample_weight) 134 xml_filename = os.path.join(info.directory, 'weights', 135 '{job}{name}.weights.xml'.format(job=info.tmva_job, name=self._method_name)) --> 136 with open(xml_filename, 'r') as xml_file: 137 self.formula_xml = xml_file.read() 138

    IOError: [Errno 2] No such file or directory: '/home/artem/Documents/IPython Notebooks/CERN + Yandex/Original Baseline/flavours-of-physics-start/tmp0Fhtqe/weights/TMVAEstimation_REP_Estimator.weights.xml'

    As I found, weights/ folder was created outside of temporary folder instead created inside in last one. It causes the error above.

    ROOT 5.34, Python 2.7, GCC 4.8, Ubuntu 14.04 LTS (x64). All requirenments for REP were installed successfully (from requirenments.txt)

    bug 
    opened by HolyBayes 9
  • FoldingClassifier: KFold vs StratifiedKFold

    FoldingClassifier: KFold vs StratifiedKFold

    Hey,

    first of all a compliment: I really like your repo and I build a lot of code on it, it's so useful! About the FoldingClassifier: There was already a request to implement the StratifiedKFolding additionally to the "normal" KFolding. I would be very glad to see this but I'd even go a step further: why don't you completely replace the KFold with a StratifiedKFold?

    I think, from an ML point of view, it is always better (or, in best case, equally good) to use a stratified one. Using a normal KFolding only introduces different class-balances which (usually) result in "shifted" probabilities among the different classifier, whereas a stratified one does not and therefore makes each trained classifiers predictions "comparable".

    Or in other words: I cannot think of any case where you want to have a non-stratified KFolding instead of a stratified one.

    What do you think?

    Best, Mayou

    enhancement 
    opened by jonas-eschle 5
  • Support for build on hosted on (ana)conda

    Support for build on hosted on (ana)conda

    I see that some of the continuous integration scripts support conda builds, although not all the dependencies are installed this way. Is there any hope of seeing a build on conda soon for Linux x86_64 systems?

    The reason I ask is that I have accounts on numerous batch systems, none of which I have root access or have any way to use docker. They're all linux-based though, as is the norm. So far as I know, this is the case for many researchers.

    It'd be great to see a way to quickly install REP on these systems. This would:

    • Cut down on the time needed to introduce people to REP
    • Hook into the environment management and environment logging provided by conda
    • Easily and quickly deploy REP on supercomputing nodes while requiring little of their filesystem

    This is especially useful for ensuring the ROOT install is sane. I know there has already been a lot of work in the direction of making REP easy to access and install. Perhaps this could be a healthy addition?

    question 
    opened by ewengillies 5
  • Add ability to initialise FoldingBase objects with external parser

    Add ability to initialise FoldingBase objects with external parser

    If you would like to run rep with eg a StratifiedKFold instead of a normal KFold, this will be possible after the pull request. If no external folder-object is parsed, the default KFold algorithm is used.

    opened by mschlupp 5
  • test_xgboost file is not running on windows 10

    test_xgboost file is not running on windows 10

    test_xgboost file is not running on windows 10 File "c:\Sander\my_code\rep-master\tests\test_xgboost.py", line 4, in from rep.estimators import XGBoostClassifier, XGBoostRegressor

    ImportError: cannot import name XGBoostClassifier

    when rep installatoin is ok but xgboost instal fails Microsoft Windows Version 10.0.10586 2015 Microsoft Corporation. All rights reserved.

    c:\Sander>pip install rep --no-dependencies Collecting rep Downloading rep-0.6.5.tar.gz (72kB) 100% |################################| 81kB 511kB/s Building wheels for collected packages: rep Running setup.py bdist_wheel for rep ... done Stored in directory: C:\Users\Sander\AppData\Local\pip\Cache\wheels\db\ee\06\ac6e3f3ec208edaee29654f0b55ffaf2719a51de799c396b91 Successfully built rep Installing collected packages: rep Successfully installed rep-0.6.5 You are using pip version 8.1.0, however version 8.1.2 is available. You should consider upgrading via the 'python -m pip install --upgrade pip' command.

    c:\Sander>pip install xgboost==0.4a30 Collecting xgboost==0.4a30 Downloading xgboost-0.4a30.tar.gz (753kB) 100% |################################| 757kB 553kB/s No files/directories in c:\users\sander\appdata\local\temp\pip-build-exobfm\xgboost\pip-egg-info (from PKG-INFO) You are using pip version 8.1.0, however version 8.1.2 is available. You should consider upgrading via the 'python -m pip install --upgrade pip' command.

    c:\Sander>

    opened by Sandy4321 5
  • Manual Install on Windows

    Manual Install on Windows

    Hi! Is there a way to install REP manually on Windows environment? When installing dependencies i get an error when installing gnureadline:

    Error: this module is not meant to work on Windows (try pyreadline instead)

    Is there a way to use pyreadline for windows uoosers?

    wontfix 
    opened by funkindy 4
  • Mac OS instalation with docker

    Mac OS instalation with docker

    It seems last docker release depricates boot2docker http://docs.docker.com/installation/mac/ "This release of Docker deprecates the Boot2Docker command line in favor of Docker Machine"

    How to install REP with latest docker release?

    opened by pupadupa 4
  • test failed

    test failed

    after python setup.py install I run cd tests ; nosetests . it runs for long time and ends up with errors:

    ..Info in <TCanvas::Print>: png file /tmp/tmpBg1dar.png has been created
    Error in <TFile::TFile>: file toy_datasets/toyMC_bck_mass.root does not exist
    E..E.
    ======================================================================
    ERROR: tests.z_test_notebook.test_notebooks_in_folder('/root/rep/howto/00-intro-ROOT.ipynb',)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
        self.test(*self.arg)
      File "/root/rep/rep/test/test_notebooks.py", line 43, in check_single_notebook
        raise RuntimeError(description)
    RuntimeError: Cell failed: 'T.Draw("min_DOCA")
    c1'
    
     Traceback:
    ---------------------------------------------------------------------------
    ReferenceError                            Traceback (most recent call last)
    <ipython-input-5-aa6c7320180d> in <module>()
    ----> 1 T.Draw("min_DOCA")
          2 c1
    
    ReferenceError: attempt to access a null-pointer
    

    What am I missing?

    opened by anaderi 3
  • Updating numpy in 0.6.6 docker breaks matplotlib

    Updating numpy in 0.6.6 docker breaks matplotlib

    % docker run -ti yandex/rep:0.6.6 bash -lc 'pip install -U numpy; python -c "from matplotlib import pyplot as plt; plt.figure()"'
    Activate: ROOT has been sourced. Environment settings are ready.
    ROOTSYS=/root/miniconda/envs/rep_py2
    Deactivate:Unsetting ROOT environment variables..
    Activate: ROOT has been sourced. Environment settings are ready.
    ROOTSYS=/root/miniconda/envs/rep_py2
    Collecting numpy
      Downloading numpy-1.11.2-cp27-cp27mu-manylinux1_x86_64.whl (15.3MB)
        100% |################################| 15.3MB 46kB/s
    Installing collected packages: numpy
      Found existing installation: numpy 1.10.4
        DEPRECATION: Uninstalling a distutils installed project (numpy) has been deprecated and will be removed in a future version. This is due to the fact that uninstalling a distutils project will only partially uninstall the project.
        Uninstalling numpy-1.10.4:
          Successfully uninstalled numpy-1.10.4
    Successfully installed numpy-1.11.2
    /root/miniconda/envs/rep_py2/lib/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
      warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
    bash: line 1:   222 Illegal instruction     python -c "from matplotlib import pyplot as plt; plt.figure()"
    
    opened by sashabaranov 2
  • do we need to measure fit/predict time without %time?

    do we need to measure fit/predict time without %time?

    it is useful if jupyter frontend disconnects during fit/predict execution.

    might the following snippet be handy for such cases

    class Stopwatch(object):
        def __enter__(self):
            self.t0 = datetime.datetime.now()
            return self
    
        def __exit__(self, type, value, traceback):
            self.t1 = datetime.datetime.now()
    
        def __repr__(self):
            return "delta: (%s)" % (self.t1 - self.t0)
    
    
    with Stopwatch() as sfit:
        time.sleep(1)
    with Stopwatch() as spredict:
        time.sleep(1)
    
    print "fit:", sfit, "spredict:", spredict
    
    opened by anaderi 2
  • New REP docker version running in /var/lib/docker/volumes/ instead of ~/rep_container

    New REP docker version running in /var/lib/docker/volumes/ instead of ~/rep_container

    Hi.

    I had old REP docker version in ~/rep_container which started with run.sh script on 8080 port. I updated REP and it broke: sudo $REPDIR/run.sh worked, but I couldn't connect to localhost:8080 (connection refused). I've decided to update docker and REP according to new instructions: https://github.com/yandex/rep/wiki/Install-REP-with-Docker-(Linux).

    1. I installed Docker, according to instructions.
    2. netstat -anl | grep 8888 gave empty result
    3. git checkout https://github.com/yandex/rep.git didn't work (pathspec did not match any file(s) known to git), so I used git clone instead.
    4. First run of sudo make run was successful and installed container.
    5. I rebooted and second sudo make run gave the following

    docker run -ti --rm -p 8888:8888 --name rep yandex/rep:0.6.4
    Error response from daemon: Conflict. The name "rep" is already in use by container 3af0884aeedb. You have to remove (or rename) that container to be able to reuse that name. make: *
    * [run] Error 1* 6. I ran sudo docker images

    REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE yandex/rep 0.6.4 18a48bc5a3b6 8 hours ago 2.635 GB anaderi/rep latest 63c3db2850b6 4 months ago 1.649 GB 91c95931e552 7 months ago 910 B 7. I tried sudo docker start rep. It worked and I opned REP on localhost:8888. But its working folder changed. Now it is /var/lib/docker/volumes/dbcc7ff99538007d9c6b244fb6b8f03bdcfd564f6076b36d79fa3330d2041107/_data/. It is quite unhandy, because it requires superuser rights to access and not conveniently located at all.

    Question: Is it a new system or did I something wrong? If latter, how to I fix it and run REP container in handy folder?

    opened by lodurality 2
  • Bump notebook from 4.2.1 to 6.4.12

    Bump notebook from 4.2.1 to 6.4.12

    Bumps notebook from 4.2.1 to 6.4.12.

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Update lib

    Update lib

    Issue:

    ModuleNotFoundError Traceback (most recent call last) in 5 from sklearn.ensemble import HistGradientBoostingClassifier 6 from rep.report.metrics import RocAuc ----> 7 from rep.metaml import GridOptimalSearchCV, FoldingScorer, RandomParameterOptimizer 8 from rep.estimators import SklearnClassifier

    ~/.local/lib/python3.8/site-packages/rep/metaml/init.py in 2 3 from .factory import ClassifiersFactory, RegressorsFactory ----> 4 from .folding import FoldingClassifier, FoldingRegressor 5 from .gridsearch import GridOptimalSearchCV 6 from .stacking import FeatureSplitter

    ~/.local/lib/python3.8/site-packages/rep/metaml/folding.py in 11 12 from sklearn import clone ---> 13 from sklearn.cross_validation import KFold 14 from sklearn.utils import check_random_state 15 from . import utils

    ModuleNotFoundError: No module named 'sklearn.cross_validation'

    Correction suggested based on https://stackoverflow.com/questions/30667525/importerror-no-module-named-sklearn-cross-validation

    opened by RobsonRocha 1
  • Bump requests from 2.9.1 to 2.20.0

    Bump requests from 2.9.1 to 2.20.0

    Bumps requests from 2.9.1 to 2.20.0.

    Changelog

    Sourced from requests's changelog.

    2.20.0 (2018-10-18)

    Bugfixes

    • Content-Type header parsing is now case-insensitive (e.g. charset=utf8 v Charset=utf8).
    • Fixed exception leak where certain redirect urls would raise uncaught urllib3 exceptions.
    • Requests removes Authorization header from requests redirected from https to http on the same hostname. (CVE-2018-18074)
    • should_bypass_proxies now handles URIs without hostnames (e.g. files).

    Dependencies

    • Requests now supports urllib3 v1.24.

    Deprecations

    • Requests has officially stopped support for Python 2.6.

    2.19.1 (2018-06-14)

    Bugfixes

    • Fixed issue where status_codes.py's init function failed trying to append to a __doc__ value of None.

    2.19.0 (2018-06-12)

    Improvements

    • Warn user about possible slowdown when using cryptography version < 1.3.4
    • Check for invalid host in proxy URL, before forwarding request to adapter.
    • Fragments are now properly maintained across redirects. (RFC7231 7.1.2)
    • Removed use of cgi module to expedite library load time.
    • Added support for SHA-256 and SHA-512 digest auth algorithms.
    • Minor performance improvement to Request.content.
    • Migrate to using collections.abc for 3.7 compatibility.

    Bugfixes

    • Parsing empty Link headers with parse_header_links() no longer return one bogus entry.
    ... (truncated)
    Commits
    • bd84045 v2.20.0
    • 7fd9267 remove final remnants from 2.6
    • 6ae8a21 Add myself to AUTHORS
    • 89ab030 Use comprehensions whenever possible
    • 2c6a842 Merge pull request #4827 from webmaven/patch-1
    • 30be889 CVE URLs update: www sub-subdomain no longer valid
    • a6cd380 Merge pull request #4765 from requests/encapsulate_urllib3_exc
    • bbdbcc8 wrap url parsing exceptions from urllib3's PoolManager
    • ff0c325 Merge pull request #4805 from jdufresne/https
    • b0ad249 Prefer https:// for URLs throughout project
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot ignore this [patch|minor|major] version will close this PR and stop Dependabot creating any more for this minor/major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Changes to TMVA API in new ROOT versions break TMVAClassifier

    Changes to TMVA API in new ROOT versions break TMVAClassifier

    Hi all,

    first of all, I wanted to thank and compliment the developers for this brilliant library. I finally had the chance to start playing with it today, but I was stopped in my tracks when trying to use a TMVAClassifier:

    AssertionError: ERROR: TMVA process is incorrect finished 
     LOG: None 
     Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/home/ludo/miniconda3/envs/pyroot/lib/python2.7/site-packages/rep/estimators/_tmvaFactory.py", line 86, in main
        tmva_process(classifier, info, data, labels, sample_weight)
      File "/home/ludo/miniconda3/envs/pyroot/lib/python2.7/site-packages/rep/estimators/_tmvaFactory.py", line 40, in tmva_process
        factory.AddVariable(var)
    AttributeError: 'Factory' object has no attribute 'AddVariable'
    

    My ROOT/TMVA versions are:

    You are running ROOT Version: 6.08/00, Nov 4, 2016
    TMVA Version 4.2.1, Feb 5, 2015
    

    Searching the web for this error message led me to this post on the ROOT forum: https://root-forum.cern.ch/t/25090, where the cause of problem is indicated as being due to a breaking change in the TMVA API:

    In recent ROOT versions (6.06 or 6.08, don't remember exactly), the TMVA interface has changed. You need to create a TMVA::DataLoader and call AddVariable on the dataloader object.

    As I understand, this is related to what was mentioned by @gandreassi in a comment to #104. Any idea on how complicated it would be to adapt tmva_process to the new interface?

    opened by fndari 1
Releases(0.6.6)
  • 0.6.6(Aug 9, 2016)

    • python2 and python3 dockers
    • updated libraries
    • added CacheClassifier
    • minimized size of docker image, simplified building process
    • some fixes for ML libraries
    • some documentation updates
    • deleted plot.ly
    • solved theanets reproducibility
    Source code(tar.gz)
    Source code(zip)
  • 0.6.5(Feb 3, 2016)

    Fixes:

    • TMVA process correct termination
    • TMVA fix for MAX OS El Capitan (problems with dynamic libraries paths)
    • fix travis (show not passed tests, create docker on dockerhub)
    • fix wget in notebooks
    • fix errors calculation in efficiencies (for flatness property)
    • added Makefile
    • fix normalization in the multi dimentional metric
    Source code(tar.gz)
    Source code(zip)
  • 0.6.4(Nov 21, 2015)

    • Add continuous integration
    • Python 3 support
    • Conda installation in docker and travis
    • Kitematic-friendly docker
    • Update all libraries versions
    • added Folding Regressor, added feature importances for folding
    • added minimization to gridsearch, added random gridsearch from distributions
    • added folding scorer for regressor to gridsearch
    • faster tests
    • updated notebooks
    • Fixes:
      • tmva termination
      • documentation for grid search
      • Gridsearch bugs with metrics (metric fit)
      • learning curve with mask for folding
    Source code(tar.gz)
    Source code(zip)
  • 0.6.3(Jul 30, 2015)

  • 0.6.2(Jul 6, 2015)

    • Support of neural networks in common interface:

      • theanets
      • neurolab
      • pybrain

      Now all the REP stuff is available for classifiers and regressors from these libraries:

      • usage inside sklearn pipeline
      • grid_search for hyper parameter optimization
      • reports, parallel training on cluster
    • New lovely documentation, check it out!

    • Fixes in metaclassifiers connected with usage of expressions-as-features

    • Rewritten FeatureSplitter

    • Switched to sklearn 0.16

    • New method train_test_split_group - splitting into train and test by the value of special column. Samples with same values are either both in train or both in test.

    • Update howto/notebooks with new open physical datasets

    Source code(tar.gz)
    Source code(zip)
  • 0.6.1(May 22, 2015)

    • Tmva implementation enhancement with root_numpy https://github.com/yandex/rep/issues/2.
    • Add FPRatTPR (return fpr value at fixed tpr) and TPRatFPR (return tpr value at fixed fpr) metrics, which are required, e.g. for tuning online triggering system. Moreover learning curves are available for these metrics now.
    • Many improvements in documentation.
    Source code(tar.gz)
    Source code(zip)
  • 0.6.0(May 12, 2015)

    • unified classifiers wrapper for variety of implementations: TMVA, Sklearn, XGBoost, uBoost
    • parallel training of classifiers on cluster
    • classification/regression reports with plots
    • support of interactive plots (bokeh, plotly)
    • grid-search with parallelized execution on a cluster
    • git, versioning of research
    • computation of different classification metrics
    • partial support of python 3.
    Source code(tar.gz)
    Source code(zip)
Owner
Yandex
Yandex open source projects and technologies
Yandex
A small demonstration of using WebDataset with ImageNet and PyTorch Lightning

A small demonstration of using WebDataset with ImageNet and PyTorch Lightning This is a small repo illustrating how to use WebDataset on ImageNet. usi

50 Dec 16, 2022
Efficient Training of Audio Transformers with Patchout

PaSST: Efficient Training of Audio Transformers with Patchout This is the implementation for Efficient Training of Audio Transformers with Patchout Pa

165 Dec 26, 2022
Info and sample codes for "NTU RGB+D Action Recognition Dataset"

"NTU RGB+D" Action Recognition Dataset "NTU RGB+D 120" Action Recognition Dataset "NTU RGB+D" is a large-scale dataset for human action recognition. I

Amir Shahroudy 578 Dec 30, 2022
AoT is a system for automatically generating off-target test harness by using build information.

AoT: Auto off-Target Automatically generating off-target test harness by using build information. Brought to you by the Mobile Security Team at Samsun

Samsung 10 Oct 19, 2022
Tensorflow Tutorials using Jupyter Notebook

Tensorflow Tutorials using Jupyter Notebook TensorFlow tutorials written in Python (of course) with Jupyter Notebook. Tried to explain as kindly as po

Sungjoon 2.6k Dec 22, 2022
Get started with Machine Learning with Python - An introduction with Python programming examples

Machine Learning With Python Get started with Machine Learning with Python An engaging introduction to Machine Learning with Python TL;DR Download all

Learn Python with Rune 130 Jan 02, 2023
An Open-Source Tool for Automatic Disease Diagnosis..

OpenMedicalChatbox An Open-Source Package for Automatic Disease Diagnosis. Overview Due to the lack of open source for existing RL-base automated diag

8 Nov 08, 2022
Learned Initializations for Optimizing Coordinate-Based Neural Representations

Learned Initializations for Optimizing Coordinate-Based Neural Representations Project Page | Paper Matthew Tancik*1, Ben Mildenhall*1, Terrance Wang1

Matthew Tancik 127 Jan 03, 2023
Source code for Fathony, Sahu, Willmott, & Kolter, "Multiplicative Filter Networks", ICLR 2021.

Multiplicative Filter Networks This repository contains a PyTorch MFN implementation and code to perform & reproduce experiments from the ICLR 2021 pa

Bosch Research 66 Jan 04, 2023
Code repository for "Reducing Underflow in Mixed Precision Training by Gradient Scaling" presented at IJCAI '20

Reducing Underflow in Mixed Precision Training by Gradient Scaling This project implements the gradient scaling method to improve the performance of m

Ruizhe Zhao 5 Apr 14, 2022
Training code and evaluation benchmarks for the "Self-Supervised Policy Adaptation during Deployment" paper.

Self-Supervised Policy Adaptation during Deployment PyTorch implementation of PAD and evaluation benchmarks from Self-Supervised Policy Adaptation dur

Nicklas Hansen 101 Nov 01, 2022
My take on a practical implementation of Linformer for Pytorch.

Linformer Pytorch Implementation A practical implementation of the Linformer paper. This is attention with only linear complexity in n, allowing for v

Peter 349 Dec 25, 2022
Official Code for ICML 2021 paper "Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline"

Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline Ankit Goyal, Hei Law, Bowei Liu, Alejandro Newell, Jia Deng Internati

Princeton Vision & Learning Lab 115 Jan 04, 2023
PyTorch implementation of the Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning This is the official PyTorch implementation of the ContrastiveCrop paper: @artic

249 Dec 28, 2022
PyTorch - Python + Nim

Master Release Pytorch - Py + Nim A Nim frontend for pytorch, aiming to be mostly auto-generated and internally using ATen. Because Nim compiles to C+

Giovanni Petrantoni 425 Dec 22, 2022
RLBot Python bindings for the Rust crate rl_ball_sym

RLBot Python bindings for rl_ball_sym 0.6 Prerequisites: Rust & Cargo Build Tools for Visual Studio RLBot - Verify that the file %localappdata%\RLBotG

Eric Veilleux 2 Nov 25, 2022
TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)

TensorFlow Examples This tutorial was designed for easily diving into TensorFlow, through examples. For readability, it includes both notebooks and so

Aymeric Damien 42.5k Jan 08, 2023
Code for the paper "Generative design of breakwaters usign deep convolutional neural network as a surrogate model"

Generative design of breakwaters usign deep convolutional neural network as a surrogate model This repository contains the code for the paper "Generat

2 Apr 10, 2022
Model serving at scale

Run inference at scale Cortex is an open source platform for large-scale machine learning inference workloads. Workloads Realtime APIs - respond to pr

Cortex Labs 7.9k Jan 06, 2023
Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

Facebook Research 182 Dec 30, 2022