CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning applications.

Overview

SmartSim Example Zoo

This repository contains CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning applications.

The CrayLabs team will attempt to keep examples updated with current releases but all user contibuted examples should specify the release they were created with.

Contibuting Examples

We welcome any and all contibutions to this repository. The CrayLabs team will do their best to review in a timely manner. We ask that, if you contribute examples, please include a description and all references to code and relavent previous implemenations or open source code that the work is based off of for the benefit of anyone who would like to try out your example.

Examples by Paper

The following examples are implemented based on existing research papers. Each example lists the paper, previous works, and links to the implementation (possibly stored within this repository or a seperate repository)

1. DeepDriveMD

  • Contibuting User: CrayLabs
  • Tags: OpenMM, CVAE, online inference, unsupervised online learning, PyTorch, ensemble

This use case highlights many features of SmartSim and SmartRedis and together they can be used to orchestrate complex workflows with coupled applications without using the filesystem for exchanging information.

More specifically, this use case is based on the original DeepDriveMD work. DeepDriveMD was furthered with an asynchronous streaming version. SmartSim extends the streaming implementation through the use of the SmartSim architecture. The main difference between the SmartSim implementation and the previous implementations, is that neither ML models, nor Molecular Dynamics (MD) intermediate results are stored on the file system. Additionally, the inference portion of the workflow takes place inside the database instead of a seperate task launched on the system.

2. TensorFlowFoam

  • Contributing User: CrayLabs
  • Tags: Online Inference, TensorFlow, OpenFOAM, supervised learning

This example shows how to use TensorFlow inside of OpenFOAM simulations using SmartSim.

More specifically, this SmartSim use case adapts the TensorFlowFoam work which utilized a deep neural network to predict steady-state turbulent viscosities of the Spalart-Allmaras (SA) model. This use case highlights that a machine learning model can be evaluated using SmartSim from within a simulation with minimal external library code. For the OpenFOAM use case herein, only four SmartRedis client API calls are needed to initialize a client connection, send tensor data for evaluation, execute the TensorFlow model, and retrieve the model inference result.

In general, this example provides a useful driver script for those looking to run OpenFOAM with SmartSim.

3. ML-EKE

  • Contributing User: CrayLabs
  • Tags: Online inference, MOM6, climate modeling, ensemble, parameterization replacement

This example was a collaboration between CrayLabs (HPE), NCAR, and the university of Victoria. Using SmartSim, this example shows how to run an ensemble of simulations all using the SmartSim architecture to replace a parameterization (MEKE) within each global ocean simulation (MOM6).

Paper Abstract:

We demonstrate the first climate-scale, numerical ocean simulations improved through distributed, online inference of Deep Neural Networks (DNN) using SmartSim. SmartSim is a library dedicated to enabling online analysis and Machine Learning (ML) for traditional HPC simulations. In this paper, we detail the SmartSim architecture and provide benchmarks including online inference with a shared ML model on heterogeneous HPC systems. We demonstrate the capability of SmartSim by using it to run a 12-member ensemble of global-scale, high-resolution ocean simulations, each spanning 19 compute nodes, all communicating with the same ML architecture at each simulation timestep. In total, 970 billion inferences are collectively served by running the ensemble for a total of 120 simulated years. Finally, we show our solution is stable over the full duration of the model integrations, and that the inclusion of machine learning has minimal impact on the simulation runtimes.

Since this is original research done by CrayLabs, there is no previous implementation.

Examples by Simulation Model

LAMMPS

SmartSim examples with LAMMPS which is a Molecular Dynamics simulation model.

1. Online Analysis of Atom Position

  • Contibuting User: CrayLabs
  • Tags: Molecular Dynamics, online analysis, visualizations.

LAMMPS has dump styles which are custom I/O methods that can be implmentated by users. CrayLabs implemented a SMARTSIM dump style which uses the SmartRedis clients to stream data to an Orchestrator database created by SmartSim.

Once the data is in the database, any application with a SmartRedis client can consume that data. For this example, we have a simple Python script that uses iPyVolume to plot the data every 100 iterations.

Examples by System

High Performance Computing Systems are a bit like snowflakes, they are all different. Since each one has their own quirks, some examples for specific and popular systems can be of benefit to new users.

National Center for Atmospheric Research (NCAR)

1. Cheyenne

  • Contibuting User: CrayLabs
  • implementation (this repo)
  • WLM: PBSPro
  • System: SGI 8600
  • CPU: intel
  • GPU: None

2. Casper

  • Contibuting user: @jedwards4b
  • Implementation (this repo)
  • WLM: PBSPro
  • GPU: Nvidia
  • CPU: Intel
  • SmartSim Version: 0.3.2
  • SmartRedis Version: 0.2.0

Oak Ridge National Lab

1. Summit

  • Contributing user: CrayLabs
  • implementation (this repo)
  • System:
  • OS: Red Hat Enterprise Linux (RHEL)
  • CPU: Power9
  • GPU: Nvidia V100
Owner
Cray Labs
Cray Labs
Responsible Machine Learning with Python

Examples of techniques for training interpretable ML models, explaining ML models, and debugging ML models for accuracy, discrimination, and security.

ph_ 624 Jan 06, 2023
It is a forest of random projection trees

rpforest rpforest is a Python library for approximate nearest neighbours search: finding points in a high-dimensional space that are close to a given

Lyst 211 Dec 29, 2022
Painless Machine Learning for python based on scikit-learn

PlainML Painless Machine Learning Library for python based on scikit-learn. Install pip install plainml Example from plainml import KnnModel, load_ir

1 Aug 06, 2022
Houseprices - Predict sales prices and practice feature engineering, RFs, and gradient boosting

House Prices - Advanced Regression Techniques Predicting House Prices with Machine Learning This project is build to enhance my knowledge about machin

1 Jan 01, 2022
Responsible AI Workshop: a series of tutorials & walkthroughs to illustrate how put responsible AI into practice

Responsible AI Workshop Responsible innovation is top of mind. As such, the tech industry as well as a growing number of organizations of all kinds in

Microsoft 9 Sep 14, 2022
Machine Learning Algorithms ( Desion Tree, XG Boost, Random Forest )

implementation of machine learning Algorithms such as decision tree and random forest and xgboost on darasets then compare results for each and implement ant colony and genetic algorithms on tsp map,

Mohamadreza Rezaei 1 Jan 19, 2022
Cohort Intelligence used to solve various mathematical functions

Cohort-Intelligence-for-Mathematical-Functions About Cohort Intelligence : Cohort Intelligence ( CI ) is an optimization technique. It attempts to mod

Aayush Khandekar 2 Oct 25, 2021
ml4h is a toolkit for machine learning on clinical data of all kinds including genetics, labs, imaging, clinical notes, and more

ml4h is a toolkit for machine learning on clinical data of all kinds including genetics, labs, imaging, clinical notes, and more

Broad Institute 65 Dec 20, 2022
PyHarmonize: Adding harmony lines to recorded melodies in Python

PyHarmonize: Adding harmony lines to recorded melodies in Python About To use this module, the user provides a wav file containing a melody, the key i

Julian Kappler 2 May 20, 2022
Fourier-Bayesian estimation of stochastic volatility models

fourier-bayesian-sv-estimation Fourier-Bayesian estimation of stochastic volatility models Code used to run the numerical examples of "Bayesian Approa

15 Jun 20, 2022
WAGMA-SGD is a decentralized asynchronous SGD for distributed deep learning training based on model averaging.

WAGMA-SGD is a decentralized asynchronous SGD based on wait-avoiding group model averaging. The synchronization is relaxed by making the collectives externally-triggerable, namely, a collective can b

Shigang Li 6 Jun 18, 2022
K-Means clusternig example with Python and Scikit-learn

Unsupervised-Machine-Learning Flat Clustering K-Means clusternig example with Python and Scikit-learn Flat clustering Clustering algorithms group a se

Emin 1 Dec 13, 2021
Climin is a Python package for optimization, heavily biased to machine learning scenarios

climin climin is a Python package for optimization, heavily biased to machine learning scenarios distributed under the BSD 3-clause license. It works

Biomimetic Robotics and Machine Learning at Technische Universität München 177 Sep 02, 2022
OptaPy is an AI constraint solver for Python to optimize planning and scheduling problems.

OptaPy is an AI constraint solver for Python to optimize the Vehicle Routing Problem, Employee Rostering, Maintenance Scheduling, Task Assignment, School Timetabling, Cloud Optimization, Conference S

OptaPy 208 Dec 27, 2022
ThunderGBM: Fast GBDTs and Random Forests on GPUs

Documentations | Installation | Parameters | Python (scikit-learn) interface What's new? ThunderGBM won 2019 Best Paper Award from IEEE Transactions o

Xtra Computing Group 648 Dec 16, 2022
100 Days of Machine and Deep Learning Code

💯 Days of Machine Learning and Deep Learning Code MACHINE LEARNING TOPICS COVERED - FROM SCRATCH Linear Regression Logistic Regression K Means Cluste

Tanishq Gautam 66 Nov 02, 2022
ETNA is an easy-to-use time series forecasting framework.

ETNA is an easy-to-use time series forecasting framework. It includes built in toolkits for time series preprocessing, feature generation, a variety of predictive models with unified interface - from

Tinkoff.AI 674 Jan 07, 2023
AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications.

AutoTabular AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just

wenqi 2 Jun 26, 2022
A simple machine learning python sign language detection project.

SST Coursework 2022 About the app A python application that utilises the tensorflow object detection algorithm to achieve automatic detection of ameri

Xavier Koh 2 Jun 30, 2022
Evaluate on three different ML model for feature selection using Breast cancer data.

Anomaly-detection-Feature-Selection Evaluate on three different ML model for feature selection using Breast cancer data. ML models: SVM, KNN and MLP.

Tarek idrees 1 Mar 17, 2022