πŸ›  All-in-one web-based IDE specialized for machine learning and data science.

Overview


All-in-one web-based development environment for machine learning

Getting Started β€’ Features & Screenshots β€’ Support β€’ Report a Bug β€’ FAQ β€’ Known Issues β€’ Contribution

The ML workspace is an all-in-one web-based IDE specialized for machine learning and data science. It is simple to deploy and gets you started within minutes to productively built ML solutions on your own machines. This workspace is the ultimate tool for developers preloaded with a variety of popular data science libraries (e.g., Tensorflow, PyTorch, Keras, Sklearn) and dev tools (e.g., Jupyter, VS Code, Tensorboard) perfectly configured, optimized, and integrated.

Highlights

  • πŸ’«   Jupyter, JupyterLab, and Visual Studio Code web-based IDEs.
  • πŸ—ƒ   Pre-installed with many popular data science libraries & tools.
  • πŸ–₯   Full Linux desktop GUI accessible via web browser.
  • πŸ”€   Seamless Git integration optimized for notebooks.
  • πŸ“ˆ   Integrated hardware & training monitoring via Tensorboard & Netdata.
  • πŸšͺ   Access from anywhere via Web, SSH, or VNC under a single port.
  • πŸŽ›   Usable as remote kernel (Jupyter) or remote machine (VS Code) via SSH.
  • 🐳   Easy to deploy on Mac, Linux, and Windows via Docker.

Getting Started

Try in PWD

Prerequisites

The workspace requires Docker to be installed on your machine ( πŸ“– Installation Guide).

Start single instance

Deploying a single workspace instance is as simple as:

docker run -p 8080:8080 mltooling/ml-workspace:0.12.1

VoilΓ , that was easy! Now, Docker will pull the latest workspace image to your machine. This may take a few minutes, depending on your internet speed. Once the workspace is started, you can access it via http://localhost:8080.

If started on another machine or with a different port, make sure to use the machine's IP/DNS and/or the exposed port.

To deploy a single instance for productive usage, we recommend to apply at least the following options:

docker run -d \
    -p 8080:8080 \
    --name "ml-workspace" \
    -v "${PWD}:/workspace" \
    --env AUTHENTICATE_VIA_JUPYTER="mytoken" \
    --shm-size 512m \
    --restart always \
    mltooling/ml-workspace:0.12.1

This command runs the container in background (-d), mounts your current working directory into the /workspace folder (-v), secures the workspace via a provided token (--env AUTHENTICATE_VIA_JUPYTER), provides 512MB of shared memory (--shm-size) to prevent unexpected crashes (see known issues section), and keeps the container running even on system restarts (--restart always). You can find additional options for docker run here and workspace configuration options in the section below.

Configuration Options

The workspace provides a variety of configuration options that can be used by setting environment variables (via docker run option: --env).

Configuration options (click to expand...)
Variable Description Default
WORKSPACE_BASE_URL The base URL under which Jupyter and all other tools will be reachable from. /
WORKSPACE_SSL_ENABLED Enable or disable SSL. When set to true, either certificate (cert.crt) must be mounted to /resources/ssl or, if not, the container generates self-signed certificate. false
WORKSPACE_AUTH_USER Basic auth user name. To enable basic auth, both the user and password need to be set. We recommend to use the AUTHENTICATE_VIA_JUPYTER for securing the workspace.
WORKSPACE_AUTH_PASSWORD Basic auth user password. To enable basic auth, both the user and password need to be set. We recommend to use the AUTHENTICATE_VIA_JUPYTER for securing the workspace.
WORKSPACE_PORT Configures the main container-internal port of the workspace proxy. For most scenarios, this configuration should not be changed, and the port configuration via Docker should be used instead of the workspace should be accessible from a different port. 8080
CONFIG_BACKUP_ENABLED Automatically backup and restore user configuration to the persisted /workspace folder, such as the .ssh, .jupyter, or .gitconfig from the users home directory. true
SHARED_LINKS_ENABLED Enable or disable the capability to share resources via external links. This is used to enable file sharing, access to workspace-internal ports, and easy command-based SSH setup. All shared links are protected via a token. However, there are certain risks since the token cannot be easily invalidated after sharing and does not expire. true
INCLUDE_TUTORIALS If true, a selection of tutorial and introduction notebooks are added to the /workspace folder at container startup, but only if the folder is empty. true
MAX_NUM_THREADS The number of threads used for computations when using various common libraries (MKL, OPENBLAS, OMP, NUMBA, ...). You can also use auto to let the workspace dynamically determine the number of threads based on available CPU resources. This configuration can be overwritten by the user from within the workspace. Generally, it is good to set it at or below the number of CPUs available to the workspace. auto
Jupyter Configuration:
SHUTDOWN_INACTIVE_KERNELS Automatically shutdown inactive kernels after a given timeout (to clean up memory or GPU resources). Value can be either a timeout in seconds or set to true with a default value of 48h. false
AUTHENTICATE_VIA_JUPYTER If true, all HTTP requests will be authenticated against the Jupyter server, meaning that the authentication method configured with Jupyter will be used for all other tools as well. This can be deactivated with false. Any other value will activate this authentication and are applied as token via NotebookApp.token configuration of Jupyter. false
NOTEBOOK_ARGS Add and overwrite Jupyter configuration options via command line args. Refer to this overview for all options.

Persist Data

To persist the data, you need to mount a volume into /workspace (via docker run option: -v).

Details (click to expand...)

The default work directory within the container is /workspace, which is also the root directory of the Jupyter instance. The /workspace directory is intended to be used for all the important work artifacts. Data within other directories of the server (e.g., /root) might get lost at container restarts.

Enable Authentication

We strongly recommend enabling authentication via one of the following two options. For both options, the user will be required to authenticate for accessing any of the pre-installed tools.

The authentication only works for all tools accessed through the main workspace port (default: 8080). This works for all preinstalled tools and the Access Ports feature. If you expose another port of the container, please make sure to secure it with authentication as well!

Details (click to expand...)

Token-based Authentication via Jupyter (recommended)

Activate the token-based authentication based on the authentication implementation of Jupyter via the AUTHENTICATE_VIA_JUPYTER variable:

docker run -p 8080:8080 --env AUTHENTICATE_VIA_JUPYTER="mytoken" mltooling/ml-workspace:0.12.1

You can also use <generated> to let Jupyter generate a random token that is printed out on the container logs. A value of true will not set any token but activate that every request to any tool in the workspace will be checked with the Jupyter instance if the user is authenticated. This is used for tools like JupyterHub, which configures its own way of authentication.

Basic Authentication via Nginx

Activate the basic authentication via the WORKSPACE_AUTH_USER and WORKSPACE_AUTH_PASSWORD variable:

docker run -p 8080:8080 --env WORKSPACE_AUTH_USER="user" --env WORKSPACE_AUTH_PASSWORD="pwd" mltooling/ml-workspace:0.12.1

The basic authentication is configured via the nginx proxy and might be more performant compared to the other option since with AUTHENTICATE_VIA_JUPYTER every request to any tool in the workspace will check via the Jupyter instance if the user (based on the request cookies) is authenticated.

Enable SSL/HTTPS

We recommend enabling SSL so that the workspace is accessible via HTTPS (encrypted communication). SSL encryption can be activated via the WORKSPACE_SSL_ENABLED variable.

Details (click to expand...)

When set to true, either the cert.crt and cert.key file must be mounted to /resources/ssl or, if the certificate files do not exist, the container generates self-signed certificates. For example, if the /path/with/certificate/files on the local system contains a valid certificate for the host domain (cert.crt and cert.key file), it can be used from the workspace as shown below:

docker run \
    -p 8080:8080 \
    --env WORKSPACE_SSL_ENABLED="true" \
    -v /path/with/certificate/files:/resources/ssl:ro \
    mltooling/ml-workspace:0.12.1

If you want to host the workspace on a public domain, we recommend to use Let's encrypt to get a trusted certificate for your domain. To use the generated certificate (e.g., via certbot tool) for the workspace, the privkey.pem corresponds to the cert.key file and the fullchain.pem to the cert.crt file.

When you enable SSL support, you must access the workspace over https://, not over plain http://.

Limit Memory & CPU

By default, the workspace container has no resource constraints and can use as much of a given resource as the host’s kernel scheduler allows. Docker provides ways to control how much memory, or CPU a container can use, by setting runtime configuration flags of the docker run command.

The workspace requires atleast 2 CPUs and 500MB to run stable and be usable.

Details (click to expand...)

For example, the following command restricts the workspace to only use a maximum of 8 CPUs, 16 GB of memory, and 1 GB of shared memory (see Known Issues):

docker run -p 8080:8080 --cpus=8 --memory=16g --shm-size=1G mltooling/ml-workspace:0.12.1

πŸ“– For more options and documentation on resource constraints, please refer to the official docker guide.

Enable Proxy

If a proxy is required, you can pass the proxy configuration via the HTTP_PROXY, HTTPS_PROXY, and NO_PROXY environment variables.

Workspace Flavors

In addition to the main workspace image (mltooling/ml-workspace), we provide other image flavors that extend the features or minimize the image size to support a variety of use cases.

Minimal Flavor

Details (click to expand...)

The minimal flavor (mltooling/ml-workspace-minimal) is our smallest image that contains most of the tools and features described in the features section without most of the python libraries that are pre-installed in our main image. Any Python library or excluded tool can be installed manually during runtime by the user.

docker run -p 8080:8080 mltooling/ml-workspace-minimal:0.12.1

R Flavor

Details (click to expand...)

The R flavor (mltooling/ml-workspace-r) is based on our default workspace image and extends it with the R-interpreter, R-Jupyter kernel, RStudio server (access via Open Tool -> RStudio), and a variety of popular packages from the R ecosystem.

docker run -p 8080:8080 mltooling/ml-workspace-r:0.12.1

Spark Flavor

Details (click to expand...)

The Spark flavor (mltooling/ml-workspace-spark) is based on our R-flavor workspace image and extends it with the Spark runtime, Spark-Jupyter kernel, Zeppelin Notebook (access via Open Tool -> Zeppelin), PySpark, Hadoop, Java Kernel, and a few additional libraries & Jupyter extensions.

docker run -p 8080:8080 mltooling/ml-workspace-spark:0.12.1

GPU Flavor

Details (click to expand...)

Currently, the GPU-flavor only supports CUDA 11.2. Support for other CUDA versions might be added in the future.

The GPU flavor (mltooling/ml-workspace-gpu) is based on our default workspace image and extends it with CUDA 10.1 and GPU-ready versions of various machine learning libraries (e.g., tensorflow, pytorch, cntk, jax). This GPU image has the following additional requirements for the system:

docker run -p 8080:8080 --gpus all mltooling/ml-workspace-gpu:0.12.1
docker run -p 8080:8080 --runtime nvidia --env NVIDIA_VISIBLE_DEVICES="all" mltooling/ml-workspace-gpu:0.12.1

The GPU flavor also comes with a few additional configuration options, as explained below:

Variable Description Default
NVIDIA_VISIBLE_DEVICES Controls which GPUs will be accessible inside the workspace. By default, all GPUs from the host are accessible within the workspace. You can either use all, none, or specify a comma-separated list of device IDs (e.g., 0,1). You can find out the list of available device IDs by running nvidia-smi on the host machine. all
CUDA_VISIBLE_DEVICES Controls which GPUs CUDA applications running inside the workspace will see. By default, all GPUs that the workspace has access to will be visible. To restrict applications, provide a comma-separated list of internal device IDs (e.g., 0,2) based on the available devices within the workspace (run nvidia-smi). In comparison to NVIDIA_VISIBLE_DEVICES, the workspace user will be still able to access other GPUs by overwriting this configuration from within the workspace.
TF_FORCE_GPU_ALLOW_GROWTH By default, the majority of GPU memory will be allocated by the first execution of a TensorFlow graph. While this behavior can be desirable for production pipelines, it is less desirable for interactive use. Use true to enable dynamic GPU Memory allocation or false to instruct TensorFlow to allocate all memory at execution. true

Multi-user setup

The workspace is designed as a single-user development environment. For a multi-user setup, we recommend deploying 🧰 ML Hub. ML Hub is based on JupyterHub with the task to spawn, manage, and proxy workspace instances for multiple users.

Deployment (click to expand...)

ML Hub makes it easy to set up a multi-user environment on a single server (via Docker) or a cluster (via Kubernetes) and supports a variety of usage scenarios & authentication providers. You can try out ML Hub via:

docker run -p 8080:8080 -v /var/run/docker.sock:/var/run/docker.sock mltooling/ml-hub:0.12.1

For more information and documentation about ML Hub, please take a look at the Github Site.



Support

This project is maintained by Benjamin RΓ€thlein, Lukas Masuch, and Jan Kalkan. Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly so that more people can benefit from it.

Type Channel
🚨   Bug Reports
🎁   Feature Requests
πŸ‘©β€πŸ’»   Usage Questions
πŸ“’   Announcements
❓   Other Requests


Features

Jupyter β€’ Desktop GUI β€’ VS Code β€’ JupyterLab β€’ Git Integration β€’ File Sharing β€’ Access Ports β€’ Tensorboard β€’ Extensibility β€’ Hardware Monitoring β€’ SSH Access β€’ Remote Development β€’ Job Execution

The workspace is equipped with a selection of best-in-class open-source development tools to help with the machine learning workflow. Many of these tools can be started from the Open Tool menu from Jupyter (the main application of the workspace):

Within your workspace you have full root & sudo privileges to install any library or tool you need via terminal (e.g., pip, apt-get, conda, or npm). You can find more ways to extend the workspace within the Extensibility section

Jupyter

Jupyter Notebook is a web-based interactive environment for writing and running code. The main building blocks of Jupyter are the file-browser, the notebook editor, and kernels. The file-browser provides an interactive file manager for all notebooks, files, and folders in the /workspace directory.

A new notebook can be created by clicking on the New drop-down button at the top of the list and selecting the desired language kernel.

You can spawn interactive terminal instances as well by selecting New -> Terminal in the file-browser.

The notebook editor enables users to author documents that include live code, markdown text, shell commands, LaTeX equations, interactive widgets, plots, and images. These notebook documents provide a complete and self-contained record of a computation that can be converted to various formats and shared with others.

This workspace has a variety of third-party Jupyter extensions activated. You can configure these extensions in the nbextensions configurator: nbextensions tab on the file browser

The Notebook allows code to be run in a range of different programming languages. For each notebook document that a user opens, the web application starts a kernel that runs the code for that notebook and returns output. This workspace has a Python 3 kernel pre-installed. Additional Kernels can be installed to get access to other languages (e.g., R, Scala, Go) or additional computing resources (e.g., GPUs, CPUs, Memory).

Python 2 is deprected and we do not recommend to use it. However, you can still install a Python 2.7 kernel via this command: /bin/bash /resources/tools/python-27.sh

Desktop GUI

This workspace provides an HTTP-based VNC access to the workspace via noVNC. Thereby, you can access and work within the workspace with a fully-featured desktop GUI. To access this desktop GUI, go to Open Tool, select VNC, and click the Connect button. In the case you are asked for a password, use vncpassword.

Once you are connected, you will see a desktop GUI that allows you to install and use full-fledged web-browsers or any other tool that is available for Ubuntu. Within the Tools folder on the desktop, you will find a collection of install scripts that makes it straightforward to install some of the most commonly used development tools, such as Atom, PyCharm, R-Runtime, R-Studio, or Postman (just double-click on the script).

Clipboard: If you want to share the clipboard between your machine and the workspace, you can use the copy-paste functionality as described below:

πŸ’‘ Long-running tasks: Use the desktop GUI for long-running Jupyter executions. By running notebooks from the browser of your workspace desktop GUI, all output will be synchronized to the notebook even if you have disconnected your browser from the notebook.

Visual Studio Code

Visual Studio Code (Open Tool -> VS Code) is an open-source lightweight but powerful code editor with built-in support for a variety of languages and a rich ecosystem of extensions. It combines the simplicity of a source code editor with powerful developer tooling, like IntelliSense code completion and debugging. The workspace integrates VS Code as a web-based application accessible through the browser-based on the awesome code-server project. It allows you to customize every feature to your liking and install any number of third-party extensions.

The workspace also provides a VS Code integration into Jupyter allowing you to open a VS Code instance for any selected folder, as shown below:

JupyterLab

JupyterLab (Open Tool -> JupyterLab) is the next-generation user interface for Project Jupyter. It offers all the familiar building blocks of the classic Jupyter Notebook (notebook, terminal, text editor, file browser, rich outputs, etc.) in a flexible and powerful user interface. This JupyterLab instance comes pre-installed with a few helpful extensions such as a the jupyterlab-toc, jupyterlab-git, and juptyterlab-tensorboard.

Git Integration

Version control is a crucial aspect of productive collaboration. To make this process as smooth as possible, we have integrated a custom-made Jupyter extension specialized on pushing single notebooks, a full-fledged web-based Git client (ungit), a tool to open and edit plain text documents (e.g., .py, .md) as notebooks (jupytext), as well as a notebook merging tool (nbdime). Additionally, JupyterLab and VS Code also provide GUI-based Git clients.

Clone Repository

For cloning repositories via https, we recommend to navigate to the desired root folder and to click on the git button as shown below:

This might ask for some required settings and, subsequently, opens ungit, a web-based Git client with a clean and intuitive UI that makes it convenient to sync your code artifacts. Within ungit, you can clone any repository. If authentication is required, you will get asked for your credentials.

Push, Pull, Merge, and Other Git Actions

To commit and push a single notebook to a remote Git repository, we recommend to use the Git plugin integrated into Jupyter, as shown below:

For more advanced Git operations, we recommend to use ungit. With ungit, you can do most of the common git actions such as push, pull, merge, branch, tag, checkout, and many more.

Diffing and Merging Notebooks

Jupyter notebooks are great, but they often are huge files, with a very specific JSON file format. To enable seamless diffing and merging via Git this workspace is pre-installed with nbdime. Nbdime understands the structure of notebook documents and, therefore, automatically makes intelligent decisions when diffing and merging notebooks. In the case you have merge conflicts, nbdime will make sure that the notebook is still readable by Jupyter, as shown below:

Furthermore, the workspace comes pre-installed with jupytext, a Jupyter plugin that reads and writes notebooks as plain text files. This allows you to open, edit, and run scripts or markdown files (e.g., .py, .md) as notebooks within Jupyter. In the following screenshot, we have opened a markdown file via Jupyter:

In combination with Git, jupytext enables a clear diff history and easy merging of version conflicts. With both of those tools, collaborating on Jupyter notebooks with Git becomes straightforward.

File Sharing

The workspace has a feature to share any file or folder with anyone via a token-protected link. To share data via a link, select any file or folder from the Jupyter directory tree and click on the share button as shown in the following screenshot:

This will generate a unique link protected via a token that gives anyone with the link access to view and download the selected data via the Filebrowser UI:

To deactivate or manage (e.g., provide edit permissions) shared links, open the Filebrowser via Open Tool -> Filebrowser and select Settings->User Management.

Access Ports

It is possible to securely access any workspace internal port by selecting Open Tool -> Access Port. With this feature, you are able to access a REST API or web application running inside the workspace directly with your browser. The feature enables developers to build, run, test, and debug REST APIs or web applications directly from the workspace.

If you want to use an HTTP client or share access to a given port, you can select the Get shareable link option. This generates a token-secured link that anyone with access to the link can use to access the specified port.

The HTTP app requires to be resolved from a relative URL path or configure a base path (/tools/PORT/). Tools made accessible this way are secured by the workspace's authentication system! If you decide to publish any other port of the container yourself instead of using this feature to make a tool accessible, please make sure to secure it via an authentication mechanism!

Example (click to expand...)
  1. Start an HTTP server on port 1234 by running this command in a terminal within the workspace: python -m http.server 1234
  2. Select Open Tool -> Access Port, input port 1234, and select the Get shareable link option.
  3. Click Access, and you will see the content provided by Python's http.server.
  4. The opened link can also be shared to other people or called from external applications (e.g., try with Incognito Mode in Chrome).

SSH Access

SSH provides a powerful set of features that enables you to be more productive with your development tasks. You can easily set up a secure and passwordless SSH connection to a workspace by selecting Open Tool -> SSH. This will generate a secure setup command that can be run on any Linux or Mac machine to configure a passwordless & secure SSH connection to the workspace. Alternatively, you can also download the setup script and run it (instead of using the command).

The setup script only runs on Mac and Linux. Windows is currently not supported.

Just run the setup command or script on the machine from where you want to setup a connection to the workspace and input a name for the connection (e.g., my-workspace). You might also get asked for some additional input during the process, e.g. to install a remote kernel if remote_ikernel is installed. Once the passwordless SSH connection is successfully setup and tested, you can securely connect to the workspace by simply executing ssh my-workspace.

Besides the ability to execute commands on a remote machine, SSH also provides a variety of other features that can improve your development workflow as described in the following sections.

Tunnel Ports (click to expand...)

An SSH connection can be used for tunneling application ports from the remote machine to the local machine, or vice versa. For example, you can expose the workspace internal port 5901 (VNC Server) to the local machine on port 5000 by executing:

ssh -nNT -L 5000:localhost:5901 my-workspace

To expose an application port from your local machine to a workspace, use the -R option (instead of -L).

After the tunnel is established, you can use your favorite VNC viewer on your local machine and connect to vnc://localhost:5000 (default password: vncpassword). To make the tunnel connection more resistant and reliable, we recommend to use autossh to automatically restart SSH tunnels in the case that the connection dies:

autossh -M 0 -f -nNT -L 5000:localhost:5901 my-workspace

Port tunneling is quite useful when you have started any server-based tool within the workspace that you like to make accessible for another machine. In its default setting, the workspace has a variety of tools already running on different ports, such as:

  • 8080: Main workspace port with access to all integrated tools.
  • 8090: Jupyter server.
  • 8054: VS Code server.
  • 5901: VNC server.
  • 22: SSH server.

You can find port information on all the tools in the supervisor configuration.

πŸ“– For more information about port tunneling/forwarding, we recommend this guide.

Copy Data via SCP (click to expand...)

SCP allows files and directories to be securely copied to, from, or between different machines via SSH connections. For example, to copy a local file (./local-file.txt) into the /workspace folder inside the workspace, execute:

scp ./local-file.txt my-workspace:/workspace

To copy the /workspace directory from my-workspace to the working directory of the local machine, execute:

scp -r my-workspace:/workspace .

πŸ“– For more information about scp, we recommend this guide.

Sync Data via Rsync (click to expand...)

Rsync is a utility for efficiently transferring and synchronizing files between different machines (e.g., via SSH connections) by comparing the modification times and sizes of files. The rsync command will determine which files need to be updated each time it is run, which is far more efficient and convenient than using something like scp or sftp. For example, to sync all content of a local folder (./local-project-folder/) into the /workspace/remote-project-folder/ folder inside the workspace, execute:

rsync -rlptzvP --delete --exclude=".git" "./local-project-folder/" "my-workspace:/workspace/remote-project-folder/"

If you have some changes inside the folder on the workspace, you can sync those changes back to the local folder by changing the source and destination arguments:

rsync -rlptzvP --delete --exclude=".git" "my-workspace:/workspace/remote-project-folder/" "./local-project-folder/"

You can rerun these commands each time you want to synchronize the latest copy of your files. Rsync will make sure that only updates will be transferred.

πŸ“– You can find more information about rsync on this man page.

Mount Folders via SSHFS (click to expand...)

Besides copying and syncing data, an SSH connection can also be used to mount directories from a remote machine into the local filesystem via SSHFS. For example, to mount the /workspace directory of my-workspace into a local path (e.g. /local/folder/path), execute:

sshfs -o reconnect my-workspace:/workspace /local/folder/path

Once the remote directory is mounted, you can interact with the remote file system the same way as with any local directory and file.

πŸ“– For more information about sshfs, we recommend this guide.

Remote Development

The workspace can be integrated and used as a remote runtime (also known as remote kernel/machine/interpreter) for a variety of popular development tools and IDEs, such as Jupyter, VS Code, PyCharm, Colab, or Atom Hydrogen. Thereby, you can connect your favorite development tool running on your local machine to a remote machine for code execution. This enables a local-quality development experience with remote-hosted compute resources.

These integrations usually require a passwordless SSH connection from the local machine to the workspace. To set up an SSH connection, please follow the steps explained in the SSH Access section.

Jupyter - Remote Kernel (click to expand...)

The workspace can be added to a Jupyter instance as a remote kernel by using the remote_ikernel tool. If you have installed remote_ikernel (pip install remote_ikernel) on your local machine, the SSH setup script of the workspace will automatically offer you the option to setup a remote kernel connection.

When running kernels on remote machines, the notebooks themselves will be saved onto the local filesystem, but the kernel will only have access to the filesystem of the remote machine running the kernel. If you need to sync data, you can make use of rsync, scp, or sshfs as explained in the SSH Access section.

In case you want to manually setup and manage remote kernels, use the remote_ikernel command-line tool, as shown below:

# Change my-workspace with the name of a workspace SSH connection
remote_ikernel manage --add \
    --interface=ssh \
    --kernel_cmd="ipython kernel -f {connection_file}" \
    --name="ml-server (Python)" \
    --host="my-workspace"

You can use the remote_ikernel command line functionality to list (remote_ikernel manage --show) or delete (remote_ikernel manage --delete <REMOTE_KERNEL_NAME>) remote kernel connections.

VS Code - Remote Machine (click to expand...)

The Visual Studio Code Remote - SSH extension allows you to open a remote folder on any remote machine with SSH access and work with it just as you would if the folder were on your own machine. Once connected to a remote machine, you can interact with files and folders anywhere on the remote filesystem and take full advantage of VS Code's feature set (IntelliSense, debugging, and extension support). The discovers and works out-of-the-box with passwordless SSH connections as configured by the workspace SSH setup script. To enable your local VS Code application to connect to a workspace:

  1. Install Remote - SSH extension inside your local VS Code.
  2. Run the SSH setup script of a selected workspace as explained in the SSH Access section.
  3. Open the Remote-SSH panel in your local VS Code. All configured SSH connections should be automatically discovered. Just select any configured workspace connection you like to connect to as shown below:

πŸ“– You can find additional features and information about the Remote SSH extension in this guide.

Tensorboard

Tensorboard provides a suite of visualization tools to make it easier to understand, debug, and optimize your experiment runs. It includes logging features for scalar, histogram, model structure, embeddings, and text & image visualization. The workspace comes pre-installed with jupyter_tensorboard extension that integrates Tensorboard into the Jupyter interface with functionalities to start, manage, and stop instances. You can open a new instance for a valid logs directory, as shown below:

If you have opened a Tensorboard instance in a valid log directory, you will see the visualizations of your logged data:

Tensorboard can be used in combination with many other ML frameworks besides Tensorflow. By using the tensorboardX library you can log basically from any python based library. Also, PyTorch has a direct Tensorboard integration as described here.

If you prefer to see the tensorboard directly within your notebook, you can make use of following Jupyter magic:

%load_ext tensorboard
%tensorboard --logdir /workspace/path/to/logs

Hardware Monitoring

The workspace provides two pre-installed web-based tools to help developers during model training and other experimentation tasks to get insights into everything happening on the system and figure out performance bottlenecks.

Netdata (Open Tool -> Netdata) is a real-time hardware and performance monitoring dashboard that visualize the processes and services on your Linux systems. It monitors metrics about CPU, GPU, memory, disks, networks, processes, and more.

Glances (Open Tool -> Glances) is a web-based hardware monitoring dashboard as well and can be used as an alternative to Netdata.

Netdata and Glances will show you the hardware statistics for the entire machine on which the workspace container is running.

Run as a job

A job is defined as any computational task that runs for a certain time to completion, such as a model training or a data pipeline.

The workspace image can also be used to execute arbitrary Python code without starting any of the pre-installed tools. This provides a seamless way to productize your ML projects since the code that has been developed interactively within the workspace will have the same environment and configuration when run as a job via the same workspace image.

Run Python code as a job via the workspace image (click to expand...)

To run Python code as a job, you need to provide a path or URL to a code directory (or script) via EXECUTE_CODE. The code can be either already mounted into the workspace container or downloaded from a version control system (e.g., git or svn) as described in the following sections. The selected code path needs to be python executable. In case the selected code is a directory (e.g., whenever you download the code from a VCS) you need to put a __main__.py file at the root of this directory. The __main__.py needs to contain the code that starts your job.

Run code from version control system

You can execute code directly from Git, Mercurial, Subversion, or Bazaar by using the pip-vcs format as described in this guide. For example, to execute code from a subdirectory of a git repository, just run:

docker run --env EXECUTE_CODE="git+https://github.com/ml-tooling/ml-workspace.git#subdirectory=resources/tests/ml-job" mltooling/ml-workspace:0.12.1

πŸ“– For additional information on how to specify branches, commits, or tags please refer to this guide.

Run code mounted into the workspace

In the following example, we mount and execute the current working directory (expected to contain our code) into the /workspace/ml-job/ directory of the workspace:

docker run -v "${PWD}:/workspace/ml-job/" --env EXECUTE_CODE="/workspace/ml-job/" mltooling/ml-workspace:0.12.1

Install Dependencies

In the case that the pre-installed workspace libraries are not compatible with your code, you can install or change dependencies by just adding one or multiple of the following files to your code directory:

The execution order is 1. environment.yml -> 2. setup.sh -> 3. requirements.txt

Test job in interactive mode

You can test your job code within the workspace (started normally with interactive tools) by executing the following python script:

python /resources/scripts/execute_code.py /path/to/your/job

Build a custom job image

It is also possible to embed your code directly into a custom job image, as shown below:

FROM mltooling/ml-workspace:0.12.1

# Add job code to image
COPY ml-job /workspace/ml-job
ENV EXECUTE_CODE=/workspace/ml-job

# Install requirements only
RUN python /resources/scripts/execute_code.py --requirements-only

# Execute only the code at container startup
CMD ["python", "/resources/docker-entrypoint.py", "--code-only"]

Pre-installed Libraries and Interpreters

The workspace is pre-installed with many popular interpreters, data science libraries, and ubuntu packages:

  • Interpreter: Python 3.8 (Miniconda 3), NodeJS 14, Scala, Perl 5
  • Python libraries: Tensorflow, Keras, Pytorch, Sklearn, XGBoost, MXNet, Theano, and many more
  • Package Manager: conda, pip, apt-get, npm, yarn, sdk, poetry, gdebi...

The full list of installed tools can be found within the Dockerfile.

For every minor version release, we run vulnerability, virus, and security checks within the workspace using safety, clamav, trivy, and snyk via docker scan to make sure that the workspace environment is as secure as possible. We are committed to fix and prevent all high- or critical-severity vulnerabilities. You can find some up-to-date reports here.

Extensibility

The workspace provides a high degree of extensibility. Within the workspace, you have full root & sudo privileges to install any library or tool you need via terminal (e.g., pip, apt-get, conda, or npm). You can open a terminal by one of the following ways:

  • Jupyter: New -> Terminal
  • Desktop VNC: Applications -> Terminal Emulator
  • JupyterLab: File -> New -> Terminal
  • VS Code: Terminal -> New Terminal

Additionally, pre-installed tools such as Jupyter, JupyterLab, and Visual Studio Code each provide their own rich ecosystem of extensions. The workspace also contains a collection of installer scripts for many commonly used development tools or libraries (e.g., PyCharm, Zeppelin, RStudio, Starspace). You can find and execute all tool installers via Open Tool -> Install Tool. Those scripts can be also executed from the Desktop VNC (double-click on the script within the Tools folder on the Desktop VNC).

Example (click to expand...)

For example, to install the Apache Zeppelin notebook server, simply execute:

/resources/tools/zeppelin.sh --port=1234

After installation, refresh the Jupyter website and the Zeppelin tool will be available under Open Tool -> Zeppelin. Other tools might only be available within the Desktop VNC (e.g., atom or pycharm) or do not provide any UI (e.g., starspace, docker-client).

As an alternative to extending the workspace at runtime, you can also customize the workspace Docker image to create your own flavor as explained in the FAQ section.



FAQ

How to customize the workspace image (create your own flavor)? (click to expand...)

The workspace can be extended in many ways at runtime, as explained here. However, if you like to customize the workspace image with your own software or configuration, you can do that via a Dockerfile as shown below:

# Extend from any of the workspace versions/flavors
FROM mltooling/ml-workspace:0.12.1

# Run you customizations, e.g.
RUN \
    # Install r-runtime, r-kernel, and r-studio web server from provided install scripts
    /bin/bash $RESOURCES_PATH/tools/r-runtime.sh --install && \
    /bin/bash $RESOURCES_PATH/tools/r-studio-server.sh --install && \
    # Cleanup Layer - removes unneccessary cache files
    clean-layer.sh

Finally, use docker build to build your customized Docker image.

πŸ“– For a more comprehensive Dockerfile example, take a look at the Dockerfile of the R-flavor.

How to update a running workspace container? (click to expand...)

To update a running workspace instance to a more recent version, the running Docker container needs to be replaced with a new container based on the updated workspace image.

All data within the workspace that is not persisted to a mounted volume will be lost during this update process. As mentioned in the persist data section, a volume is expected to be mounted into the /workspace folder. All tools within the workspace are configured to make use of the /workspace folder as the root directory for all source code and data artifacts. During an update, data within other directories will be removed, including installed/updated libraries or certain machine configurations. We have integrated a backup and restore feature (CONFIG_BACKUP_ENABLED) for various selected configuration files/folders, such as the user's Jupyter/VS-Code configuration, ~/.gitconfig, and ~/.ssh.

Update Example (click to expand...)

If the workspace is deployed via Docker (Kubernetes will have a different update process), you need to remove the existing container (via docker rm) and start a new one (via docker run) with the newer workspace image. Make sure to use the same configuration, volume, name, and port. For example, a workspace (image version 0.8.7) was started with this command:

docker run -d \
    -p 8080:8080 \
    --name "ml-workspace" \
    -v "/path/on/host:/workspace" \
    --env AUTHENTICATE_VIA_JUPYTER="mytoken" \
    --restart always \
    mltooling/ml-workspace:0.8.7

and needs to be updated to version 0.9.1, you need to:

  1. Stop and remove the running workspace container: docker stop "ml-workspace" && docker rm "ml-workspace"
  2. Start a new workspace container with the newer image and same configuration: docker run -d -p 8080:8080 --name "ml-workspace" -v "/path/on/host:/workspace" --env AUTHENTICATE_VIA_JUPYTER="mytoken" --restart always mltooling/ml-workspace:0.9.1
How to configure the VNC server? (click to expand...)

If you want to directly connect to the workspace via a VNC client (not using the noVNC webapp), you might be interested in changing certain VNC server configurations. To configure the VNC server, you can provide/overwrite the following environment variables at container start (via docker run option: --env):

Variable Description Default
VNC_PW Password of VNC connection. This password only needs to be secure if the VNC server is directly exposed. If it is used via noVNC, it is already protected based on the configured authentication mechanism. vncpassword
VNC_RESOLUTION Default desktop resolution of VNC connection. When using noVNC, the resolution will be dynamically adapted to the window size. 1600x900
VNC_COL_DEPTH Default color depth of VNC connection. 24
How to use a non-root user within the workspace? (click to expand...)

Unfortunately, we currently do not support using a non-root user within the workspace. We plan to provide this capability and already started with some refactoring to allow this configuration. However, this still requires a lot more work, refactoring, and testing from our side.

Using root-user (or users with sudo permission) within containers is generally not recommended since, in case of system/kernel vulnerabilities, a user might be able to break out of the container and be able to access the host system. Since it is not very common to have such problematic kernel vulnerabilities, the risk of a severe attack is quite minimal. As explained in the official Docker documentation, containers (even with root users) are generally quite secure in preventing a breakout to the host. And compared to many other container use-cases, we actually want to provide the flexibility to the user to have control and system-level installation permissions within the workspace container.

How to create and use a virtual environment? (click to expand...)

The workspace comes preinstalled with various common tools to create isolated Python environments (virtual environments). The following sections provide a quick-intro on how to use these tools within the workspace. You can find information on when to use which tool here. Please refer to the documentation of the given tool for additional usage information.

venv (recommended):

To create a virtual environment via venv, execute the following commands:

# Create environment in the working directory
python -m venv my-venv
# Activate environment in shell
source ./my-venv/bin/activate
# Optional: Create Jupyter kernel for this environment
pip install ipykernel
python -m ipykernel install --user --name=my-venv --display-name="my-venv ($(python --version))"
# Optional: Close enviornment session
deactivate

pipenv (recommended):

To create a virtual environment via pipenv, execute the following commands:

# Create environment in the working directory
pipenv install
# Activate environment session in shell
pipenv shell
# Optional: Create Jupyter kernel for this environment
pipenv install ipykernel
python -m ipykernel install --user --name=my-pipenv --display-name="my-pipenv ($(python --version))"
# Optional: Close environment session
exit

virtualenv:

To create a virtual environment via virtualenv, execute the following commands:

# Create environment in the working directory
virtualenv my-virtualenv
# Activate environment session in shell
source ./my-virtualenv/bin/activate
# Optional: Create Jupyter kernel for this environment
pip install ipykernel
python -m ipykernel install --user --name=my-virtualenv --display-name="my-virtualenv ($(python --version))"
# Optional: Close environment session
deactivate

conda:

To create a virtual environment via conda, execute the following commands:

# Create environment (globally)
conda create -n my-conda-env
# Activate environment session in shell
conda activate my-conda-env
# Optional: Create Jupyter kernel for this environment
python -m ipykernel install --user --name=my-conda-env --display-name="my-conda-env ($(python --version))"
# Optional: Close environment session
conda deactivate

Tip: Shell Commands in Jupyter Notebooks:

If you install and use a virtual environment via a dedicated Jupyter Kernel and use shell commands within Jupyter (e.g. !pip install matplotlib), the wrong python/pip version will be used. To use the python/pip version of the selected kernel, do the following instead:

import sys
!{sys.executable} -m pip install matplotlib
How to install a different Python version? (click to expand...)

The workspace provides three easy options to install different Python versions alongside the main Python instance: pyenv, pipenv (recommended), conda.

pipenv (recommended):

To install a different python version (e.g. 3.7.8) within the workspace via pipenv, execute the following commands:

# Install python vers
pipenv install --python=3.7.8
# Activate environment session in shell
pipenv shell
# Check python installation
python --version
# Optional: Create Jupyter kernel for this environment
pipenv install ipykernel
python -m ipykernel install --user --name=my-pipenv --display-name="my-pipenv ($(python --version))"
# Optional: Close environment session
exit

pyenv:

To install a different python version (e.g. 3.7.8) within the workspace via pyenv, execute the following commands:

# Install python version
pyenv install 3.7.8
# Make globally accessible
pyenv global 3.7.8
# Activate python version in shell
pyenv shell 3.7.8
# Check python installation
python3.7 --version
# Optional: Create Jupyter kernel for this python version
python3.7 -m pip install ipykernel
python3.7 -m ipykernel install --user --name=my-pyenv-3.7.8 --display-name="my-pyenv (Python 3.7.8)"

conda:

To install a different python version (e.g. 3.7.8) within the workspace via conda, execute the following commands:

# Create environment with python version
conda create -n my-conda-3.7 python=3.7.8
# Activate environment session in shell
conda activate my-conda-3.7
# Check python installation
python --version
# Optional: Create Jupyter kernel for this python version
pip install ipykernel
python -m ipykernel install --user --name=my-conda-3.7 --display-name="my-conda ($(python --version))"
# Optional: Close environment session
conda deactivate

Tip: Shell Commands in Jupyter Notebooks:

If you install and use another Python version via a dedicated Jupyter Kernel and use shell commands within Jupyter (e.g. !pip install matplotlib), the wrong python/pip version will be used. To use the python/pip version of the selected kernel, do the following instead:

import sys
!{sys.executable} -m pip install matplotlib
Can I publish any other than the default port to access a tool inside the container? (click to expand...) You can do this, but please be aware that this port is not protected by the workspace's authentication mechanism then! For security reasons, we therefore highly recommend to use the Access Ports functionality of the workspace.
System and Tool Translations (click to expand...) If you want to configure another language than English in your workspace and some tools are not translated properly, have a look at this issue. Try to comment out the 'exclude translations' line in `/etc/dpkg/dpkg.cfg.d/excludes` and re-install / configure the package.


Known Issues

Too small shared memory might crash tools or scripts (click to expand...)

Certain desktop tools (e.g., recent versions of Firefox) or libraries (e.g., Pytorch - see Issues: 1, 2) might crash if the shared memory size (/dev/shm) is too small. The default shared memory size of Docker is 64MB, which might not be enough for a few tools. You can provide a higher shared memory size via the shm-size docker run option:

docker run --shm-size=2G mltooling/ml-workspace:0.12.1
Multiprocessing code is unexpectedly slow (click to expand...)

In general, the performance of running code within Docker is nearly identical compared to running it directly on the machine. However, in case you have limited the container's CPU quota (as explained in this section), the container can still see the full count of CPU cores available on the machine and there is no technical way to prevent this. Many libraries and tools will use the full CPU count (e.g., via os.cpu_count()) to set the number of threads used for multiprocessing/-threading. This might cause the program to start more threads/processes than it can efficiently handle with the available CPU quota, which can tremendously slow down the overall performance. Therefore, it is important to set the available CPU count or the maximum number of threads explicitly to the configured CPU quota. The workspace provides capabilities to detect the number of available CPUs automatically, which are used to configure a variety of common libraries via environment variables such as OMP_NUM_THREADS or MKL_NUM_THREADS. It is also possible to explicitly set the number of available CPUs at container startup via the MAX_NUM_THREADS environment variable (see configuration section). The same environment variable can also be used to get the number of available CPUs at runtime.

Even though the automatic configuration capabilities of the workspace will fix a variety of inefficiencies, we still recommend configuring the number of available CPUs with all libraries explicitly. For example:

import os
MAX_NUM_THREADS = int(os.getenv("MAX_NUM_THREADS"))

# Set in pytorch
import torch
torch.set_num_threads(MAX_NUM_THREADS)

# Set in tensorflow
import tensorflow as tf
config = tf.ConfigProto(
    device_count={"CPU": MAX_NUM_THREADS},
    inter_op_parallelism_threads=MAX_NUM_THREADS,
    intra_op_parallelism_threads=MAX_NUM_THREADS,
)
tf_session = tf.Session(config=config)

# Set session for keras
import keras.backend as K
K.set_session(tf_session)

# Set in sklearn estimator
from sklearn.linear_model import LogisticRegression
LogisticRegression(n_jobs=MAX_NUM_THREADS).fit(X, y)

# Set for multiprocessing pool
from multiprocessing import Pool

with Pool(MAX_NUM_THREADS) as pool:
    results = pool.map(lst)
Nginx terminates with SIGILL core dumped error (click to expand...)

If you encounter the following error within the container logs when starting the workspace, it will most likely not be possible to run the workspace on your hardware:

exited: nginx (terminated by SIGILL (core dumped); not expected)

The OpenResty/Nginx binary package used within the workspace requires to run on a CPU with SSE4.2 support (see this issue). Unfortunately, some older CPUs do not have support for SSE4.2 and, therefore, will not be able to run the workspace container. On Linux, you can check if your CPU supports SSE4.2 when looking into the cat /proc/cpuinfo flags section. If you encounter this problem, feel free to notify us by commenting on the following issue: #30.



Contribution

Development

Requirements: Docker and Act are required to be installed on your machine to execute the build process.

To simplify the process of building this project from scratch, we provide build-scripts - based on universal-build - that run all necessary steps (build, test, and release) within a containerized environment. To build and test your changes, execute the following command in the project root folder:

act -b -j build

Under the hood it uses the build.py files in this repo based on the universal-build library. So, if you want to build it locally, you can also execute this command in the project root folder to build the docker container:

python build.py --make

For additional script options:

python build.py --help

Refer to our contribution guides for more detailed information on our build scripts and development process.


Licensed Apache 2.0. Created and maintained with ❀️   by developers from Berlin.

Comments
  • vscode, vnc, ungit, rstudio permission problem

    vscode, vnc, ungit, rstudio permission problem

    Feature description:

    vscode, vnc, ungit, rstudio proceeds without password

    Problem and motivation:

    we can escape the risk of hacking.

    Is this something you're interested in working on?

    Can I set a password?

    security further input needed support 
    opened by Foreist 14
  • (Selfsinged) Certificates can't be installed

    (Selfsinged) Certificates can't be installed

    Hello,

    some days ago I found your wonderful ml-workspace image. You're doing great work with this image :-)

    While testing it out I found a bug. I suppose that this is a operating-system issue, where you can't really do anything about it, but I'd still like to report it.

    When I create an container via:

    docker run -d \
    	-p 8080:8080 \
    	--gpus all \
    	--name "ml-workspace" \
    	--env WORKSPACE_SSL_ENABLED="true" \
    	-v /data/Docker/ML-tooling/workspace:/workspace \
    	--env AUTHENTICATE_VIA_JUPYTER="secret" \
    	--shm-size 1G \
    	--restart always \
    	mltooling/ml-workspace-gpu:0.12.1
    

    As you can see I'm using selfsinged certificate mode, since no certificate volumn is mapped.

    I get this log file:

    2021-01-13 13:23:37,335 [INFO] Starting...
    2021-01-13 13:23:37,409 [INFO] Start Workspace
    2021-01-13 13:23:37,409 [INFO] Copy tutorials to /workspace folder
    2021-01-13 13:23:37,520 [INFO] Running config backup restore.
    2021-01-13 13:23:37,520 [INFO] Nothing to restore. Config backup folder is empty.
    2021-01-13 13:23:37,526 [INFO] Configure ssh service
    2021-01-13 13:23:37,576 [INFO] Creating new SSH Key (id_ed25519)
    Agent pid 41
    Identity added: /root/.ssh/id_ed25519 ([email protected])
    2021-01-13 13:23:37,626 [INFO] Configure nginx service
    Generating self-signed certificate for SSL/HTTPS.
    ERROR: usage: test [options] [file_or_dir] [file_or_dir] [...]
    test: error: unrecognized arguments: -e
      inifile: None
      rootdir: /etc/ssl/certs
    ERROR: usage: test [options] [file_or_dir] [file_or_dir] [...]
    test: error: unrecognized arguments: -e
      inifile: None
      rootdir: /etc/ssl/certs
    ERROR: usage: test [options] [file_or_dir] [file_or_dir] [...]
    test: error: unrecognized arguments: -e
      inifile: None
      rootdir: /etc/ssl/certs
    
        ...
    
    ERROR: usage: test [options] [file_or_dir] [file_or_dir] [...]
    test: error: unrecognized arguments: -e
      inifile: None
      rootdir: /etc/ssl/certs
    ERROR: usage: test [options] [file_or_dir] [file_or_dir] [...]
    test: error: unrecognized arguments: -e
      inifile: None
      rootdir: /etc/ssl/certs
    Warning: there was a problem reading the certificate file /etc/ssl/certs/cert.pem. Message:
      /etc/ssl/certs/cert.pem (No such file or directory)
    /resources/scripts/setup-certs.sh: 26: /resources/scripts/setup-certs.sh: cannot create /opt/conda/envs/python2/lib/python2.7/site-packages/certifi/cacert.pem: Directory nonexistent
    2021-01-13 13:28:31,325 [INFO] Configure tools
    2021-01-13 13:28:31,389 [INFO] Initialize filebrowser database.
    2021-01-13 13:28:31,423 [INFO] Create filebrowser admin with generated password: jzwdoikaqyspxumgnvhb
    2021-01-13 13:28:31,540 [INFO] Configure cron scripts
    2021-01-13 13:28:31,710 [INFO] Scheduling cron check xfdesktop task with with cron: 0 * * * *
    2021-01-13 13:28:31,715 [INFO] Running cron jobs:
    2021-01-13 13:28:31,715 [INFO] @hourly /opt/conda/bin/python '/resources/scripts/check_xfdesktop_leak.py' check> /proc/1/fd/1 2>/proc/1/fd/2
    2021-01-13 13:28:31,784 [INFO] Starting configuration backup.
    2021-01-13 13:28:31,876 [INFO] Scheduling cron config backup task with with cron: 0 * * * *
    2021-01-13 13:28:31,882 [INFO] Running cron jobs:
    2021-01-13 13:28:31,882 [INFO] @hourly /opt/conda/bin/python '/resources/scripts/check_xfdesktop_leak.py' check> /proc/1/fd/1 2>/proc/1/fd/2
    2021-01-13 13:28:31,882 [INFO] @hourly . /resources/environment.sh; /opt/conda/bin/python '/resources/scripts/backup_restore_config.py' backup> /proc/1/fd/1 2>/proc/1/fd/2
    2021-01-13 13:28:31,899 [INFO] Configure and run custom scripts
    2021-01-13 13:28:32,148 INFO Included extra file "/etc/supervisor/conf.d/cron.conf" during parsing
    2021-01-13 13:28:32,148 INFO Included extra file "/etc/supervisor/conf.d/filebrowser.conf" during parsing
    2021-01-13 13:28:32,148 INFO Included extra file "/etc/supervisor/conf.d/glances.conf" during parsing
    2021-01-13 13:28:32,148 INFO Included extra file "/etc/supervisor/conf.d/jupyter.conf" during parsing
    2021-01-13 13:28:32,148 INFO Included extra file "/etc/supervisor/conf.d/netdata.conf" during parsing
    2021-01-13 13:28:32,148 INFO Included extra file "/etc/supervisor/conf.d/nginx.conf" during parsing
    2021-01-13 13:28:32,148 INFO Included extra file "/etc/supervisor/conf.d/novnc.conf" during parsing
    2021-01-13 13:28:32,149 INFO Included extra file "/etc/supervisor/conf.d/rsyslog.conf" during parsing
    2021-01-13 13:28:32,149 INFO Included extra file "/etc/supervisor/conf.d/sshd.conf" during parsing
    2021-01-13 13:28:32,149 INFO Included extra file "/etc/supervisor/conf.d/sslh.conf" during parsing
    2021-01-13 13:28:32,149 INFO Included extra file "/etc/supervisor/conf.d/ungit.conf" during parsing
    2021-01-13 13:28:32,149 INFO Included extra file "/etc/supervisor/conf.d/vncserver.conf" during parsing
    2021-01-13 13:28:32,149 INFO Included extra file "/etc/supervisor/conf.d/vscode.conf" during parsing
    2021-01-13 13:28:32,149 INFO Included extra file "/etc/supervisor/conf.d/xrdp.conf" during parsing
    2021-01-13 13:28:32,149 INFO Set uid to user 0 succeeded
    2021-01-13 13:28:32,159 INFO RPC interface 'supervisor' initialized
    2021-01-13 13:28:32,159 CRIT Server 'inet_http_server' running without any HTTP authentication checking
    2021-01-13 13:28:32,159 INFO RPC interface 'supervisor' initialized
    2021-01-13 13:28:32,159 CRIT Server 'unix_http_server' running without any HTTP authentication checking
    2021-01-13 13:28:32,159 INFO supervisord started with pid 1717
    2021-01-13 13:28:33,162 INFO spawned: 'rsyslog' with pid 1719
    2021-01-13 13:28:33,164 INFO spawned: 'nginx' with pid 1720
    2021-01-13 13:28:33,166 INFO spawned: 'sslh' with pid 1721
    2021-01-13 13:28:33,169 INFO spawned: 'sshd' with pid 1722
    2021-01-13 13:28:33,171 INFO spawned: 'jupyter' with pid 1723
    2021-01-13 13:28:33,173 INFO spawned: 'vncserver' with pid 1727
    2021-01-13 13:28:33,175 INFO spawned: 'cron' with pid 1728
    2021-01-13 13:28:33,177 INFO spawned: 'filebrowser' with pid 1729
    2021-01-13 13:28:33,179 INFO spawned: 'glances' with pid 1732
    2021-01-13 13:28:33,181 INFO spawned: 'netdata' with pid 1734
    2021-01-13 13:28:33,183 INFO spawned: 'novnc' with pid 1736
    2021-01-13 13:28:33,185 INFO spawned: 'ungit' with pid 1743
    2021-01-13 13:28:33,187 INFO spawned: 'vscode' with pid 1745
    2021-01-13 13:28:33,189 INFO spawned: 'xrdp' with pid 1747
    2021-01-13 13:28:34,212 INFO success: rsyslog entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    2021-01-13 13:28:34,212 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    2021-01-13 13:28:34,212 INFO success: sslh entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    2021-01-13 13:28:34,212 INFO success: sshd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    2021-01-13 13:28:34,212 INFO success: jupyter entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    2021-01-13 13:28:34,212 INFO success: vncserver entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    2021-01-13 13:28:34,212 INFO success: cron entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    2021-01-13 13:28:34,212 INFO success: filebrowser entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    2021-01-13 13:28:34,212 INFO success: glances entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    2021-01-13 13:28:34,212 INFO success: netdata entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    2021-01-13 13:28:34,212 INFO success: novnc entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    2021-01-13 13:28:34,212 INFO success: ungit entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    2021-01-13 13:28:34,212 INFO success: vscode entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    2021-01-13 13:28:34,212 INFO success: xrdp entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    2021-01-13 13:28:36.904169: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
    2021-01-13 14:00:01,277 [INFO] Starting configuration backup.
    

    As you can see the update-ca-certificates causes a crash. After about 5 min the system keeps going and runs into the nex issue sine the path /opt/conda/envs/python2/lib/python2.7 does not exsist. I suppose this old python 2.7 path is no longer valied since you're using Python 3.X so I'm not that bothered by it. But the update-ca-certificates-Bug leads to me beeing unable to install e.g. RStudio within the container. I Hope that using another Ubuntu base image will fix this issue.

    B.t.w. is it a lot of work for you to upgrade yout R 3.6.1 to the R 4.0.3 Version?

    Thanks for you wunderfull Software.

    bug stale 
    opened by Someone894 13
  • Update?

    Update?

    Can we expect updates anytime soon? I really like to use ml-workspace. But since most of its software is outdated it becomes inconvinient to upgrade everything via pip / apt everytime I deploy a new container.

    feature stale 
    opened by Emporea 12
  • Is this repository still active?

    Is this repository still active?

    Hello, just wanted to ask if one can expect that this repository will get updates in the future or if it got abonded?

    I use the jupyter labs and vscode on a daily basis but each of their own received a lot of updates since the last half year or so and ml-tooling hasnt had any updates since january.

    Is it because of corona?

    discussion 
    opened by Emporea 11
  • Ungit is not loading in the browser

    Ungit is not loading in the browser

    Describe the bug: The Ungit link, from the workspace "Open Tool", is not loading the web interface for the Ungit. ezgif com-add-text

    Expected behaviour: Open the web interface for Ungit

    Steps to reproduce the issue: Open the ML-Hub login to the server. 2. Create a new server with a given name 3. After the server spawn, access the server. 4. Open Tool -> Ungit

    Technical details:

    • Workspace version : 0.8.7
    • Docker version :
    • Host Machine OS (Windows/Linux/Mac): Linux
    • Command used to start the workspace : ML-Hub
    • Browser (Chrome/Firefox/Safari): Chrome, Firefox

    Possible Fix:

    Additional context:

    bug next 
    opened by IslamAlam 11
  • Minimal Working Example with Non-root User (Jovyan) and Single User Spawn

    Minimal Working Example with Non-root User (Jovyan) and Single User Spawn

    MLWorkspace strikes me as a phenomenal tool with broad applicability. I'm wondering if you might be able to provide an example of how to fit it into a standard multi-user jupyterhub + dockerspawner, where a 'start-singleuser' command spawns an image in which the primary user is not root, but jupyter's standard jovyan, and permissions are fixed accordingly. I've been attempting this, but it's proving a bit elusive to me.

    Basically, is there a way of combining jupyter's base-notebook stack with mlworkspace to create a single-user compatible image that could be spawned from jupyterhub?

    (MLHub seems like a natural step in this direction, but this would be for users who are interested in integrating MLWorkspace into their existing workflows and who might be gunshy about having single users have root permissions).

    My thanks in advance.

    feature stale 
    opened by ColinConwell 10
  • Desktop GUI has some features not appearing in French

    Desktop GUI has some features not appearing in French

    Describe the bug:

    XFCE VNC does not get fully localized into French (first image below) after installing the locale and setting the environment variables.

    Expected behaviour:

    The XFCE installed applications (using thunar as an example) to appear in French (second image below).

    Steps to reproduce the issue: In a docker file

    FROM mltooling/ml-workspace-minimal:0.12.1
    RUN \
        apt-get update && \
        apt-get install -y locales && \
        sed -i -e 's/# fr_FR.UTF-8 UTF-8/fr_FR.UTF-8 UTF-8/' /etc/locale.gen && \
        locale-gen && \
        dpkg-reconfigure --frontend=noninteractive locales && \
        clean-layer.sh
    ENV LANG=fr_FR.UTF-8 LANGUAGE=fr_FR.UTF-8 LC_ALL=fr_FR.UTF-8
    

    Build: sudo docker build -t mltest Run: sudo docker run --rm -p 8080:8080 mltest Navigate to desktop gui

    Technical details:

    • Workspace version: 0.12.1 (base)
    • Docker versionL: 20.10.0
    • Host Machine OS (Windows/Linux/Mac): Linux
    • Command used to start the workspace : sudo docker run --rm -p 8080:8080 mltest
    • Browser (Chrome/Firefox/Safari): Firefox

    Possible Fix: I don't know :(

    Additional context: I've looked at and extended just an xfce specific image here and was able to get the UI to fully appear in French (see second image below). That image is based on ubuntu:16.04.

    image

    I would instead expect something like image

    bug stale 
    opened by Jose-Matsuda 8
  • Docker image does not build

    Docker image does not build

    Describe the bug:

    When I build the image with docker, I get errors. I run the following command inside the project root folder: docker image build -t ml-workspace .

    After a while I get the error at step 50/97: RuntimeError: JupyterLab failed to build

    Expected behaviour:

    Success build.

    Steps to reproduce the issue:

    I run the following command inside the project root folder: docker image build -t ml-workspace .

    After a while I get the error at step 50/97:

    Step 50/97 : RUN     jupyter lab build &&     jupyter labextension install @jupyter-widgets/jupyterlab-manager &&     if [ "$WORKSPACE_FLAVOR" = "minimal" ]; then         jupyter lab clean &&         jlpm cache clean &&         rm -rf $CONDA_DIR/share/jupyter/lab/staging &&         clean-layer.sh &&         exit 0 ;     fi &&     jupyter labextension install @jupyterlab/toc &&     jupyter labextension install jupyterlab_tensorboard &&     jupyter labextension install @jupyterlab/git &&     pip install jupyterlab-git &&     jupyter serverextension enable --py jupyterlab_git &&     jupyter labextension install jupyter-matplotlib &&     if [ "$WORKSPACE_FLAVOR" = "light" ]; then         jupyter lab clean &&         jlpm cache clean &&         rm -rf $CONDA_DIR/share/jupyter/lab/staging &&         clean-layer.sh &&         exit 0 ;     fi &&     pip install --pre jupyter-lsp &&     jupyter labextension install @krassowski/jupyterlab-lsp &&     jupyter labextension install @jupyterlab/plotly-extension &&     jupyter labextension install jupyterlab-chart-editor &&     jupyter labextension install @pyviz/jupyterlab_pyviz &&     jupyter labextension install @lckr/jupyterlab_variableinspector &&     jupyter labextension install @ryantam626/jupyterlab_code_formatter &&     pip install jupyterlab_code_formatter &&     jupyter serverextension enable --py jupyterlab_code_formatter &&     jupyter lab build &&     jupyter lab clean &&     jlpm cache clean &&     rm -rf $CONDA_DIR/share/jupyter/lab/staging &&     clean-layer.sh
     ---> Running in 62ce5b6a94e6
    [LabBuildApp] JupyterLab 1.2.5
    [LabBuildApp] Building in /opt/conda/share/jupyter/lab
    [LabBuildApp] Building jupyterlab assets (build:prod:minimize)
    An error occured.
    RuntimeError: JupyterLab failed to build
    

    Technical details:

    • Workspace version : 0.9.1
    • Docker version : 19.03.05
    • Host Machine OS (Windows/Linux/Mac): Windows
    • Command used to start the workspace :
    • Browser (Chrome/Firefox/Safari):

    Possible Fix:

    Additional context:

    bug 
    opened by xprizedevops 8
  • Jupyter Notebook and Jupyter Lab seems to use only one CPU in multi-cpu configs

    Jupyter Notebook and Jupyter Lab seems to use only one CPU in multi-cpu configs

    Describe the bug:

    We have a setup where we have some cloud machines with one gpu and multiple cpus, which we use for training some models. In such a context, we've tried running our usual notebooks from the ml-workspace docker image with Jupyter Notebook as well as Jupyter Lab, however we've noticed that during training or multi-process operations, only one CPU on the machine is being used. We've noticed this in netstat as well as using htop.

    Expected behaviour:

    When running jupyter notebooks that use multiprocess operations, all cpus on the host machine should be used.

    Steps to reproduce the issue:

    Here is a piece of code we tried to run specifically to test this behaviour: def f(x): while True: x*x

    pool = multiprocessing.Pool(3) pool.map(f,range(100))

    Technical details:

    • Workspace version : 0.8.7
    • Docker version : 18.09.7
    • Host Machine OS (Windows/Linux/Mac): Ubuntu 18.04 LTS
    • Command used to start the workspace : docker run -d -p 8081:8080 --name "ml-workspace" -v ${WORKSPACE_DIR}:/workspace -v /data/:/data --runtime nvidia --env NVIDIA_VISIBLE_DEVICES="all" --env AUTHENTICATE_VIA_JUPYTER="token" --shm-size 512m --restart always $IMAGE
    • Browser (Chrome/Firefox/Safari):

    Possible Fix:

    Our suspicion is that maybe there are some restrictions in the setup/run config for Jupyter Notebook.

    Additional context:

    bug next 
    opened by andr0idsensei 7
  • Building Develop fail due to Nginx and update related to Ubuntu Base Upgrade

    Building Develop fail due to Nginx and update related to Ubuntu Base Upgrade

    What kind of change does this PR introduce?

    • [x] Bugfix
    • [ ] New Feature
    • [x] Feature Improvment
    • [ ] Refactoring
    • [ ] Documentation
    • [x] Other, please describe:

    Description:

    Currently installing Flair via pip fail during the build of the latest develop-branch.

        Requirement already satisfied, skipping upgrade: ipython-genutils==0.2.0 in /opt/conda/lib/python3.7/site-packages (from flair==0.4.4->-r /resources/libraries/requirements-full.txt (line 98)) (0.2.0)
        Collecting sqlitedict>=1.6.0
        Downloading https://files.pythonhosted.org/packages/0f/1c/c757b93147a219cf1e25cef7e1ad9b595b7f802159493c45ce116521caff/sqlitedict-1.6.0.tar.gz
        Collecting tiny-tokenizer[all]
        Downloading https://files.pythonhosted.org/packages/8d/0f/aa52c227c5af69914be05723b3deaf221805a4ccbce87643194ef2cdde43/tiny_tokenizer-3.1.0.tar.gz
        ERROR: Packages installed from PyPI cannot depend on packages which are not also hosted on PyPI.
        tiny-tokenizer depends on SudachiDict_core@ https://object-storage.tyo2.conoha.io/v1/nc_2520839e1f9641b08211a5c85243124a/sudachi/SudachiDict_core-20190927.tar.gz 
        The command '/bin/sh -c ln -s -f $CONDA_DIR/bin/python /usr/bin/python &&     apt-get update &&     pip install --upgrade pip &&     if [ "$WORKSPACE_FLAVOR" = "minimal" ]; then         conda install -y --update-all nomkl ;     else         conda install -y --update-all mkl ;     fi &&     conda install -y --update-all             'python='$PYTHON_VERSION             tqdm             pyzmq             cython             graphviz             numpy             matplotlib             scipy             requests             urllib3             pandas             six             future             protobuf             zlib             boost             psutil             PyYAML             python-crontab             ipykernel             cmake             Pillow             'ipython=7.10.*'             'notebook=6.0.*'             'jupyterlab=1.2.*' &&     pip install --no-cache-dir --upgrade -r ${RESOURCES_PATH}/libraries/requirements-minimal.txt &&     if [ "$WORKSPACE_FLAVOR" = "minimal" ]; then         fix-permissions.sh $CONDA_DIR &&         clean-layer.sh &&         exit 0 ;     fi &&     apt-get install -y --no-install-recommends libopenmpi-dev openmpi-bin &&     conda install -y mkl-include &&     conda install -y numba &&     conda install -y 'tensorflow=2.0.*' &&     conda install -y -c pytorch "pytorch==1.3.*" torchvision cpuonly &&     pip install --no-cache-dir --upgrade -r ${RESOURCES_PATH}/libraries/requirements-light.txt &&     if [ "$WORKSPACE_FLAVOR" = "light" ]; then         fix-permissions.sh $CONDA_DIR &&         clean-layer.sh &&         exit 0 ;     fi &&     apt-get install -y --no-install-recommends liblapack-dev libatlas-base-dev libeigen3-dev pandoc libblas-dev &&     conda install -y -c pytorch faiss-cpu &&     pip install --no-cache-dir --upgrade -r ${RESOURCES_PATH}/libraries/requirements-full.txt &&     python -m spacy download en &&     cd $CONDA_PYTHON_DIR/site-packages/spacy/lang &&     rm -rf tr pt da sv ca nb &&     fix-permissions.sh $CONDA_DIR &&     clean-layer.sh' returned a non-zero code: 1
        Executing: docker build -t ml-workspace:latest -t ml-workspace:latest  --build-arg ARG_WORKSPACE_VERSION=latest  --build-arg ARG_WORKSPACE_FLAVOR=full  --build-arg ARG_VCS_REF=8242e6b  --build-arg ARG_BUILD_DATE=2020-01-01T12:27:14Z ./
        Failed to build container
    

    The workaround is mentioned here.

    Checklist:

    • [x] I have read the CONTRIBUTING document.
    • [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
    • Fix issues related to the build of Openresty, Nginx, and OpenSSL
    • Upgrading CUDA, apt packages for ubuntu 18.04

    The docker files have been built and tested: https://hub.docker.com/u/imansour

    opened by IslamAlam 6
  • Julia Kernel

    Julia Kernel

    Feature description:

    It would be great if there could be a script in tools for installing a Julia kernel that would be available from Jupyter.

    Problem and motivation:

    It's extremely easy to get Python or R going with ml-workspace, but it would be nice if Julia was as straightforward.

    Personally, I'm interested in getting more familiar with Julia and am also interested in using ml-workspace for my working environment. That's pretty much it. I think it would be useful for other Julia users.

    Is this something you're interested in working on?

    I'd be willing to try to figure it out.

    feature stale 
    opened by sterlinm 6
  • I'm not able to access a file inside a directory in workspace  like workspace/folder1/file1.How can this be done?

    I'm not able to access a file inside a directory in workspace like workspace/folder1/file1.How can this be done?

    Describe the bug:

    Expected behaviour:

    Steps to reproduce the issue:

    Technical details:

    • Workspace version :
    • Docker version :
    • Host Machine OS (Windows/Linux/Mac):
    • Command used to start the workspace :
    • Browser (Chrome/Firefox/Safari):

    Possible Fix:

    Additional context:

    bug 
    opened by mahendranmahendran 1
  • update dockerfile with https for pyenv-doctor

    update dockerfile with https for pyenv-doctor

    What kind of change does this PR introduce?

    • [x ] Bugfix
    • [ ] New Feature
    • [ ] Feature Improvement
    • [ ] Refactoring
    • [ ] Documentation
    • [ ] Other, please describe:

    Description: build fails at 318. Changing the URI for pyenv-doctor to https fixes the build.

    Checklist:

    • [ x] I have read the CONTRIBUTING document.
    • [x ] My changes don't require a change to the documentation, or if they do, I've added all required information.
    opened by jraviotta 0
  • Create updatedockerfile minimal

    Create updatedockerfile minimal

    What kind of change does this PR introduce?

    • [ ] Bugfix
    • [ ] New Feature
    • [ ] Feature Improvement
    • [ ] Refactoring
    • [ ] Documentation
    • [ ] Other, please describe:

    Description:

    Checklist:

    • [ ] I have read the CONTRIBUTING document.
    • [ ] My changes don't require a change to the documentation, or if they do, I've added all required information.
    opened by baluvanam 0
Releases(v0.13.2)
  • v0.13.2(Jul 13, 2021)

  • v0.12.1(Jan 11, 2021)

    πŸ“ Documentation

    • Document security aspects with exposing other ports (#67).
    • Various minor updates to Readme.

    πŸ‘· Maintenance & Refactoring

    • Refactor spark flavor into separated tool scripts.
    • Use r-flavor as base image for the spark-flavor.
    • Add link to GitHub repo into tooling menu.

    🎁 Features & Improvements

    • Add gpu-r flavor.

    🚨 Bug Fixes

    • Fix error if a folder is named notebooks (#63).

    ⬆ Dependencies

    • Update code-server, ungit, and filebrowser to newest versions.
    • Update Jupyter to 6.1.6.
    • Update python minimal and light dependencies.
    • Various other dependency updates and fixes.
    Source code(tar.gz)
    Source code(zip)
  • v0.11.0(Dec 13, 2020)

    πŸ‘· Maintenance & Refactoring

    • Change default branch from master to main

    🚨 Bug Fixes

    • Fix memory error with numba introduced by icc_rt
    • Fix and update tensorboard magic
    • Update tutorial notebooks

    ⬆ Dependencies

    • Add catboost and pycaret
    • Install lightgbm gpu version in gpu-flavor
    • Update rapids installer to version 0.17
    Source code(tar.gz)
    Source code(zip)
  • v0.10.4(Dec 13, 2020)

    🎁 Features & Improvements

    • Updated all dependencies to up-to-date versions
    • Added additional dependencies based on best-of-ml-python list
    • Implemented a new build-, test- and release-workflow
    • Apply trivy and synk vulnerability scans for additional security
    • Implemented and documented improved support for virtual/isolated Python environments
    • A variety of other fixes and improvements for better stability
    Source code(tar.gz)
    Source code(zip)
Owner
Machine Learning Tooling
Open-source machine learning tooling to boost your productivity.
Machine Learning Tooling
TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning

TransZero++ This repository contains the testing code for the paper "TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning" submitted

Shiming Chen 6 Aug 16, 2022
AutoVideo: An Automated Video Action Recognition System

AutoVideo is a system for automated video analysis. It is developed based on D3M infrastructure, which describes machine learning with generic pipeline languages. Currently, it focuses on video actio

Data Analytics Lab at Texas A&M University 267 Dec 17, 2022
TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.

TalkNet 2 [WIP] TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Predictio

Rishikesh (ΰ€‹ΰ€·ΰ€Ώΰ€•ΰ₯‡ΰ€Ά) 69 Dec 17, 2022
The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Deep High-Resolution Representation Learning for Human Pose Estimation (CVPR 2019) News [2020/07/05] A very nice blog from Towards Data Science introd

Leo Xiao 3.9k Jan 05, 2023
Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

One-Shot Voice Conversion with Weight Adaptive Instance Normalization By Shengjie Huang, Yanyan Xu*, Dengfeng Ke*, Mingjie Chen, Thomas Hain. This rep

31 Dec 07, 2022
Optimus: the first large-scale pre-trained VAE language model

Optimus: the first pre-trained Big VAE language model This repository contains source code necessary to reproduce the results presented in the EMNLP 2

314 Dec 19, 2022
Repository for the AugmentedPCA Python package.

Overview This Python package provides implementations of Augmented Principal Component Analysis (AugmentedPCA) - a family of linear factor models that

Billy Carson 6 Dec 07, 2022
An investigation project for SISR.

SISR-Survey An investigation project for SISR. This repository is an official project of the paper "From Beginner to Master: A Survey for Deep Learnin

Juncheng Li 79 Oct 20, 2022
Official PyTorch implementation of SyntaSpeech (IJCAI 2022)

SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech | | | | δΈ­ζ–‡ζ–‡ζ‘£ This repository is the official PyTorch implementation of our IJCAI-2022

Zhenhui YE 116 Nov 24, 2022
Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

This repo is the official implementation of "Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework". @inproceedings{zhou2021insta

34 Dec 31, 2022
TuckER: Tensor Factorization for Knowledge Graph Completion

TuckER: Tensor Factorization for Knowledge Graph Completion This codebase contains PyTorch implementation of the paper: TuckER: Tensor Factorization f

Ivana Balazevic 296 Dec 06, 2022
UCSD Oasis platform

oasis UCSD Oasis platform Local project setup Install Docker Compose and make sure you have Pip installed Clone the project and go to the project fold

InSTEDD 4 Jun 16, 2021
Scene-Text-Detection-and-Recognition (Pytorch)

Scene-Text-Detection-and-Recognition (Pytorch) Competition URL: https://tbrain.t

Gi-Luen Huang 9 Jan 02, 2023
DeepMind Alchemy task environment: a meta-reinforcement learning benchmark

The DeepMind Alchemy environment is a meta-reinforcement learning benchmark that presents tasks sampled from a task distribution with deep underlying structure.

DeepMind 188 Dec 25, 2022
How to Learn a Domain Adaptive Event Simulator? ACM MM, 2021

LETGAN How to Learn a Domain Adaptive Event Simulator? ACM MM 2021 Running Environment: pytorch=1.4, 1 NVIDIA-1080TI. More details can be found in pap

CVTEAM 4 Sep 20, 2022
A human-readable PyTorch implementation of "Self-attention Does Not Need O(n^2) Memory"

memory_efficient_attention.pytorch A human-readable PyTorch implementation of "Self-attention Does Not Need O(n^2) Memory" (Rabe&Staats'21). def effic

Ryuichiro Hataya 7 Dec 26, 2022
Code repository of the paper Neural circuit policies enabling auditable autonomy published in Nature Machine Intelligence

Neural Circuit Policies Enabling Auditable Autonomy Online access via SharedIt Neural Circuit Policies (NCPs) are designed sparse recurrent neural net

8 Jan 07, 2023
Learning Temporal Consistency for Low Light Video Enhancement from Single Images (CVPR2021)

StableLLVE This is a Pytorch implementation of "Learning Temporal Consistency for Low Light Video Enhancement from Single Images" in CVPR 2021, by Fan

99 Dec 19, 2022
Phonetic PosteriorGram (PPG)-Based Voice Conversion (VC)

ppg-vc Phonetic PosteriorGram (PPG)-Based Voice Conversion (VC) This repo implements different kinds of PPG-based VC models. Pretrained models. More m

Liu Songxiang 227 Dec 28, 2022
DziriBERT: a Pre-trained Language Model for the Algerian Dialect

DziriBERT DziriBERT is the first Transformer-based Language Model that has been pre-trained specifically for the Algerian Dialect. It handles Algerian

117 Jan 07, 2023