Rust bindings for the C++ api of PyTorch.

Overview

tch-rs

Rust bindings for the C++ api of PyTorch. The goal of the tch crate is to provide some thin wrappers around the C++ PyTorch api (a.k.a. libtorch). It aims at staying as close as possible to the original C++ api. More idiomatic rust bindings could then be developed on top of this. The documentation can be found on docs.rs.

Build Status Latest version Documentation License

The code generation part for the C api on top of libtorch comes from ocaml-torch.

Getting Started

This crate requires the C++ PyTorch library (libtorch) in version v1.9.0 to be available on your system. You can either:

  • Use the system-wide libtorch installation (default).
  • Install libtorch manually and let the build script know about it via the LIBTORCH environment variable.
  • When a system-wide libtorch can't be found and LIBTORCH is not set, the build script will download a pre-built binary version of libtorch. By default a CPU version is used. The TORCH_CUDA_VERSION environment variable can be set to cu111 in order to get a pre-built binary using CUDA 11.1.

System-wide Libtorch

The build script will look for a system-wide libtorch library in the following locations:

  • In Linux: /usr/lib/libtorch.so

Libtorch Manual Install

  • Get libtorch from the PyTorch website download section and extract the content of the zip file.
  • For Linux users, add the following to your .bashrc or equivalent, where /path/to/libtorch is the path to the directory that was created when unzipping the file.
export LIBTORCH=/path/to/libtorch
export LD_LIBRARY_PATH=${LIBTORCH}/lib:$LD_LIBRARY_PATH
  • For Windows users, assuming that X:\path\to\libtorch is the unzipped libtorch directory.

    • Navigate to Control Panel -> View advanced system settings -> Environment variables.
    • Create the LIBTORCH variable and set it to X:\path\to\libtorch.
    • Append X:\path\to\libtorch\lib to the Path variable.

    If you prefer to temporarily set environment variables, in PowerShell you can run

$Env:LIBTORCH = "X:\path\to\libtorch"
$Env:Path += ";X:\path\to\libtorch\lib"
  • You should now be able to run some examples, e.g. cargo run --example basics.

Windows Specific Notes

As per the pytorch docs the Windows debug and release builds are not ABI-compatible. This could lead to some segfaults if the incorrect version of libtorch is used.

Examples

Basic Tensor Operations

This crate provides a tensor type which wraps PyTorch tensors. Here is a minimal example of how to perform some tensor operations.

extern crate tch;
use tch::Tensor;

fn main() {
    let t = Tensor::of_slice(&[3, 1, 4, 1, 5]);
    let t = t * 2;
    t.print();
}

Training a Model via Gradient Descent

PyTorch provides automatic differentiation for most tensor operations it supports. This is commonly used to train models using gradient descent. The optimization is performed over variables which are created via a nn::VarStore by defining their shapes and initializations.

In the example below my_module uses two variables x1 and x2 which initial values are 0. The forward pass applied to tensor xs returns xs * x1 + exp(xs) * x2.

Once the model has been generated, a nn::Sgd optimizer is created. Then on each step of the training loop:

  • The forward pass is applied to a mini-batch of data.
  • A loss is computed as the mean square error between the model output and the mini-batch ground truth.
  • Finally an optimization step is performed: gradients are computed and variables from the VarStore are modified accordingly.
extern crate tch;
use tch::nn::{Module, OptimizerConfig};
use tch::{kind, nn, Device, Tensor};

fn my_module(p: nn::Path, dim: i64) -> impl nn::Module {
    let x1 = p.zeros("x1", &[dim]);
    let x2 = p.zeros("x2", &[dim]);
    nn::func(move |xs| xs * &x1 + xs.exp() * &x2)
}

fn gradient_descent() {
    let vs = nn::VarStore::new(Device::Cpu);
    let my_module = my_module(vs.root(), 7);
    let mut opt = nn::Sgd::default().build(&vs, 1e-2).unwrap();
    for _idx in 1..50 {
        // Dummy mini-batches made of zeros.
        let xs = Tensor::zeros(&[7], kind::FLOAT_CPU);
        let ys = Tensor::zeros(&[7], kind::FLOAT_CPU);
        let loss = (my_module.forward(&xs) - ys).pow(2).sum(kind::Kind::Float);
        opt.backward_step(&loss);
    }
}

Writing a Simple Neural Network

The nn api can be used to create neural network architectures, e.g. the following code defines a simple model with one hidden layer and trains it on the MNIST dataset using the Adam optimizer.

extern crate anyhow;
extern crate tch;
use anyhow::Result;
use tch::{nn, nn::Module, nn::OptimizerConfig, Device};

const IMAGE_DIM: i64 = 784;
const HIDDEN_NODES: i64 = 128;
const LABELS: i64 = 10;

fn net(vs: &nn::Path) -> impl Module {
    nn::seq()
        .add(nn::linear(
            vs / "layer1",
            IMAGE_DIM,
            HIDDEN_NODES,
            Default::default(),
        ))
        .add_fn(|xs| xs.relu())
        .add(nn::linear(vs, HIDDEN_NODES, LABELS, Default::default()))
}

pub fn run() -> Result<()> {
    let m = tch::vision::mnist::load_dir("data")?;
    let vs = nn::VarStore::new(Device::Cpu);
    let net = net(&vs.root());
    let mut opt = nn::Adam::default().build(&vs, 1e-3)?;
    for epoch in 1..200 {
        let loss = net
            .forward(&m.train_images)
            .cross_entropy_for_logits(&m.train_labels);
        opt.backward_step(&loss);
        let test_accuracy = net
            .forward(&m.test_images)
            .accuracy_for_logits(&m.test_labels);
        println!(
            "epoch: {:4} train loss: {:8.5} test acc: {:5.2}%",
            epoch,
            f64::from(&loss),
            100. * f64::from(&test_accuracy),
        );
    }
    Ok(())
}

More details on the training loop can be found in the detailed tutorial.

Using some Pre-Trained Model

The pretrained-models example illustrates how to use some pre-trained computer vision model on an image. The weights - which have been extracted from the PyTorch implementation - can be downloaded here resnet18.ot and here resnet34.ot.

The example can then be run via the following command:

cargo run --example pretrained-models -- resnet18.ot tiger.jpg

This should print the top 5 imagenet categories for the image. The code for this example is pretty simple.

    // First the image is loaded and resized to 224x224.
    let image = imagenet::load_image_and_resize(image_file)?;

    // A variable store is created to hold the model parameters.
    let vs = tch::nn::VarStore::new(tch::Device::Cpu);

    // Then the model is built on this variable store, and the weights are loaded.
    let resnet18 = tch::vision::resnet::resnet18(vs.root(), imagenet::CLASS_COUNT);
    vs.load(weight_file)?;

    // Apply the forward pass of the model to get the logits and convert them
    // to probabilities via a softmax.
    let output = resnet18
        .forward_t(&image.unsqueeze(0), /*train=*/ false)
        .softmax(-1);

    // Finally print the top 5 categories and their associated probabilities.
    for (probability, class) in imagenet::top(&output, 5).iter() {
        println!("{:50} {:5.2}%", class, 100.0 * probability)
    }

Further examples include:

External material:

  • A tutorial showing how to use Torch to compute option prices and greeks.

License

tch-rs is distributed under the terms of both the MIT license and the Apache license (version 2.0), at your option.

See LICENSE-APACHE, LICENSE-MIT for more details.

Comments
  • Trying to run basic examples, but I think I have some issues with my config

    Trying to run basic examples, but I think I have some issues with my config

    Hello there :) I wanted to give this crate a go and was attracted by the apparent simplijcity of usage. I downloaded libtorch 1.5 from pytorch website and filled the corresponding Environment Variables I tried to start a new cargo project and just put :

    [dependencies]
    tch = "0.1.6"
    

    in my cargo.toml file.

    I used the first main in example :

    extern crate tch;
    use tch::Tensor;
    
    fn main() {
        let t = Tensor::of_slice(&[3, 1, 4, 1, 5]);
        let t = t * 2;
        t.print();
    }
    

    When I try to run it I encounter som compilation issue :

       Compiling torch-sys v0.1.6
    error: failed to run custom build command for `torch-sys v0.1.6`
    
    Caused by:
      process didn't exit successfully: `F:\RustProjects\tt_torch_rl_demo\target\debug\build\torch-sys-d475e3cf6635366d\build-script-build` (exit code: 1)
    --- stdout
    cargo:rustc-link-search=native=C:\SDKs\libtorch\libtorch-1.5\lib
    TARGET = Some("x86_64-pc-windows-msvc")
    OPT_LEVEL = Some("0")
    HOST = Some("x86_64-pc-windows-msvc")
    CXX_x86_64-pc-windows-msvc = None
    CXX_x86_64_pc_windows_msvc = None
    HOST_CXX = None
    CXX = None
    CXXFLAGS_x86_64-pc-windows-msvc = None
    CXXFLAGS_x86_64_pc_windows_msvc = None
    HOST_CXXFLAGS = None
    CXXFLAGS = None
    CRATE_CC_NO_DEFAULTS = None
    CARGO_CFG_TARGET_FEATURE = Some("fxsr,sse,sse2")
    DEBUG = Some("true")
    running: "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.24.28314\\bin\\HostX64\\x64\\cl.exe" "-nologo" "-MD" "-Z7" "-Brepro" "-I" "C:\\SDKs\\libtorch\\libtorch-1.5\\include" "-I" "C:\\SDKs\\libtorch\\libtorch-1.5\\include/torch/csrc/api/include" "-FoF:\\RustProjects\\tt_torch_rl_demo\\target\\debug\\build\\torch-sys-51e7d731766f4f38\\out\\libtch/torch_api.o" "-c" "libtch/torch_api.cpp"
    torch_api.cpp
    C:\SDKs\libtorch\libtorch-1.5\include\torch\csrc\api\include\torch/data/worker_exception.h(18): warning C4530: C++ exception handler used, but unwind semantics are not enabled. Specify /EHsc
    libtch/torch_api.cpp(380): error C2248: 'torch::autograd::Engine::Engine': cannot access protected member declared in class 'torch::autograd::Engine'
    C:\SDKs\libtorch\libtorch-1.5\include\torch/csrc/autograd/engine.h(213): note: see declaration of 'torch::autograd::Engine::Engine'
    C:\SDKs\libtorch\libtorch-1.5\include\torch/csrc/autograd/engine.h(159): note: see declaration of 'torch::autograd::Engine'
    libtch/torch_api.cpp(394): error C2039: 'beta1': is not a member of 'torch::optim::AdamOptions'
    C:\SDKs\libtorch\libtorch-1.5\include\torch\csrc\api\include\torch/optim/adam.h(21): note: see declaration of 'torch::optim::AdamOptions'
    libtch/torch_api.cpp(394): error C3536: 'options': cannot be used before it is initialized
    libtch/torch_api.cpp(450): error C2039: 'options': is not a member of 'torch::optim::Adam'
    C:\SDKs\libtorch\libtorch-1.5\include\torch\csrc\api\include\torch/optim/adam.h(49): note: see declaration of 'torch::optim::Adam'
    libtch/torch_api.cpp(450): error C2039: 'options': is not a member of 'torch::optim::RMSprop'
    C:\SDKs\libtorch\libtorch-1.5\include\torch\csrc\api\include\torch/optim/rmsprop.h(54): note: see declaration of 'torch::optim::RMSprop'
    libtch/torch_api.cpp(450): error C2039: 'options': is not a member of 'torch::optim::SGD'
    C:\SDKs\libtorch\libtorch-1.5\include\torch\csrc\api\include\torch/optim/sgd.h(48): note: see declaration of 'torch::optim::SGD'
    libtch/torch_api.cpp(463): error C2039: 'options': is not a member of 'torch::optim::Adam'
    C:\SDKs\libtorch\libtorch-1.5\include\torch\csrc\api\include\torch/optim/adam.h(49): note: see declaration of 'torch::optim::Adam'
    libtch/torch_api.cpp(463): error C2039: 'options': is not a member of 'torch::optim::RMSprop'
    C:\SDKs\libtorch\libtorch-1.5\include\torch\csrc\api\include\torch/optim/rmsprop.h(54): note: see declaration of 'torch::optim::RMSprop'
    libtch/torch_api.cpp(463): error C2039: 'options': is not a member of 'torch::optim::SGD'
    C:\SDKs\libtorch\libtorch-1.5\include\torch\csrc\api\include\torch/optim/sgd.h(48): note: see declaration of 'torch::optim::SGD'
    libtch/torch_api.cpp(699): error C2039: 'isGenericList': is not a member of 'c10::IValue'
    C:\SDKs\libtorch\libtorch-1.5\include\torch/csrc/jit/runtime/interpreter.h(13): note: see declaration of 'c10::IValue'
    libtch/torch_api.cpp(751): error C2039: 'isGenericList': is not a member of 'c10::IValue'
    C:\SDKs\libtorch\libtorch-1.5\include\torch/csrc/jit/runtime/interpreter.h(13): note: see declaration of 'c10::IValue'
    libtch/torch_api.cpp(751): error C2039: 'toGenericList': is not a member of 'c10::IValue'
    C:\SDKs\libtorch\libtorch-1.5\include\torch/csrc/jit/runtime/interpreter.h(13): note: see declaration of 'c10::IValue'
    libtch/torch_api.cpp(785): error C2039: 'toGenericList': is not a member of 'c10::IValue'
    C:\SDKs\libtorch\libtorch-1.5\include\torch/csrc/jit/runtime/interpreter.h(13): note: see declaration of 'c10::IValue'
    libtch/torch_api.cpp(785): error C3536: 'vec': cannot be used before it is initialized
    libtch/torch_api.cpp(785): error C2109: subscript requires array or pointer type
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(878): error C2039: '_test_optional_float': is not a member of 'torch'
    C:\SDKs\libtorch\libtorch-1.5\include\torch/custom_class.h(18): note: see declaration of 'torch'
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(878): error C3861: '_test_optional_float': identifier not found
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(2364): error C2039: 'cudnn_convolution_backward_bias': is not a member of 'torch'
    C:\SDKs\libtorch\libtorch-1.5\include\torch/custom_class.h(18): note: see declaration of 'torch'
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(2364): error C3861: 'cudnn_convolution_backward_bias': identifier not found
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(2392): error C2039: 'cudnn_convolution_transpose_backward_bias': is not a member of 'torch'
    C:\SDKs\libtorch\libtorch-1.5\include\torch/custom_class.h(18): note: see declaration of 'torch'
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(2392): error C3861: 'cudnn_convolution_transpose_backward_bias': identifier not found
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(3703): error C2039: 'imag_out': is not a member of 'torch'
    C:\SDKs\libtorch\libtorch-1.5\include\torch/custom_class.h(18): note: see declaration of 'torch'
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(3703): error C3861: 'imag_out': identifier not found
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(3985): error C2660: 'at::leaky_relu_backward': function does not take 3 arguments
    C:\SDKs\libtorch\libtorch-1.5\include\ATen/Functions.h(14254): note: see declaration of 'at::leaky_relu_backward'
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(3988): error C3536: 'outputs__': cannot be used before it is initialized
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(3985): error C2664: 'at::Tensor::Tensor(at::Tensor &&)': cannot convert argument 1 from 'int' to 'c10::intrusive_ptr<c10::TensorImpl,c10::UndefinedTensorImpl>'
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(3988): note: No constructor could take the source type, or constructor overload resolution was ambiguous
    C:\SDKs\libtorch\libtorch-1.5\include\ATen/core/TensorBody.h(85): note: see declaration of 'at::Tensor::Tensor'
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(3992): error C2039: 'leaky_relu_backward_out': is not a member of 'torch'
    C:\SDKs\libtorch\libtorch-1.5\include\torch/custom_class.h(18): note: see declaration of 'torch'
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(3992): error C3861: 'leaky_relu_backward_out': identifier not found
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(5958): error C2039: 'real_out': is not a member of 'torch'
    C:\SDKs\libtorch\libtorch-1.5\include\torch/custom_class.h(18): note: see declaration of 'torch'
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(5958): error C3861: 'real_out': identifier not found
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(6382): error C2660: 'at::rrelu_with_noise_backward': function does not take 6 arguments
    C:\SDKs\libtorch\libtorch-1.5\include\ATen/Functions.h(14406): note: see declaration of 'at::rrelu_with_noise_backward'
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(6385): error C3536: 'outputs__': cannot be used before it is initialized
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(6382): error C2664: 'at::Tensor::Tensor(at::Tensor &&)': cannot convert argument 1 from 'int' to 'c10::intrusive_ptr<c10::TensorImpl,c10::UndefinedTensorImpl>'
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(6385): note: No constructor could take the source type, or constructor overload resolution was ambiguous
    C:\SDKs\libtorch\libtorch-1.5\include\ATen/core/TensorBody.h(85): note: see declaration of 'at::Tensor::Tensor'
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(6389): error C2039: 'rrelu_with_noise_backward_out': is not a member of 'torch'
    C:\SDKs\libtorch\libtorch-1.5\include\torch/custom_class.h(18): note: see declaration of 'torch'
    C:\Users\vidal\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.1.6\libtch\torch_api_generated.cpp.h(6389): error C3861: 'rrelu_with_noise_backward_out': identifier not found
    exit code: 2
    
    --- stderr
    
    
    error occurred: Command "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.24.28314\\bin\\HostX64\\x64\\cl.exe" "-nologo" "-MD" "-Z7" "-Brepro" "-I" "C:\\SDKs\\libtorch\\libtorch-1.5\\include" "-I" "C:\\SDKs\\libtorch\\libtorch-1.5\\include/torch/csrc/api/include" "-FoF:\\RustProjects\\tt_torch_rl_demo\\target\\debug\\build\\torch-sys-51e7d731766f4f38\\out\\libtch/torch_api.o" "-c" "libtch/torch_api.cpp" with args "cl.exe" did not execute successfully (status code exit code: 2).
    
    

    Do you have any idea of what I should do or what is missing on my system ?

    I'm on Windows 10 using CLion with rust plugin

    Thank you in advance.

    Nicolas

    opened by NicolasVidal 24
  • GRU::zero_state -> wrong shape

    GRU::zero_state -> wrong shape

    Hey,

    The GRU::zero_state is missing a dimension which makes the seq method fail. According to the PyTorch doc, you can make it work by changing it to:

       fn zero_state(&self, batch_dim: i64) -> GRUState {
            let dim = if self.config.bidirectional { 2 } else { 1 };
            let shape = [dim * self.config.num_layers, batch_dim, self.hidden_dim];
            GRUState(Tensor::zeros(&shape, (Kind::Float, self.device)))
        }
    

    Should be a similar problem with the LSTM struct.

    Cheers

    opened by vegapit 21
  • Linking error tch 0.1.1 & Centos 7

    Linking error tch 0.1.1 & Centos 7

    When I try to run a binary using tch = 0.1.1 on centos 7, I get a linking error: /usr/bin/ld: warning: libgomp-8bba0e50.so.1, needed by /home/maxence/zezima/market-analytics/target/release/build/torch-sys-c846cc9c89b11e6f/out/libtorch/libtorch/lib/libc10.so, not found (try using -rpath or -rpath-link). Note that I'm not installing libtorch manually.

    When I look at the content of the OUT_DIR of the build script, I can see libgomp-8bba0e50.so.1. So the lib is not missing, just not linked.

    See the full error output https://gist.github.com/jean-airoldie/c908704722181f2d0dfc27bf64bcf668.

    opened by jean-airoldie 17
  • New tutorial on the most fundamental functionalities

    New tutorial on the most fundamental functionalities

    Chose promise, chose due...

    Here is a tutorial I wrote covering the most fundamental use cases for this library.

    Let me know if I got anything wrong, cheers.

    opened by vegapit 15
  • `Cannot initialize CUDA without ATen_cuda library`

    `Cannot initialize CUDA without ATen_cuda library`

    Hi! First, thanks for your work regarding PyTorch.

    Background

    I have run into several problems when trying to run a project using rust-bert, a rust native Transformer-based models implementation which uses tch-rs. The CPU version ran just fine, but the CUDA version did not. Initially, I started a thread on the rust-bert repository with possibly more detailed information, but I'll summarize it here:

    Problem

    First, switching from Device::CPU to Device::CUDA made it stop working and generated the following error:

    TorchError { c_error: "Cannot initialize CUDA without ATen_cuda library. PyTorch splits its backend into two shared libraries: a CPU library and a CUDA library; this error has occurred because you are trying to use some CUDA functionality, but the CUDA library has not been loaded by the dynamic linker for some reason. The CUDA library MUST be loaded, EVEN IF you don\'t directly use any symbols from the CUDA library! One common culprit is a lack of -Wl,--no-as-needed in your link arguments; many dynamic linkers will delete dynamic library dependencies if you don\'t depend on any of their symbols. You can check if this has occurred by using ldd on your binary to see if there is a dependency on *_cuda.so library. (initCUDA at C:\\b\\windows\\pytorch\\aten\\src\\ATen/detail/CUDAHooksInterface.h:63)\n(no backtrace available)" }.

    Trying to fix it, I installed CUDA 10.2.89, updated graphics drivers, tried release and debug modes, always deleting the cargo build directory to force a fresh build. All of this did not change anything.

    Then I tried various manually installing PyTorch 1.5, setting environment variables (LIBTORCH and PATH with the LibTorch path, TORCH_CUDA_VERSION as 10.2), but suddenly, the previous Error did not even show up, because a different runtime error aborted the process before anything else could happen:

    error: process didn't exit successfully: `target\release\phrase-set-variations.exe` (exit code: 0xc0000135, STATUS_DLL_NOT_FOUND)

    Now, not even the CPU version runs, not even after reverting the environment variable changes. :( I was not able to resolve that error with Google, so I'm asking you for help here.

    Environment

    CUDA 10.2.89 tch = "0.1.7" Windows 10 GeForce GTX 1060 rust-bert = "0.7.0"

    Question

    If someone of you has an idea on what I could do next, I would really appreciate some hints :)

    opened by johannesvollmer 13
  • Suggesting a roadmap for v0.1

    Suggesting a roadmap for v0.1

    Hi Laurent

    First of all, I wanted to thank you again for making this happen. Given the pace of the developments and I would love to see an amazing NN crate for Rust, below are my suggestions for v0.1 release.

    • [ ] Improve error handling:
      • [x] Use failure crate for error handle.
      • [ ] Less panic and use unsafe_torch_err! more often.
      • [ ] Handling device errors #16
    • [ ] Various idiomatic Rust improvements:
      • [ ] Customizable optimizers #18
    • [ ] More unit test coverage.
    • [ ] Improve overall documentations.
      • [x] For module level docs use //!
      • [ ] Add doc examples more important methods/functions.
      • [ ] Cross-reference modules.
    • [ ] Decouple implementations from codegen.
    • [ ] Complete tutorials at least as much as the ocaml-torch equivalent.
    • [ ] Integration with Rust ndarray.
    • [ ] GPU build and testing:
      • [x] Local
      • [ ] CI (no free option)
    • [ ] Cover as much as PyTorch API as possible. (see how it goes?)
      • [ ] Linalg ops for dense and sparse tensors.
      • [ ] Add as much nn ops as possible in nn.
      • [ ] Initializers.
      • [ ] Data loading and augmentations.
      • [ ] Multiprocessing with rayon.
      • [ ] Distributed (though it's harder).
    • [ ] Pytorch extensions C++ <--> C <--> Rust
    • [ ] Subcrate core, vision, model_zoo, ffi inside tch through vitual workspace manifest.

    Since you've put a lot of efforts so far and I guess functionality-wise you want to make this crate mimic your other similar projects, please let us know of any other plans to be on the same page.

    opened by ehsanmok 13
  • DLL error : Status Ordinal Not Found

    DLL error : Status Ordinal Not Found

    I'm getting this error :

    error: process didn't exit successfully: `target\debug\neural1.exe` (exit code: 0xc0000138, STATUS_ORDINAL_NOT_FOUND)
    

    From what i understood it might come from my installation of LibTorch so here's the steps i've followed :

    1. Download LibTorch for c++, CPU, release
    2. Unzip the libtorch folder somewhere
    3. Added an environment variable called LIBTORCH with the path of the libtorch folder
    4. Added the libtorch folder to the PATH
    opened by ykafia 12
  • creating first tensor takes 4 seconds

    creating first tensor takes 4 seconds

    Consider the following code:

    extern crate tch;
    use tch::{Cuda, Tensor};
    
    pub fn main() {
        println!("cuda: {:?}", Cuda::is_available());
    
        let opts = (tch::Kind::Float, tch::Device::Cuda(1));
    
        let start = std::time::Instant::now();
        let x_empty = Tensor::empty(&[5, 3], opts);
        let mid = std::time::Instant::now();
    
        let x_rand = Tensor::rand(&[5, 3], opts);
        let x_zeros = Tensor::zeros(&[5, 3], opts);
        let t = Tensor::of_slice(&[5, 3]);
    
        let end = std::time::Instant::now();
    
        println!("time to create 1st tensor: {:?}", mid - start);
        println!("time to create next 3 tensor: {:?}", end - mid);
    
        println!("start: {:?}", start);
        println!("mid: {:?}", mid);
        println!("end: {:?}", end);
    }
    

    I get results of:

    cuda: true
    time to create 1st tensor: 4.124049426s
    time to create next 3 tensor: 907.468µs
    start: Instant { tv_sec: 28481, tv_nsec: 825629454 }
    mid: Instant { tv_sec: 28485, tv_nsec: 949678880 }
    end: Instant { tv_sec: 28485, tv_nsec: 950586348 }
    

    Clearly I am doing something wrong, as it should not take 4 seconds to initialize CUDA. What am I doing wrong?

    opened by zeroexcuses 12
  • Convenient indexing methods

    Convenient indexing methods

    I'm wondering if a convenient slicing function that automatically select(), narrow(), masked_index() or index_select() tensors. Just like that in PyTorch. For the sake of limitations of Index and IndexMut, we could name a polymorphic method tensor.i(), which impl depends on input type. This snipplet illustrates the idea.

    trait TensorIndex<T> {
        fn i(&self, index: T) -> Tensor;
    }
    
    impl TensorIndex<Range> for Tensor {...}
    

    I looked into how PyTorch handles slice indexes of distinct types, and summarize them into these categories

    type | impl --- | --- tuple of {integer, range, list of {integer, range}} | Each tuple component corresponds to one dimension. For example, tensor[0, :2, [1, 3, 5]] results in selecting 0th row on first dim, up to 2nd row on second dim, and index_select() on third dim. integer or range | I treat is as degenerate case of above. tensor | basically masked_index()

    I think Rust is capable of providing above semantics. However, unlike Python, we cannot have mixed typed slices. We need to play with macros to cope with explosive combinations of mixed-type tuples. So I leave the thought here and seek if anyone knows the best way.

    opened by jerry73204 12
  • libtorch error when LD_LIBRARY_PATH is not set

    libtorch error when LD_LIBRARY_PATH is not set

    When the LD_Library path isn't setup to point to a local pytorch install ( such as during clion builds/runs ) cuda calls fail with an error message such as

    TorchError { c_error: "Cannot initialize CUDA without ATen_cuda library. PyTorch splits its backend into two shared libraries: a CPU library and a CUDA library; this error has occurred because you are trying to use some CUDA functionality, but the CUDA library has not been loaded by the dynamic linker for some reason. .... 
    

    From the docs it seems like a copy of pytorch is getting downloaded which isn't cuda compatible. Is this intentional?

    opened by lumost 11
  • Translating `if torch.sum(gt_score) < 1:  return torch.sum(pred_score + pred_geo) * 0`

    Translating `if torch.sum(gt_score) < 1: return torch.sum(pred_score + pred_geo) * 0`

    Hi im working on converting this https://github.com/SakuraRiven/EAST to rust using tch-rs,

    But im struggling to convert this line of code using tch-rs https://github.com/SakuraRiven/EAST/blob/cec7ae98f9c21a475b935f74f4c3969f3a989bd4/loss.py#L31-L32

    I want it to return from the function based on that condition any ideas ?

    opened by mdrokz 10
  • How to pad mini-batches in a training loop?

    How to pad mini-batches in a training loop?

    First of all, thanks for this wonderful library!

    I have been reading through the tests and examples but I'm struggling to figure out one basic use case, so I hope you don't mind me asking here. What is the idiomatic way to iterate through a labeled dataset and pad the mini-batches before each step? I understand tch::data::Iter2 provides something akin to pytorch data loaders but the Iter2 constructor requires me to input the data as tensors, which means padding all data samples to the maximum length. As I'm working with language data this is not feasible in my case. I looked at the translation example but it doesn't seem to use batching. Is my best option to implement something like tch::data::TextData{Iter} for labeled data myself, or have I missed some essential piece of the API somewhere?

    Thanks a lot.

    opened by trpstra 0
  • How to specify key/value types for `IValue::GenericDict`?

    How to specify key/value types for `IValue::GenericDict`?

    I am trying to run a torchscript model with .method_is. The model has a method defined as follows:

        @torch.jit.export
        def generate(self, batch: Dict[str, torch.Tensor]) -> Tuple[torch.Tensor, torch.Tensor]:
            ...
    

    I am trying to construct an input as follows:

        let input = IValue::GenericDict(vec![                                                             
            (IValue::String("text".to_owned()), IValue::Tensor(Tensor::of_slice(&[/* ... */]).reshape(&[-1, 2]))),  
            (IValue::String("text_len".to_owned()), IValue::Tensor(Tensor::of_slice(&[29, 38]))),
            (IValue::String("start_index".to_owned()), IValue::Tensor(Tensor::of_slice(&[1, 1]))),  
        ]);
        let output = model.method_is("generate", &[input])?;
    

    The attempt above failed and produced a runtime error:

    Error: Internal torch error: generate() Expected a value of type 'Dict[str, Tensor]' for argument 'batch' but instead found type 'Dict[Any, Any]'.
    Position: 1
    Declaration: generate(__torch__.dp.model.model.ForwardTransformer self, Dict(str, Tensor) batch) -> ((Tensor, Tensor))
    Exception raised from checkArg at /path/to/libtorch/include/ATen/core/function_schema_inl.h:336 (most recent call first):
    

    Is it possible to specify key/value types for GenericDict, so that it won't be recognized as Dict[Any, Any]?

    opened by Contextualist 2
  • rebuild for torch-sys is triggered everytime

    rebuild for torch-sys is triggered everytime

    Hi, I have exported LD_LIBRARY_PATH and LIBTORCH as required and have a working tch-rs with cuda . However everytime I tried to run tests, torch-sys is rebuild ("Compiling torch-sys v0.10.0") which takes long time.

    However,It works fine on my other pc without setting env vars for LIBTORCH with only cpu

    maybe relevant to this PR : https://github.com/LaurentMazare/tch-rs/pull/184

    opened by doofin 0
  • New `VarStore.load` capabilities

    New `VarStore.load` capabilities

    Hello,

    I have noticed that capabilities to load Pytorch bin or pt pickle archives seems to now be possible with the library which is a great addition. I have just tried this new feature and I am unfortunately encountering errors.

    I am trying to load the following model file, which returns an OrderedDict when loaded in Python with torch.load("path/to/pytorch_model.bin")

    Loading the same model with Rust causes the following error:

    Error: Tch tensor error: Internal torch error: Unrecognized data format
    Exception raised from _load_parameters_bytes at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\jit\mobile\import_data.cpp:269 (most recent call first):
    00007FFC387CAD1200007FFC387CACB0 c10.dll!c10::Error::Error [<unknown file> @ <unknown line number>]
    00007FFC387CA72E00007FFC387CA6E0 c10.dll!c10::detail::torchCheckFail [<unknown file> @ <unknown line number>]
    00007FFBF4BE7D0000007FFBF4BE7B40 torch_cpu.dll!torch::jit::_load_parameters [<unknown file> @ <unknown line number>]
    00007FF784B2A6F900007FF784B2A660 codebert.exe!at_loadz_callback_with_device [C:\Users\Guillaum\.cargo\registry\src\github.com-1ecc6299db9ec823\torch-sys-0.10.0\libtch\torch_api.cpp @ 419]
    00007FF783F4C3F900007FF783F4C2C0 codebert.exe!tch::wrappers::tensor::Tensor::loadz_multi_with_device<ref$<ref$<std::path::PathBuf> > > [C:\Users\Guillaum\.cargo\registry\src\github.com-1ecc6299db9ec823\tch-0.10.1\src\wrappers\tensor.rs @ 681]
    00007FF783F5528600007FF783F55170 codebert.exe!tch::nn::var_store::VarStore::named_tensors<ref$<std::path::PathBuf> > [C:\Users\Guillaum\.cargo\registry\src\github.com-1ecc6299db9ec823\tch-0.10.1\src\nn\var_store.rs @ 175]
    00007FF783F554BB00007FF783F55480 codebert.exe!tch::nn::var_store::VarStore::load<std::path::PathBuf> [C:\Users\Guillaum\.cargo\registry\src\github.com-1ecc6299db9ec823\tch-0.10.1\src\nn\var_store.rs @ 188]
    

    These seem to be caused by Libtorch itself, but you may know from your own experiments if there are limitations regarding the type of format that can be loaded?

    opened by guillaume-be 3
  • More flexible build script for Android support

    More flexible build script for Android support

    I wanted to be able to bind tch to PyTorch Mobile (libtorch on Android). To this end, I made a few changes to the build script:

    • I added environment specific environment variables for giving the path to libtorch (ex.: LIBTORCH_LIB_aarch64-linux-android) so that I can give a path to a different libtorch for each architecture and OS.
    • I removed the include and lib from the LIBTORCH_INCLUDE and LIBTORCH_LIB paths, so the user is now supposed to include these in the paths themselves. This is a breaking change, but it was necessary to make things work on Android, because the folders have different names there.
    • PyTorch has two possible build for Android, a "lite" version and a normal one. They have different names for the .so file, so I added a LIBTORCH_LITE file to tell the build script which one we are linking against.

    I also made some changes to the code, namely support for moving tensors to the Vulkan device. I was disappointed to find out that PyTorch support for Vulkan is pretty half-baked, but still.

    opened by laptou 0
  • Crash printing simple tensor

    Crash printing simple tensor

    This causes a crash:

    
    fn main() {
        let x = tch::Tensor::rand(&[2, 2], (Kind::Uint8, Device::Cpu));
        println!("{x}");
    }
    
    
    

    error:

    
         Running `target/release/neural_net`
    thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Torch("\"check_uniform_bounds\" not implemented for 'Byte'\nException raised from operator() at ../aten/src/ATen/native/DistributionTemplates.h:274 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7f7cff1afd4b in /home/threadexception/Desktop/neural_net/target/release/build/torch-sys-2fa15c8de74b3ddd/out/libtorch/libtorch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xce (0x7f7cff1ab6fe in /home/threadexception/Desktop/neural_net/target/release/build/torch-sys-2fa15c8de74b3ddd/out/libtorch/libtorch/lib/libc10.so)\nframe #2: <unknown function> + 0x16f5b9c (0x7f7ce74f5b9c in /home/threadexception/Desktop/neural_net/target/release/build/torch-sys-2fa15c8de74b3ddd/out/libtorch/libtorch/lib/libtorch_cpu.so)\nframe #3: at::native::uniform_(at::Tensor&, double, double, c10::optional<at::Generator>) + 0x2e (0x7f7ce74ec4fe in /home/threadexception/Desktop/neural_net/target/release/build/torch-sys-2fa15c8de74b3ddd/out/libtorch/libtorch/lib/libtorch_cpu.so)\nframe #4: <unknown function> + 0x248681e (0x7f7ce828681e in /home/threadexception/Desktop/neural_net/target/release/build/torch-sys-2fa15c8de74b3ddd/out/libtorch/libtorch/lib/libtorch_cpu.so)\nframe #5: <unknown function> + 0x2488b60 (0x7f7ce8288b60 in /home/threadexception/Desktop/neural_net/target/release/build/torch-sys-2fa15c8de74b3ddd/out/libtorch/libtorch/lib/libtorch_cpu.so)\nframe #6: at::_ops::uniform_::call(at::Tensor&, double, double, c10::optional<at::Generator>) + 0x183 (0x7f7ce7f73083 in /home/threadexception/Desktop/neural_net/target/release/build/torch-sys-2fa15c8de74b3ddd/out/libtorch/libtorch/lib/libtorch_cpu.so)\nframe #7: at::native::rand(c10::ArrayRef<long>, c10::optional<at::Generator>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) + 0x12a (0x7f7ce77d921a in /home/threadexception/Desktop/neural_net/target/release/build/torch-sys-2fa15c8de74b3ddd/out/libtorch/libtorch/lib/libtorch_cpu.so)\nframe #8: at::native::rand(c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) + 0x4d (0x7f7ce77d935d in /home/threadexception/Desktop/neural_net/target/release/build/torch-sys-2fa15c8de74b3ddd/out/libtorch/libtorch/lib/libtorch_cpu.so)\nframe #9: <unknown function> + 0x25fbcfd (0x7f7ce83fbcfd in /home/threadexception/Desktop/neural_net/target/release/build/torch-sys-2fa15c8de74b3ddd/out/libtorch/libtorch/lib/libtorch_cpu.so)\nframe #10: at::_ops::rand::redispatch(c10::DispatchKeySet, c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) + 0xf3 (0x7f7ce7f08f63 in /home/threadexception/Desktop/neural_net/target/release/build/torch-sys-2fa15c8de74b3ddd/out/libtorch/libtorch/lib/libtorch_cpu.so)\nframe #11: <unknown function> + 0x242331e (0x7f7ce822331e in /home/threadexception/Desktop/neural_net/target/release/build/torch-sys-2fa15c8de74b3ddd/out/libtorch/libtorch/lib/libtorch_cpu.so)\nframe #12: at::_ops::rand::call(c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) + 0x161 (0x7f7ce7f55041 in /home/threadexception/Desktop/neural_net/target/release/build/torch-sys-2fa15c8de74b3ddd/out/libtorch/libtorch/lib/libtorch_cpu.so)\nframe #13: <unknown function> + 0x20d56 (0x55db69e5ed56 in target/release/neural_net)\nframe #14: <unknown function> + 0x118a9 (0x55db69e4f8a9 in target/release/neural_net)\nframe #15: <unknown function> + 0xfbca (0x55db69e4dbca in target/release/neural_net)\nframe #16: <unknown function> + 0xfb53 (0x55db69e4db53 in target/release/neural_net)\nframe #17: <unknown function> + 0xfb69 (0x55db69e4db69 in target/release/neural_net)\nframe #18: <unknown function> + 0x3847c (0x55db69e7647c in target/release/neural_net)\nframe #19: <unknown function> + 0xfc85 (0x55db69e4dc85 in target/release/neural_net)\nframe #20: <unknown function> + 0x27510 (0x7f7ce5b6a510 in /lib64/libc.so.6)\nframe #21: __libc_start_main + 0x89 (0x7f7ce5b6a5c9 in /lib64/libc.so.6)\nframe #22: <unknown function> + 0xfa85 (0x55db69e4da85 in target/release/neural_net)\n")', /home/threadexception/.cargo/registry/src/github.com-1ecc6299db9ec823/tch-0.10.1/src/wrappers/tensor_generated.rs:13914:39
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    
    
    
    opened by terrarier2111 1
Owner
Laurent Mazare
Laurent Mazare
Code for the paper "There is no Double-Descent in Random Forests"

Code for the paper "There is no Double-Descent in Random Forests" This repository contains the code to run the experiments for our paper called "There

2 Jan 14, 2022
This is the official repository of XVFI (eXtreme Video Frame Interpolation)

XVFI This is the official repository of XVFI (eXtreme Video Frame Interpolation), https://arxiv.org/abs/2103.16206 Last Update: 20210607 We provide th

Jihyong Oh 195 Dec 29, 2022
Boostcamp AI Tech 3rd / Basic Paper reading w.r.t Embedding

Boostcamp AI Tech 3rd : Basic Paper Reading w.r.t Embedding TL;DR 1992년부터 2018년도까지 이루어진 word/sentence embedding의 중요한 줄기를 이루는 기초 논문 스터디를 진행하고자 합니다. 논

Soyeon Kim 14 Nov 14, 2022
Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.

Restormer: Efficient Transformer for High-Resolution Image Restoration Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan,

Syed Waqas Zamir 906 Dec 30, 2022
[NeurIPS'21] Projected GANs Converge Faster

[Project] [PDF] [Supplementary] [Talk] This repository contains the code for our NeurIPS 2021 paper "Projected GANs Converge Faster" by Axel Sauer, Ka

798 Jan 04, 2023
The world's largest toxicity dataset.

The Toxicity Dataset by Surge AI Saving the internet is fun. Combing through thousands of online comments to build a toxicity dataset isn't. That's wh

Surge AI 134 Dec 19, 2022
Official repository for: Continuous Control With Ensemble DeepDeterministic Policy Gradients

Continuous Control With Ensemble Deep Deterministic Policy Gradients This repository is the official implementation of Continuous Control With Ensembl

4 Dec 06, 2021
Code for "On the Effects of Batch and Weight Normalization in Generative Adversarial Networks"

Note: this repo has been discontinued, please check code for newer version of the paper here Weight Normalized GAN Code for the paper "On the Effects

Sitao Xiang 182 Sep 06, 2021
Pre-trained model, code, and materials from the paper "Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation" (MICCAI 2019).

Adaptive Segmentation Mask Attack This repository contains the implementation of the Adaptive Segmentation Mask Attack (ASMA), a targeted adversarial

Utku Ozbulak 53 Jul 04, 2022
Joint deep network for feature line detection and description

SOLD² - Self-supervised Occlusion-aware Line Description and Detection This repository contains the implementation of the paper: SOLD² : Self-supervis

Computer Vision and Geometry Lab 427 Dec 27, 2022
Implementation of Monocular Direct Sparse Localization in a Prior 3D Surfel Map (DSL)

DSL Project page: https://sites.google.com/view/dsl-ram-lab/ Monocular Direct Sparse Localization in a Prior 3D Surfel Map Authors: Haoyang Ye, Huaiya

Haoyang Ye 93 Nov 30, 2022
In this project I played with mlflow, streamlit and fastapi to create a training and prediction app on digits

Fastapi + MLflow + streamlit Setup env. I hope I covered all. pip install -r requirements.txt Start app Go in the root dir and run these Streamlit str

76 Nov 23, 2022
Cl datasets - PyTorch image dataloaders and utility functions to load datasets for supervised continual learning

Continual learning datasets Introduction This repository contains PyTorch image

berjaoui 5 Aug 28, 2022
Improving Contrastive Learning by Visualizing Feature Transformation, ICCV 2021 Oral

Improving Contrastive Learning by Visualizing Feature Transformation This project hosts the codes, models and visualization tools for the paper: Impro

Bingchen Zhao 83 Dec 15, 2022
Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis"

StrengthNet Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis" https://arxiv.org/abs/2110

RuiLiu 65 Dec 20, 2022
Code for our ICCV 2021 Paper "OadTR: Online Action Detection with Transformers".

Code for our ICCV 2021 Paper "OadTR: Online Action Detection with Transformers".

66 Dec 15, 2022
Code for the upcoming CVPR 2021 paper

The Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth Jamie Watson, Oisin Mac Aodha, Victor Prisacariu, Gabriel J. Brostow and Michael

Niantic Labs 496 Dec 30, 2022
code for Multi-scale Matching Networks for Semantic Correspondence, ICCV

MMNet This repo is the official implementation of ICCV 2021 paper "Multi-scale Matching Networks for Semantic Correspondence.". Pre-requisite conda cr

joey zhao 25 Dec 12, 2022
Air Pollution Prediction System using Linear Regression and ANN

AirPollution Pollution Weather Prediction System: Smart Outdoor Pollution Monitoring and Prediction for Healthy Breathing and Living Publication Link:

Dr Sharnil Pandya, Associate Professor, Symbiosis International University 19 Feb 07, 2022
DM-ACME compatible implementation of the Arm26 environment from Mujoco

ACME-compatible implementation of Arm26 from Mujoco This repository contains a customized implementation of Mujoco's Arm26 model, that can be used wit

1 Dec 24, 2021