Python wrapper of LSODA (solving ODEs) which can be called from within numba functions.

Overview

numbalsoda

numbalsoda is a python wrapper to the LSODA method in ODEPACK, which is for solving ordinary differential equation initial value problems. LSODA was originally written in Fortran. numbalsoda is a wrapper to a C++ re-write of the original code: https://github.com/dilawar/libsoda

numbalsoda also wraps the dop853 explicit Runge-Kutta method from this repository: https://github.com/jacobwilliams/dop853

This package is very similar to scipy.integrate.solve_ivp (see here), when you set method = 'LSODA' or method = DOP853. But, scipy.integrate.solve_ivp invokes the python interpreter every time step which can be slow. Also, scipy.integrate.solve_ivp can not be used within numba jit-compiled python functions. In contrast, numbalsoda never invokes the python interpreter during integration and can be used within a numba compiled function which makes numbalsoda a lot faster than scipy for most problems, and achieves similar performance to Julia's DifferentialEquations.jl in some cases (see benchmark folder).

Installation

Conda:

conda install -c conda-forge numbalsoda

Pip:

python -m pip install numbalsoda

Basic usage

from numbalsoda import lsoda_sig, lsoda, dop853
from numba import njit, cfunc
import numpy as np

@cfunc(lsoda_sig)
def rhs(t, u, du, p):
    du[0] = u[0]-u[0]*u[1]
    du[1] = u[0]*u[1]-u[1]*p[0]

funcptr = rhs.address # address to ODE function
u0 = np.array([5.,0.8]) # Initial conditions
data = np.array([1.0]) # data you want to pass to rhs (data == p in the rhs).
t_eval = np.linspace(0.0,50.0,1000) # times to evaluate solution

# integrate with lsoda method
usol, success = lsoda(funcptr, u0, t_eval, data = data)

# integrate with dop853 method
usol1, success1 = dop853(funcptr, u0, t_eval, data = data)

# usol = solution
# success = True/False

The variables u, du and p in the rhs function are pointers to an array of floats. Therefore, operations like np.sum(u) or len(u) will not work. However, you can use the function nb.carray() to make a numpy array out of the pointers. For example:

import numba as nb

@cfunc(lsoda_sig)
def rhs(t, u, du, p):
    u_ = nb.carray(u, (2,))
    p_ = nb.carray(p, (1,))
    # ... rest of rhs goes here using u_ and p_

Above, u_ and p_ are numpy arrays build out of u and p, and so functions like np.sum(u_) will work.

Also, note lsoda can be called within a jit-compiled numba function (see below). This makes it much faster than scipy if a program involves many integrations in a row.

@njit
def test():
    usol, success = lsoda(funcptr, u0, t_eval, data = data)
    return usol
usol = test() # this works!

@njit
def test_sp():
    sol = solve_ivp(f_scipy, t_span, u0, t_eval = t_eval, method='LSODA')
    return sol
sol = test_sp() # this does not work :(

Passing data to the right-hand-side function

In the examples shown above, we passed a an single array of floats to the right-hand-side function:

# ...
data = np.array([1.0])
usol, success = lsoda(funcptr, u0, t_eval, data = data)

However, sometimes you might want to pass more data types than just floats. For example, you might want to pass several integers, an array of floats, and an array of integers. One way to achieve this is with generating the cfunc using a function like this:

def make_lsoda_func(param1, param2, param3):
    @cfunc(lsoda_sig)
    def rhs(t, x, du, p):
        # Here param1, param2, and param3
        # can be accessed.
        du[0] = param1*t
        # etc...
    return rhs
    
rhs = make_lsoda_func(10.0, 5, 10000)
funcptr = rhs.address
# etc...

The only drawback of this approach is if you want to do many successive integrations where the parameters change because it would required re-compiling the cfunc between each integration. This could be slow.

But! It is possible to pass arbitrary parameters without re-compiling the cfunc, but it is a little tricky. The notebook passing_data_to_rhs_function.ipynb gives an example that explains how.

Comments
  • Installation with pip fails

    Installation with pip fails

    Good evening,

    I am working on a project involving the long-time integration of an ODE, and the results you show definitely look very promising, but I cannot install the package ...

    I am working on mac os X v12.4 with a M1 chip. Python v3.10.8, pip v22.3.1, setuptools v65.5.1.

    To really test only the install of numbalsoda and remove all possible side effects from pre-installed packages, I put myself in a new virtualenv. πŸ”΄ I am getting the following error when doing pip install numbalsoda:

    pip install numbalsoda
    Collecting numbalsoda
      Downloading numbalsoda-0.3.4.tar.gz (241 kB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 241.3/241.3 kB 2.8 MB/s eta 0:00:00
      Installing build dependencies ... done
      Getting requirements to build wheel ... done
      Preparing metadata (pyproject.toml) ... done
    Collecting numpy
      Downloading numpy-1.23.5-cp310-cp310-macosx_11_0_arm64.whl (13.4 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.4/13.4 MB 9.5 MB/s eta 0:00:00
    Collecting numba
      Downloading numba-0.56.4-cp310-cp310-macosx_11_0_arm64.whl (2.4 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.4/2.4 MB 9.4 MB/s eta 0:00:00
    Requirement already satisfied: setuptools in ./py3env/lib/python3.10/site-packages (from numba->numbalsoda) (65.5.1)
    Collecting llvmlite<0.40,>=0.39.0dev0
      Downloading llvmlite-0.39.1-cp310-cp310-macosx_11_0_arm64.whl (23.1 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.1/23.1 MB 10.3 MB/s eta 0:00:00
    Building wheels for collected packages: numbalsoda
      Building wheel for numbalsoda (pyproject.toml) ... error
      error: subprocess-exited-with-error
      
      Γ— Building wheel for numbalsoda (pyproject.toml) did not run successfully.
      β”‚ exit code: 1
      ╰─> [73 lines of output]
          /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/setuptools/dist.py:770: UserWarning: Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead
            warnings.warn(
          /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/setuptools/config/setupcfg.py:508: SetuptoolsDeprecationWarning: The license_file parameter is deprecated, use license_files instead.
            warnings.warn(msg, warning_class)
          
          
          --------------------------------------------------------------------------------
          -- Trying "Ninja" generator
          --------------------------------
          ---------------------------
          ----------------------
          -----------------
          ------------
          -------
          --
          Not searching for unused variables given on the command line.
          -- The C compiler identification is AppleClang 13.1.6.13160021
          -- Detecting C compiler ABI info
          -- Detecting C compiler ABI info - done
          -- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang - skipped
          -- Detecting C compile features
          -- Detecting C compile features - done
          -- The CXX compiler identification is AppleClang 13.1.6.13160021
          -- Detecting CXX compiler ABI info
          -- Detecting CXX compiler ABI info - done
          -- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++ - skipped
          -- Detecting CXX compile features
          -- Detecting CXX compile features - done
          -- Configuring done
          -- Generating done
          -- Build files have been written to: /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_cmake_test_compile/build
          --
          -------
          ------------
          -----------------
          ----------------------
          ---------------------------
          --------------------------------
          -- Trying "Ninja" generator - success
          --------------------------------------------------------------------------------
          
          Configuring Project
            Working directory:
              /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_skbuild/macosx-12.0-arm64-3.10/cmake-build
            Command:
              cmake /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57 -G Ninja -DCMAKE_INSTALL_PREFIX:PATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_skbuild/macosx-12.0-arm64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.8 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPYTHON_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPYTHON_LIBRARY:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/lib/libpython3.10.dylib -DPython_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPython_ROOT_DIR:PATH=/Users/tbridel/Documents/1-CODES/py3env -DPython_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPython_FIND_REGISTRY:STRING=NEVER -DPython3_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPython3_ROOT_DIR:PATH=/Users/tbridel/Documents/1-CODES/py3env -DPython3_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPython3_FIND_REGISTRY:STRING=NEVER -DCMAKE_MAKE_PROGRAM:FILEPATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja -DSKBUILD=ON -DCMAKE_BUILD_TYPE:STRING=Release -DCMAKE_OSX_DEPLOYMENT_TARGET:STRING=12.0 -DCMAKE_OSX_ARCHITECTURES:STRING=arm64
          
          CMake Error at /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/cmake/data/share/cmake-3.25/Modules/CMakeDetermineFortranCompiler.cmake:33 (message):
            Could not find compiler set in environment variable FC:
          
            /usr/local/bin/gfortran.
          Call Stack (most recent call first):
            CMakeLists.txt:3 (project)
          
          
          CMake Error: CMAKE_Fortran_COMPILER not set, after EnableLanguage
          CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
          -- Configuring incomplete, errors occurred!
          See also "/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_skbuild/macosx-12.0-arm64-3.10/cmake-build/CMakeFiles/CMakeOutput.log".
          Traceback (most recent call last):
            File "/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/skbuild/setuptools_wrap.py", line 632, in setup
              env = cmkr.configure(
            File "/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/skbuild/cmaker.py", line 334, in configure
              raise SKBuildError(
          
          An error occurred while configuring with CMake.
            Command:
              cmake /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57 -G Ninja -DCMAKE_INSTALL_PREFIX:PATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_skbuild/macosx-12.0-arm64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.8 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPYTHON_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPYTHON_LIBRARY:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/lib/libpython3.10.dylib -DPython_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPython_ROOT_DIR:PATH=/Users/tbridel/Documents/1-CODES/py3env -DPython_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPython_FIND_REGISTRY:STRING=NEVER -DPython3_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPython3_ROOT_DIR:PATH=/Users/tbridel/Documents/1-CODES/py3env -DPython3_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPython3_FIND_REGISTRY:STRING=NEVER -DCMAKE_MAKE_PROGRAM:FILEPATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja -DSKBUILD=ON -DCMAKE_BUILD_TYPE:STRING=Release -DCMAKE_OSX_DEPLOYMENT_TARGET:STRING=12.0 -DCMAKE_OSX_ARCHITECTURES:STRING=arm64
            Source directory:
              /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57
            Working directory:
              /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_skbuild/macosx-12.0-arm64-3.10/cmake-build
          Please see CMake's output for more information.
          [end of output]
      
      note: This error originates from a subprocess, and is likely not a problem with pip.
      ERROR: Failed building wheel for numbalsoda
    Failed to build numbalsoda
    ERROR: Could not build wheels for numbalsoda, which is required to install pyproject.toml-based projects
    

    πŸ”΄ This error also occurs with numbalsoda v0.3.3.

    🟒 The pip install runs fine for numbalsoda v0.2.1 however.

    Could you please advise @Nicholaswogan ?

    Thank you very much in advance ! Best, T. Bridel-Bertomeu

    opened by tbridel 5
  • [lsoda] 500 steps taken before reaching tout

    [lsoda] 500 steps taken before reaching tout

    I am running a simulation and if I understand correctly the maximum number of steps (500) is performed before the integration finishes. Is there a way to change this parameter?

    Thanks

    opened by nmourdo 5
  • Can I use 2d-array of 'u0'?

    Can I use 2d-array of 'u0'?

    I want to use the 2d-shape initial values of the multiple-particles system. But, I think I cannot use the slicing or indexing.

    def swing_lsoda(network):
        """
        rhs = swing_cover(network)
        funcptr = rhs.address
        ...
        data = np.array([1.0])
        usol, success = lsoda(funcptr, u0, t_eval, data = data)
        """
        @nb.cfunc(lsoda_sig) # network
        def rhs(t, u, du, p): # p0=m, p1=gamma, p2=P, p3=K
            # u is 2d-array 
            Interaction = p[3] * SinIntE(network, u[0])
            du[0] = u[1] #Thetas of nodes
            du[1] = 1/p[0]*(p[2] - p[1]*u[1] - Interaction) #Omega 
    

    How can I try it?

    opened by kimyoungjin06 4
  • Support for other integrators

    Support for other integrators

    Hi, thanks for this great package!

    I know this package is only supposed to deal with the lsoda integrator, but I was wondering if you could at least add support for an explicit integrator as well, maybe rk23. The reason I ask is that I want to solve a very large system of ODEs (>10^6), and because of that, implicit integrators are not really suitable (since they require a matrix solve Ax=b at each step), so I'm stuck with scipy's rk23 or rk45, which are fine, except that because the integration involves too many time steps, I think solve_ivp's explicit while loop is becoming the bottle-neck.

    opened by ma-sadeghi 4
  • Numba cache and iterative solving

    Numba cache and iterative solving

    Hi,

    I am trying to use NumbaLSODA to solve many systems iteratively. However, it seems that I can't out the function in the numba cache and so I have a big overhead at each iteration. I was wondering if you have an advice about it.

    This is my code. First I put all numba related function inside a file

    numba_func.py

    NOPYTHON = True
    CACHE = True
    PARALLEL = True
    
    @nb.cfunc(lsoda_sig, nopython=NOPYTHON, cache=CACHE)
    def rhs(x, i, di, p):
        di[0] = ...
        di[1] = ...
        di[2] = ...
    
    funcptr = rhs.address
    
    @nb.njit('(float64)(float64, float64, float64, int16)', nopython=NOPYTHON, parallel=PARALLEL, cache=CACHE)
    def solve(a, b, c, funcptr):
    
        w0 = np.array([a, b, ...], dtype=np.float64)
        p  = np.array([c], dtype=np.float64)
    
        x = np.linspace(0, 100, 500)
    
        usol, success = lsoda(funcptr, w0, x, data=p)
        
        return usol[-1][1]
    
    

    Then I use another file to solve the systems one after the others

    
    from numba_func import solve, funcptr
    
    gs = []
    for a, b, c in zip(as, bs, cs):
        gs = np.append(gs, solve(a, b, c, funcptr))
    

    I got the following warning:

    NumbaWarning: Cannot cache compiled function "solve" as it uses dynamic globals (such as ctypes pointers and large global arrays)
    

    I guess the idea is to pass the variable funcptr correctly so that numba is happy but so far I failed...

    opened by edumur 4
  • Strange results when t_eval has repeat values

    Strange results when t_eval has repeat values

    Thanks for the great package!

    Summary: I'm trying to switch over from using scipy's odeint to NumbaLSODA as it seems much better! Sometimes the t_eval array I supply has repeated times at the end and this produces some strange output. This was okay with odeint though, would it be possible to allow it for NumbaLSODA too? I think it is trying to integrate an infinitely small timestep rather than just evaluating the same timestep multiple times.

    Details: I'm trying to add this to LEGWORK so that we can speed up how we evolve the orbits of binary stars due to gravitational wave emission. However, if the evolution gets too close to the merger then LSODA produces a bunch of warnings about the timesteps being too small (since the derivatives get so large). To avoid this, I set any timesteps that are too close to the merger equal to the last allowable timestep - this leads to repeated values and the weird results that I see. (I don't just trim the array since I do this in a 2D array with a collection of binaries and so each individual binary might have a different length array which would wouldn't work in a 2D array I think.)

    MWE: Evolve binary evolution due to gravitational wave emission using Peters (1964)

    from numba import cfunc
    import numba as nb
    import numpy as np
    from NumbaLSODA import lsoda_sig, lsoda
    
    @cfunc(lsoda_sig)
    def de_dt(t, e, de, data):
        e_ = nb.carray(e, (1,))
        beta, c_0 = nb.carray(data, (2,))
        de[0] = -19 / 12 * beta / c_0**4 * (e_[0]**(-29.0 / 19.0) * (1 - e_[0]**2)**(3.0/2.0)) \
            / (1 + (121./304.) * e_[0]**2)**(1181./2299.)
    
    ecc = 0.5
    beta = 2.47e22
    c_0 = 1.677e10
    t_merge = 1.81e17
    
    funcptr = de_dt.address
    u0 = np.array([ecc])
    data = np.array([beta, c_0])
    t_eval = np.array([0.0, 0.1, 0.2, 0.5, 0.8, 0.9, 0.9, 0.9]) * t_merge
    
    ecc_evol, _ = lsoda(funcptr, u0, t_eval=t_eval, data=data)
    print(ecc_evol)
    

    This outputs

    [[ 5.00000000e-01]
     [ 4.83244343e-01]
     [ 4.64840518e-01]
     [ 3.95592532e-01]
     [ 2.82404708e-01]
     [ 2.15245963e-01]
     [-1.36311572e+57]
     [-1.36311572e+57]]
    

    So you can see that as soon as the timesteps repeat, you get this hugely small number (and eccentricity should be between 0 and 1)

    opened by TomWagg 3
  • Cannot import dop853 from numbalsoda

    Cannot import dop853 from numbalsoda

    Hi! I've just installed your library and got this error when running the code shown in the readme. ImportError: cannot import name 'dop853' from 'numbalsoda' Could I be missing a library? Thanks in advance

    opened by samirop 2
  • do you have a tip how to workaround the global variable?

    do you have a tip how to workaround the global variable?

    Hi,

    thanks for the great code, it's very fast.

    I'm struggling though with a need for a global variable, or maybe an argument that is updated by the ODE itself

    @nb.cfunc(lsoda_sig)
    def particle_ode(t, y_, dydt, p_):
        y = nb.carray(y_,(3,))
        p = nb.carray(p_,(11,))
        rho1, rho, nu , Cd, g, rhop, d, zu, zl, Vc0, trec = p
       
        global tzl # <---- here 
    
                               
        zp   = y[0]                # particle position      [m]
        V    = y[1]                # particle velocity      [m/s]
        tzl =  y[2] 
        Cam = 0.5                  # added mass coefficient [-]
    
        # determine Vc
        if (y[0]<= zl):
            Vc = Vc0
            tzl = t     # < ----- here I need to adjust this as we move through with y[0]
        else:
            Vc = Vc0 * np.exp(-1*((t-tzl)/trec))
        FS = (rho(zp) - rho1)*Vc*g
    
        # force balance on the particle 
        dzpdt = V
        dVdt  = ( (rhop-rho(zp)) * Vp(d) * g \
                - 0.5 *Cd(V*d/nu(zp))* rho(zp) * Ap(d) * np.abs(V) * V \
                - FS \
                ) / (rhop*Vp(d) + Cam*rho(zp)*Vp(d))
    
        dydt = [dzpdt, dVdt, 0.0 ]
    funcptr = particle_ode.address
    
    opened by alexlib 2
  • Unexpected behavior of NumbaLSODA and nb.cufnc in parallel mode

    Unexpected behavior of NumbaLSODA and nb.cufnc in parallel mode

    Hi There, Thanks for sharing this package. I am seeing up to 20x speedup in my code. As I am simulating the evolution of multiple initial conditions, I want to speed up the code further by using parallel=True flag in Numba. I adapted the example you provided in Stackoverflow for my code to use NumbaLSODA with parallel=True flag in Numba. However, I observed that the speedup was marginal even with 8 cores . In some cases, my parallel code was actually slower (!).

    After debugging on the 2 state example using the code below, I narrowed the reason down to the way the parameters in the rhs function are defined. if the parameters are either global variables or passed using the data argument in lsoda, the parallel = True speeds up the code almost linearly versus the number of cores. However, if the parameters are defined locally as in f_local below, the speed up is marginal. In fact, the parallel version using f_local is slower than the series version using f_global or f_param.

    I thought that local declarations of variables is a good coding practice and it speeds up code execution, but this does not seem to be the case with Numba and NumbaLSODA. I do not know if this behavior is caused by cfunc in Numba or NumbaLSODA. It was definitely unexpected and I thought I will bring it to your notice. Do you know the reason for this behavior? Are local constants not compiled just as global constants? I am unable to dig deeper into the reason and was wondering if you could help. Best Amar

    from NumbaLSODA import lsoda_sig, lsoda
    from matplotlib import pyplot as plt
    import numpy as np
    import numba as nb
    import time
    
    a_glob=np.array([1.5,1.5])
    
    @nb.cfunc(lsoda_sig)
    def f_global(t, u_, du, p): # variable a is global
        u = nb.carray(u_, (2,))
        du[0] = u[0]-u[0]*u[1]*a_glob[0]
        du[1] = u[0]*u[1]-u[1]*a_glob[1]
    
        
    @nb.cfunc(lsoda_sig)
    def f_local(t, u_, du, p): # variable a is local
        u = nb.carray(u_, (2,))
        a = np.array([1.5,1.5]) 
        du[0] = u[0]-u[0]*u[1]*a[0]
        du[1] = u[0]*u[1]-u[1]*a[1]
    
        
    @nb.cfunc(lsoda_sig)
    def f_param(t, u_, du, p): # pass in a as a paameter
        u = nb.carray(u_, (2,))
        du[0] = u[0]-u[0]*u[1]*p[0]
        du[1] = u[0]*u[1]-u[1]*p[1]
    
    funcptr_glob = f_global.address
    funcptr_local = f_local.address
    funcptr_param = f_param.address
    t_eval = np.linspace(0.0,20.0,201)
    np.random.seed(0)
    a = np.array([1.5,1.5])
    
    @nb.njit(parallel=True)
    def main(n, flag):
    #     a = np.array([1.5,1.5])
        u1 = np.empty((n,len(t_eval)), np.float64)
        u2 = np.empty((n,len(t_eval)), np.float64)
        for i in nb.prange(n):
            u0 = np.empty((2,), np.float64)
            u0[0] = np.random.uniform(4.5,5.5)
            u0[1] = np.random.uniform(0.7,0.9)
            if flag ==1: # global
                usol, success = lsoda(funcptr_glob, u0, t_eval, rtol = 1e-8, atol = 1e-8)
            if flag ==2: # local
                usol, success = lsoda(funcptr_local, u0, t_eval, rtol = 1e-8, atol = 1e-8)
            if flag ==3: # param
                usol, success = lsoda(funcptr_param, u0, t_eval, data = a, rtol = 1e-8, atol = 1e-8)
            u1[i] = usol[:,0]
            u2[i] = usol[:,1]
        return u1, u2
    
    @nb.njit(parallel=False)
    def main_series(n, flag): # same function as above but with parallel flag = False
    #     a = np.array([1.5,1.5])
    u1 = np.empty((n,len(t_eval)), np.float64)
    u2 = np.empty((n,len(t_eval)), np.float64)
    for i in nb.prange(n):
        u0 = np.empty((2,), np.float64)
        u0[0] = np.random.uniform(4.5,5.5)
        u0[1] = np.random.uniform(0.7,0.9)
        if flag ==1: # global
            usol, success = lsoda(funcptr_glob, u0, t_eval, rtol = 1e-8, atol = 1e-8)
        if flag ==2: # local
            usol, success = lsoda(funcptr_local, u0, t_eval, rtol = 1e-8, atol = 1e-8)
        if flag ==3: # param
            usol, success = lsoda(funcptr_param, u0, t_eval, data = a, rtol = 1e-8, atol = 1e-8)
        u1[i] = usol[:,0]
        u2[i] = usol[:,1]
    return u1, u2
    
    n = 100
    # calling first time for JIT compiling
    u1, u2 = main(n,1)
    u1, u2 = main(n,2)
    u1, u2 = main(n,3)
    
    u1, u2 = main_series(n,1)
    u1, u2 = main_series(n,1)
    u1, u2 = main_series(n,1)
    
    # Running code for large number 
    n = 10000
    a1 = time.time()
    u1, u2 = main(n,1) # global
    a2 = time.time()
    print(a2 - a1) # this is fast
    
    
    a1 = time.time()
    u1, u2 = main(n,2) # local
    a2 = time.time()
    print(a2 - a1) # this is slow
    
    a1 = time.time()
    u1, u2 = main(n,3) # param
    a2 = time.time()
    print(a2 - a1) # this is fast and almost identical performance as global
    
    a1 = time.time()
    u1, u2 = main_series(n,1) # global
    a2 = time.time()
    print(a2 - a1) # this is faster than local + parallel
    
    a1 = time.time()
    u1, u2 = main_series(n,2) # local
    a2 = time.time()
    print(a2 - a1) # this is slow
    
    a1 = time.time()
    u1, u2 = main_series(n,3) # param
    a2 = time.time()
    print(a2 - a1) # this is fast and almost identical performance as global
    
    opened by amar-iastate 2
  • Easier passing data to rhs

    Easier passing data to rhs

    Thanks for the package, some 400x faster than plain Python for the system I'm looking at. I just wanted to point out that Numba functions close over their local scope so passing arbitrary (constant?) data can be much easier than in your example. Mine looks like this:

    def make_lsoda_func(c,R,dstar,E,F,N):
        @cfunc(lsoda_sig)
        def rhs(t, x, du, p):
            codim3(t,x,c,R,dstar,E,F,N,du)
        return rhs
    
    opened by maedoc 2
  • Potential interpolant issue

    Potential interpolant issue

    Hi @Nicholaswogan,

    Great work on this repository, it looks like you managed a consistent and quite high speed-up compared to SciPy. I have been working with LSODA recently using solve_ivp, and I have encountered an issue relative to the interpolant (PR at SciPy). I am wondering if this implementation of LSODA also has a similar issue, as it looks to me that the root of the problem lies within the FORTRAN implementation. In particular, LSODA::intdy seems very similar to me to DINTDY in ODEPACK, which could well mean that the issue is present. Let me know if you can have a look.

    opened by Patol75 2
  • Addition of a Runge-Kutta version

    Addition of a Runge-Kutta version

    Good afternoon @Nicholaswogan.

    As discussed, here is a PR for the addition of the RK45 explicit solver. I still have a few things to add like the comparison to Scipy and the performance tests, but you can already have a look.

    Note that I only implemented Fehlberg RK45 here, but I wrote it in a sufficient general manner to implement any explicit RK described by its Butcher tableau.

    To-Do list

    • [x] It seems there is a bug with machine-epsilon requested absolute tolerance (atol = 1e-30): the timesteps get infinitesimally smaller and the solver never finishes.
    opened by tbridel 1
  • Adding other Runge-Kutta time integrators

    Adding other Runge-Kutta time integrators

    Hye again !

    Could you please share the steps one would have to take to add a new Runge-Kutta integrator to the project ?

    I'm happy to implement it push a PR at some point, but I wanted to know if you had maybe a process that would make it easier to add a new Butcher-tableau based RK integrator ?

    Thanks ! T. Bridel-Bertomeu

    opened by tbridel 2
  • Adaptive timestepping and hijacking the

    Adaptive timestepping and hijacking the "step" method

    Hi again @Nicholaswogan !

    I was wondering if you ever thought about the adaptive timestepping capability that exists in Scipy LSODA ? Is anything like this possible with your implementation ?

    Thanks ! Best, T. Bridel-Bertomeu

    opened by tbridel 12
  • Installation via pip produces error when importing numbalsoda

    Installation via pip produces error when importing numbalsoda

    Hi Nick, thank you very much for building this Python package, it definitely appears very promising!

    I installed the latest version of numbalsoda (0.3.4) via pip and the installation was successful. I am using Python 3.9.7 and conda version 22.9.0.

    However, when trying to import it, I get the following error:

    'The procedure entry point ZSt28​__throw​_bad_array_new_lengthv could not be located in the dynamic link library C:\Users...\numbalsoda\liblsoda.dll'

    Installing numbalsoda via conda (conda install -c conda-forge numbalsoda) solves the issue and the package is successfully imported into the Python script. Conda installs version 0.3.2, and thus I was wondering if the difference in the versions may play a role in this issue.

    Thank you very much!

    opened by kf120 0
  • Backward in time using numbalsoda

    Backward in time using numbalsoda

    Hi

    I am using numblsoda to solve couples of first order differential equations. I need to solve those equations backward in time for some particular situations. I could not figure out how I can do that. I tried to use decreasing time step but it does not work as it works for solve_ivp and odeint.

    for instance in this code how can I go backward in time?

    from numbalsoda import lsoda_sig, lsoda, dop853
    from numba import njit, cfunc
    import numpy as np
    
    @cfunc(lsoda_sig)
    def rhs(t, u, du, p):
        du[0] = u[0]-u[0]*u[1]
        du[1] = u[0]*u[1]-u[1]*p[0]
    
    funcptr = rhs.address # address to ODE function
    u0 = np.array([5.,0.8]) # Initial conditions
    data = np.array([1.0]) # data you want to pass to rhs (data == p in the rhs).
    t_eval = np.arrange(0.0,50.0,0.1) # times to evaluate solution
    
    
    usol, success = lsoda(funcptr, u0, t_eval, data = data)
    

    I tried t_eval = np.arrange(50, 0, -0.1) but it does not work. It would be great if someone could help me to fix this issue.

    opened by meysam-motaharfar 1
  • Make NumPy types for u0 and usol configurable

    Make NumPy types for u0 and usol configurable

    First of all, awesome code! I got a speedup of an order of magnitude basically for free!

    But, as far as I can tell, u0 and usol are required to be of type np.float64. However, often times np.float32 is more than enough.

    Maybe I just don't see the option to change it, but if it doesn't exist, it'd be nice if we could change it.

    Thank you!

    opened by juhannc 1
Releases(v0.3.5)
Owner
Nick Wogan
Planetary Scientist, PhD Student, developing models of atmospheric evolution.
Nick Wogan
Where2Act: From Pixels to Actions for Articulated 3D Objects

Where2Act: From Pixels to Actions for Articulated 3D Objects The Proposed Where2Act Task. Given as input an articulated 3D object, we learn to propose

Kaichun Mo 69 Nov 28, 2022
Training BERT with Compute/Time (Academic) Budget

Training BERT with Compute/Time (Academic) Budget This repository contains scripts for pre-training and finetuning BERT-like models with limited time

Intel Labs 263 Jan 07, 2023
gitγ€ŠCommonsense Knowledge Base Completion with Structural and Semantic Context》(AAAI 2020) GitHub: [fig1]

Commonsense Knowledge Base Completion with Structural and Semantic Context Code for the paper Commonsense Knowledge Base Completion with Structural an

AI2 96 Nov 05, 2022
Forecasting with Gradient Boosted Time Series Decomposition

ThymeBoost ThymeBoost combines time series decomposition with gradient boosting to provide a flexible mix-and-match time series framework for spicy fo

131 Jan 08, 2023
My course projects for the 2021 Spring Machine Learning course at the National Taiwan University (NTU)

ML2021Spring There are my projects for the 2021 Spring Machine Learning course at the National Taiwan University (NTU) Course Web : https://speech.ee.

Ding-Li Chen 15 Aug 29, 2022
[ICLR 2021] "CPT: Efficient Deep Neural Network Training via Cyclic Precision" by Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin

CPT: Efficient Deep Neural Network Training via Cyclic Precision Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin Accep

26 Oct 25, 2022
Simple STAC Catalogs discovery tool.

STAC Catalog Discovery Simple STAC discovery tool. Just paste the STAC Catalog link and press Enter. Details STAC Discovery tool enables discovering d

Mykola Kozyr 21 Oct 19, 2022
StarGAN2 for practice

StarGAN2 for practice This version of StarGAN2 (coined as 'Post-modern Style Transfer') is intended mostly for fellow artists, who rarely look at scie

vadim epstein 87 Sep 24, 2022
This is the pytorch implementation for the paper: *Learning Accurate Performance Predictors for Ultrafast Automated Model Compression*, which is in submission to TPAMI

SeerNet This is the pytorch implementation for the paper: Learning Accurate Performance Predictors for Ultrafast Automated Model Compression, which is

3 May 01, 2022
Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxiang Wang, Han Zhao, Bo Li.

Bridging Multi-Task Learning and Meta-Learning Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Trainin

AI Secure 57 Dec 15, 2022
Code for the IJCAI 2021 paper "Structure Guided Lane Detection"

SGNet Project for the IJCAI 2021 paper "Structure Guided Lane Detection" Abstract Recently, lane detection has made great progress with the rapid deve

Jinming Su 27 Dec 08, 2022
Leveraging Social Influence based on Users Activity Centers for Point-of-Interest Recommendation

SUCP Leveraging Social Influence based on Users Activity Centers for Point-of-Interest Recommendation () Direct Friends (i.e., users who follow each o

Kosar 8 Nov 26, 2022
Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020

XDVioDet Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020. The proj

peng 64 Dec 12, 2022
Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight)

Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight) Abstract Due to the limited and even imbalanced dat

Hanzhe Hu 99 Dec 12, 2022
tensorflow code for inverse face rendering

InverseFaceRender This is tensorflow code for our project: Learning Inverse Rendering of Faces from Real-world Videos. (https://arxiv.org/abs/2003.120

Yuda Qiu 18 Nov 16, 2022
Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

Decentralized Reinforcement Learning This is the code complementing the paper Decentralized Reinforcment Learning: Global Decision-Making via Local Ec

40 Oct 30, 2022
Minimal implementation of PAWS (https://arxiv.org/abs/2104.13963) in TensorFlow.

PAWS-TF 🐾 Implementation of Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments with Support Samples (PAWS)

Sayak Paul 43 Jan 08, 2023
This repository consists of Blender python scripts and corresponding assets to generate variants of the CANDLE dataset

candle-simulator This repository consists of Blender python scripts and corresponding assets to generate variants of the IITH-CANDLE dataset. The rend

1 Dec 15, 2021
library for nonlinear optimization, wrapping many algorithms for global and local, constrained or unconstrained, optimization

NLopt is a library for nonlinear local and global optimization, for functions with and without gradient information. It is designed as a simple, unifi

Steven G. Johnson 1.4k Dec 25, 2022
Tensorflow implementation of DeepLabv2

TF-deeplab This is a Tensorflow implementation of DeepLab, compatible with Tensorflow 1.2.1. Currently it supports both training and testing the ResNe

Chenxi Liu 21 Sep 27, 2022