Taichi is a parallel programming language for high-performance numerical computations.

Overview

Postsubmit Checks Docker Cloud Build Status Python Codecov Status Latest Release Netlify Status

Overview

Taichi (太极) is a parallel programming language for high-performance numerical computations. It is embedded in Python, and its just-in-time compiler offloads compute-intensive tasks to multi-core CPUs and massively parallel GPUs.

Advanced features of Taichi include spatially sparse computing, differentiable programming [examples], and quantized computation.

Please check out our SIGGRAPH 2020 course on Taichi basics: YouTube, Bilibili, slides (pdf).

中文视频教程: [哔哩哔哩], [幻灯片]

Examples (More...)

Installation Downloads

python3 -m pip install taichi

Supported OS: Windows, Linux, Mac OS X; Python: 3.6-3.9 (64-bit only); Backends: x64 CPUs, CUDA, Apple Metal, Vulkan, OpenGL Compute Shaders.

Please build from source for other configurations (e.g., your CPU is ARM, or you want to try out our experimental C backend).

Note:

Developer Installation

Please follow this doc to learn how to build Taichi from source. Note that Taichi requires LLVM-10.0.0, and it is recommneded to use our prebuilt LLVM libraries for each platform.

Contributors

Note: contributor avatars above are randomly shuffled.


We welcome feedback and comments. If you would like to contribute to Taichi, please check out our Contributor Guidelines.

If you use Taichi in your research, please cite related papers:

Links

Security

Please disclose security issues responsibly to [email protected].


1. TaichiZoo is still in its Beta version. If you've encountered any issue, please do not hesitate to file a bug.

Comments
  • [IR] Use JIT compilation/evaluation for systematic constant folding

    [IR] Use JIT compilation/evaluation for systematic constant folding

    Related issue = #510

    We now support more operation types rather than just add and mul. Also add a lot of unary ops rather than just casting.

    [Click here for the format server]

    opened by archibate 103
  • pip-installed Taichi crashes on Google colab kernels

    pip-installed Taichi crashes on Google colab kernels

    Opening an empty CPU-backed notebook at https://colab.research.google.com and running the following code leads to crash:

    !apt install clang-7
    !apt install clang-format
    !pip install taichi-nightly
    
    import taichi as ti
    
    x, y = ti.var(ti.f32), ti.var(ti.f32)
    
    @ti.layout
    def xy():
      ti.root.dense(ti.ij, 16).place(x, y)
    
    @ti.kernel
    def laplace():
      for i, j in x:
        if (i + j) % 3 == 0:
          y[i, j] = 4.0 * x[i, j] - x[i - 1, j] - x[i + 1, j] - x[i, j - 1] - x[i, j + 1]
        else:
          y[i, j] = 0.0
    
    for i in range(10):
     x[i, i + 1] = 1.0
    
    laplace()
    
    for i in range(10):
      print(y[i, i + 1])
    

    And the relevant runtime logs say:

    Oct 30, 2019, 3:47:15 PM | WARNING | /usr/local/lib/python3.6/dist-packages/taichi/core/../lib/taichi_core.so:
    Oct 30, 2019, 3:47:15 PM | WARNING | /usr/local/lib/python3.6/dist-packages/taichi/core/../lib/taichi_core.so:
    Oct 30, 2019, 3:47:15 PM | WARNING | /usr/local/lib/python3.6/dist-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::Kernel::operator()()
    Oct 30, 2019, 3:47:15 PM | WARNING | /usr/local/lib/python3.6/dist-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::Kernel::compile()
    Oct 30, 2019, 3:47:15 PM | WARNING | /usr/local/lib/python3.6/dist-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::Program::compile(taichi::Tlang::Kernel&)
    Oct 30, 2019, 3:47:15 PM | WARNING | /usr/local/lib/python3.6/dist-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::KernelCodeGen::compile(taichi::Tlang::Program&, taichi::Tlang::Kernel&)
    Oct 30, 2019, 3:47:15 PM | WARNING | /usr/local/lib/python3.6/dist-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::CPUCodeGen::lower_cpp()
    Oct 30, 2019, 3:47:15 PM | WARNING | /usr/local/lib/python3.6/dist-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::irpass::lower(taichi::Tlang::IRNode*)
    Oct 30, 2019, 3:47:15 PM | WARNING | /usr/local/lib/python3.6/dist-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::LowerAST::visit(taichi::Tlang::Block*)
    Oct 30, 2019, 3:47:15 PM | WARNING | /usr/local/lib/python3.6/dist-packages/taichi/core/../lib/taichi_core.so:
    Oct 30, 2019, 3:47:15 PM | WARNING | /usr/local/lib/python3.6/dist-packages/taichi/core/../lib/taichi_core.so:
    Oct 30, 2019, 3:47:15 PM | WARNING | /lib/x86_64-linux-gnu/libc.so.6: abort
    Oct 30, 2019, 3:47:15 PM | WARNING | /lib/x86_64-linux-gnu/libc.so.6: gsignal
    Oct 30, 2019, 3:47:15 PM | WARNING | /lib/x86_64-linux-gnu/libc.so.6:
    Oct 30, 2019, 3:47:15 PM | WARNING | /usr/local/lib/python3.6/dist-packages/taichi/core/../lib/taichi_core.so: taichi::signal_handler(int)
    Oct 30, 2019, 3:47:15 PM | WARNING | ***************************
    Oct 30, 2019, 3:47:15 PM | WARNING | * Taichi Core Stack Trace *
    Oct 30, 2019, 3:47:15 PM | WARNING | ***************************
    Oct 30, 2019, 3:47:15 PM | WARNING | [E 10/30/19 14:47:15.371] Received signal 6 (Aborted)
    Oct 30, 2019, 3:47:15 PM | WARNING | [I 10/30/19 14:47:15.340] [base.cpp:[email protected]] Compilation time: 2889.9 ms
    Oct 30, 2019, 3:47:12 PM | WARNING | [T 10/30/19 14:47:12.056] [logging.cpp:[email protected]] Taichi core started. Thread ID = 122
    

    Can you please provide some insight into the possible root of the problem if you have it on top of your head?

    welcome contribution stale 
    opened by znah 69
  • Advanced optimization

    Advanced optimization

    Concisely describe the proposed feature With new extensions introduced by #581, there are lots of space to optimize the IR. I also found some feasible optimizations that are not directly related to the new extension. For example, in this fragment of IR,

    ...
    <f32 x1> $5 = alloca
    if $26 {
      ...
    } else {
      ...
    }
    if $26 {
      ...
    } else {
      ...
    }
    <f32 x1> $83 = local load [ [$5[0]]] (the only statement about $5)
    ...
    

    we could merge the two if's together, change $83 to const [0], and then delete $5.

    A list of optimizations I have done and going to do:

    • [x] Basic algebraic simplification (#472, #502)
    • [x] Better algebraic simplification: -1 & a, 0 | a (#827)
    • [x] Lower linearized (#509)
    • [x] Variable optimization
      • [x] Dive into container statements to find local loads/stores for optimization (merge identical local loads, delete local stores if there are no following loads, etc.) (#662)
      • [x] Dive into container statements to merge identical global loads (#857)
      • [x] Optimize local loads of new alloca's to const [0] (#662)
      • [x] Local store elimination and forwarding (#788, #858, #859)
      • [x] Global store elimination and forwarding (#857)
    • [x] Merge adjacent if's with identical condition (#668)
    • [x] Move common statements in both branches outside if's (thanks for @archibate 's discussion) (#727)
    • [x] Add a WholeKernelCSE pass (#727, #1082)
    • [x] Eliminate WhileControlStmt with cond == const [1] (#829)
    • [x] Eliminate assertions with non-zero const conditions (#877)
    • [x] Improve optimization for OffsetAndExtractBitsStmt (#851)
    • [x] DIE for stack pop (#1324)
    • [x] Allocate stacks with sizes on demand (#2438)
    • [x] Extract consts to top-level after offloading (#897)

    Additional comments For benchmarking, we may want to introduce a temporary boolean variable as the switch of optimization.

    Some nice slides: https://courses.cs.washington.edu/courses/cse401/08wi/lecture/opt-mark.v2.pdf

    feature request 
    opened by xumingkuan 64
  • [OpenGL] Ship OpenGL backend

    [OpenGL] Ship OpenGL backend

    Concisely describe the proposed feature

    As we approach v0.6 it's time to think about how to ensure users/developers can easily access the OpenGL backend. I'm setting up testing environments on my GPU matches, yet I find that the OpenGL build process is not portable enough. For example, users may have to apt install libglew-dev libglfw3-dev on Ubuntu.

    Describe the solution you'd like (if any)

    Fortunately, the two libraries are fast to build ( < 10 seconds) and generate small binaries (< 1MB). So we can simply build from source and statically link them into libtaichi_core.so, just like LLVM.

    1. Added the desired versions of glew and glfw (3.3.2 I guess) as submodules of Taichi.
    2. Integrate their CMakeLists.txt using add_subdirectory. Make sure we use static linking.
    3. Update the buildbots to build OpenGL by default.

    Additional comments

    Ultimately we might want to remove the dependency on glew and glfw. We only use limited functionalities of these libraries. For example, we only use glfw to create the OpenGL context - not sure if there's an easy way without introducing a dependency. For now, we can stick to the current solution.

    feature request welcome contribution opengl 
    opened by yuanming-hu 45
  • Join us as a new Taichi contributor!

    Join us as a new Taichi contributor!

    Below are features that are relatively friendly to new comers. Everyone is welcome to participate :-)

    Welcome contribution

    Issue | Link -- | -- Support more atomic operations | https://github.com/taichi-dev/taichi/issues/2675 Support efficient operations for large matrix | https://github.com/taichi-dev/taichi/issues/2696 [refactor] Make Callable::Args/Ret serializable | https://github.com/taichi-dev/taichi/issues/2625 Deprecate Expr::operator= | https://github.com/taichi-dev/taichi/issues/2684 [vulkan] Clear unnecessary GLSL shader files on Vulkan backend | https://github.com/taichi-dev/taichi/issues/2711 Put repeating code blocks in Github CI workflows into scripts. | https://github.com/taichi-dev/taichi/issues/2715 Invoke materialize() only before calling a Taichi kernel | https://github.com/taichi-dev/taichi/issues/2730

    feature request welcome contribution 
    opened by lucywsq 40
  • Experimental Metal backend

    Experimental Metal backend

    Is your feature request related to a problem? Please describe.

    I'd like to add a Metal backend to taichi, so as to allow Mac users to enjoy the GPU acceleration, too.

    Describe the solution you'd like

    Halide has already supported Metal backend, and its codebase has a lot to learn from. Specifically, they used source-to-source codegen to translate Halide to Metal compute kernels (not LLVM IR). They also wrapped Metal APIs into C++ via the objc-runtime APIs (taichi is also using this approach for its GUI).

    After talking to @yuanming-hu , I think we can start by supporting dense first. This requires two things:

    • scattered read/write (access arbitrary memory address in the buffer)
    • atomic operations (e.g. atomic_add)

    Metal supports both fairly well.

    I also need to figure out if the memory returned by the existing/next-gen memory allocator can be used by Metal kernels.

    Describe alternatives you've considered

    Supporting OpenGL compute shader may be appealing to a broader audience scope. However, due to my dev environment setup and working experience, I feel more comfortable working on Metal. If this works, it should be useful for helping design the OpenGL backend as well.

    Additional context

    Some references

    • Halide
    • mtlpp: This is another library that exposes Metal APIs to C++, but it's implemented in Obj-C.
    feature request mac 
    opened by k-ye 36
  • Access out-of-bound checking on CPU backends

    Access out-of-bound checking on CPU backends

    Concisely describe the proposed feature

    Currently out-of-bound tensor accesses lead to undefined behavior. Oftentime taichi will just yield a wrong result without a loud failure. A lot of users complained about this. For example,

    ti.init(arch=ti.cuda, debug=True)
    
    x = ti.var(ti.i32, shape=(8, 8, 8))
    
    @ti.kernel
    def boom():
      x[0, 0, 7] = 1 # OK
      x[0, 0, 8] = 1 # Access out of bound!
                     # Should trigger assertion failure instead of corrupting memory
                     # Should raise TaichiRuntimeError(
                     #      "Accessing Tensor of Size [8, 8, 8] with indices (0, 0, 8)")
    

    Describe the solution you'd like We can modify the codegen to add that check for every tensor access (before the lower_access IR pass) in debug mode.

    How to get the tensor bounds: [0, SNode::extractors[physical_index_position[i]].num_elements)

    Code Generation First generate a set of compares for bound checking, and then summarize the results using BinaryOp::and. Pass the result to an AssertStmt. Note that AssertStmt now takes only a condition and a fixed message. Maybe we need to extend it with (condition, format_str, values) so that Accessing Tensor of Size [8, 8, 8] with indices (0, 0, 8) can be generated via ("Accessing Tensor of Size [8, 8, 8] with indices (%d, %d, %d)", input_i, input_j, input_k).

      int physical_index_position[taichi_max_num_indices]{};
      // physical indices are (ti.i, ti.j, ti.k, ti.l, ...)
      // physical_index_position[i] =
      // which physical index does the i-th virtual index (the one exposed to
      // programmers) refer to? i.e. in a[i, j, k], "i", "j", and "k" are virtual
      // indices.
    

    Additional comments

    • pbf2d.py sometimes crashes. It might be related to an access out-of-bound. Implementing this will help us diagnose.
    • The first step is to add bound checking on CPU (x64), and ultimately we may also want to implement that on GPUs. Taichi GPU programmers can, of course, switch back to CPUs to check for access out-of-bounds, which should work in most cases.
    feature request welcome contribution 
    opened by yuanming-hu 35
  • [Lang] Support Python-scope matrix/vector operations

    [Lang] Support Python-scope matrix/vector operations

    Related issue = #1008

    NOT READY YET

    Implementation plan and thoughts

    • [x] Figure out where to put the runtime scope checks for +, -, @ (actuall it also covers other binary ops), and add a placeholder for now
    • [x] Implement Python-scope operations for Matrix/Vector.
    • [x] Implement Python-scope operations for tensor of Matrices (Matrix.at).
    • [x] Implement Python-scope print support for Matrix/Vector.
    • [x] Implement Python-scope print support for tensor of Matrices (Matrix.at).
    • [ ] Hook Proxy to get rid of Matrix.at.
    • [x] Add test cases.
      • [x] Test single tensor (matrix).
      • [x] Test tensor of tensors (matrices).
    • [ ] Update documentation.
    • [ ] ~(maybe in a separate PR) Add a matrix example.~

    Yak shaving

    • Removed a duplicate test case for numpy io, also fixed the flipped names.
    • Very tiny doc typo fix.

    Open Questions

    • ~It seems we've been using assert over raise Exception acorss the codebase, while they works similar in most cases, assert statements are removed when the Py compilation is optimized (e.g. python -O examples/***.py), and thus assertions will be unsafely bypassed. Do we have special considerations about why we chose asserts?~
    • ~For the places that are raising exceptions, I noticed often wild Exception is being thrown, which might not be a good practice to me. As we start to discuss refactor the operations and math libs, it might be a good chance to have more explicit Exceptions defined in exception.py?~

    Discussion moved to #1066

    [Click here for the format server]

    stale GAMES201 
    opened by rexwangcc 34
  • Support break in non-parallel for statements

    Support break in non-parallel for statements

    Concisely describe the proposed feature

    @ti.kernel
    def sums():
      for i in range(n):
        is_prime[i] = 1
        for j in range(i):
           if i % j == 0:
             is_prime[i] = 0
           if j * j > i:
             break # This is not supported now...
    

    Describe the solution you'd like (if any) Add an IR pass that lower non-parallel fors with break statements into while statements. While statements already support breaks (WhileControlStmt).

    feature request welcome contribution 
    opened by yuanming-hu 34
  • Import taichi encounter taichi_core error

    Import taichi encounter taichi_core error

    build taichi from source code,and import taichi will show below error export PYTHONPATH=$TAICHI_REPO_DIR/python:$PYTHONPATH is set in my ubuntu21.10 .bashrc.

    import taichi Share object taichi_core import failed, check this page for possible solutions: https://docs.taichi.graphics/lang/articles/misc/install Traceback (most recent call last): File "", line 1, in File "/home/linaro/ssd/git/taichi/python/taichi/init.py", line 3, in from taichi._funcs import * File "/home/linaro/ssd/git/taichi/python/taichi/_funcs.py", line 3, in from taichi.lang import impl, matrix, ops File "/home/linaro/ssd/git/taichi/python/taichi/lang/init.py", line 3, in from taichi._lib import core as _ti_core File "/home/linaro/ssd/git/taichi/python/taichi/_lib/init.py", line 1, in from taichi._lib.utils import ti_core as core File "/home/linaro/ssd/git/taichi/python/taichi/_lib/utils.py", line 103, in ti_core = import_ti_core() File "/home/linaro/ssd/git/taichi/python/taichi/_lib/utils.py", line 54, in import_ti_core raise e from None File "/home/linaro/ssd/git/taichi/python/taichi/_lib/utils.py", line 43, in import_ti_core from taichi._lib.core import
    ImportError: cannot import name 'taichi_core' from 'taichi._lib.core' (/home/linaro/ssd/git/taichi/python/taichi/_lib/core/init.py)

    question 
    opened by Peterritche 30
  • [Lang] [refactor] Deprecate

    [Lang] [refactor] Deprecate "as_vector=True" in Matrix.to_numpy/to_torch

    Related issue = #940 #833 #1016

    This PR also makes the stage 3 of #923 easier to implement.

    [Click here for the format server]

    I strongly argue about the ultra-massive-centralized-multiplexer in ti.Matrix.__init__:

    def __init__(self, n=1, m=1, dt=None, shape=None,
                 empty=False, layout=None, needs_grad=False,
                 keep_raw=False, rows=None, cols=None):
       ... # 100+ lines of if-else-if's, within just **one function**
    

    Functions should be short and sweet, and do just one thing. They should fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24, as we all know), and do one thing and do that well.

    ... said the Linux Kernel Coding Style, and I believe the 80x24 rule also works in python too.

    Another measure of the function is the number of local variables. They shouldn’t exceed 5-10, or you’re doing something wrong. Re-think the function, and split it into smaller pieces. A human brain can generally easily keep track of about 7 different things, anything more and it gets confused. You know you’re brilliant, but maybe you’d like to understand what you did 2 weeks from now.

    Not to say lobal variables, just the arguments alone, there are 11.


    Also note that the API initializing a ti.Matrix is highly centralized, GAMES201 students can easily get confused by error messages when they're not using the correct argument specification:

    ti.Matrix(n, m, dt, shape)  # matrix tensor
    ti.Matrix(n, dt, shape)  # vector tensor
    ti.Matrix([[a, b], [c, d]])  # taichi-scope matrix
    ti.Matrix([a, b])  # taichi-scope vector
    ti.Matrix(cols=[u, v])  # taichi-scope matrix, with col vectors
    ti.Matrix(rows=[u, v])  # taichi-scope matrix, with row vectors
    ti.Matrix(n, m, empty=True)  # empty matrix
    

    Don't you think a better design could be:

    ti.Matrix.var(n, m, dt, shape)  # matrix tensor
    ti.Vector.var(n, dt, shape)  # vector tensor
    ti.Matrix([[a, b], [c, d]])  # taichi-scope matrix
    ti.Vector([a, b])  # taichi-scope vector
    ti.Matrix.cols([u, v])  # taichi-scope matrix, with col vectors
    ti.Matrix.rows([u, v])  # taichi-scope matrix, with row vectors
    ti.Matrix.empty(n, m)  # empty matrix
    

    to give ti.Matrix.__init__ less duty just like currently ti.Matrix.ones and ti.Matrix.zeros does?

    Or at least we can make ti.Matrix.__init__ serve as a router just redirecting to the actual functions like ti.Matrix.init_as_local and ti.Matrix.init_as_var, instead of putting those implementations inside of ti.Matrix.__init__ if-else-if's?

    Note that we should always try our best to make a well-polished matrix/vector API before Taichi v1.0.0 comes.


    I'm putting a lot of lines of text just to convey you trust my refactor decision, so please take a good look and tl;dr's are unwanted.

    GAMES201 
    opened by archibate 29
  • ti.sym_eig on identity matrix returns nan

    ti.sym_eig on identity matrix returns nan

    Describe the bug ti.sym_eig on identity matrix returns nan.

    To Reproduce

    import taichi as ti
    
    ti.init()
    @ti.kernel
    def test():
        A = ti.Matrix([[1., 2., 3.], [2., 3., 4.], [3., 4., 8.]])
        I = ti.Matrix.identity(ti.f32,3)
        eig, V = ti.sym_eig(I)
        print(V)
        eig1, V1 = ti.sym_eig(A)
        print(V1)
    
    test()
    

    Output

    [[-nan, -nan, -nan], [-nan, -nan, -nan], [-nan, -nan, -nan]]
    [[0.325306, 0.231804, -0.916757], [0.466246, 0.804128, 0.368771], [0.822673, -0.547398, 0.153510]]
    
    opened by xuan-li 0
  • [rhi] Update `create_pipeline` API and add support of VkPipelineCache

    [rhi] Update `create_pipeline` API and add support of VkPipelineCache

    Issue: #6832

    Brief Summary

    The create_pipeline API has been updated to new standards.

    In addition, we now added (optional) support for backend caches such as VkPipelineCache. This makes running cached taichi programs even faster.

    opened by bobcao3 1
  • [aot] Precommit.ci clang-tidy doesn't apply to C-API

    [aot] Precommit.ci clang-tidy doesn't apply to C-API

    Currently precommit.ci doesn't apply clang-tidy to C-API code sources in c_api/src and c_api/include. This should be resolved once we don't need to introduce more significant change to C-API.

    opened by PENGUINLIONG 0
Releases(v1.3.0)
  • v1.3.0(Nov 30, 2022)

    Deprecation Notice

    • Using sparse data structures on the Metal backend is now deprecated. The support for Dynamic SNode has been removed in v1.3.0, and the support for Pointer/Bitmasked SNode will be removed in v1.4.0.
    • The packed switch in ti.init() is now deprecated and will be removed in v1.4.0. See the feature introduction below for details.
    • ti.Matrix.rotation2d() is now deprecated and will be removed in v1.4.0. Use ti.math.rotation2d() instead.
    • To clearly distinguish vectors from matrices, transpose() on a vector is no longer allowed. If you want something like a @ b.transpose(), write a.outer_product(b) instead.
    • Ndarray: The arguments of ndarray type annotation element_dim, element_shape and field_dim will be deprecated in v1.4.0. The field_dim is renamed to ndim to make it more intuitive. element_dim and element_shape will be replaced by passing a matrix type into dtype argument. For example, the ti.types.ndarray(element_dim=2, element_shape=(3,3)) will be replaced by ti.types.ndarray(dtype=ti.matrix(3,3)).

    New features

    Dynamic SNode

    To support variable-length fields, Taichi provides dynamic SNodes. You can now use the dynamic SNode on fields of different data types, even struct fields and matrix fields. You can use x[i].append(...) to append an element, use x[i].length() to get the length, and use x[i].deactivate() to clear the list as shown in the following code snippet.

    pair = ti.types.struct(a=ti.i16, b=ti.i64)
    pair_field = pair.field()
    
    block = ti.root.dense(ti.i, 4)
    pixel = block.dynamic(ti.j, 100, chunk_size=4)
    pixel.place(pair_field)
    l = ti.field(ti.i32)
    ti.root.dense(ti.i, 5).place(l)
    
    @ti.kernel
    def dynamic_pair():
        for i in range(4):
            pair_field[i].deactivate()
            for j in range(i * i):
                pair_field[i].append(pair(i, j + 1))
            # pair_field = [[],
            #              [(1, 1)],
            #              [(2, 1), (2, 2), (2, 3), (2, 4)],
            #              [(3, 1), (3, 2), ... , (3, 8), (3, 9)]]
            l[i] = pair_field[i].length()  # l = [0, 1, 4, 9]
    

    Packed Mode

    Packed mode was introduced in v0.8.0 to allow users to trade runtime performance for memory usage. In v1.3.0, after the elimination of runtime overhead in common cases, packed mode has become the default mode. There's no longer any automatic padding behavior behind the scenes, so users can use fields and SNodes without surprise.

    Sparse Matrix

    We introduce the experimental sparse matrix and sparse solver on the CUDA backend. The API of using is the same as CPU backend. Currently, only the f32 data type and LLT linear solver are supported on CUDA. You can only use ti.ndarray to compute SpMV and linear solver operation. Float64 data type and other linear solvers are under implementation.

    Improvements

    Python Frontend

    • Matrix slicing now supports augmented assign (e.g. +=) besides assign.

    Taichi Examples

    1. Our user https://github.com/Linyou contributed an excellent example on instant ngp renderer PR #6673. Run taichi_ngp to check it out!

    [Developers only] LLVM15 upgrade

    Starting from v1.3.0, Taichi has upgraded its LLVM dependency to version 15.0.0. If you're interested in contributing or simply building Taichi from source, please follow our installation doc for developers. Note this change has no impact on Taichi users.

    Highlights

    • Documentation
      • Update the documentation about Dynamic SNode (#6752) (by Lin Jiang)
      • Stop mentioning packed mode (#6755) (by Yi Xu)
    • Language and syntax
      • Add deprecation warning for the removal of the packed switch (#6753) (by Yi Xu)
    • Metal backend
      • Raise deprecate warning and error when using sparse snodes on metal (#6739) (by Lin Jiang)

    Full changelog

    • [aot] Revert C-API Device capability improvements (#6772) (by PENGUINLIONG)
    • [aot] C-API Device capability improvements (#6702) (by PENGUINLIONG)
    • [aot] C-API to get available archs (#6766) (by PENGUINLIONG)
    • [doc] Update sparse matrix document (#6719) (by pengyu)
    • [autodiff] Separate non-linear operators to an individual class (#6700) (by Mingrui Zhang)
    • [bug] Fix dereferencing nullptr (#6763) (by Yi Xu)
    • [Doc] Update the documentation about Dynamic SNode (#6752) (by Lin Jiang)
    • [doc] Update dev install about clang version (#6759) (by Ailing)
    • [build] Improve TI_WITH_CUDA guards for CUDA related test cases (#6698) (by Zhanlue Yang)
    • [Lang] Add deprecation warning for the removal of the packed switch (#6753) (by Yi Xu)
    • [lang] Improve sparse matrix building on GPU (#6748) (by pengyu)
    • [aot] JSON serde (#6754) (by PENGUINLIONG)
    • [bug] MatrixType bug fix: Fix error with to_numpy() and from_numpy() (#6726) (by Zhanlue Yang)
    • [Doc] Stop mentioning packed mode (#6755) (by Yi Xu)
    • [lang] Get the length of dynamic SNode by x.length() (#6750) (by Lin Jiang)
    • [llvm] Support nested struct with matrix return value on real function (#6734) (by Lin Jiang)
    • [Metal] [error] Raise deprecate warning and error when using sparse snodes on metal (#6739) (by Lin Jiang)
    • [build] Integrate backward_cpp to test targets for enabling C++ stack trace (#6697) (by Zhanlue Yang)
    • [aot] Load AOT module from memory (#6692) (#6714) (by PENGUINLIONG)
    • [ci] Add dockerfile.ubuntu-18.04.amdgpu (#6736) (by Zeyu Li)
    • [doc] Update LLVM10 -> LLVM15 in installation guide (#6747) (by Zhanlue Yang)
    • [misc] Fix warnings of taichi examples (#6740) (by PGZXB)
    • [example] Ti-example: instant ngp renderer (#6673) (by Youtian Lin)
    • [build] Use a separate prebuilt llvm15 binary for manylinux environment (#6732) (by Ailing)
    Source code(tar.gz)
    Source code(zip)
  • v1.2.2(Nov 15, 2022)

    Molten-vk version is downgraded to v1.1.10 to fix a few GGUI issues.

    Full changelog:

    • [build] Downgrade molten-vk version to v1.1.10 (#6564) (by Zhanlue Yang)
    Source code(tar.gz)
    Source code(zip)
  • v1.2.1(Nov 1, 2022)

    This is a bug fix release for v1.2.0.

    Full changelog:

    • [mesh] Fix MeshTaichi warnings in CUDA backend (#6369) (by Chang Yu)
    • [Bug] Fix cache_loop_invariant_global_vars pass (#6462) (by Lin Jiang)
    Source code(tar.gz)
    Source code(zip)
  • v1.2.0(Oct 25, 2022)

    Starting from the v1.2.0 release, Taichi follows semantic versioning where regular releases cutting from master branch bumps MINOR version and PATCH version is only bumped when cherry-picking critial bug fixes.

    Deprecation Notice

    Indexing multi-dimensional ti.ndrange() with a single loop index will be disallowed in future releases.

    Highlights

    New features

    Offline Cache

    We introduced the offline cache on CPU and CUDA backends in v1.1.0. In this release, we support this feature on other backends, including Vulkan, OpenGL, and Metal.

    • If your code behaves abnormally, disable offline cache by setting the environment variable TI_OFFLINE_CACHE=0 or offline_cache=False in the ti.init() method call and file an issue with us on Taichi's GitHub repo.
    • See Offline cache for more information.

    GDAR (Global Data Access Rule)

    A checker is provided for detecting potential violations of global data access rules.

    1. The checker only works in debug mode. To enable it, set debug=True when calling ti.init().
    2. Set validation=True when using ti.ad.Tape() to validate the kernels captured by ti.ad.Tape(). If a violation occurs, the checker pinpoints the line of code breaking the rules.

    For example:

    import taichi as ti
    ti.init(debug=True)
    
    N = 5
    x = ti.field(dtype=ti.f32, shape=N, needs_grad=True)
    loss = ti.field(dtype=ti.f32, shape=(), needs_grad=True)
    b = ti.field(dtype=ti.f32, shape=(), needs_grad=True)
    
    @ti.kernel
    def func_1():
        for i in range(N):
            loss[None] += x[i] * b[None]
    
    @ti.kernel
    def func_2():
        b[None] += 100
    
    b[None] = 10
    with ti.ad.Tape(loss, validation=True):
        func_1()
        func_2()
    
    """
    taichi.lang.exception.TaichiAssertionError:
    (kernel=func_2_c78_0) Breaks the global data access rule. Snode S10 is overwritten unexpectedly.
    File "across_kernel.py", line 16, in func_2:
        b[None] += 100
        ^^^^^^^^^^^^^^
    """
    

    Improvements

    Performance

    Improved Vulkan performance with loops (#6072) (by Lin Jiang)

    Python Frontend

    • PrefixSumExecutor is added to improve the performance of prefix-sum operations. The legacy prefix-sum function allocates auxiliary gpu buffers at every function call, which causes an obvious performance problem. The new PrefixSumExecutor is able to avoid allocating buffers again and again. For arrays with the same length, the PrefixSumExecutor only needs to be initialized once, then it is able to perform any number of times prefix-sum operations without redundant field allocations. The prefix-sum operation is only supported on CUDA backend currently. (#6132) (by Yu Zhang)

      Usage:

      N = 100
      arr0 = ti.field(dtype, N)
      arr1 = ti.field(dtype, N)
      arr2 = ti.field(dtype, N)
      arr3 = ti.field(dtype, N)
      arr4 = ti.field(dtype, N)
      
      # initialize arr0, arr1, arr2, arr3, arr4, ...
      # ...
      
      # Performing an inclusive in-place's parallel prefix sum,
      # only one executor is needed for a specified sorting length.
      executor = ti.algorithms.PrefixSumExecutor(N)
      executor.run(arr0)
      executor.run(arr1)
      executor.run(arr2)
      executor.run(arr3)
      executor.run(arr4)
      
    • Runtime integer overflow detection on addition, subtraction, multiplication and shift left operators on Vulkan, CPU and CUDA backends is now available when debug mode is on. To use overflow detection on Vulkan backend, you need to enable printing, and the overflow detection of 64-bit multiplication on Vulkan backend requires NVIDIA driver 510 or higher. (#6178) (#6279) (by Lin Jiang)

      For the following program:

      import taichi as ti
      
      ti.init(debug=True)
      
      @ti.kernel
      def add(a: ti.u64, b: ti.u64)->ti.u64:
          return a + b
      
      add(2 ** 63, 2 ** 63)
        The following warning is printed at runtime:
      Addition overflow detected in File "/home/lin/test/overflow.py", line 7, in add:
          return a + b
                 ^^^^^
      
    • Printing is now supported on Vulkan backend on Unix/Windows platforms. To enable printing on vulkan backend, follow instructions at https://docs.taichi-lang.org/docs/master/debugging#applicable-backends (#6075) (by Ailing)

    GGUI

    • Setting the initial position of GGUI window is now supported. Please refer to this link https://docs.taichi-lang.org/docs/master/ggui#create-a-window to checkout details and usage. (#6156) (by Mocki)

    Taichi Examples

    Three new examples from community contributors are also merged in this release. They include:

    • Animating the fundamental solution of a Laplacian equation, (#6249) (by @bismarckkk)
    • Animating the Kerman vortex street using LBM, (#6249) (by @hietwl)
    • Animating the two streams of instability (#6249) (by JiaoLuhuai)

    You can view these examples by running ti example in terminal and select the corresponding index.

    Important bug fixes

    • "ti.data_oriented" class instance now correctly releases its allocated memory upon garbage collection. (#6256) (by Zhanlue Yang)
    • "ti.fields" can now be correctly indexed using non-i32 typed indices. (#6276) (by Zhanlue Yang)
    • "ti.select" and "ti.ifte" can now be printed correctly in Taichi Kernels. (#6297) (by Zhanlue Yang)
    • Before this release, setting u64 arguments with numbers greater than 2^63 raises error, and u64 return values are treated as i64 in Python (integers greater than 2^63 are returned as negative numbers). This release fixed those two bugs. (#6267) (#6364) (by Lin Jiang)
    • Taichi now raises an error when the number of the loop variables does not match the dimension of the ndrange for loop instead of malfunctioning. (#6360) (by Lin Jiang)
    • calling ti.append with vector/matrix now throws more proper error message. (#6322) (by Ailing)
    • Division on unsigned integers now works properly on LLVM backends. (#6128) (by Yi Xu)
    • Operator ">>=" now works properly. (#6153) (by Yi Xu)
    • Numpy int is now allowed for SNode shape setting. (#6211) (by Yi Xu)
    • Dimension check for GlobalPtrStmt is now aware of whether it is a cell access. (#6275) (by Yi Xu)
    • Before this release, Taichi autodiff may fail in cases where the condition of an if statement depends on the index of a outer for-loop. The bug has been fixed in this release. (#6207) (by Mingrui Zhang)

    Full changelog:

    • [Error] Deprecate ndrange with number of the loop variables != the dimension of the ndrange (#6422) (by Lin Jiang)
    • Adjust aot_demo.sh (by jim19930609)
    • [error] Warn Linux users about manylinux2014 build on startup i(#6416) (by Proton)
    • [misc] Bug fix (by jim19930609)
    • [misc] Bump version (by jim19930609)
    • [vulkan] [bug] Stop using the buffer device address feature on macOS (#6415) (by Yi Xu)
    • [Lang] [bug] Allow filling a field with Expr (#6391) (by Yi Xu)
    • [misc] Rc v1.2.0 cherry-pick PR number 2 (#6384) (by Zhanlue Yang)
    • [misc] Revert PR 6360 (#6386) (by Zhanlue Yang)
    • [misc] Rc v1.2.0 c1 (#6380) (by Zhanlue Yang)
    • [bug] Fix potential bug in #6362 (#6363) (#6371) (by Zhanlue Yang)
    • [example] Add example "laplace equation" (#6302) (by 猫猫子Official)
    • [ci] Android Demo: leave Docker containers intact for debugging (#6357) (by Proton)
    • [autodiff] Skip gradient kernel compilation for validation kernel (#6356) (by Mingrui Zhang)
    • [autodiff] Move autodiff gdar checker to release (#6355) (by Mingrui Zhang)
    • [aot] Removed constraint on same-allocation copy (#6354) (by PENGUINLIONG)
    • [ci] Add new performance monitoring (#6349) (by Proton)
    • [dx12] Only use llvm to compile dx12. (#6339) (by Xiang Li)
    • [opengl] Fix with_opengl when TI_WITH_OPENGL is off (#6353) (by Ailing)
    • [Doc] Add instructions about running clang-tidy checks locally (by Ailing Zhang)
    • [build] Enable readability-redundant-member-init in clang-tidy check (by Ailing Zhang)
    • [build] Enable TI_WITH_VULKAN and TI_WITH_OPENGL for clang-tidy checks (by Ailing Zhang)
    • [build] Enable a few modernize checks in clang-tidy (by Ailing Zhang)
    • [autodiff] Recover kernel autodiff mode after validation (#6265) (by Mingrui Zhang)
    • [test] Adjust rtol for sparse_linear_solver tests (#6352) (by Ailing)
    • [lang] MatrixType bug fix: Fix array indexing with MatrixType-index (#6323) (by Zhanlue Yang)
    • [Lang] MatrixNdarray refactor part13: Add scalarization for TernaryOpStmt (#6314) (by Zhanlue Yang)
    • [Lang] MatrixNdarray refactor part12: Add scalarization for AtomicOpStmt (#6312) (by Zhanlue Yang)
    • [build] Enable a few modernize checks in clang-tidy (by Ailing Zhang)
    • [build] Enable google-explicit-constructor check in clang-tidy (by Ailing Zhang)
    • [build] Enable google-build-explicit-make-pair check in clang-tidy (by Ailing Zhang)
    • [build] Enable a few bugprone related rules in clang-tidy (by Ailing Zhang)
    • [build] Enable modernize-use-override in clang-tidy (by Ailing Zhang)
    • [ci] Use .clang-tidy for check_static_analyzer job (by Ailing Zhang)
    • [mesh] Support arm64 backend for MeshTaichi (#6329) (by Chang Yu)
    • [lang] Throw proper error message if calling ti.append with vector/matrix (#6322) (by Ailing)
    • [aot] Fixed buffer device address import (#6326) (by PENGUINLIONG)
    • [aot] Fixed export of get_instance_proc_addr (#6324) (by PENGUINLIONG)
    • [build] Allow building test when LLVM is off (#6327) (by Ailing)
    • [bug] Fix generating LLVM AOT module for the second time failed (#6311) (by PGZXB)
    • [aot] Per-parameter documentation in C-API header (#6317) (by PENGUINLIONG)
    • [ci] Revert "Add end-to-end CI tests for meshtaichi (#6321)" (#6325) (by Proton)
    • [ci] Add end-to-end CI tests for meshtaichi (#6321) (by yixu)
    • [doc] Update the document about offline cache (#6313) (by PGZXB)
    • [aot] Include taichi_cpu.h in taich.h (#6315) (by Zhanlue Yang)
    • [Vulkan] [bug] Change the format string of 64bit unsigned integer type from %llu to %lu (#6308) (by Lin Jiang)
    • [mesh] Refactor MeshTaichi API (#6306) (by Chang Yu)
    • [lang] MatrixType bug fix: Allow dynamic_index=True when real_matrix_scalarize=True (#6304) (by Yi Xu)
    • [lang] MatrixType bug fix: Enable irpass::cfg_optimization if real_matrix_scalarize is on (#6300) (by Zhanlue Yang)
    • [metal] Enable offline cache by default on Metal (#6307) (by PGZXB)
    • [Vulkan] Add overflow detection on vulkan when debug=True (#6279) (by Lin Jiang)
    • [aot] Inline documentations (#6301) (by PENGUINLIONG)
    • [aot] Support exporting interop info for TiMemory on Cpu/Cuda backends (#6242) (by Zhanlue Yang)
    • [lang] MatrixType bug fix: Avoid checks for legacy Matrix-class when real_matrix is on (#6292) (by Zhanlue Yang)
    • [aot] Support setting vector/matrix argument in C++ wrapper of C-API (#6298) (by Ailing)
    • [lang] MatrixType bug fix: Fix MatrixType validations in build_call_if_is_type() (#6294) (by Zhanlue Yang)
    • [bug] Fix asserting failed when registering kernels with same name on Metal (#6271) (by PGZXB)
    • [ci] Add more release tests (#5839) (by Proton)
    • [lang] MatrixType bug fix: Allow indexing a matrix r-value (#6291) (by Yi Xu)
    • [bug] Fix duplicate runs with 'run_tests.py --cpp -k' when selecting AOT tests (#6296) (by Zhanlue Yang)
    • [bug] Fix segmentation fault with TextureOpStmt ir_printer (#6297) (by Zhanlue Yang)
    • [ci] Add taichi-aot-demo headless demos (#6280) (by Proton)
    • [bug] Serialize missing fields of metal::TaichiKernelAttributes and metal::KernelAttributes (#6270) (by PGZXB)
    • [metal] Implement offline cache cleaning on metal (#6272) (by PGZXB)
    • [aot] Reorganized C-API headers (#6199) (by PENGUINLIONG)
    • [lang] [bug] Fix setting integer arguments within u64 range but greater than i64 range (#6267) (by Lin Jiang)
    • [autodiff] Skip gdar checking for user defined grad kernel (#6273) (by Mingrui Zhang)
    • [bug] Fix AotModuleBuilder::add_compiled_kernel (#6287) (by PGZXB)
    • [Bug] [lang] Make dimension check for GlobalPtrStmt aware of whether it is a cell access (#6275) (by Yi Xu)
    • [refactor] Move setting visible device to vulkan instance initialization (by Ailing Zhang)
    • [bug] Add unit test to detect memory leak from data_oriented classes (#6278) (by Zhanlue Yang)
    • [aot] Ship runtime *.bc files with C-API for LLVM AOT (#6285) (by Zhanlue Yang)
    • [bug] Convert non-i32 type indices to i32 for GlobalPtrStmt (#6276) (by Zhanlue Yang)
    • [Doc] Renamed syntax.md to kernel_function.md, plus miscellaneous edits (#6277) (by Vissidarte-Herman)
    • [lang] Fixed validation scope (#6262) (by PENGUINLIONG)
    • [bug] Prevent ti.kernel from directly caching the passed-in arguments to avoid memory leak (#6256) (by Zhanlue Yang)
    • [autodiff] Add demote atomics before gdar checker (#6266) (by Mingrui Zhang)
    • [autodiff] Add grad check feature and related test (#6245) (by PhrygianGates)
    • [lang] Fixed contraction cast (#6255) (by PENGUINLIONG)
    • [Example] Add karman vortex street example (#6249) (by Zhao Liang)
    • [ci] Lift GitHub CI timeout (#6260) (by Proton)
    • [metal] Support offline cache on metal (#6227) (by PGZXB)
    • [dx12] Add DirectX-Headers as a submodule (#6259) (by Xiang Li)
    • [bug] Fix link error with TI_WITH_OPENGL:BOOL=ON but TI_WITH_VULKAN:BOOL=OFF (#6257) (by PGZXB)
    • [dx12] Disable DX12 for cpu only test. (#6253) (by Xiang Li)
    • [Lang] MatrixNdarray refactor part11: Fuse ExternalPtrStmt and PtrOffsetStmt (#6189) (by Zhanlue Yang)
    • [Doc] Rename index.md to hello_world.md (#6244) (by Vissidarte-Herman)
    • [Doc] Update syntax.md (#6236) (by Zhao Liang)
    • [spirv] Generate OpBitFieldUExtract for BitExtractStmt (#6208) (by Yi Xu)
    • [Bug] [lang] Allow numpy int as snode dimension (#6211) (by Yi Xu)
    • [doc] Update document about building and running Taichi C++ tests (#6228) (by PGZXB)
    • [misc] Disable the offline cache if printing ir is enabled (#6234) (by PGZXB)
    • [vulkan] [opengl] Enable offline cache by default on Vulkan and OpenGL (#6233) (by PGZXB)
    • [Doc] Update math_module.md (#6235) (by Zhao Liang)
    • [Doc] Update debugging.md (#6238) (by Zhao Liang)
    • [dx12] Add ti.dx12. (#6174) (by Xiang Li)
    • [lang] Set ret_type for AtomicOpStmt (#6213) (by Ailing)
    • [Doc] Update global settings (#6201) (by Olinaaaloompa)
    • [doc] Editorial updates (#6216) (by Vissidarte-Herman)
    • [Doc] Update hello world (#6191) (by Olinaaaloompa)
    • [Doc] Update math module (#6203) (by Olinaaaloompa)
    • [Doc] Update profiler (#6214) (by Olinaaaloompa)
    • [autodiff] Store if condition in adstack (#6207) (by Mingrui Zhang)
    • [Doc] Update debugging.md (#6212) (by Zhao Liang)
    • [Doc] Update debugging.md (#6200) (by Zhao Liang)
    • [bug] Fixed type inference error with ExternalPtrStmt (#6210) (by Zhanlue Yang)
    • [example] Request to add my code into examples (#6185) (by JiaoLuhuai)
    • [Lang] MatrixNdarray refactor part10: Remove redundant MatrixInitStmt generated from scalarization (#6171) (by Zhanlue Yang)
    • [aot] Apply ti_get_last_error_message() for all C-API test cases (#6195) (by Zhanlue Yang)
    • [llvm] [refactor] Merge create_call and call (#6192) (by Lin Jiang)
    • [build] Support executing manually-specified cpp tests for run_tests.py (#6206) (by Zhanlue Yang)
    • [doc] Editorial updates to field.md (#6202) (by Vissidarte-Herman)
    • [Lang] MatrixNdarray refactor part9: Add scalarization for AllocaStmt (#6168) (by Zhanlue Yang)
    • [Lang] Support GPU solve with analyzePattern and factorize (#6158) (by pengyu)
    • [Lang] MatrixField refactor 9/n: Allow dynamic index of matrix field when real_matrix=True (#6194) (by Yi Xu)
    • [Doc] Fixed broken links (#6193) (by Olinaaaloompa)
    • [ir] MatrixField refactor 8/n: Rename PtrOffsetStmt to MatrixPtrStmt (#6187) (by Yi Xu)
    • [Doc] Update field.md (#6182) (by Zhao Liang)
    • [bug] Relax dependent Pillow version (#6170) (by Ailing)
    • [Doc] Update data_oriented_class.md (#6181) (by Zhao Liang)
    • [Doc] Update kernels and functions (#6176) (by Zhao Liang)
    • [Doc] Update type.md (#6180) (by Zhao Liang)
    • [Doc] Update getting started (#6175) (by Zhao Liang)
    • [llvm] MatrixField refactor 7/n: Simplify codegen for TensorType allocation and access (#6169) (by Yi Xu)
    • [LLVM] Add runtime overflow detection on LLVM-based backends (#6178) (by Lin Jiang)
    • Revert "[LLVM] Add runtime overflow detection on LLVM-based backends" (#6177) (by Ailing)
    • [dx12] Add aot for dx12. (#6099) (by Xiang Li)
    • [LLVM] Add runtime overflow detection on LLVM-based backends (#6166) (by Lin Jiang)
    • [doc] C-API documentation & generator (#5736) (by PENGUINLIONG)
    • [gui] Support for setting the initial position of GGUI window (#6156) (by Mocki)
    • [metal] Maintain a print string table per kernel (#6160) (by PGZXB)
    • [Lang] MatrixNdarray refactor part8: Add scalarization for BinaryOpStmt with TensorType-operands (#6086) (by Zhanlue Yang)
    • [Doc] Refactor debugging (#6102) (by Olinaaaloompa)
    • [doc] Updated the position of Sparse Matrix (#6167) (by Vissidarte-Herman)
    • [Doc] Refactor global settings (#6071) (by Zhao Liang)
    • [Doc] Refactor external arrays (#6065) (by Zhao Liang)
    • [Doc] Refactor simt (#6151) (by Zhao Liang)
    • [Doc] Refactor Profiler (#6142) (by Olinaaaloompa)
    • [Doc] Add doc for math module (#6145) (by Zhao Liang)
    • [aot] Fixed texture interop (#6164) (by PENGUINLIONG)
    • [misc] Remove TI_UI namespace macros (#6163) (by Lin Jiang)
    • [llvm] Add comment about the structure of the CodeGen (#6150) (by Lin Jiang)
    • [Bug] [lang] Fix augmented assign for sar (#6153) (by Yi Xu)
    • [Test] Add scipy to test GPU sparse solver (#6162) (by pengyu)
    • [bug] Fix crashing when loading old offline cache files (for gfx backends) (#6157) (by PGZXB)
    • [lang] Remove print at the end of parallel sort (#6161) (by Haidong Lan)
    • [misc] Move some offline cache utils from analysis/ to util/ (#6155) (by PGZXB)
    • [Lang] Matrix/Vector refactor: support basic matrix ops (#6077) (by Mike He)
    • [misc] Remove namespace macros (#6154) (by Lin Jiang)
    • [Doc] Update gui_system (#6152) (by Zhao Liang)
    • [aot] Track layouts for imported image & tests (#6138) (by PENGUINLIONG)
    • [ci] Fix build cache problems (#6149) (by Proton)
    • [Misc] Add prefix sum executor to avoid multiple field allocations (#6132) (by YuZhang)
    • [opt] Cache loop-invariant global vars to local vars (#6072) (by Lin Jiang)
    • [aot] Improve C++ wrapper implementation (#6146) (by PENGUINLIONG)
    • [doc] Refactored ODOP (#6143) (by Vissidarte-Herman)
    • [Lang] Support basic sparse matrix operations on GPU. (#6082) (by Jiafeng Liu)
    • [Lang] MatrixField refactor 6/n: Add tests for MatrixField scalarization (#6137) (by Yi Xu)
    • [vulkan] Fix SPV physical ptr load alignment (#6139) (by Bob Cao)
    • [bug] Let every thread has its own CompileConfig (#6124) (by Lin Jiang)
    • [refactor] Remove redundant codegen of floordiv (#6135) (by Yi Xu)
    • [doc] Miscellaneous editorial updates (#6131) (by Vissidarte-Herman)
    • Revert "[spirv] Fixed OpLoad with physical address" (#6136) (by Lin Jiang)
    • [bug] [llvm] Fix is_same_type when the suffix of a type is the prefix of the suffix of the other type (#6126) (by Lin Jiang)
    • [bug] [vulkan] Only enable non_semantic_info cap when validation layer is on (#6129) (by Ailing)
    • [Llvm] Fix codegen for div (unsigned) (#6128) (by Yi Xu)
    • [Lang] MatrixField refactor 5/n: Lower access of matrix field element into CHI IR (#6119) (by Yi Xu)
    • [Lang] Fix invalid assertion for matrix values (#6125) (by Zhanlue Yang)
    • [opengl] Fix GLES support (#6121) (by Ailing)
    • [Lang] MatrixNdarray refactor part7: Add scalarization for UnaryOpStmt with TensorType-operand (#6080) (by Zhanlue Yang)
    • [doc] Editorial updates (#6116) (by Vissidarte-Herman)
    • [misc] Allow more commits in changelog generation (#6115) (by Yi Xu)
    • [aot] Import MoltenVK (#6090) (by PENGUINLIONG)
    • [vulkan] Instruct users to install vulkan sdk if they want to use validation layer (#6098) (by Ailing)
    • [ci] Use local caches on self-hosted runners, and code refactoring. (#5846) (by Proton)
    • [misc] Bump version to v1.1.4 (#6112) (by Taichi Gardener)
    • [doc] Fixed a broken link (#6111) (by Vissidarte-Herman)
    • [doc] Update explanation on data-layout (#6110) (by Qian Bao)
    • [Doc] Move developer utilities to contribution (#6109) (by Olinaaaloompa)
    • [Doc] Added Accelerate PyTorch (#6106) (by Vissidarte-Herman)
    • [Doc] Refactor ODOP (#6013) (by Zhao Liang)
    • [opengl] Support offline cache on opengl (#6104) (by PGZXB)
    • [build] Fix building with TI_WITH_OPENGL:BOOL=OFF and TI_WITH_DX11:BOOL=ON failed (#6108) (by PGZXB)
    Source code(tar.gz)
    Source code(zip)
  • v1.1.3(Sep 20, 2022)

    Highlights:

    • Aot module
      • Added texture interfaces to C-API (#5520) (by PENGUINLIONG)
    • Bug fixes
      • Disable vkCmdWriteTimestamp with MacOS to enable tests on Vulkan (#6020) (by Zhanlue Yang)
      • Fix printing i8/u8 (#5893) (by Yi Xu)
      • Fix wrong type cast in codegen of storing quant floats (#5818) (by Yi Xu)
      • Remove wrong optimization: Float x // 1 -> x (#5672) (by Yi Xu)
    • Build system
      • Clean up Taichi core cmake (#5595) (by Bo Qiao)
    • CI/CD workflow
      • Update torch and cuda version (#6054) (by pengyu)
    • Documentation
      • Refactor field (#6006) (by Zhao Liang)
      • Update docstring of pow() (#6046) (by Yi Xu)
      • Fix spelling of numerical and nightly in README.md (#6025) (by Lauchlin)
      • Added Accelerate Python (#5940) (by Vissidarte-Herman)
      • New FAQs added (#5784) (by Olinaaaloompa)
      • Update type cast (#5831) (by Zhao Liang)
      • Update global_settings.md (#5764) (by Zhao Liang)
      • Update init docstring (#5759) (by Zhao Liang)
      • Add introduction to quantized types (#5705) (by Yi Xu)
      • Add docs for GGUI's new features (#5647) (by Mocki)
      • Add introduction to forward mode autodiff (#5680) (by Mingrui Zhang)
      • Add doc about offline cache (#5646) (by Mingming Zhang)
      • Typo in the doc. (#5652) (by dongqi shen)
    • Error messages
      • Add error when breaking/continuing a static for inside non-static if (#5755) (by Lin Jiang)
      • Do not show warning when the offline cache path does not exist (#5747) (by Lin Jiang)
    • Language and syntax
      • Sort coo to build correct csr format sparse matrix on GPU (#6050) (by pengyu)
      • MatrixNdarray refactor part6: Add scalarization for LocalLoadStmt & GlobalLoadStmt with TensorType (#6024) (by Zhanlue Yang)
      • MatrixField refactor 4/n: Disallow invalid matrix field definition (#6074) (by Yi Xu)
      • Fixes matrix-vector multiplication (#6014) (by Mike He)
      • MatrixNdarray refactor part5: Add scalarization for LocalStoreStmt & GlobalStoreStmt with TensorType (#5946) (by Zhanlue Yang)
      • Deprecate SOA-layout for NdarrayMatrix/NdarrayVector (#6030) (by Zhanlue Yang)
      • Indexing for new local matrix implementation (#5783) (by Mike He)
      • Make scalar kernel arguments immutable (#5990) (by Lin Jiang)
      • Demote pow() with integer exponent (#6044) (by Yi Xu)
      • Support abs(i64) (#6018) (by Yi Xu)
      • MatrixNdarray refactor part4: Lowered TensorType to CHI IR level for elementwise-indexed MatrixNdarray (#5936) (by Zhanlue Yang)
      • MatrixNdarray refactor part3: Enable TensorType for MatrixNdarray at Frontend IR level (#5900) (by Zhanlue Yang)
      • Support linear system solving on GPU with cuSolver (#5860) (by pengyu)
      • MatrixNdarray refactor part2: Remove redundant members in python-scope AnyArray (#5885) (by Zhanlue Yang)
      • MatrixNdarray refactor part1: Refactor Taichi kernel argument to use TensorType (#5881) (by Zhanlue Yang)
      • MatrixNdarray refactor part0: Support direct TensorType construction in Ndarray and refactor use of element_shape (#5875) (by Zhanlue Yang)
      • Enable definition of local matrices/vectors (#5782) (by Mike He)
      • Build csr sparse matrix on GPU using coo format ndarray (#5838) (by pengyu)
      • Add @python_scope decorator for selected MatrixNdarray/VectorNdarray methods (#5844) (by Zhanlue Yang)
      • Make python scope comparison return 1 instead of -1 (#5840) (by daylily)
      • Allow implicit conversion of integer types in if conditions (#5763) (by daylily)
      • Support sparse matrix on GPU (#5185) (by pengyu)
      • Improve loop error message and remove the check for real type id (#5792) (by Zhao Liang)
      • Implement index validation for matrices/vectors (#5605) (by Mike He)
    • MeshTaichi
      • Fix nested mesh for (#6062) (by Chang Yu)
    • Vulkan backend
      • Track image layout internally (#5597) (by PENGUINLIONG)

    Full changelog:

    • [bug] [gui] Fix a bug of drawing mesh instacing that cpu/cuda objects have an offset when copying to vulkan object (#6028) (by Mocki)
    • [bug] Fix cleaning cache failed (#6100) (by PGZXB)
    • [aot] Support multi-target builds for Apple M1 (#6083) (by PENGUINLIONG)
    • [spirv] [refactor] Rename debug_ segment to names_ (#6094) (by Ailing)
    • [dx12] Update codegen for range_for and mesh_for (#6092) (by Xiang Li)
    • [gui] Direct image presentation & faster direct copy routine (#6085) (by Bob Cao)
    • [vulkan] Support printing in debug mode on vulkan backend (#6075) (by Ailing)
    • [bug] Fix crashing when loading old offline cache files (#6089) (by PGZXB)
    • [ci] Update prebuild binary for llvm 15. (#6091) (by Xiang Li)
    • [example] Add RHI examples (#5969) (by Bob Cao)
    • [aot] Pragma once in taichi.cpp (#6088) (by PENGUINLIONG)
    • [Lang] Sort coo to build correct csr format sparse matrix on GPU (#6050) (by pengyu)
    • [build] Refactor test infrastructure for AOT tests (#6064) (by Zhanlue Yang)
    • [Lang] MatrixNdarray refactor part6: Add scalarization for LocalLoadStmt & GlobalLoadStmt with TensorType (#6024) (by Zhanlue Yang)
    • [Lang] MatrixField refactor 4/n: Disallow invalid matrix field definition (#6074) (by Yi Xu)
    • [bug] Remove unnecessary lower() in AotModuleBuilder::add (#6068) (by PGZXB)
    • [lang] Preserve shape info for Vectors (#6076) (by Mike He)
    • [misc] Simplify PR template (#6063) (by Ailing)
    • [Bug] Disable vkCmdWriteTimestamp with MacOS to enable tests on Vulkan (#6020) (by Zhanlue Yang)
    • [bug] Set cfg.offline_cache after reset() (#6073) (by PGZXB)
    • [ci] [dx12] Enable dx12 build for windows cpu ci. (#6069) (by Xiang Li)
    • [ci] Upgrade conda cudatoolkit version to 11.3 (#6070) (by Proton)
    • [Mesh] [bug] Fix nested mesh for (#6062) (by Chang Yu)
    • [Lang] Fixes matrix-vector multiplication (#6014) (by Mike He)
    • [ir] MatrixField refactor 3/n: Add MatrixFieldExpression (#6010) (by Yi Xu)
    • [dx12] Drop code for llvm passes which prepare for DXIL generation. (#5998) (by Xiang Li)
    • [aot] Guard C-API interfaces with try-catch (#6060) (by PENGUINLIONG)
    • [CI] Update torch and cuda version (#6054) (by pengyu)
    • [Lang] MatrixNdarray refactor part5: Add scalarization for LocalStoreStmt & GlobalStoreStmt with TensorType (#5946) (by Zhanlue Yang)
    • [Lang] Deprecate SOA-layout for NdarrayMatrix/NdarrayVector (#6030) (by Zhanlue Yang)
    • [aot] Dump required device capability in AOT module meta (#6056) (by PENGUINLIONG)
    • [Doc] Refactor field (#6006) (by Zhao Liang)
    • [Lang] Indexing for new local matrix implementation (#5783) (by Mike He)
    • [lang] Reformat source indicator in Python convention (#6053) (by PENGUINLIONG)
    • [misc] Enable offline cache in frontend instead of C++ Side (#6051) (by PGZXB)
    • [lang] Remove redundant codegen of integer pow (#6048) (by Yi Xu)
    • [Doc] Update docstring of pow() (#6046) (by Yi Xu)
    • [Lang] Make scalar kernel arguments immutable (#5990) (by Lin Jiang)
    • [build] Fix compile error on gcc (#6047) (by PGZXB)
    • [llvm] [refactor] Split LLVMCompiledData of kernels and tasks (#6019) (by Lin Jiang)
    • [Lang] Demote pow() with integer exponent (#6044) (by Yi Xu)
    • [doc] Refactor type system (#5984) (by Zhao Liang)
    • [test] Change deprecated make_camera() to Camera() (#6009) (by Zihua Wu)
    • [doc] Fix a typo in README.md (#6033) (by OccupyMars2025)
    • [misc] Lazy load spirv code from disk during offline cache (#6000) (by PGZXB)
    • [aot] Fixed compilation on Linux distros (#6043) (by PENGUINLIONG)
    • [bug] [test] Run C-API tests correctly on Windows (#6038) (by PGZXB)
    • [aot] C-API texture support and tests (#5994) (by PENGUINLIONG)
    • [Doc] Fix spelling of numerical and nightly in README.md (#6025) (by Lauchlin)
    • [doc] Fixed a format issue (#6023) (by Vissidarte-Herman)
    • [doc] Indenting (#6022) (by Vissidarte-Herman)
    • [Lang] Support abs(i64) (#6018) (by Yi Xu)
    • [lang] Merge ti_core.make_index_expr and ti_core.subscript (#5993) (by Zhanlue Yang)
    • [llvm] [refactor] Remove the use of vector with size=1 (#6002) (by Lin Jiang)
    • [bug] [test] Fix patch_os_environ_helper (#6017) (by Lin Jiang)
    • [ci] Remove legacy perf monitoring (to be reworked) (#6015) (by Proton)
    • Fix (#5999) (by PGZXB)
    • [doc] Format updates (#6016) (by Olinaaaloompa)
    • [refactor] Turn on torch_io tests for opengl, vulkan and dx11 backend (#5997) (by Ailing)
    • [ci] Adjust Windows GPU task buildbot tag (#6008) (by Proton)
    • Fixed compilation (#6005) (by PENGUINLIONG)
    • [autodiff] Avoid initializing Field with None (#6007) (by Yi Xu)
    • [doc] Cloth simulation tutorial (#6004) (by Olinaaaloompa)
    • [Lang] MatrixNdarray refactor part4: Lowered TensorType to CHI IR level for elementwise-indexed MatrixNdarray (#5936) (by Zhanlue Yang)
    • [llvm] [refactor] Link modules instead of cloning modules (#5962) (by Lin Jiang)
    • [dx12] Drop code for dxil generation. (#5958) (by Xiang Li)
    • [ci] Windows Build: Use PowerShell 7 (pwsh) (#5996) (by Proton)
    • Use CUDA primary context to work with PyTorch and Numba. (#5992) (by Haidong Lan)
    • [vulkan] Implement offline cache cleaning on vulkan (#5968) (by PGZXB)
    • [ir] MatrixField refactor 2/n: Rename GlobalVariableExpression to FieldExpression (#5989) (by Yi Xu)
    • [build] Add option to generate dependency graph of cmake targets (#5966) (by Ailing)
    • [autodiff] Fix matrix dual (#5985) (by Mingrui Zhang)
    • [doc] Add a note about the offline cache in the print_ir part (#5986) (by Zihua Wu)
    • [ir] MatrixField refactor 1/n: Make a GlobalVariableExpression solely represent a field (#5980) (by Yi Xu)
    • [aot] [opengl] Add GL interop so that we can export GL memory (#5956) (by Ailing)
    • [ci] Temporarily lower CUDA tests parallelism by 1 (4->3) (#5981) (by Proton)
    • [aot] Temporarily allow Vulkan extensions to be automatically enabled in C-API runtimes (#5976) (by PENGUINLIONG)
    • [cuda] Clear cuda context after init (#5891) (by Haidong Lan)
    • [ir] MatrixField refactor 0/n: Remove redundant code for SNode expressions (#5964) (by Yi Xu)
    • [test] [gui] Skip test of drawing mesh instances on Windows platform (#5971) (by Mocki)
    • [test] [gui] Add a test of fetching depth attachment (#5947) (by Mocki)
    • [test] [gui] Add a test of the display of wireframe mode (#5967) (by Mocki)
    • [aot] Support i8/u8 args in cgraph (#5961) (by Ailing)
    • RHI fixes & improvements (#5950) (by Bob Cao)
    • [test] [gui] Add a test of drawing part of mesh instances (#5965) (by Mocki)
    • [bug] Force CMAKE_OSX_ARCHITECTURES in sync with host processor's architecture to avoid ABI issues on Mac m1 (#5952) (by Zhanlue Yang)
    • [doc] Add explanatation of usage of TI_VISIBLE_DEVICE and CUDA_VISIBLE_DEVICES (#5910) (by Mocki)
    • [test] [gui] Add a test of drawing mesh instances (#5963) (by Mocki)
    • [test] [gui] Add a test of drawing part of lines (#5957) (by Mocki)
    • [doc] Refactor kernels and functions (#5943) (by Zhao Liang)
    • [dx12] Drop code for dx12 codegen. (#5953) (by Xiang Li)
    • [test] [gui] Add a test of drawing part of mesh (#5955) (by Mocki)
    • [test] [gui] Add a test of drawing part of particles (#5951) (by Mocki)
    • [test] [gui] Add a test of drawing lines (#5948) (by Mocki)
    • [llvm] [refactor] Refactor and add code about llvm modules (#5941) (by Lin Jiang)
    • [llvm] [refactor] Move the generation of struct for function to the context (#5937) (by Lin Jiang)
    • [Doc] Added Accelerate Python (#5940) (by Vissidarte-Herman)
    • [test] [gui] Add a test of fetching color attachment (#5920) (by Mocki)
    • [refactor] Refactor the implementation of cleaning offline cache files (#5934) (by PGZXB)
    • Revert "[vulkan] Less sync overhead for GGUI & Device API Examples (#5880)" (#5945) (by PENGUINLIONG)
    • [vulkan] Less sync overhead for GGUI & Device API Examples (#5880) (by Bob Cao)
    • [autodiff] Fix adjoint checkbit type in gdar checker (#5938) (by Mingrui Zhang)
    • [bug] Fix undefined symbol error by isolating CacheManager as a separate target (#5931) (by PGZXB)
    • [aot] Fix LLVM submit (#5930) (by PENGUINLIONG)
    • [refactor] [llvm] Unify llvm_type() and get_data_type() (#5927) (by Yi Xu)
    • [llvm] Add attributes to LLVMCompiledData (#5929) (by Lin Jiang)
    • [llvm] [refactor] Move the common parts of compilation to the base class (#5926) (by Lin Jiang)
    • Update index.md (#5928) (by Zhao Liang)
    • [doc] Refactor "Getting Started" (#5902) (by Zhao Liang)
    • [autodiff] Support clear gradients by type (#5911) (by Mingrui Zhang)
    • [Lang] MatrixNdarray refactor part3: Enable TensorType for MatrixNdarray at Frontend IR level (#5900) (by Zhanlue Yang)
    • [aot] Taichi C-API C++ wrapper (#5899) (by PENGUINLIONG)
    • [llvm] Enhance function is_same_type (#5922) (by Lin Jiang)
    • [Lang] Support linear system solving on GPU with cuSolver (#5860) (by pengyu)
    • Bugfix: Minor issue for CUPTI kernel profiler (#5879) (by Jack He)
    • [Error] [lang] Add error when breaking/continuing a static for inside non-static if (#5755) (by Lin Jiang)
    • [llvm] [refactor] Rename methods in KernelCodeGen (#5919) (by Lin Jiang)
    • [refactor] [ir] Remove legacy LocalAddress / VectorElement / create_vector_or_scalar_type() (#5918) (by Yi Xu)
    • [autodiff] Fix validate autodiff kernel name lost (#5912) (by Mingrui Zhang)
    • [misc] Disable parallel compilation (#5916) (by Lin Jiang)
    • [refactor] [ir] Remove legacy LaneAttribute (#5901) (by Yi Xu)
    • [aot] [test] Fix c api aot tests for vulkan and opengl backend (by Ailing Zhang)
    • [aot] Add C API for opengl backend (by Ailing Zhang)
    • [doc] Updated forward-mode autodiff (#5894) (by Vissidarte-Herman)
    • [Lang] MatrixNdarray refactor part2: Remove redundant members in python-scope AnyArray (#5885) (by Zhanlue Yang)
    • [refactor] [ir] Remove legacy LaneAttribute usage from ExternalPtrStmt/GlobalPtrStmt (#5898) (by Yi Xu)
    • [ci] Switch windows cpu build to llvm 15. (#5832) (by Xiang Li)
    • [Lang] MatrixNdarray refactor part1: Refactor Taichi kernel argument to use TensorType (#5881) (by Zhanlue Yang)
    • [Bug] [lang] Fix printing i8/u8 (#5893) (by Yi Xu)
    • [misc] Remove FrontendEvalStmt (#5897) (by PGZXB)
    • [refactor] [ir] Remove legacy ElementShuffleStmt (#5892) (by Yi Xu)
    • [bug] Remove mistakenly-added C-API compilation from release pipeline (#5890) (by Zhanlue Yang)
    • [llvm] Fix PtrOffset address for shared array in llvm 15. (#5867) (by Xiang Li)
    • [vulkan] [refactor] [bug] Redesign gfx::OfflineCacheManager to unify compilation of kernels on vulkan  (#5889) (by PGZXB)
    • [doc] Updated Quantized data types (#5886) (by Vissidarte-Herman)
    • [Lang] MatrixNdarray refactor part0: Support direct TensorType construction in Ndarray and refactor use of element_shape (#5875) (by Zhanlue Yang)
    • [refactor] [ir] Remove legacy stmt width (#5882) (by Yi Xu)
    • [bug] [ggui] Fix cpu vulkan interop build (#5865) (by Ailing)
    • [refactor] Separate texture args from scalar arg declaration (#5878) (by Ailing)
    • [Lang] Enable definition of local matrices/vectors (#5782) (by Mike He)
    • [gfx] Unify the implementation of offline cache for gfx backends (#5868) (by PGZXB)
    • [bug] Improve error message with GlobalPtrStmt indexing (#5841) (by Zhanlue Yang)
    • [aot] Workaround build structure to export GGUI symbols in libtaichi_export_core.so (#5870) (by Ailing)
    • [doc] Updated supported backend DX 11 (#5845) (by Vissidarte-Herman)
    • [bug] Enabled NdarrayType & MatrixType annotation parsing for ti.func (#5814) (by Zhanlue Yang)
    • Removed Unexpected debug code in repo (#5866) (by PENGUINLIONG)
    • [aot] C-API error handling mechanism (#5847) (by PENGUINLIONG)
    • [vulkan] Detect and set device-capabilities for aot::TargetDevice used in offline cache (#5843) (by PGZXB)
    • [Lang] Build csr sparse matrix on GPU using coo format ndarray (#5838) (by pengyu)
    • [gui] Add some built-int math APIs for building translation, scale and rotation matrix (#5827) (by Mocki)
    • [Lang] Add @python_scope decorator for selected MatrixNdarray/VectorNdarray methods (#5844) (by Zhanlue Yang)
    • [Lang] Make python scope comparison return 1 instead of -1 (#5840) (by daylily)
    • [vulkan] Support offline cache on Vulkan (#5825) (by PGZXB)
    • [Lang] Allow implicit conversion of integer types in if conditions (#5763) (by daylily)
    • making gravity option used (#5836) (by Michael Xu)
    • [autodiff] Add grad type for SNode (#5805) (by Mingrui Zhang)
    • [Doc] New FAQs added (#5784) (by Olinaaaloompa)
    • [dx12] Drop code for dx12. (#5816) (by Xiang Li)
    • [Doc] Update type cast (#5831) (by Zhao Liang)
    • [autodiff] Fix global data access rule checker memory allocation (#5801) (by Mingrui Zhang)
    • [misc] Bump version to v1.1.3 (#5823) (by Ailing)
    • [doc] Updated compilation warnings (#5808) (by Vissidarte-Herman)
    • Fixed crash in SPIR-V CodeGen when a const is declared twice (#5813) (by PENGUINLIONG)
    • [test] Refactor opengl and vulkan cpp aot tests (#5812) (by Ailing)
    • [Bug] [type] Fix wrong type cast in codegen of storing quant floats (#5818) (by Yi Xu)
    • [autodiff] Support shift ptr in dynamic index (#5770) (by Mingrui Zhang)
    • [ci] Regenerate AOT binaries for every Android smoke test (#5815) (by Proton)
    • [ci] Disable show_env job on fork repo (#5811) (by Bo Qiao)
    • [bug] Fix: GraphBuilder::Sequential unable to handle Matrix-type argument (#5806) (by Zhanlue Yang)
    • [build] [bug] Fix Taichi build with vulkan on and opengl off (#5807) (by Bo Qiao)
    • [test] Add a cgraph test with template args in ti.func (#5803) (by Ailing)
    • [Lang] Support sparse matrix on GPU (#5185) (by pengyu)
    • [test] Enable llvm aot tests for vulkan and opengl backend (#5795) (by Ailing)
    • Fixed a display issue (#5796) (by Vissidarte-Herman)
    • [Lang] Improve loop error message and remove the check for real type id (#5792) (by Zhao Liang)
    • [Doc] Update global_settings.md (#5764) (by Zhao Liang)
    • [ci] Workaround nightly C++ test crash (#5789) (by Proton)
    • [llvm] Disable f16 atomic hack for llvm 15. (#5756) (by Xiang Li)
    • [aot] Workaround C-API build structure to include GGUI symbols (#5787) (by Zhanlue Yang)
    • [ci] Skip C++ tests on macOS + CPU (#5778) (by Proton)
    • [Doc] Update init docstring (#5759) (by Zhao Liang)
    • [doc] Fix struct to numpy example typo (#5781) (by Garry Ling)
    • [test] Get rid of utils.py in aot python scripts (#5785) (by Ailing)
    • [refactor] Refactor C++ aot test to accommodate multiple backends (by Ailing Zhang)
    • [test] Expand python aot test coverage for opengl backend (by Ailing Zhang)
    • [opengl] Fix target device for opengl aot (by Ailing Zhang)
    • Fixed GGUI scene mesh memory leak (#5779) (by PENGUINLIONG)
    • [autodiff] [test] Recover the forward mode test cases (#5696) (by Mingrui Zhang)
    • [ci] Test the offline cache every day (#5768) (by PGZXB)
    • [lang] Update scan impl with shared memory usage (#5762) (by Bo Qiao)
    • Revert "[aot] Added basic infrastructure for gui_utils interfaces - unimplemented (#5688)" (#5760) (by Zhanlue Yang)
    • Fixed window size crash (#5765) (by PENGUINLIONG)
    • [llvm] Support SharedArray global when lower PtrOffsetStmt. (#5758) (by Xiang Li)
    • [ci] Prevent using the offline cache of previous jobs (#5734) (by Lin Jiang)
    • [gui] Add GGUI set_image support for non-Vector fields and numpy ndarrays. (#5654) (by Carbene)
    • [Lang] Implement index validation for matrices/vectors (#5605) (by Mike He)
    • [aot] Added basic infrastructure for gui_utils interfaces - unimplemented (#5688) (by Zhanlue Yang)
    • [Error] Do not show warning when the offline cache path does not exist (#5747) (by Lin Jiang)
    • [vulkan] Fixed query pool invalid usage (#5717) (by PENGUINLIONG)
    • [llvm] Fix crash caused on ByVal Attribute when switch to llvm 15. (#5745) (by Xiang Li)
    • [bug] Fix incorrect autodiff_mode information in offline cache key (#5737) (by Mingming Zhang)
    • [lang] Add parallel scan prefix sum utility (#5697) (by Bo Qiao)
    • [AOT] Added texture interfaces to C-API (#5520) (by PENGUINLIONG)
    • [autodiff] Move the global data access rule checker to experimental (#5719) (by Mingrui Zhang)
    • [ci] Rename lite test CI label, force full test on rc branch (#5732) (by Proton)
    • [ci] Fix TI_SKIP_CPP_TESTS (#5720) (by Proton)
    • [misc] Bump version to v1.1.1 (#5726) (by Taichi Gardener)
    • Improve Windows build script (#5611) (by PENGUINLIONG)
    • Remove miscommited file (#5727) (by Bo Qiao)
    • [test] Fix autodiff test for unsupported shift ptr (#5723) (by Mingrui Zhang)
    • [Doc] [type] Add introduction to quantized types (#5705) (by Yi Xu)
    • Fix shared array for all Vulkan versions. (#5722) (by Haidong Lan)
    • [autodiff] Clear all dual fields when exiting context manager (#5716) (by Mingrui Zhang)
    • [bug] Support indexing via np.integer for field (#5712) (by Ailing)
    • [vulkan] Relax number of array args for each kernel (#5689) (by Ailing)
    • [bug] Properly delete functions of a SNode Tree (#5710) (by Lin Jiang)
    • [Doc] Add docs for GGUI's new features (#5647) (by Mocki)
    • [gui] Support set_image with texture (#5655) (by PENGUINLIONG)
    • [Doc] Add introduction to forward mode autodiff (#5680) (by Mingrui Zhang)
    • [autodiff] Fix AdStackAllocaStmt not correctly backup (#5692) (by Mingrui Zhang)
    • [doc] Rename ti.struct_class to ti.dataclass (#5706) (by Yi Xu)
    • [gui] GGUI renames (#5704) (by PENGUINLIONG)
    • [Vulkan] Track image layout internally (#5597) (by PENGUINLIONG)
    • [ci] Confine show_environ task to Linux bots (#5677) (by Proton)
    • [build] [refactor] Decouple GUI source files from taichi_core target (#5676) (by Bo Qiao)
    • [ci] Rename libcommon.sh -> common-utils.sh, remove expore core build task (#5673) (by Proton)
    • [ci] Temporarily disable a M1 vulkan test (#5701) (by Proton)
    • [doc] Add comments to explain the commit Id for Build Andriod Demos CI pipeline (#5700) (by Zhanlue Yang)
    • [bug] Fix ndarray arg with shape=(1,) in cgraph (#5666) (by Ailing)
    • [doc] Fix typo (#5695) (by Proton)
    • [ci] Temporarily disable M1 vulkan tests (bot3 is ill) (#5698) (by Proton)
    • [build] Add commit id to taichi-aot-demo for Build-Andriod-Demos CI pipeline (#5693) (by Zhanlue Yang)
    • [bug] Fix potential bug of loading of offline cache (#5682) (by Mingming Zhang)
    • [aot] Add Comet Demo to AOT test cases (#5671) (by Zhanlue Yang)
    • [Doc] Add doc about offline cache (#5646) (by Mingming Zhang)
    • [lang] Make vector a real class (#5653) (by PENGUINLIONG)
    • [spirv] [bug] Fix invalid CFG error when simulated atomic is present (#5678) (by Ailing)
    • [ci] Fix python/examples crashes in non-lite test mode (#5670) (by Proton)
    • [Bug] [opt] Remove wrong optimization: Float x // 1 -> x (#5672) (by Yi Xu)
    • [bug] [ir] Change the way to convert for-loop with break to while-loop (#5674) (by Lin Jiang)
    • [aot] Add SPH to AOT test cases (#5642) (by Zhanlue Yang)
    • [ci] Only run portion of tests on PR draft (#5626) (by Proton)
    • [autodiff] Use external array fill to initialize and clear seed (#5656) (by Mingrui Zhang)
    • [bug] Fix bug that kernel names are not correctly captured by the profiler (#5651) (by Mingming Zhang)
    • [lang] Give warning about printing in Vulkan (#5661) (by PENGUINLIONG)
    • Support exporting vertex velocity (#5644) (by YuZhang)
    • [aot] Add helper function to construct Ndarray in C-API (#5641) (by Zhanlue Yang)
    • [gui] GGUI scene APIs are broken (#5658) (by PENGUINLIONG)
    • [Doc] Typo in the doc. (#5652) (by dongqi shen)
    • [autodiff] Add the global data access rule checker (by mingrui)
    • [autodiff] Add gradient visited for global data access rule checker (by mingrui)
    • [autodiff] Print more specific error message that autodiff does not support to_numpy (#5630) (by PhrygianGates)
    • [ci] Drop py36 in nightly and release (#5640) (by Ailing)
    • [misc] Explicitly specify base tag commit when running make_changelog.py (#5632) (by Ailing)
    • [aot] Rewrite mpm88 aot test with C-API (#5615) (by Zhanlue Yang)
    • [Build] Clean up Taichi core cmake (#5595) (by Bo Qiao)
    Source code(tar.gz)
    Source code(zip)
  • v1.1.2(Aug 18, 2022)

    This is a bug fix release for v1.1.0. Full changelog:

    • [misc] Bump version to v1.1.2
    • [Bug] [type] Fix wrong type cast in codegen of storing quant floats (https://github.com/taichi-dev/taichi/pull/5818)
    • [bug] Fix incorrect autodiff_mode information in offline cache key (https://github.com/taichi-dev/taichi/pull/5737)
    • [Error] Do not show warning when the offline cache path does not exist (https://github.com/taichi-dev/taichi/pull/5747)
    • [autodiff] Support shift ptr in dynamic index (https://github.com/taichi-dev/taichi/pull/5770)
    Source code(tar.gz)
    Source code(zip)
  • v1.1.0(Aug 10, 2022)

    Highlights

    New features

    Quantized data types

    High-resolution simulations can deliver great visual quality, but are often limited by the capacity of the onboard GPU memory. This release adds quantized data types, allowing you to define your own integers, fixed-point numbers, or floating-point numbers of arbitrary number of bits that may strike a balance between your hardware limits and simulation effects. See Using quantized data types for a comprehensive introduction.

    Offline cache

    A Taichi kernel is implicitly compiled the first time it is called. The compilation results are kept in an online in-memory cache to reduce the overhead in the subsequent function calls. As long as the kernel function is unchanged, it can be directly loaded and launched. The cache, however, is no longer available when the program terminates. Then, if you run the program again, Taichi has to re-compile all kernel functions and reconstruct the online in-memory cache. And the first launch of a Taichi function is always slow due to the compilation overhead. To address this problem, this release adds the offline cache feature, which dumps the compilation cache to the disk for future runs. The first launch overhead can be drastically reduced in subsequent runs. Taichi now constructs and maintains an offline cache by default. The following table shows the launch overhead of running cornell_box on the CUDA backend with and without offline cache:

    | | Time spent on compilation and cached data loading | | ------------------------------- | ------------------------------------------------- | | Offline cache disabled | 24.856s | | Offline cache enabled (1st run) | 25.435s | | Offline cache enabled (2nd run) | 0.677s |

    Note that, for now, the offline cache feature works only on the CPU and CUDA backends. If your code behaves abnormally, disable offline cache by setting the environment variable TI_OFFLINE_CACHE=0 or ti.init(offline_cache=False) and file an issue with us on Taichi's GitHub repo. See Offline cache for more information.

    Forward-mode automatic differentiation

    Adds forward-mode automatic differentiation via ti.ad.FwdMode. Unlike the existing reverse-mode automatic differentiation, which computes vector-Jacobian product (vJp), forward-mode computes Jacobian-vector product (Jvp) when evaluating derivatives. Therefore, forward-mode automatic differentiation is much more efficient in situations where the number of a function's outputs is greater than its inputs. Read this example, which demonstrates Jacobian matrix computation in forward mode and reverse mode.

    SharedArray (experimental)

    GPU's shared memory is a fast small memory that is visible within each thread block (or workgroup in Vulkan). It is widely used in scenarios where performance is a crucial concern. To give you access to your GPU's shared memory, this release adds the SharedArray API under the namespace ti.simt.block. The following diagram illustrates the performance benefits of Taichi's SharedArray. With SharedArray, Taichi Lang is comparable to or even outperforms the equivalent CUDA code.

    n-body benchmarking

    Texture (experimental)

    Taichi now supports texture bilinear sampling and raw texel fetch on both Vulkan and OpenGL backends. This feature leverages the hardware texture unit and diminishes the need for manual composition of bilinear interpolation code in image processing tasks. This feature also provides an easy way for texture mapping in tasks such as rasterization or ray-tracing. On Vulkan backend, Taichi additionally supports image load and store. You can directly manipulate texels of an image and use this very image in subsequent texture mapping.

    Note that the current texture and image APIs are in the early stages and subject to change. In the future we plan to support bindless textures to extend to tasks such as ray-tracing. We also plan to extend full texture support to all backends that support texture APIs.

    Run ti example simple_texture to see an example of texture support!

    Improvements

    GGUI

    • Supports fetching and storing the depth information of the current scene:
      • In a Taichi field: ti.ui.Window.get_depth_buffer(field);
      • In a NumPy array: ti.ui.Window.get_depth_buffer_as_numpy().
    • Supports drawing 3D lines using Scene.lines(vertices, width).
    • Supports drawing mesh instances. You can pass a list of transformation matrices (ti.Matrix.field(4, 4, ti.f32, shape=N)) and call ti.ui.Scene.mesh_instance(vertices, transforms=TransformMatrixField) to put various mesh instances at different places.
    • Supports showing the wireframe of a mesh when calling Scene.mesh() or Scene.mesh_instance() by setting show_wireframe=True.

    Syntax

    • Taichi dataclass: Taichi now recommends using the @ti.dataclass decorator to define struct types, or even attach functions to them. See Taichi dataclasses for more information.

      @ti.dataclass
      class Sphere:
        center: vec3
        radius: ti.f32
        @ti.func
        def area(self):
          # a function to run in taichi scope
          return 4 * math.pi * self.radius * self.radius
        def is_zero_sized(self):
          # a python scope function
          return self.radius == 0.0
      
    • As shown in the dataclass example above, vec2, vec3, and vec4 in the taichi.math module (same for ivec and uvec) can be directly used as type hints. The numeric precision of these types is determined by default_ip or default_fp in ti.init().

    • More flexible instantiation for a struct or dataclass: In earlier releases, to instantiate a taichi.types.struct and taichi.dataclass, you have to explicitly put down a complete list of member-value pairs like:

      ray = Ray(ro=vec3(0), rd=vec3(1, 0, 0), t=1.0)
      

      As of this release, you are given more options. The positional arguments are passed to the struct members in the order they are defined; the keyword arguments set the corresponding struct members. Unspecified struct members are automatically set to zero. For example:

      # use positional arguments to set struct members in order
      ray = Ray(vec3(0), vec3(1, 0, 0), 1.0)
      
      # ro is set to vec3(0) and t will be set to 0
      ray = Ray(vec3(0), rd=vec3(1, 0, 0))
      
      # both ro and rd are set to vec3(0)
      ray = Ray(t=1.0)
      
      # ro is set to vec3(1), rd=vec3(0) and t=0.0
      ray = Ray(1)
      
      # all members are set to 0.
      ray = Ray()
      
    • Supports calling fill() from both the Python scope and the Taichi scope. In earlier releases, you can only call fill() from the Python scope, which is a method in the ScalarField or MatrixField class. As of this release, you can call this method from either the Python scope or the Taichi scope. See the following code snippet:

      x = ti.field(int, shape=(10, 10))
      x.fill(1)
      
      @ti.kernel
      def test():
          x.fill(-1)
      
    • More flexible initialization for customized matrix types: As the following code snippet shows, matrix types created using taichi.types.matrix() or taichi.types.vector() can be initialized more flexibly: Taichi automatically combines the inputs and converts them to a matrix whose shape matches the shape of the target matrix type.

      # mat2 and vec3 are predefined types in the ti.math module
      mat2 = ti.types.matrix(2, 2, float)
      vec3 = ti.types.vector(3, float)
      
      m = mat2(1)  # [[1., 1.], [1., 1.]]
      m = mat2(1, 2, 3, 4)  # [[1., 2.], [3, 4.]]
      m = mat2([1, 2], [3, 4])  # [[1., 2.], [3, 4.]]
      m = mat2([1, 2, 3, 4])  # [[1., 2.], [3, 4.]]
      v = vec3(1, 2, 3)
      m = mat2(v, 4)  # [[1., 2.], [3, 4.]]
      
    • Makes ti.f32(x) syntax sugar for ti.cast(x, ti.f32), if x is neither a literal nor of a compound data type. Same for other primitive types such as ti.i32, ti.u8, or ti.f64.

    • More convenient axes order adjustment: A common way to improve the performance of a Taichi program is to adjust the order of axes when laying out field data in the memory. In earlier releases, this requires in-depth knowledge about the data definition language (the SNode system) and may become an extra burden in situations where sparse data structures are not required. As of this release, Taichi supports specifying the order of axes when defining a Taichi field.

      # Before
      x = ti.field(ti.i32)
      y = ti.field(ti.i32)
      ti.root.dense(ti.i, M).dense(ti.j, N).place(x)  # row-major
      ti.root.dense(ti.j, N).dense(ti.i, M).place(y)  # column-major
      # New syntax
      x = ti.field(ti.i32, shape=(M, N), order='ij')
      y = ti.field(ti.i32, shape=(M, N), order='ji')
      # SoA vs. AoS example
      p = ti.Vector.field(3, ti.i32, shape=(M, N), order='ji', layout=ti.Layout.SOA)
      q = ti.Vector.field(3, ti.i32, shape=(M, N), order='ji', layout=ti.Layout.AOS)
      

    Important bug fixes

    • Fixed infinite loop when an integer pow() has a negative exponent (#5275)
    • Fixed numerical issues with matrix slicing (#4677)
    • Improved data type checks for ti.ndrange (#4478)

    API changes

    Added

    • ti.BitpackedFields
    • ti.from_paddle
    • ti.to_paddle
    • ti.FieldsBuilder.lazy_dual
    • ti.math module
    • ti.Texture
    • ti.ref
    • ti.dataclass
    • ti.simt.block.SharedArray

    Moved

    | Old API | New API | | ---------------------------- | ------------------------------ | | ti.clear_all_gradients | ti.ad.clear_all_gradients | | ti.Tape | ti.ad.Tape | | ti.FieldsBuilder.bit_array | ti.FieldsBuilder.quant_array | | ti.ui.Window.write_image | ti.ui.Window.save_image | | ti.ui.Window.GUI | ti.ui.Window.get_gui |

    Deprecated

    • ti.ui.make_camera: Please construct cameras with ti.ui.Camera instead.

    Deprecation notice

    Python 3.6

    As announced in v1.0.0 release, we no longer provide official python3.6 wheels through pypi. Users who need taichi with python3.6 may still build from source but its support is not guaranteed.

    Taichi_GLSL

    The taichi_glsl package on pypi will no longer be maintained as of this release. GLSL-related features will be implemented in the official taichi.math module, which includes data types and handy functions for daily math and shader development:

    • Vector types: vec2, vec3, and vec4.
    • Matrix types: mat2,mat3, and mat4.
    • GLSL functions such as step(),clamp(), and smoothstep().

    MacOS 10.14

    Official support for MacOS Mojave (10.14, released in 2018) will be dropped starting from v1.2.0. Please upgrade your MacOS if possible or let us know if you have any concerns.

    Full changelog:

    • [misc] Update version to v1.1.0 (by Ailing Zhang)
    • [test] Fix autodiff test for unsupported shift ptr (#5723) (by Mingrui Zhang)
    • [Doc] [type] Add introduction to quantized types (#5705) (by Yi Xu)
    • [autodiff] Clear all dual fields when exiting context manager (#5716) (by Mingrui Zhang)
    • [bug] Support indexing via np.integer for field (#5712) (by Ailing)
    • [Doc] Add docs for GGUI's new features (#5647) (by Mocki)
    • [Doc] Add introduction to forward mode autodiff (#5680) (by Mingrui Zhang)
    • [autodiff] Fix AdStackAllocaStmt not correctly backup (#5692) (by Mingrui Zhang)
    • Fix shared array for all Vulkan versions. (#5721) (by Haidong Lan)
    • [misc] Rc v1.1.0 patch3 (#5709) (by Ailing)
    • [bug] RC v1.1.0 patch2 (#5683) (by Ailing)
    • [ci] Temporarily disable a M1 vulkan test (#5703) (by Proton)
    • [Doc] Add doc about offline cache (#5646) (#5686) (by Mingming Zhang)
    • [bug] Fix bug that kernel names are not correctly captured by the profiler (#5651) (#5669) (by Mingming Zhang)
    • [gui] GGUI scene APIs are broken (#5658) (#5667) (by PENGUINLIONG)
    • [release] v1.1.0 patch1 (#5649) (by Ailing)
    • [llvm] Compile serially when num_thread=0 (#5631) (by Lin Jiang)
    • [cuda] Reduce kernel profiler memory usage (#5623) (by Bo Qiao)
    • [doc] Add docstrings for texture related apis (by Ailing Zhang)
    • [Lang] Support from/to_image for textures and add tests (by Ailing Zhang)
    • [gui] Add wareframe mode for mesh & mesh_instance, add slider_int for Window.GUI. (#5576) (by Mocki)
    • avoid redundant compilation (#5607) (by yixu)
    • [misc] Enable offline cache by default (#5613) (by Mingming Zhang)
    • [Lang] Add parameter 'order' to specify layout for scalar, vector, matrix fields (#5617) (by Yi Xu)
    • [autodiff] [example] Add an example for computing Jacobian matrix (#5609) (by Mingrui Zhang)
    • [ci] Add PR tag for dx12. (#5614) (by Xiang Li)
    • fix ti.ui.Space (#5606) (by yixu)
    • [ci] Build Android export core (#5409) (by Proton)
    • [type] Rename module quantized_types to quant (#5608) (by Yi Xu)
    • [llvm] [aot] Add unit tests for Dynamic SNodes with LLVM AOT (#5594) (by Zhanlue Yang)
    • [build] Forcing write file encoding in misc/make_changelog.py (#5604) (by Proton)
    • [llvm] [aot] Add unit tests for Bitmasked SNodes with LLVM AOT (#5593) (by Zhanlue Yang)
    • [GUI] Shifted to a more commonly supported type for set_image (#5514) (by PENGUINLIONG)
    • [gui] Fix snode offset (mesh disappearing bug) (#5579) (by Bob Cao)
    • [refactor] Redesign loading, dumping and cleaning of offline cache (#5578) (by Mingming Zhang)
    • [autodiff] [test] Add more complex for loop test cases for forward mode (#5592) (by Mingrui Zhang)
    • fix num_triangles (#5602) (by yixu)
    • [cuda] Decouple update from sync in kernel profiler (#5589) (by Bo Qiao)
    • Removed unnecessary tags to work around a crowdIn issue. (#5590) (by Vissidarte-Herman)
    • [Lang] Change vec2/3/4 from function calls to types (#5556) (by Zhao Liang)
    • [vulkan] Enable shared array support for vulkan backend (#5583) (by Haidong Lan)
    • [aot] Avoid reserved words when generate C# AOT bindings (#5586) (by Proton)
    • [ci] Update llvm15 prebuild binary. (#5581) (by Xiang Li)
    • [doc] Removed a redundant line break to see if it will fix a CrowdIn issue (#5584) (by Vissidarte-Herman)
    • [type] Refine SNode with quant 10/n: Add validity checks and simplify BitStructType (#5573) (by Yi Xu)
    • [autodiff] [refactor] Rename the parameters to param for forward mode (#5582) (by Mingrui Zhang)
    • [doc] Format fix to work around a crowdIn issue (#5580) (by Vissidarte-Herman)
    • Update syntax.md (#5575) (by Zhao Liang)
    • [doc] Added an mdx-code-block escape hatch syntaxt to workaround a CrowdIn … (#5574) (by Vissidarte-Herman)
    • [Doc] Update external.md (#5547) (by Zhao Liang)
    • [doc] Add introductions to ambient_elements in llvm_sparse_runtime.md (#5567) (by Zhanlue Yang)
    • [refactor] Unify ways to set external array args (#5565) (by Ailing)
    • [Lang] [type] Refine SNode with quant 9/n: Rename some parameters in quant APIs (#5566) (by Yi Xu)
    • [opt] Improved warning messages for statements (#5564) (by Zhanlue Yang)
    • [bug] Fix android build for taichi-aot-demo (#5560) (by Ailing)
    • [opt] Added llvm::SeparateConstOffsetFromGEPPass() for shared_memory optimizations (#5494) (by Zhanlue Yang)
    • [Lang] [type] Refine SNode with quant 8/n: Replace bit_struct with ti.BitpackedFields (#5532) (by Yi Xu)
    • [build] Enforce local-scoped symbols in static llvm libs (#5553) (by Bo Qiao)
    • [refactor] Unify ways to set ndarray args (#5559) (by Ailing)
    • [gui] [vulkan] Support for drawing mesh instances (#5546) (by Mocki)
    • [llvm] [aot] Added taichi_sparse unit test to C-API for CUDA backend (#5531) (by Zhanlue Yang)
    • Add glFinish to wait_idle (#5538) (by Bo Qiao)
    • [autodiff] Skip ConstStmt when generating alloca for dual (#5554) (by Mingrui Zhang)
    • [ci] Fix macOS nightly build (#5552) (by Proton)
    • Fix potential bug of lang::Program that could be double finalized (#5550) (by Mingming Zhang)
    • [Error] Raise error when using the struct for in python scope (#5536) (by Lin Jiang)
    • [bug] Fix calling make_aot_kernel failed when offline_cache=True (#5537) (by Mingming Zhang)
    • [ci] Move macOS 10.15 workloads to self-hosted runners (#5539) (by Proton)
    • [build] [refactor] Utilize find_cuda_toolkit and clean some target dependencies (#5526) (by Bo Qiao)
    • [autodiff] [test] Add more for-loop tests for forward mode (#5525) (by Mingrui Zhang)
    • [Lang] [bug] Ensure non-i32 compatibility in while statement conditions (#5521) (by daylily)
    • [Lang] Improve error message for ggui on opengl backend (#5509) (by Zhao Liang)
    • [aot] Support texture and rwtexture in cgraph (#5528) (by Ailing)
    • [llvm] Add parallel compilation to CUDA backend (#5519) (by Lin Jiang)
    • [type] [refactor] Decouple quant from SNode 9/n: Remove exponent handling from SNode (#5510) (by Yi Xu)
    • [Lang] Fix numpy and taichi operations problem (#5506) (by Zhao Liang)
    • [Vulkan] Added an interface to get accumulated on-device execution time (#5488) (by PENGUINLIONG)
    • [Async] [refactor] Remove AsyncTaichi (#5523) (by Lin Jiang)
    • [misc] Fix warning at GGUI canvas.circles (#5424) (#5518) (by Proton)
    • [gui] Support rendering lines from a part of VBO (#5495) (by Mocki)
    • [ir] Cast indices of ExternalPtrStmt to ti.i32 (#5516) (by Yi Xu)
    • [Lang] Support syntax sugar for ti.cast (#5515) (by Yi Xu)
    • [Lang] Better struct initialization (#5481) (by Zhao Liang)
    • [example] Make implicit_fem fallback to CPU when CUDA is not available (#5512) (by Yi Xu)
    • [Lang] Make MatrixType support more ways of initialization (#5479) (by Zhao Liang)
    • [Vulkan] Fixed depth texture validation error (#5507) (by PENGUINLIONG)
    • [bug] Fix vulkan source when build for android (#5508) (by Bo Qiao)
    • [refactor] [llvm] Rename CodeGenCPU/CUDA/WASM and CodeGenLLVMCPU/CUDA/WASM (#5500) (by Lin Jiang)
    • [bug] Let the arguments in ti.init override the environment variables (#5497) (by Lin Jiang)
    • [misc] Add debug logging and TI_AUTO_PROF for offline cache (#5503) (by Mingming Zhang)
    • [misc] ti.Tape -> ti.ad.Tape (#5501) (by Zihua Wu)
    • [misc] Support jit offline cache for kernels that call real functions (#5477) (by Mingming Zhang)
    • [doc] Update cpp tests build doc (#5493) (by Bo Qiao)
    • [Lang] Support call field.fill in kernel functions (#5486) (by Zhao Liang)
    • [Lang] [bug] Make comparisons always return i32 (#5487) (by Yi Xu)
    • [gui] [vulkan] Support 3d-lines rendering (#5492) (by Mocki)
    • [autodiff] Switch off parts of store forwarding optimization for autodiff (#5464) (by Mingrui Zhang)
    • [llvm] [aot] Add LLVM to CAPI part 9: Added AOT field tests for LLVM backend in C-API (#5461) (by Zhanlue Yang)
    • [bug] [llvm] Fix GEP when allocating TLS buffer in struct for (#5473) (by Lin Jiang)
    • [gui] [vulkan] Modify some internal APIs (#5484) (by Mocki)
    • [Build] Remove TI_EMSCRIPTENED related code (#5483) (by Bo Qiao)
    • [type] [refactor] Decouple quant from SNode 8/n: Remove redundant handling of llvm15 in codegen_llvm_quant (#5480) (by Yi Xu)
    • [CUDA] Enable shared memory for CUDA (#5429) (by Haidong Lan)
    • [gui] [vulkan] A faster version of depth copy through ti.field/ti.ndarray (copy directly from vulkan to cuda/gpu/cpu) (#5455) (by Mocki)
    • [misc] Add missing members of XXXExpression and FrontendXXXStmt to result of ASTSerializer (#5471) (by Mingming Zhang)
    • [llvm] [aot] Added field tests for LLVM backend in CGraph (#5458) (by Zhanlue Yang)
    • [type] [refactor] Decouple quant from SNode 7/n: Rewrite BitStructStoreStmt codegen without SNode (#5475) (by Yi Xu)
    • [llvm] [aot] Add LLVM to CAPI part 8: Added CGraph tests for LLVM backend in C-API (#5456) (by Zhanlue Yang)
    • [build] [refactor] Rename taichi core and taichi python targets (#5451) (by Bo Qiao)
    • [llvm] [aot] Add LLVM to CAPI part 6: Handle Field initialization in C-API (#5444) (by Zhanlue Yang)
    • [llvm] [aot] Add LLVM to CAPI part 7: Added AOT kernel tests for LLVM backend in C-API (#5447) (by Zhanlue Yang)
    • [error] Throw proper error message when an Ndarray is passed in via ti.template (#5457) (by Ailing)
    • [type] [refactor] Decouple quant from SNode 6/n: Rewrite extract_quant_float() without SNode (#5448) (by Yi Xu)
    • [bug] Set SNode tree id to all SNodes (#5454) (by Lin Jiang)
    • [AOT] Support on-device event (#5433) (by PENGUINLIONG)
    • [llvm] [aot] Add LLVM to CAPI part 5: Added C-API tests for Vulkan and Cuda backend (#5440) (by Zhanlue Yang)
    • [llvm] [bug] Fixing the crash in release tests introduced by a typo in #5381 where we need a deep copy of arglist. (#5441) (by Proton)
    • [llvm] [aot] Add LLVM to CAPI part 4: Enabled C-API tests on CI & Added C-API tests for CPU backend (#5435) (by Zhanlue Yang)
    • [misc] Bump version to v1.0.5 (#5437) (by Proton)
    • [aot] Support specifying vk_api_version in CompileConfig (#5419) (by Ailing)
    • [Lang] Add append attribute to dynamic fields (#5413) (by Zhao Liang)
    • [Lang] Add inf and nan (#5270) (by Zhao Liang)
    • [Doc] Updated docsite structure (#5416) (by Vissidarte-Herman)
    • [ci] Run release tests (#5327) (by Proton)
    • [type] [refactor] Decouple quant from SNode 5/n: Rewrite load_quant_float() without SNode (#5422) (by Yi Xu)
    • [llvm] Allow using clang 15 for COMPILE_LLVM_RUNTIME (#5381) (by Xiang Li)
    • [opengl] Speedup compilation for Nvidia cards (#5430) (by Bob Cao)
    • [Bug] Fix infinite loop when exponent of integer pow is negative (#5275) (by Mike He)
    • [build] [refactor] Move spirv codegen and common targets (#5415) (by Bo Qiao)
    • [autodiff] Check not placed field.dual and add needs_dual (#5412) (by Mingrui Zhang)
    • [bug] Simplify scalar handling in cgraph and relax field_dim check (#5411) (by Ailing)
    • [gui] [vulkan] Surpport for getting depth information for python users. (#5410) (by Mocki)
    • [AOT] Adjusted C-API for nd-array type conformance (#5417) (by PENGUINLIONG)
    • [type] Decouple quant from SNode 4/n: Add exponent info to BitStructType (#5407) (by Yi Xu)
    • [llvm] Avoid creating new LLVM contexts when updating struct module (#5397) (by Lin Jiang)
    • [build] Enable C-API compilation on CI (#5403) (by Zhanlue Yang)
    • [Lang] Implement assignment by slicing (#5369) (by Mike He)
    • [llvm] [aot] Add LLVM to CAPI part 3: Adapted AOT interfaces for LLVM backend (#5402) (by Zhanlue Yang)
    • [AOT] Fixed Vulkan device import capability settings (#5400) (by PENGUINLIONG)
    • [llvm] [aot] Add LLVM to CAPI part 2: Adapted memory allocation interfaces for LLVM backend (#5396) (by Zhanlue Yang)
    • [autodiff] Add ternary operators for forward mode (#5405) (by Mingrui Zhang)
    • [llvm] [aot] Add LLVM to CAPI part 1: Implemented capi::LlvmRuntime class (#5393) (by Zhanlue Yang)
    • [ci] Add per test hard timeout limit (#5384) (by Proton)
    • [ci] Properly detect $DISPLAY (#5398) (by Proton)
    • [ci] Llvm15 clang10 ci (#5368) (by Xiang Li)
    • [llvm] [bug] Add stop grad to ASTSerializer (#5401) (by Lin Jiang)
    • [autodiff] Add test for ternary operators in reverse mode autodiff (#5395) (by Mingrui Zhang)
    • [llvm] [aot] Add numerical unit tests for LLVM-CGraph (#5319) (by Zhanlue Yang)
    Source code(tar.gz)
    Source code(zip)
  • v1.0.4(Jul 12, 2022)

    Highlights:

    • Documentation
      • Fix typos (#5283) (by Kian-Meng Ang)
      • Update dev_install.md (#5266) (by Vissidarte-Herman)
      • Updated README command lines (#5199) (by Vissidarte-Herman)
      • Modify compilation warnings (#5180) (by Olinaaaloompa)
      • Updated odop.md, removing obsolete information (#5163) (by Vissidarte-Herman)
    • Language and syntax
      • Refine SNode with quant 7/n: Support placing QuantFixedType under quant_array (#5386) (by Yi Xu)
      • Add determinant for 1d case (#5375) (by Zhao Liang)
      • Make floor, ceil and round accept a dtype optional argument (#5307) (by Zhao Liang)
      • Rename struct_class to dataclass (#5365) (by Zhao Liang)
      • Improve ti example so that users can choose which example to run by entering numbers. (#5265) (by Zhao Liang)
      • Refine SNode with quant 5/n: Rename bit_array to quant_array (#5344) (by Yi Xu)
      • Make bit_vectorize a parameter of ti.loop_config (#5334) (by Yi Xu)
      • Refine SNode with quant 3/n: Turn bit_vectorize into an on/off switch (#5331) (by Yi Xu)
      • Add errror message for missing init call (#5280) (by Zhao Liang)
      • Fix fractal gui close warning (#5281) (by Zhao Liang)
      • Refine SNode with quant 2/n: Enable struct for on bit_array with bit_vectorize off (#5253) (by Yi Xu)
      • Refactor indexing expressions in AST & enforce integer indices (#5138) (by daylily)

    Full changelog:

    • Revert "[llvm] (Decomp of #5251 11/n) Enable parallel compilation on CPU backend (#5394)" (by Proton)
    • [refactor] Default dtype of ndarray type should be None instead of f32 (#5391) (by Ailing)
    • [llvm] (Decomp of #5251 11/n) Enable parallel compilation on CPU backend (#5394) (by Lin Jiang)
    • [gui] [vulkan] Surpport for python users to control the start index and count number of particles & meshes data. (#5388) (by Mocki)
    • [autodiff] Support binary operators for forward mode (#5389) (by Mingrui Zhang)
    • [llvm] (Decomp of #5251 10/n) Make SNode tree compatible with parallel compilation (#5390) (by Lin Jiang)
    • [llvm] [refactor] (Decomp of #5251 9/n) Refactor CodeGen to support parallel compilation on LLVM backend (#5387) (by Lin Jiang)
    • [Lang] [type] Refine SNode with quant 7/n: Support placing QuantFixedType under quant_array (#5386) (by Yi Xu)
    • [llvm] [refactor] (Decomp of #5251 8/n) Refactor KernelCacheData (#5383) (by Lin Jiang)
    • [cuda] [type] Refine SNode with quant 6/n: Support __ldg for loading QuantFixedType and QuantFloatType (#5374) (by Yi Xu)
    • [doc] Add simt functions in operators (#5333) (by Bo Qiao)
    • [Lang] Add determinant for 1d case (#5375) (by Zhao Liang)
    • [lang] Texture image load store support (#5317) (by Bob Cao)
    • [bug] Cast scalar to right type before converting to uint64 (by Ailing Zhang)
    • [refactor] Check dtype mismatch in cgraph compilation and runtime (by Ailing Zhang)
    • [refactor] Check field_dim mismatch in cgraph compilation and runtime (by Ailing Zhang)
    • [test] Check repeated arg names in cgraph (by Ailing Zhang)
    • [llvm] [refactor] (Decomp of #5251 6/n) Let ModuleToFunctionConverter support multiple modules (#5372) (by Lin Jiang)
    • [Lang] Make floor, ceil and round accept a dtype optional argument (#5307) (by Zhao Liang)
    • [refactor] Rename the confused needs_grad (#5359) (by Mingrui Zhang)
    • [autodiff] Support unary ops for forward mode (#5366) (by Mingrui Zhang)
    • [llvm] (Decomp of #5251 7/n) Change the way to record the time of offline cache (#5373) (by Lin Jiang)
    • [llvm] (Decomp of #5251 5/n) Add the parallel compilation worker to LlvmProgramImpl (#5364) (by Lin Jiang)
    • [gui] [test] Fix bug in test_ggui.py when some pc env do not surrport ggui (#5370) (by Mocki)
    • [Lang] Rename struct_class to dataclass (#5365) (by Zhao Liang)
    • [llvm] Drop code for llvm 15. (#5313) (by Xiang Li)
    • [llvm] [aot] Rewrite LLVM AOT tests with LlvmRuntimeExecutor (#5358) (by Zhanlue Yang)
    • [example] Avoid f64 type in simulation/initial_value_problem.py (#5355) (by Proton)
    • [ci] testing: add retention-days for broken wheels (#5326) (by Proton)
    • [test] (Decomp of #5251 4/n) Delete tests for AsyncTaichi (#5357) (by Lin Jiang)
    • [llvm] [refactor] (Decomp of #5251 2/n) Make modulegen a virtual function and let LLVMCompiledData replace ModuleGenValue (#5353) (by Lin Jiang)
    • [gui] Support exporting gif && video in GGUI (#5354) (by Mocki)
    • [autodiff] Handle field accessing by zero for forward mode (#5339) (by Mingrui Zhang)
    • [llvm] [refactor] (Decomp of #5251 3/n) Remove codegen from OffloadedTask and let it replace OffloadedTaskCacheData (#5356) (by Lin Jiang)
    • [refactor] Turn off stack traceback info by default (#5347) (by Ailing)
    • [refactor] (Decomp of #5251 1/n) Move ParallelExecutor out of async engine (#5351) (by Lin Jiang)
    • [Lang] Improve ti example so that users can choose which example to run by entering numbers. (#5265) (by Zhao Liang)
    • [gui] Add get_view_matrix() and get_projection_matrix() APIs for camera (#5345) (by Mocki)
    • [bug] Added warning messages for implicit type conversion for RangeFor boundaries (#5322) (by Zhanlue Yang)
    • [example] Fix simulation/waterwave.py:update race condition (#5346) (by Proton)
    • [Lang] [type] Refine SNode with quant 5/n: Rename bit_array to quant_array (#5344) (by Yi Xu)
    • [llvm] [aot] Added CGraph tests for LLVM backend (#5305) (by Zhanlue Yang)
    • [autodiff] [test] Add for-loop tests for forward mode (#5336) (by Mingrui Zhang)
    • [example] Lower example GUI resolution to fit buildbot display (#5337) (by Proton)
    • [build] [bug] Fix building on macOS 10.14 failed (#5332) (by PGZXB)
    • [llvm] [aot] Replaced LlvmProgramImpl with LlvmRuntimeExecutor for LlvmAotModuleLoader (#5330) (by Zhanlue Yang)
    • [AOT] Fixed certain crashes in C-API (#5335) (by PENGUINLIONG)
    • [Lang] [type] Make bit_vectorize a parameter of ti.loop_config (#5334) (by Yi Xu)
    • [autodiff] Skip store forwarding to keep the GlobalLoadStmt alive (#5315) (by Mingrui Zhang)
    • [llvm] [aot] RModified ModuleToFunctionConverter to use LlvmRuntimeExecutor instead of LlvmProgramImpl (#5328) (by Zhanlue Yang)
    • [llvm] Changed LlvmProgramImpl to save cache_data_ with unique_ptr instead of raw object (#5329) (by Zhanlue Yang)
    • [Lang] [type] Refine SNode with quant 3/n: Turn bit_vectorize into an on/off switch (#5331) (by Yi Xu)
    • [misc] Fix a few compilation warnings (#5325) (by yekuang)
    • [bug] Accept numpy integers in ndrange (#5245) (#5323) (by Proton)
    • [misc] Implement cache file cleaning (#5310) (by PGZXB)
    • Fixed C-AP build on Android (#5321) (by PENGUINLIONG)
    • [AOT] Save AOT module artifacts as zip archive (#5316) (by PENGUINLIONG)
    • [llvm] [aot] Added LLVM backend support for Compute Graph (#5294) (by Zhanlue Yang)
    • [AOT] Unity native plugin interfaces (#5273) (by PENGUINLIONG)
    • [autodiff] Check not placed field.grad when needs_grad = True (#5295) (by Mingrui Zhang)
    • [autodiff] Fix alloca block and add control flow test case for forward mode (#5301) (by Mingrui Zhang)
    • [refactor] Synchronize should always be called in non-async mode (#5302) (by Ailing)
    • [Lang] Add errror message for missing init call (#5280) (by Zhao Liang)
    • Update prtags.json (#5304) (by Bob Cao)
    • [refactor] Get rid ndarray host accessor kernels (by Ailing Zhang)
    • [refactor] Use device api for CPU/CUDA ndarray (by Ailing Zhang)
    • [refactor] Switch to using staging buffer for metal/vulkan/opengl (by Ailing Zhang)
    • [llvm] Use LlvmProgramImpl::cache_data_ to store compiled kernel info (#5290) (by Zhanlue Yang)
    • [opengl] Texture support in OpenGL (#5296) (by Bob Cao)
    • [build] [refactor] Cleanup backends folder and rename to RHI (#5288) (by Bo Qiao)
    • [Lang] Fix fractal gui close warning (#5281) (by Zhao Liang)
    • [autodiff] [test] Add atomic test for forward autodiff (#5286) (by Mingrui Zhang)
    • [dx11] Fix DX backend with new runtime & Better D3D11 buffer handling (#5244) (by Bob Cao)
    • [autodiff] Set default seed only for scalar parameter to avoid silent unexpected results (#5287) (by Mingrui Zhang)
    • test (#5292) (by Ailing)
    • [AOT] Added C-API for on-device memory copy (#5271) (by PENGUINLIONG)
    • [Doc] Fix typos (#5283) (by Kian-Meng Ang)
    • [autodiff] Support control flow for forward mode (by mingrui)
    • [autodiff] Support for-loop and mutation for forward mode (by mingrui)
    • [autodiff] Refactor dual field allocation (by mingrui)
    • [AOT] Refactor C-API codegen (#5272) (by PENGUINLIONG)
    • Update README.md (#5279) (by Taichi contributor)
    • [metal] Support memcpy_internal via buffer_copy (#5268) (by Ailing)
    • [bug] Fix missing old but useful metadata in offline cache (#5267) (by PGZXB)
    • [Lang] [type] Refine SNode with quant 2/n: Enable struct for on bit_array with bit_vectorize off (#5253) (by Yi Xu)
    • [Doc] Update dev_install.md (#5266) (by Vissidarte-Herman)
    • [build] [bug] Fix dependency for opengl_rhi target (by Bo Qiao)
    • Update fallback order, move opengl behind Vulkan (#5257) (by Bob Cao)
    • [opengl] Move OpenGL backend onto Gfx runtime (#5246) (by Bob Cao)
    • [build] [refactor] Move LLVM source files to target locations (#5254) (by Bo Qiao)
    • [bug] Fixed misuse of std::forward (#5237) (by Zhanlue Yang)
    • [AOT] Added safety checks to prevent hard crashes on failure (#5249) (by PENGUINLIONG)
    • [build] [refactor] Move shaders source files to runtime (#5247) (by Bo Qiao)
    • [example] Fix diff_sph example with --train (#5242) (by Mingrui Zhang)
    • [misc] Add filename option to ti.tools.VideoManager. (#5219) (by Qian Bao)
    • [bug] Throw exceptions when ndrange gets non-integral arguments (#5245) (by Mike He)
    • [build] [refactor] Move wasm and dx11 source files to target locations (#5235) (by Bo Qiao)
    • [type] [bug] Refine SNode with quant 1/n: Fix (atomic_)set_mask_b##N (#5238) (by Yi Xu)
    • [lang] 1d/3d texture support (#5233) (by Bob Cao)
    • [vulkan] Fix OpBranch for reversed RangeForStmt (#5241) (by Mingrui Zhang)
    • [build] Fix -Werror errors for TI_WITH_CUDA_TOOLKIT=ON (#5133) (#5216) (by Proton)
    • [ci] Enable pylint on examples (#5222) (by Proton)
    • [llvm] [aot] Split LlvmRuntimeExecutor from LlvmProgramImpl (#5207) (by Zhanlue Yang)
    • [type] [refactor] Decouple quant from SNode 3/n: Extend bit pointers (#5232) (by Yi Xu)
    • [vulkan] Codegen & runtime improvements (#5213) (by Bob Cao)
    • [gui] Fix the device memory leak when GGUI terminates (by Ailing Zhang)
    • [gui] Let gui and renderer manage the resource they own (by Ailing Zhang)
    • [AOT] Unity language binding generator (#5204) (by PENGUINLIONG)
    • [type] [refactor] Decouple quant from SNode 2/n: Remove physical_type from QuantIntType (#5223) (by Yi Xu)
    • [type] [refactor] Decouple quant from SNode 1/n: Add BitStructTypeBuilder (#5209) (by Yi Xu)
    • [build] [refactor] Move metal source files to target locations (#5208) (by Bo Qiao)
    • [lang] Export a few types from the share library (#5220) (by yekuang)
    • [llvm] [refactor] LLVMProgramImpl code clean up: part-5 (#5197) (by Zhanlue Yang)
    • [spirv] Fixed OpLoad with physical address (#5212) (by PENGUINLIONG)
    • [wip] Enable full wheel build when TI_EXPORT_CORE is on (#5211) (by Ailing)
    • [llvm] [refactor] LLVMProgramImpl code clean up: part-4 (#5189) (by Zhanlue Yang)
    • Move spdlog include to profiler.cpp (#5210) (by Ailing)
    • Fix ti gallery command bug (#5196) (by Zhao Liang)
    • [misc] Improve TI_STATIC_ASSERT compatibility (#5205) (by Yuanming Hu)
    • [llvm] [refactor] LLVMProgramImpl code clean up: part-3 (#5188) (by Zhanlue Yang)
    • Fixed C-API provision (#5203) (by PENGUINLIONG)
    • [lang] Improve error message when literal val is out of range of default dtype (#5191) (by Ailing)
    • [Lang] [ir] Refactor indexing expressions in AST & enforce integer indices (#5138) (by daylily)
    • Remove stale coverage from README.md (#5202) (by yekuang)
    • [ci] Slim cpu build image (#5198) (by Proton)
    • [build] [refactor] Move opengl source files to target locations (#5200) (by Bo Qiao)
    • [example] Fix dtype for metal backend and enforce vulkan (#5201) (by Mingrui Zhang)
    • [Doc] Updated README command lines (#5199) (by Vissidarte-Herman)
    • [llvm] [refactor] LLVMProgramImpl code clean up: part-2 (#5187) (by Zhanlue Yang)
    • [AOT] Support Matrix/Vector as graph arguments (#5165) (by Haidong Lan)
    • [refactor] Enable adaptive block_dim selection for CPU backend (#5190) (by Bo Qiao)
    • [Doc] Modify compilation warnings (#5180) (by Olinaaaloompa)
    • [ci] Save wheel to artifact when test fails (#5186) (by Proton)
    • [gui] Detailed error message when GGUI is not available (#5164) (by Proton)
    • [ci] Run C++ tests on Windows (#5176) (by Proton)
    • [lang] Texture support 3/n (Python changes) (#5174) (by Bob Cao)
    • [llvm] [refactor] LLVMProgramImpl code clean up: part-1 (#5181) (by Zhanlue Yang)
    • [AOT] Implementation of Taichi Runtime C-API (#5168) (by PENGUINLIONG)
    • [refactor] [autodiff] Clean redundant compiled functions and refactor kernel key (#5178) (by Mingrui Zhang)
    • [doc] Add badge on README.md (#5177) (by yanqingzhang)
    • [lang] Texture support 2/n (SPIR-V backend & runtime changes) (#5159) (by Bob Cao)
    • [build] Export cmake config to ease clients usage in Cmake (#5162) (by Bo Qiao)
    • [refactor] [autodiff] Refactor autodiff api and add corresponding tests (#5175) (by Mingrui Zhang)
    • [aot] [llvm] LLVM AOT Field part-4: Added AOT tests for Fields - CUDA backend (#5124) (by Zhanlue Yang)
    • [type] [refactor] Consistently use quant_xxx in quant-related names (#5166) (by Yi Xu)
    • [cuda] Disable reduction in non-full warps (#5161) (by Bob Cao)
    • [autodiff] Support basic operations for forward mode autodiff (by mingrui)
    • [autodiff] Add a context manager for forward mode autodiff (by mingrui)
    • [AOT] C-APIs for Taichi runtime distribution (#5150) (by PENGUINLIONG)
    • [cli] Improve user interface for CLI command ti example (#5153) (by Zhao Liang)
    • [Doc] Updated odop.md, removing obsolete information (#5163) (by Vissidarte-Herman)
    • [autodiff] [refactor] Refactor autodiff tape api and TapeImpl (#5154) (by Mingrui Zhang)
    • [type] [refactor] Separate CustomFixedType from CustomFloatType (#5149) (by Yi Xu)
    • [ui] Properlly fix UTF-8 title string by converting to UTF16 (#5155) (by Bob Cao)
    • [aot] [llvm] LLVM AOT Field #3: Added AOT tests for Fields - CPU backend (#5121) (by Zhanlue Yang)
    • Bump version to v1.0.4 (#5157) (by Taichi Gardener)
    • [lang] Texture support 1/n (Context & Programs) (#5139) (by Bob Cao)
    Source code(tar.gz)
    Source code(zip)
  • v1.0.3(Jun 13, 2022)

    Highlights:

    • Aot module
      • Support importing external Vulkan buffers (#5020) (by PENGUINLIONG)
      • Supported inclusion of taichi as subdirectory for AOT modules (#5007) (by PENGUINLIONG)
    • Bug fixes
      • Fix frontend type check for reading a whole bit_struct (#5027) (by Yi Xu)
      • Remove redundant AllocStmt when lowering FrontendWhileStmt (#4870) (by Zhanlue Yang)
    • Build system
      • Improve Windows build script (#4955) (by PENGUINLIONG)
      • Improved building on Windows (#4925) (by PENGUINLIONG)
      • Define Cmake OpenGL runtime target (#4887) (by Bo Qiao)
      • Use keywords instead of plain target_link_libraries CMake (#4864) (by Bo Qiao)
      • Define runtime build target (#4838) (by Bo Qiao)
      • Switch to scikit-build as the build backend (#4624) (by Frost Ming)
    • Documentation
      • Improve ODOP doc structure (#5089) (by Yi Xu)
      • Add documentation of Taichi Struct Classes. (#5075) (by bsavery)
      • Updated type system (#5054) (by Vissidarte-Herman)
      • Branding updates. Also tests netlify. (#4994) (by Vissidarte-Herman)
      • Fix netlify cache & sync doc without pr content (#5003) (by Justin)
      • Update trouble shooting URL in bug report template (#4988) (by Haidong Lan)
      • Updated URL (#4990) (by Vissidarte-Herman)
      • Fix docs deploy netlify test configuration (#4991) (by Justin)
      • Updated relative path (#4929) (by Vissidarte-Herman)
      • Updated broken links (#4912) (by Vissidarte-Herman)
      • Updated links that may break. (#4874) (by Vissidarte-Herman)
      • Add limitation about TLS optimization (#4877) (by Ailing)
    • Examples
      • Fix block_dim warning in ggui (#5128) (by Zhao Liang)
      • Update visual effects of mass_spring_3d_ggui.py (#5081) (by Zhao Liang)
      • Update mass_spring_3d_ggui.py to v2 (#3879) (by Alex Brown)
    • Language and syntax
      • Add more initialization routines for glsl matrix types (#5069) (by Zhao Liang)
      • Support constructing vector and matrix ndarray from ti.ndarray() (by ailzhang)
      • Disallow reading a whole bit_struct (#5061) (by Yi Xu)
      • Struct Classes implementation (#4989) (by bsavery)
      • Add short-circuit if-then-else operator (#5022) (by daylily)
      • Build sparse matrix from ndarray (#4841) (by pengyu)
      • Fix potential precision bug when using math vector and matrix types (#5032) (by Zhao Liang)
      • Refactor quant type definition APIs (#5036) (by Yi Xu)
      • Fix parameter name 'range' for ti.types.quant.fixed (#5006) (by Yi Xu)
      • Refactor quantized_types module and make quant APIs public (#4985) (by Yi Xu)
      • Add more functions to math module (#4939) (by Zhao Liang)
      • Support sparse matrix datatype and storage format configuration (#4673) (by pengyu)
      • Copy-free interaction between Taichi and PaddlePaddle (#4886) (by 0xzhang)
    • LLVM backend (CPU and CUDA)
      • Add AOT builder and loader (#5013) (by yekuang)
    • Metal backend
      • Support Ndarray (#4720) (by yekuang)
    • RFC
      • AOT for all SNodes (#4806) (by yekuang)
    • SIMT programming
      • Add match_all warp intrinsics (#4961) (by Zeyu Li)
      • Add match_any warp intrinsics (#4921) (by Zeyu Li)
      • Add uni_sync warp intrinsics (#4927) (by 0xzhang)
      • Add activemask warp intrinsics (#4918) (by Zeyu Li)
      • Add syncwarp warp intrinsics (#4917) (by Zeyu Li)
    • Vulkan backend
      • Fixed vulkan backend crash on AOT examples (#5047) (by PENGUINLIONG)
    • GitHub Actions/Workflows
      • Update release_test.sh (#4960) (by Chuandong Yan)

    Full changelog:

    • [aot] [llvm] LLVM AOT Field #2: Updated LLVM AOTModuleLoader & AOTModuleBuilder to support Fields (#5120) (by Zhanlue Yang)
    • [type] [refactor] Misc improvements to quant codegen (#5129) (by Yi Xu)
    • [ci] Enable yapf and isort on example files (#5140) (by Ailing)
    • [Example] Fix block_dim warning in ggui (#5128) (by Zhao Liang)
    • fix mass_spring_3d_ggui backend (#5127) (by Zhao Liang)
    • [lang] Texture support 0/n: IR changes (#5134) (by Bob Cao)
    • Editorial update (#5119) (by Olinaaaloompa)
    • [aot] [llvm] LLVM AOT Field #1: Adjust serialization/deserialization logics for FieldCacheData (#5111) (by Zhanlue Yang)
    • [aot][bug] Use cached compiled kernel pointer when it's added to graph (#5122) (by Ailing)
    • [aot] [llvm] LLVM AOT Field #0: Implemented FieldCacheData & refactored initialize_llvm_runtime_snodes() (#5108) (by Zhanlue Yang)
    • [autodiff] Add forward mode pipeline for autodiff pass (#5098) (by Mingrui Zhang)
    • [build] [refactor] Move Vulkan runtime out of backends dir (#5106) (by Bo Qiao)
    • [bug] Fix build without llvm backend crash (#5113) (by Bo Qiao)
    • [type] [llvm] [refactor] Fix function names in codegen_llvm_quant (#5115) (by Yi Xu)
    • [llvm] [refactor] Replace cast_int() with LLVM native integer cast (#5110) (by Yi Xu)
    • [type] [refactor] Remove redundant promotion for custom int in type_check (#5102) (by Yi Xu)
    • [Example] Update visual effects of mass_spring_3d_ggui.py (#5081) (by Zhao Liang)
    • [test] Save mpm88 graph in python and load in C++ test. (#5104) (by Ailing)
    • [llvm] [refactor] Move load_bit_pointer() to CodeGenLLVM (#5099) (by Yi Xu)
    • [refactor] Remove ndarray element shape from extra arg buffer (#5100) (by Haidong Lan)
    • [refactor] Update Ndarray constructor used in AOT runtime. (#5095) (by Ailing)
    • clean hidden override functions (#5097) (by Mingrui Zhang)
    • [llvm] [aot] CUDA-AOT PR #2: Implemented AOTModuleLoader & AOTModuleBuilder for LLVM-CUDA backend (#5087) (by Zhanlue Yang)
    • [Doc] Improve ODOP doc structure (#5089) (by Yi Xu)
    • Use pre-calculated runtime size array for gfx runtime. (#5094) (by Haidong Lan)
    • [bug] Minor fix for ndarray element_shape in graph mode (#5093) (by Ailing)
    • [llvm] [refactor] Use LLVM native atomic ops if possible (#5091) (by Yi Xu)
    • [autodiff] Extract shared components for reverse and forward mode (#5088) (by Mingrui Zhang)
    • [llvm] [aot] Add LLVM-CPU AOT tests (#5079) (by Zhanlue Yang)
    • [Doc] Add documentation of Taichi Struct Classes. (#5075) (by bsavery)
    • [build] [refactor] Change CMake global include_directories to target based function (#5082) (by Bo Qiao)
    • [autodiff] Allocate dual and adjoint snode (#5083) (by Mingrui Zhang)
    • [refactor] Make sure Ndarray shape is field shape (#5085) (by Ailing)
    • [llvm] [refactor] Merge AtomicOpStmt codegen in CPU and CUDA backends (#5086) (by Yi Xu)
    • [llvm] [aot] CUDA-AOT PR #1: Extracted common logics from CPUAotModuleImpl into LLVMAotModule (#5072) (by Zhanlue Yang)
    • [infra] Refactor Vulkan runtime into true Common Runtime (#5058) (by Bob Cao)
    • [refactor] Correctly set ndarray element_size and nelement (#5080) (by Ailing)
    • [cuda] [simt] Add assertions for warp intrinsics on old GPUs (#5077) (by Bo Qiao)
    • [Lang] Add more initialization routines for glsl matrix types (#5069) (by Zhao Liang)
    • [spirv] Specialize element shape for spirv codegen. (#5068) (by Haidong Lan)
    • [llvm] Specialize element shape for LLVM backend (#5071) (by Haidong Lan)
    • [doc] Fix broken link for github action status badge (#5076) (by Ailing)
    • [Example] Update mass_spring_3d_ggui.py to v2 (#3879) (by Alex Brown)
    • [refactor] Resolve comments from #5065 (#5074) (by Ailing)
    • [Lang] Support constructing vector and matrix ndarray from ti.ndarray() (by ailzhang)
    • [refactor] Pass element_shape and layout to C++ Ndarray (by ailzhang)
    • [refactor] Specialized Ndarray Type is (element_type, shape, layout) (by ailzhang)
    • [aot] [CUDA-AOT PR #0] Refactored compile_module_to_executable() to CUDAModuleToFunctionConverter (#5070) (by Zhanlue Yang)
    • [refactor] Split GraphBuilder out of Graph class (#5064) (by Ailing)
    • [build] [bug] Ensure the assets folder is copied to the project directory (#5063) (by Frost Ming)
    • [bug] Remove operator ! for Expr (#5062) (by Yi Xu)
    • [Lang] [type] Disallow reading a whole bit_struct (#5061) (by Yi Xu)
    • [Lang] Struct Classes implementation (#4989) (by bsavery)
    • [Lang] [ir] Add short-circuit if-then-else operator (#5022) (by daylily)
    • [bug] Ndarray type should include primitive dtype as well (#5052) (by Ailing)
    • [Doc] Updated type system (#5054) (by Vissidarte-Herman)
    • [bug] Added type promotion support for atan2 (#5037) (by Zhanlue Yang)
    • [Lang] Build sparse matrix from ndarray (#4841) (by pengyu)
    • Set host_write to false for opengl ndarray (#5038) (by Ailing)
    • [ci] Run cpp tests via run_tests.py (#5035) (by yekuang)
    • Exit CI builds when download of prebuilt packages fails (#5043) (by PENGUINLIONG)
    • [Vulkan] Fixed vulkan backend crash on AOT examples (#5047) (by PENGUINLIONG)
    • [Lang] Fix potential precision bug when using math vector and matrix types (#5032) (by Zhao Liang)
    • [Metal] Support Ndarray (#4720) (by yekuang)
    • [Lang] [type] Refactor quant type definition APIs (#5036) (by Yi Xu)
    • [aot] Bind graph APIs to python and add mpm88 example (#5034) (by Ailing)
    • [aot] Move ArgKind as first argument in Arg class (by ailzhang)
    • [aot] Serialize built graph, deserialize and run. (by ailzhang)
    • [ci] Disable win cpu docker job test (#5033) (by Bo Qiao)
    • [doc] Update OS names (#5030) (by Bo Qiao)
    • fix fast_gui rgba bug (#5031) (by Zhao Liang)
    • [Bug] [type] Fix frontend type check for reading a whole bit_struct (#5027) (by Yi Xu)
    • [AOT] Support importing external Vulkan buffers (#5020) (by PENGUINLIONG)
    • [SIMT] Add match_all warp intrinsics (#4961) (by Zeyu Li)
    • [bug] Revert freeing ndarray memory when python GC triggers (#5019) (by Ailing)
    • [ci] Fix nightly macos (#5018) (by Bo Qiao)
    • [Llvm] Add AOT builder and loader (#5013) (by yekuang)
    • [aot] Build and run graph without serialization (by Ailing Zhang)
    • [test] Unify kernel setup for ndarray related tests (by Ailing Zhang)
    • [ci] [build] Enable ccache for windows docker (#5001) (by Frost Ming)
    • [refactor] Move get ndarray data ptr to program (#5012) (by pengyu)
    • [bug] Fixed numerical error for Atomic-Sub between unsigned values with different number of bits (#5011) (by Zhanlue Yang)
    • [llvm] Add serializable LlvmLaunchArgInfo (#4992) (by yekuang)
    • [doc] Update community section (#4943) (by yanqingzhang)
    • [SIMT] Add match_any warp intrinsics (#4921) (by Zeyu Li)
    • [Lang] [type] Fix parameter name 'range' for ti.types.quant.fixed (#5006) (by Yi Xu)
    • [misc] Version bump: v1.0.2 -> v1.0.3 (#5008) (by Haidong Lan)
    • [AOT] Supported inclusion of taichi as subdirectory for AOT modules (#5007) (by PENGUINLIONG)
    • [Doc] Branding updates. Also tests netlify. (#4994) (by Vissidarte-Herman)
    • [refactor] Get rid of data_ptr_ in Ndarray (by Ailing Zhang)
    • [refactor] Move ndarray fast fill methods to Program (by Ailing Zhang)
    • [refactor] Free ndarray's memory when python GC triggers (by Ailing Zhang)
    • [refactor] Construct ndarray from existing DeviceAllocation. (by Ailing Zhang)
    • [test] Add test for Ndarray from DeviceAllocation (by Ailing Zhang)
    • [refactor] Program owns allocated ndarrays. (by Ailing Zhang)
    • [Doc] Fix netlify cache & sync doc without pr content (#5003) (by Justin)
    • [test] Fix a few mis-configured ndarray tests (#5000) (by Ailing)
    • Update README.md (by Vissidarte-Herman)
    • [Lang] [type] Refactor quantized_types module and make quant APIs public (#4985) (by Yi Xu)
    • [Doc] Update trouble shooting URL in bug report template (#4988) (by Haidong Lan)
    • [Doc] Updated URL (#4990) (by Vissidarte-Herman)
    • [Doc] Fix docs deploy netlify test configuration (#4991) (by Justin)
    • [llvm] Use serializer for LLVM cache (#4982) (by yekuang)
    • Provision of prebuilt LLVM 10 for VS2022 (#4987) (by PENGUINLIONG)
    • [Workflow] Update release_test.sh (#4960) (by Chuandong Yan)
    • [cuda] Add block and grid level intrinsic for cuda backend (#4977) (by YuZhang)
    • [bug] Fix infinite recursion of get_offline_cache_key_of_snode_impl() (#4983) (by PGZXB)
    • [misc] Add ASTSerializer::visit(ReferenceExpression *) (#4984) (by PGZXB)
    • [llvm] Support both BC and LL cache format (#4979) (by yekuang)
    • [refactor] Improve serializer and cleanup utils (#4980) (by yekuang)
    • [Build] Improve Windows build script (#4955) (by PENGUINLIONG)
    • [llvm] Make cache writer support BC format (#4978) (by yekuang)
    • [ci] [build] Containerize Windows CPU build and test (#4933) (by Bo Qiao)
    • [llvm] Make codegen produce static llvm::Module (#4975) (by yekuang)
    • [test] Add an ndarray test in C++. (#4972) (by Ailing)
    • [build] Fixed Ilegal Instruction Error when importing PaddlePaddle module (#4969) (by Zhanlue Yang)
    • [llvm] Create ModuleToFunctionConverter (#4962) (by yekuang)
    • [bug] [simt] Fix the problem that some intrinsics are never called (#4957) (by Yi Xu)
    • [vulkan] Set kApiVersion to VK_API_VERSION_1_3 (#4970) (by Haidong Lan)
    • [ci] Add new buildbot with latest driver for Linux/Vulkan test (#4953) (by Bo Qiao)
    • [RFC] AOT for all SNodes (#4806) (by yekuang)
    • [llvm] Move cache directory to dump() (#4963) (by yekuang)
    • [lang] Add reference type support on real functions (#4889) (by Lin Jiang)
    • [refactor] Some renamings (#4959) (by yekuang)
    • [refactor] Add ArrayMetadata to store the array runtime size (#4950) (by yekuang)
    • [lang] [bug] Implement Expression serializing and fix some bugs (#4931) (by PGZXB)
    • [Lang] Add more functions to math module (#4939) (by Zhao Liang)
    • [Build] Improved building on Windows (#4925) (by PENGUINLIONG)
    • [ci] Fix Nightly (#4948) (by Bo Qiao)
    • [build] Limit -Werror to Clang-compiler only (#4947) (by Zhanlue Yang)
    • [refactor] [llvm] Remove struct_compiler_ as a member variable (#4945) (by yekuang)
    • [build] Turned off -Werror temporarily for issues with performance-bot (#4946) (by Zhanlue Yang)
    • [refactor] Remove unused snode_trees in ProgramImpl interface (#4942) (by yekuang)
    • [doc] Updated documentations for implicit type casting rules (#4885) (by Zhanlue Yang)
    • [build] Turn on -Werror on Linux and Mac platforms (#4928) (by Zhanlue Yang)
    • [build] Enable -Werror on Linux & Mac (#4941) (by Zhanlue Yang)
    • [SIMT] Add uni_sync warp intrinsics (#4927) (by 0xzhang)
    • [lang] Fix type check warnings for ti.Mesh (#4930) (by Chang Yu)
    • [Lang] Support sparse matrix datatype and storage format configuration (#4673) (by pengyu)
    • [Doc] Updated relative path (#4929) (by Vissidarte-Herman)
    • [refactor] Simplify Matrix's initializer (#4923) (by yekuang)
    • [build] Warning Suppression PR #4: Fixed warnings with MacOS (#4926) (by Zhanlue Yang)
    • [build] Warning Suppression PR #3: Eliminate warnings from third-party headers (#4920) (by Zhanlue Yang)
    • [SIMT] Add activemask warp intrinsics (#4918) (by Zeyu Li)
    • [build] Warning Suppression PR #1: Turned on -Wno-ignored-attributes & Removed unused functions (#4916) (by Zhanlue Yang)
    • [refactor] Create MatrixImpl to differentiate Taichi and Python scopes (#4853) (by yekuang)
    • [SIMT] Add syncwarp warp intrinsics (#4917) (by Zeyu Li)
    • [build] Warning Suppression PR #2: Fixed codebase warnings (#4909) (by Zhanlue Yang)
    • [test] Exit on error during Paddle windows test (#4910) (by Bo Qiao)
    • [Doc] Updated broken links (#4912) (by Vissidarte-Herman)
    • remove debug print (#4883) (by yixu)
    • [test] Cancel tests for Paddle on GPU (#4914) (by 0xzhang)
    • [Lang] [test] Copy-free interaction between Taichi and PaddlePaddle (#4886) (by 0xzhang)
    • Use Ninja generator on Windows and skip generator test (#4896) (by Frost Ming)
    • [vulkan] Add new VMA vulkan functions. (#4893) (by Bob Cao)
    • [vulkan] Fix typo for waitSemaphoreCount (#4892) (by Gabriel H)
    • [Build] [refactor] Define Cmake OpenGL runtime target (#4887) (by Bo Qiao)
    • [aot] [vulkan] Expose symbols for AOT (#4879) (by yekuang)
    • [bug] Fixed type promotion rule for bit-shift operations (#4884) (by Zhanlue Yang)
    • [Build] [refactor] Use keywords instead of plain target_link_libraries CMake (#4864) (by Bo Qiao)
    • [metal] Migrate runtime's MTLBuffer allocation to unified device API (#4865) (by yekuang)
    • [error] [lang] Improved error messages for illegal slicing or indexing to ti.field (#4873) (by Zhanlue Yang)
    • [Doc] Updated links that may break. (#4874) (by Vissidarte-Herman)
    • [metal] Complete Device API (#4862) (by yekuang)
    • [vulkan] Device API explicit semaphores (#4852) (by Bob Cao)
    • [build] Change the library output dir for export core (#4880) (by Frost Ming)
    • [refactor] Add ASTSerializer and use it to generate offline-cache-key (#4863) (by PGZXB)
    • [ci] Use the updated docker image for libtaichi_export_core (#4881) (by Bo Qiao)
    • [Doc] Add limitation about TLS optimization (#4877) (by Ailing)
    • [Build] [refactor] Define runtime build target (#4838) (by Bo Qiao)
    • [ci] Add libtaichi_export_core build for desktop in CI (#4871) (by Ailing)
    • [build] [bug] Fix a bug of skbuild that loses the root package_dir (#4875) (by Frost Ming)
    • [Bug] Remove redundant AllocStmt when lowering FrontendWhileStmt (#4870) (by Zhanlue Yang)
    • [misc] Bump version to v1.0.2 (#4867) (by Taichi Gardener)
    • [build] Install export core library to build dir (#4866) (by Frost Ming)
    • [Build] Switch to scikit-build as the build backend (#4624) (by Frost Ming)
    Source code(tar.gz)
    Source code(zip)
  • v1.0.2(May 18, 2022)

    Highlights:

    The v1.0.2 release is a patch fix that improves Taichi's stability on multiple platforms, especially for GGUI and the Vulkan backend.

    • Bug fixes
      • Remove redundant AllocStmt when lowering FrontendWhileStmt (#4870) (by Zhanlue Yang)
    • Build system
      • Define Cmake OpenGL runtime target (#4887) (by Bo Qiao)
      • Use keywords instead of plain target_link_libraries CMake (#4864) (by Bo Qiao)
      • Define runtime build target (#4838) (by Bo Qiao)
      • Switch to scikit-build as the build backend (#4624) (by Frost Ming)
    • Documentation
      • Add limitation about TLS optimization (#4877) (by Ailing)

    Full changelog:

    • [ci] Fix Nightly (#4948) (by Bo Qiao)
    • [ci] [build] Containerize Windows CPU build and test (#4933) (by Bo Qiao)
    • [vulkan] Set kApiVersion to VK_API_VERSION_1_3 (#4970) (by Haidong Lan)
    • [ci] Add new buildbot with latest driver for Linux/Vulkan test (#4953) (by Bo Qiao)
    • [vulkan] Add new VMA vulkan functions. (#4893) (by Bob Cao)
    • [vulkan] Fix typo for waitSemaphoreCount (#4892) (by Gabriel H)
    • [Build] [refactor] Define Cmake OpenGL runtime target (#4887) (by Bo Qiao)
    • [Build] [refactor] Use keywords instead of plain target_link_libraries CMake (#4864) (by Bo Qiao)
    • [vulkan] Device API explicit semaphores (#4852) (by Bob Cao)
    • [build] Change the library output dir for export core (#4880) (by Frost Ming)
    • [ci] Use the updated docker image for libtaichi_export_core (#4881) (by Bo Qiao)
    • [Doc] Add limitation about TLS optimization (#4877) (by Ailing)
    • [Build] [refactor] Define runtime build target (#4838) (by Bo Qiao)
    • [ci] Add libtaichi_export_core build for desktop in CI (#4871) (by Ailing)
    • [build] [bug] Fix a bug of skbuild that loses the root package_dir (#4875) (by Frost Ming)
    • [Bug] Remove redundant AllocStmt when lowering FrontendWhileStmt (#4870) (by Zhanlue Yang)
    • [misc] Bump version to v1.0.2 (#4867) (by Taichi Gardener)
    • [build] Install export core library to build dir (#4866) (by Frost Ming)
    • [Build] Switch to scikit-build as the build backend (#4624) (by Frost Ming)
    Source code(tar.gz)
    Source code(zip)
  • v1.0.1(Apr 27, 2022)

    Highlights:

    • Automatic differentiation
      • Implement ti.ad.no_grad to skip autograd (#4751) (by Shawn Yao)
    • Bug fixes
      • Fix and refactor type check for atomic ops (#4858) (by Yi Xu)
      • Fix and refactor type check for local stores (#4843) (by Yi Xu)
      • Fix implicit cast warning for global stores (#4834) (by Yi Xu)
    • Documentation
      • Updated URL (#4847) (by Vissidarte-Herman)
      • LLVM sparse runtime design doc (#4790) (by yekuang)
      • Proofread Getting started (#4682) (by Vissidarte-Herman)
      • Editorial review to fields (advanced) (#4686) (by Vissidarte-Herman)
      • Update docstring for ti.Mesh (#4818) (by Chang Yu)
      • Remove redundant semicolon in path (#4801) (by gaoxinge)
    • Error messages
      • Show warning when serialize=True is set on a struct for (#4844) (by Lin Jiang)
      • Provide source code info in warnings (#4840) (by Yi Xu)
    • Language and syntax
      • Add single character property for vector swizzle && test (#4845) (by Zhao Liang)
      • Remove obsolete vectypes class (#4831) (by LiangZhao)
      • Add support for keyword arguments (#4794) (by Lin Jiang)
      • Support swizzles on all Matrix/Vector types (#4828) (by yekuang)
      • Add 2d and 3d rotation functions to math module (#4822) (by Zhao Liang)
      • Walkaround Vulkan backend behavior which changes cwd on Mac (#4812) (by TiGeekMan)
      • Add mod function to math module (#4809) (by Zhao Liang)
      • Support in-place operator of ti.Matrix in python scope (#4799) (by Lin Jiang)
      • Move short-circuit boolean logic into AST-to-IR passes (#4580) (by daylily)
      • Promote output type of log, exp, and sqrt ops (#4622) (by Andrew Sun)
      • Fix integral type promotion rules (e.g., u8 + u8 now leads to u8 instead of i32) (#4789) (by Yuanming Hu)
      • Add basic complex arithmetic and add a mandelbrot example (#4780) (by Zhao Liang)
    • SIMT programming
      • Add shfl_down_f32 intrinsic. (#4819) (by Chun Cai)

    Full changelog:

    • [gui] Avoid implicit type casts in staging_buffer (#4861) (by Yi Xu)
    • [lang] Add better error detection for swizzle patterens (#4860) (by yekuang)
    • [Bug] [ir] Fix and refactor type check for atomic ops (#4858) (by Yi Xu)
    • [Doc] Updated URL (#4847) (by Vissidarte-Herman)
    • [bug] Fix bug that building with TI_EXPORT_CORE:BOOL=ON failed (#4850) (by PGZXB)
    • [Error] Show warning when serialize=True is set on a struct for (#4844) (by Lin Jiang)
    • [lang] Group related Matrix methods closer (#4836) (by yekuang)
    • [Lang] Add single character property for vector swizzle && test (#4845) (by Zhao Liang)
    • [Bug] [ir] Fix and refactor type check for local stores (#4843) (by Yi Xu)
    • [Error] Provide source code info in warnings (#4840) (by Yi Xu)
    • [misc] Update pre-commit hooks (#4713) (by pre-commit-ci[bot])
    • [Bug] [ir] Fix implicit cast warning for global stores (#4834) (by Yi Xu)
    • [mesh] Remove link hints from ti.Mesh (#4825) (by yixu)
    • [Lang] Remove obsolete vectypes class (#4831) (by LiangZhao)
    • [doc] Fix doc link (#4835) (by yekuang)
    • [Doc] LLVM sparse runtime design doc (#4790) (by yekuang)
    • [Lang] Add support for keyword arguments (#4794) (by Lin Jiang)
    • [Lang] Support swizzles on all Matrix/Vector types (#4828) (by yekuang)
    • [test] Add simple test for offline-cache-key of compile-config (#4805) (by PGZXB)
    • [vulkan] Device API blending (#4815) (by Bob Cao)
    • [spirv] Fix int casts (#4814) (by Bob Cao)
    • [gui] Only call ImGui_ImplVulkan_Shutdown if it's initialized (#4827) (by Ailing)
    • [ci] Use a new PAT for project with org permission (#4826) (by Frost Ming)
    • [Lang] Add 2d and 3d rotation functions to math module (#4822) (by Zhao Liang)
    • [Doc] Proofread Getting started (#4682) (by Vissidarte-Herman)
    • [Doc] Editorial review to fields (advanced) (#4686) (by Vissidarte-Herman)
    • [bug] Fix bug that building with gcc9.4 will fail (#4823) (by PGZXB)
    • [SIMT] Add shfl_down_f32 intrinsic. (#4819) (by Chun Cai)
    • [workflow] Add issues to project when issue opened (#4816) (by Frost Ming)
    • [vulkan] Fix vulkan initialization on macOS with cpu backend (#4813) (by Bob Cao)
    • [Doc] [mesh] Update docstring for ti.Mesh (#4818) (by Chang Yu)
    • [vulkan] Fix Vulkan device score bug (#4803) (by Andrew Sun)
    • [Lang] Walkaround Vulkan backend behavior which changes cwd on Mac (#4812) (by TiGeekMan)
    • [misc] Add SNode to offline-cache key (#4716) (by PGZXB)
    • [Lang] Add mod function to math module (#4809) (by Zhao Liang)
    • [doc] Fix doc of running C++ tests (#4798) (by Yi Xu)
    • [Lang] Support in-place operator of ti.Matrix in python scope (#4799) (by Lin Jiang)
    • [Lang] [ir] Move short-circuit boolean logic into AST-to-IR passes (#4580) (by daylily)
    • [lang] Fix frontend type check for sqrt, log, exp (#4797) (by Yi Xu)
    • [Doc] Remove redundant semicolon in path (#4801) (by gaoxinge)
    • [Lang] [ir] Promote output type of log, exp, and sqrt ops (#4622) (by Andrew Sun)
    • [ci] Update ci images to use latest git (#4792) (by Bo Qiao)
    • [Lang] Fix integral type promotion rules (e.g., u8 + u8 now leads to u8 instead of i32) (#4789) (by Yuanming Hu)
    • [Lang] Add basic complex arithmetic and add a mandelbrot example (#4780) (by Zhao Liang)
    • Update index.md (#4791) (by Bob Cao)
    • [spirv] Add 16 bit float immediate number (#4787) (by Bob Cao)
    • [ci] Update ubuntu 18.04 image to use latest git (#4785) (by Frost Ming)
    • [lang] Store relations with 16-bit type (#4779) (by Chang Yu)
    • [Autodiff] Implement ti.ad.no_grad to skip autograd (#4751) (by Shawn Yao)
    • [misc] Remove some unnecessary attributes from offline-cache key of compile-config (#4770) (by PGZXB)
    • [doc] Update install instruction with "--upgrade" (#4775) (by Yuanming Hu)
    • Expose VboHelpers class (#4773) (by Ailing)
    • Bump version to v1.0.1 (#4774) (by Taichi Gardener)
    • [refactor] Merge Kernel.argument_names and argument_annotations (#4753) (by dongqi shen)
    • [dx11] Constant buffer binding and AtomicIncrement in RAND_STATE (#4650) (by quadpixels)
    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Apr 13, 2022)

    v1.0.0 was released on April 13, 2022.

    Compatibility changes

    License change

    Taichi's license is changed from MIT to Apache-2.0 after a public vote in #4607.

    Python 3.10 support

    This release supports Python 3.10 on all supported operating systems (Windows, macOS, and Linux).

    Manylinux2014-compatible wheels

    Before v1.0.0, Taichi works only on Linux distributions that support glibc 2.27+ (for example Ubuntu 18.04+). As of v1.0.0, in addition to the normal Taichi wheels, Taichi provides the manylinux2014-compatible wheels to work on most modern Linux distributions, including CentOS 7.

    • The normal wheels support all backends; the incoming manylinux2014-compatible wheels support the CPU and CUDA backends only. Choose the wheels that work best for you.
    • If you encounter any issue when installing the wheels, try upgrading your pip to the latest version first.

    Deprecations

    • This release deprecates ti.ext_arr() and uses ti.types.ndarray() instead. ti.types.ndarray() supports both Taichi Ndarrays and external arrays, for example NumPy arrays.
    • Taichi plans to drop support for Python 3.6 in the next minor release (v1.1.0). If you have any questions or concerns, please let us know at #4772.

    New features

    Non-Python deployment solution

    By working together with OPPO US Research Center, Taichi delivers Taichi AOT, a solution for deploying kernels in non-Python environments, such as in mobile devices.

    Compiled Taichi kernels can be saved from a Python process, then loaded and run by the provided C++ runtime library. With a set of APIs, your Python/Taichi code can be easily deployed in any C++ environment. We demonstrate the simplicity of this workflow by porting the implicit FEM (finite element method) demo released in v0.9.0 to an Android application. Download the Android package and find out what Taichi AOT has to offer! If you want to try out this solution, please also check out the taichi-aot-demo repo.

    # In Python app.py
    module = ti.aot.Module(ti.vulkan) 
    module.add_kernel(my_kernel, template_args={'x': x})
    module.save('my_app')
    

    The following code snippet shows the C++ workflow for loading the compiled AOT modules.

    // Initialize Vulkan program pipeline
    taichi::lang::vulkan::VulkanDeviceCreator::Params evd_params;
    evd_params.api_version = VK_API_VERSION_1_2;
    auto embedded_device =
        std::make_unique<taichi::lang::vulkan::VulkanDeviceCreator>(evd_params);
    
    std::vector<uint64_t> host_result_buffer;
    host_result_buffer.resize(taichi_result_buffer_entries);
    taichi::lang::vulkan::VkRuntime::Params params;
    params.host_result_buffer = host_result_buffer.data();
    params.device = embedded_device->device();
    auto vulkan_runtime = std::make_unique<taichi::lang::vulkan::VkRuntime>(std::move(params));
    
    // Load AOT module saved from Python
    taichi::lang::vulkan::AotModuleParams aot_params{"my_app", vulkan_runtime.get()};
    auto module = taichi::lang::aot::Module::load(taichi::Arch::vulkan, aot_params);
    auto my_kernel = module->get_kernel("my_kernel");
    
    // Allocate device buffer
    taichi::lang::Device::AllocParams alloc_params;
    alloc_params.host_write = true;
    alloc_params.size = /*Ndarray size for `x`*/;
    alloc_params.usage = taichi::lang::AllocUsage::Storage;
    auto devalloc_x = embedded_device->device()->allocate_memory(alloc_params);
    
    // Execute my_kernel without Python environment
    taichi::lang::RuntimeContext host_ctx;
    host_ctx.set_arg_devalloc(/*arg_id=*/0, devalloc_x, /*shape=*/{128}, /*element_shape=*/{3, 1});
    my_kernel->launch(&host_ctx);
    

    Note that Taichi only supports the Vulkan backend in the C++ runtime library. The Taichi team is working on supporting more backends.

    Real functions (experimental)

    All Taichi functions are inlined into the Taichi kernel during compile time. However, the kernel becomes lengthy and requires longer compile time if it has too many Taichi function calls. This becomes especially obvious if a Taichi function involves compile-time recursion. For example, the following code calculates the Fibonacci numbers recursively:

    @ti.func
    def fib_impl(n: ti.template()):
        if ti.static(n <= 0):
            return 0
        if ti.static(n == 1):
            return 1
        return fib_impl(n - 1) + fib_impl(n - 2)
    
    @ti.kernel
    def fibonacci(n: ti.template()):
        print(fib_impl(n))
    

    In this code, fib_impl() recursively calls itself until n reaches 1 or 0. The total time of the calls to fib_impl() increases exponentially as n grows, so the length of the kernel also increases exponentially. When n reaches 25, it takes more than a minute to compile the kernel.

    This release introduces "real function", a new type of Taichi function that compiles independently instead of being inlined into the kernel. It is an experimental feature and only supports scalar arguments and scalar return value for now.

    You can use it by decorating the function with @ti.experimental.real_func. For example, the following is the real function version of the code above.

    @ti.experimental.real_func
    def fib_impl(n: ti.i32) -> ti.i32:
        if n <= 0:
            return 0
        if n == 1:
            return 1
        return fib_impl(n - 1) + fib_impl(n - 2)
    
    @ti.kernel
    def fibonacci(n: ti.i32):
        print(fib_impl(n))
    

    The length of the kernel does not increase as n grows because the kernel only makes a call to the function instead of inlining the whole function. As a result, the code takes far less than a second to compile regardless of the value of n.

    The main differences between a normal Taichi function and a real function are listed below:

    • You can write return statements in any part of a real function, while you cannot write return statements inside the scope of non-static if / for / while statements in a normal Taichi function.
    • A real function can be called recursively at runtime, while a normal Taichi function only supports compile-time recursion.
    • The return value and arguments of a real function must be type hinted, while the type hints are optional in a normal Taichi function.

    Type annotations for literals

    Previously, you cannot explicitly give a type to a literal. For example,

    @ti.kernel
    def foo():
        a = 2891336453  # i32 overflow (>2^31-1)
    

    In the code snippet above, 2891336453 is first turned into a default integer type (ti.i32 if not changed). This causes an overflow. Starting from v1.0.0, you can write type annotations for literals:

    @ti.kernel
    def foo():
        a = ti.u32(2891336453)  # similar to 2891336453u in C
    

    Top-level loop configurations

    You can use ti.loop_config to control the behavior of the subsequent top-level for-loop. Available parameters are:

    • block_dim: Sets the number of threads in a block on GPU.
    • parallelize: Sets the number of threads to use on CPU.
    • serialize: If you set serialize to True, the for-loop runs serially, and you can write break statements inside it (Only applies on range/ndrange for-loops). Setting serialize to True Equals setting parallelize to 1.

    Here are two examples:

    @ti.kernel
    def break_in_serial_for() -> ti.i32:
        a = 0
        ti.loop_config(serialize=True)
        for i in range(100):  # This loop runs serially
            a += i
            if i == 10:
                break
        return a
    
    break_in_serial_for()  # returns 55
    
    n = 128
    val = ti.field(ti.i32, shape=n)
    
    @ti.kernel
    def fill():
        ti.loop_config(parallelize=8, block_dim=16)
        # If the kernel is run on the CPU backend, 8 threads will be used to run it
        # If the kernel is run on the CUDA backend, each block will have 16 threads
        for i in range(n):
            val[i] = i
    

    math module

    This release adds a math module to support GLSL-standard vector operations and to make it easier to port GLSL shader code to Taichi. For example, vector types, including vec2, vec3, vec4, mat2, mat3, and mat4, and functions, including mix(), clamp(), and smoothstep(), act similarly to their counterparts in GLSL. See the following examples:

    Vector initialization and swizzling

    You can use the rgba, xyzw, uvw properties to get and set vector entries:

    import taichi.math as tm
    
    @ti.kernel
    def example():
        v = tm.vec3(1.0)  # (1.0, 1.0, 1.0)
        w = tm.vec4(0.0, 1.0, 2.0, 3.0)
        v.rgg += 1.0  # v = (2.0, 3.0, 1.0)
        w.zxy += tm.sin(v)
    

    Matrix multiplication

    Each Taichi vector is implemented as a column vector. Ensure that you put the the matrix before the vector in a matrix multiplication.

    @ti.kernel
    def example():
        M = ti.Matrix([[1, 0, 0], [0, 1, 0], [0, 0, 1]])
        v = tm.vec3(1, 2, 3)
        w = (M @ v).xyz  # [1, 2, 3]
    

    GLSL-standard functions

    @ti.kernel
    def example():
        v = tm.vec3(0., 1., 2.)
        w = tm.smoothstep(0.0, 1.0, v.xyz)
        w = tm.clamp(w, 0.2, 0.8)
    

    CLI command ti gallery

    This release introduces a CLI command ti gallery, allowing you to select and run Taichi examples in a pop-up window. To do so:

    1. Open a terminal:
    ti gallery
    

    A window pops up:

    1. Click to run any example in the pop-up window. The console prints the corresponding source code at the same time.

    Improvements

    Enhanced matrix type

    As of v1.0.0, Taichi accepts matrix or vector types as parameters and return values. You can use ti.types.matrix or ti.types.vector as the type annotations.

    Taichi also supports basic, read-only matrix slicing. Use the mat[:,:] syntax to quickly retrieve a specific portion of a matrix. See Slicings for more information.

    The following code example shows how to get numbers in four corners of a 3x3 matrix mat:

    import taichi as ti
    
    ti.init()
    
    @ti.kernel
    def foo(mat: ti.types.matrix(3, 3, ti.i32)) -> ti.types.matrix(2, 2, ti.i32)
        corners = mat[::2, ::2]
        return corners
      
    mat = ti.Matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
    corners = foo(mat)  # [[1 3] [7 9]]
    

    Note that in a slice, the lower bound, the upper bound, and the stride must be constant integers. If you want to use a variable index together with a slice, you should set ti.init(dynamic_index=True). For example:

    import taichi as ti
    
    ti.init(dynamic_index=True)
    
    @ti.kernel
    def foo(mat: ti.types.matrix(3, 3, ti.i32), ind: ti.i32) -> ti.types.matrix(3, 1, ti.i32):
        col = mat[:, ind]
        return col
      
    mat = ti.Matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
    col = foo(mat, 2)  # [3 6 9]
    

    More flexible Autodiff: Kernel Simplicity Rule removed

    Flexiblity is key to the user experience of an automatic-differentiation (AD) system. Before v1.0.0, Taichi AD system requires that a differentiable Taichi kernel only consist multiple simply nested for-loops (shown in task1 below). This was once called the Kernel Simplicity Rule (KSR). KSR prevents Taichi's users from writing differentiable kernels with multiple serial for-loops (shown in task2 below) or with a mixture of serial for-loop and non-for statements (shown in task3 below).

    # OK: multiple simply nested for-loops
    @ti.kernel
    def task1():
        for i in range(2):
            for j in range(3):
                for k in range(3):
                    y[None] += x[None]
    
    # Error: multiple serial for-loops
    @ti.kernel
    def task2():
        for i in range(2):
            for j in range(3):
                y[None] += x[None]
            for j in range(3):
                y[None] += x[None]
    
    # Error: a mixture of serial for-loop and non-for
    @ti.kernel
    def task3():
        for i in range(2):
            y[None] += x[None]
            for j in range(3):
                y[None] += x[None]
    

    With KSR being removed from this release, code with different kinds of for-loops structures can be differentiated, as shown in the snippet below.

    # OK: A complicated control flow that is still differentiable in Taichi
    for j in range(2):
        for i in range(3):
            y[None] += x[None]
        for i in range(3):
            for ii in range(2):
                y[None] += x[None]
            for iii in range(2):
                y[None] += x[None]
                for iv in range(2):
                    y[None] += x[None]
        for i in range(3):
            for ii in range(2):
                for iii in range(2):
                    y[None] += x[None]
    

    Taichi provides a demo to demonstrate how to implement a differentiable simulator using this enhanced Taichi AD system.

    f-string support in an assert statement

    This release supports including an f-string in an assert statement as an error message. You can include scalar variables in the f-string. See the example below:

    import taichi as ti
    
    ti.init(debug=True)
    
    @ti.kernel
    def assert_is_zero(n: ti.i32):
        assert n == 0, f"The number is {n}, not zero"
    
    assert_is_zero(42)  # TaichiAssertionError: The number is 42, not zero
    

    Note that the assert statement works only in debug mode.

    Documentation changes

    Taichi language reference

    This release comes with the first version of the Taichi language specification, which attempts to provide an exhaustive description of the syntax and semantics of the Taichi language and makes a decent reference for Taichi's users and developers when they determine if a specific behavior is correct, buggy, or undefined.

    API changes

    Deprecated

    | Deprecated | Replaced by | | -------------- | -------------------- | | ti.ext_arr() | ti.types.ndarray() |

    Full changelog

    • [example] Add diff sph demo (#4769) (by Mingrui Zhang)
    • [autodiff] Fix nullptr during adjoint codegen (#4771) (by Ye Kuang)
    • [bug] Fix kernel profiler on CPU backend (#4768) (by Lin Jiang)
    • [example] Fix taichi_dynamic example (#4767) (by Yi Xu)
    • [aot] Provide a convenient API to set devallocation as argument (#4762) (by Ailing)
    • [Lang] Deprecate ti.pyfunc (#4764) (by Lin Jiang)
    • [misc] Bump version to v1.0.0 (#4763) (by Yi Xu)
    • [SIMT] Add all_sync warp intrinsics (#4718) (by Yongmin Hu)
    • [doc] Taichi spec: calls, unary ops, binary ops and comparison (#4663) (by squarefk)
    • [SIMT] Add any_sync warp intrinsics (#4719) (by Yongmin Hu)
    • [Doc] Update community standard (#4759) (by notginger)
    • [Doc] Propose the RFC process (#4755) (by Ye Kuang)
    • [Doc] Fixed a broken link (#4758) (by Vissidarte-Herman)
    • [Doc] Taichi spec: conditional expressions and simple statements (#4728) (by Xiangyun Yang)
    • [bug] [lang] Let matrix initialize to the target type (#4744) (by Lin Jiang)
    • [ci] Fix ci nightly (#4754) (by Bo Qiao)
    • [doc] Taichi spec: compound statements, if, while (#4658) (by Lin Jiang)
    • [build] Simplify build command for android (#4752) (by Ailing)
    • [lang] Add PolygonMode enum for rasterizer (#4750) (by Ye Kuang)
    • [Aot] Support template args in AOT module add_kernel (#4748) (by Ye Kuang)
    • [lang] Support in-place operations on math vectors (#4738) (by Lin Jiang)
    • [ci] Add python 3.6 and 3.10 to nightly release (#4740) (by Bo Qiao)
    • [Android] Fix Android get height issue (#4743) (by Ye Kuang)
    • Updated logo (#4745) (by Vissidarte-Herman)
    • [Error] Raise an error when non-static condition is passed into ti.static_assert (#4735) (by Lin Jiang)
    • [Doc] Taichi spec: For (#4689) (by Lin Jiang)
    • [SIMT] [cuda] Use correct source lane offset for warp intrinsics (#4734) (by Bo Qiao)
    • [SIMT] Add shfl_xor_i32 warp intrinsics (#4642) (by Yongmin Hu)
    • [Bug] Fix warnings (#4730) (by Peng Yu)
    • [Lang] Add vector swizzle feature to math module (#4629) (by TiGeekMan)
    • [Doc] Taichi spec: static expressions (#4702) (by Lin Jiang)
    • [Doc] Taichi spec: assignment expressions (#4725) (by Xiangyun Yang)
    • [mac] Fix external_func test failures on arm backend (#4733) (by Ailing)
    • [doc] Fix deprecated tools APIs in docs, tests, and examples (#4729) (by Yi Xu)
    • [ci] Switch to self-hosted PyPI for nightly release (#4706) (by Bo Qiao)
    • [Doc] Taichi spec: boolean operations (#4724) (by Xiangyun Yang)
    • [doc] Fix deprecated profiler APIs in docs, tests, and examples (#4726) (by Yi Xu)
    • [spirv] Ext arr name should include arg id (#4727) (by Ailing)
    • [SIMT] Add shfl_sync_i32/f32 warp intrinsics (#4717) (by Yongmin Hu)
    • [Lang] Add 2x2/3x3 matrix solve with Guass elimination (#4634) (by Peng Yu)
    • [metal] Tweak Device to support Ndarray (#4721) (by Ye Kuang)
    • [build] Fix non x64 linux builds (#4715) (by Bob Cao)
    • [Doc] Fix 4 typos in doc (#4714) (by Jiayi Weng)
    • [simt] Subgroup reduction primitives (#4643) (by Bob Cao)
    • [misc] Remove legacy LICENSE.txt (#4708) (by Yi Xu)
    • [gui] Make GGUI VBO configurable for mesh (#4707) (by Yuheng Zou)
    • [Docs] Change License from MIT to Apache-2.0 (#4701) (by notginger)
    • [Doc] Update docstring for module misc (#4644) (by Zhao Liang)
    • [doc] Proofread GGUI.md (#4676) (by Vissidarte-Herman)
    • [refactor] Remove Expression::serialize and add ExpressionHumanFriendlyPrinter (#4657) (by PGZXB)
    • [Doc] Remove extension_libraries in doc site (#4696) (by LittleMan)
    • [Lang] Let assertion error message support f-string (#4700) (by Lin Jiang)
    • [Doc] Taichi spec: prims, attributes, subscriptions, slicings (#4697) (by Yi Xu)
    • [misc] Add compile-config to offline-cache key (#4681) (by PGZXB)
    • [refactor] Remove legacy usage of ext_arr/any_arr in codebase (#4698) (by Yi Xu)
    • [doc] Taichi spec: pass, return, break, and continue (#4656) (by Lin Jiang)
    • [bug] Fix chain assignment (#4695) (by Lin Jiang)
    • [Doc] Refactored GUI.md (#4672) (by Vissidarte-Herman)
    • [misc] Update linux version name (#4685) (by Jiasheng Zhang)
    • [bug] Fix ndrange when start > end (#4690) (by Lin Jiang)
    • [bug] Fix bugs in test_offline_cache.py (#4674) (by PGZXB)
    • [Doc] Fix gif link (#4694) (by Ye Kuang)
    • [Lang] Add math module to support glsl-style functions (#4683) (by LittleMan)
    • [Doc] Editorial updates (#4688) (by Vissidarte-Herman)
    • Editorial updates (#4687) (by Vissidarte-Herman)
    • [ci] [windows] Add Dockerfile for Windows build and test (CPU) (#4667) (by Bo Qiao)
    • [Doc] Taichi spec: list and dictionary displays (#4665) (by Yi Xu)
    • [CUDA] Fix the fp32 to fp64 promotion due to incorrect fmax/fmin call (#4664) (by Haidong Lan)
    • [misc] Temporarily disable a flaky test (#4669) (by Yi Xu)
    • [bug] Fix void return (#4654) (by Lin Jiang)
    • [Workflow] Use pre-commit hooks to check codes (#4633) (by Frost Ming)
    • [SIMT] Add ballot_sync warp intrinsics (#4641) (by Wimaxs)
    • [refactor] [cuda] Refactor offline-cache and support it on arch=cuda (#4600) (by PGZXB)
    • [Error] [doc] Add TaichiAssertionError and add assert to the lang spec (#4649) (by Lin Jiang)
    • [Doc] Taichi spec: parenthesized forms; expression lists (#4653) (by Yi Xu)
    • [Doc] Updated definition of a 0D field. (#4651) (by Vissidarte-Herman)
    • [Doc] Taichi spec: variables and scope; atoms; names; literals (#4621) (by Yi Xu)
    • [doc] Fix broken links and update docs. (#4647) (by Chengchen(Rex) Wang)
    • [Bug] Fix broken links (#4646) (by Peng Yu)
    • [Doc] Refactored field.md (#4618) (by Vissidarte-Herman)
    • [gui] Allow to configure the texture data type (#4630) (by Gabriel H)
    • [vulkan] Fixes the string comparison when querying extensions (#4638) (by Bob Cao)
    • [doc] Add docstring for ti.loop_config (#4625) (by Lin Jiang)
    • [SIMT] Add shfl_up_i32/f32 warp intrinsics (#4632) (by Yu Zhang)
    • [Doc] Examples directory update (#4640) (by dongqi shen)
    • [vulkan] Choose better devices (#4614) (by Bob Cao)
    • [SIMT] Implement ti.simt.warp.shfl_down_i32 and add stubs for other warp-level intrinsics (#4616) (by Yuanming Hu)
    • [refactor] Refactor Identifier::id_counter to global and local counter (#4581) (by PGZXB)
    • [android] Disable XDG on non-supported platform (#4612) (by Gabriel H)
    • [gui] [aot] Allow set_image to use user VBO (#4611) (by Gabriel H)
    • [Doc] Add docsting for Camera class in ui module (#4588) (by Zhao Liang)
    • [metal] Implement buffer_fill for unified device API (#4595) (by Ye Kuang)
    • [Lang] Matrix 3x3 eigen decomposition (#4571) (by Peng Yu)
    • [Doc] Set up the basis of Taichi specification (#4603) (by Yi Xu)
    • [gui] Make GGUI VBO configurable for particles (#4610) (by Yuheng Zou)
    • [Doc] Update with Python 3.10 support (#4609) (by Bo Qiao)
    • [misc] Bump version to v0.9.3 (#4608) (by Taichi Gardener)
    • [Lang] Deprecate ext_arr/any_arr in favor of types.ndarray (#4598) (by Yi Xu)
    • [Doc] Adjust CPU GUI document layout (#4605) (by Peng Yu)
    • [Doc] Refactored Type system. (#4584) (by Vissidarte-Herman)
    • [lang] Fix vector matrix ndarray to numpy layout (#4597) (by Bo Qiao)
    • [bug] Fix bug that caching kernels with same AST will fail (#4582) (by PGZXB)
    Source code(tar.gz)
    Source code(zip)
    diff-sph-demo.gif(7.17 MB)
    libtaichi_export_core.so(10.02 MB)
    taichi-aot-demo.gif(9.64 MB)
    taichi-gallery.jpg(228.38 KB)
    TaichiAOT.apk(15.24 MB)
  • v0.9.2(Mar 23, 2022)

    Highlights:

    • CI/CD workflow
      • Generate manylinux2014-compatible wheels with CUDA backend in release workflow (#4550) (by Yi Xu)
    • Command line interface
      • Fix a few bugs in taichi gallery command (#4548) (by Zhao Liang)
    • Documentation
      • Fixed broken links. (#4563) (by Vissidarte-Herman)
      • Refactored README.md (#4549) (by Vissidarte-Herman)
      • Create CODE_OF_CONDUCT (#4564) (by notginger)
      • Update syntax.md (#4557) (by Vissidarte-Herman)
      • Update docstring for ndrange (#4486) (by Zhao Liang)
      • Minor updates: It is recommended to type hint arguments and return values (#4510) (by Vissidarte-Herman)
      • Refactored Kernels and functions. (#4496) (by Vissidarte-Herman)
      • Add initial variable and fragments (#4457) (by Justin)
    • Language and syntax
      • Add taichi gallery command for user to choose and run example in gui (#4532) (by TiGeekMan)
      • Add ti.serialize and ti.loop_config (#4525) (by Lin Jiang)
      • Support simple matrix slicing (#4488) (by Xiangyun Yang)
      • Remove legacy ways to construct matrices (#4521) (by Yi Xu)

    Full changelog:

    • [lang] Replace keywords in python (#4606) (by Jiasheng Zhang)
    • [lang] Fix py36 block_dim bug (#4601) (by Jiasheng Zhang)
    • [ci] Fix release script bug (#4599) (by Jiasheng Zhang)
    • [aot] Support return in vulkan aot (#4593) (by Ailing)
    • [ci] Release script add test for tests/python/examples (#4590) (by Jiasheng Zhang)
    • [misc] Write version info right after creation of uuid (#4589) (by Jiasheng Zhang)
    • [gui] Make GGUI VBO configurable (#4575) (by Ye Kuang)
    • [test] Fix ill-formed test_binary_func_ret (#4587) (by Yi Xu)
    • Update differences_between_taichi_and_python_programs.md (#4583) (by Vissidarte-Herman)
    • [misc] Fix a few warnings (#4572) (by Ye Kuang)
    • [aot] Remove redundant module_path argument (#4573) (by Ailing)
    • [bug] [opt] Fix some bugs when deal with real function (#4568) (by Xiangyun Yang)
    • [build] Guard llvm usage inside TI_WITH_LLVM (#4570) (by Ailing)
    • [aot] [refactor] Add make_new_field for Metal (#4559) (by Bo Qiao)
    • [llvm] [lang] Add support for multiple return statements in real function (#4536) (by Lin Jiang)
    • [test] Add test for offline-cache (#4562) (by PGZXB)
    • Format updates (#4567) (by Vissidarte-Herman)
    • [aot] Add KernelTemplate interface (#4558) (by Ye Kuang)
    • [test] Eliminate the warnings in test suite (#4556) (by Frost Ming)
    • [Doc] Fixed broken links. (#4563) (by Vissidarte-Herman)
    • [Doc] Refactored README.md (#4549) (by Vissidarte-Herman)
    • [Doc] Create CODE_OF_CONDUCT (#4564) (by notginger)
    • [misc] Reset counters in Program::finalize() (#4561) (by PGZXB)
    • [misc] Add TI_CI env to CI/CD (#4551) (by Jiasheng Zhang)
    • [ir] Add basic tests for Block (#4553) (by Ye Kuang)
    • [refactor] Fix error message (#4552) (by Ye Kuang)
    • [Doc] Update syntax.md (#4557) (by Vissidarte-Herman)
    • [gui] Hack to make GUI.close() work on macOS (#4555) (by Ye Kuang)
    • [aot] Fix get_kernel API semantics (#4554) (by Ye Kuang)
    • [opt] Support offline-cache for kernel with arch=cpu (#4500) (by PGZXB)
    • [CLI] Fix a few bugs in taichi gallery command (#4548) (by Zhao Liang)
    • [ir] Small optimizations to codegen (#4442) (by Bob Cao)
    • [CI] Generate manylinux2014-compatible wheels with CUDA backend in release workflow (#4550) (by Yi Xu)
    • [misc] Metadata update (#4539) (by Jiasheng Zhang)
    • [test] Parametrize the test cases with pytest.mark (#4546) (by Frost Ming)
    • [Doc] Update docstring for ndrange (#4486) (by Zhao Liang)
    • [build] Default symbol visibility to hidden for all targets (#4545) (by Gabriel H)
    • [autodiff] Handle multiple, mixed Independent Blocks (IBs) within multi-levels serial for-loops (#4523) (by Mingrui Zhang)
    • [bug] [lang] Cast the arguments of real function to the desired types (#4538) (by Lin Jiang)
    • [Lang] Add taichi gallery command for user to choose and run example in gui (#4532) (by TiGeekMan)
    • [bug] Fix bug that calling std::getenv when cpp-tests running will fail (#4537) (by PGZXB)
    • [vulkan] Fix performance (#4535) (by Bob Cao)
    • [Lang] Add ti.serialize and ti.loop_config (#4525) (by Lin Jiang)
    • [Lang] Support simple matrix slicing (#4488) (by Xiangyun Yang)
    • Update vulkan_api.cpp (#4533) (by Bob Cao)
    • [lang] Quick fix for mesh_local analyzer (#4529) (by Chang Yu)
    • [test] Show arch info in the verbose test report (#4528) (by Frost Ming)
    • [aot] Add binding_id of root/gtmp/rets/args bufs to CompiledOffloadedTask (#4522) (by Ailing)
    • [vulkan] Relax a few test precisions for vulkan (#4524) (by Ailing)
    • [build] Option to use LLD (#4513) (by Bob Cao)
    • [misc] [linux] Implement XDG Base Directory support (#4514) (by ruro)
    • [Lang] [refactor] Remove legacy ways to construct matrices (#4521) (by Yi Xu)
    • [misc] Make result of irpass::print hold more information (#4517) (by PGZXB)
    • [refactor] Misc improvements over AST helper functions (#4398) (by daylily)
    • [misc] [build] Bump catch external library 2.13.3 -> 2.13.8 (#4516) (by ruro)
    • [autodiff] Reduce the number of ad stack using knowledge of derivative formulas (#4512) (by Mingrui Zhang)
    • [ir] [opt] Fix a bug about 'continue' stmt in cfg_build (#4507) (by Xiangyun Yang)
    • [Doc] Minor updates: It is recommended to type hint arguments and return values (#4510) (by Vissidarte-Herman)
    • [ci] Fix the taichi repo name by hardcode (#4506) (by Frost Ming)
    • [build] Guard dx lib search with TI_WITH_DX11 (#4505) (by Ailing)
    • [ci] Reduce the default device memory usage for GPU tests (#4508) (by Bo Qiao)
    • [Doc] Refactored Kernels and functions. (#4496) (by Vissidarte-Herman)
    • [aot] [refactor] Refactor AOT field API for Vulkan (#4490) (by Bo Qiao)
    • [ci] Fix: fill in the pull request body created by bot (#4503) (by Frost Ming)
    • [ci] Skip in steps rather than the whole job (#4499) (by Frost Ming)
    • [ci] Add a Dockerfile for building manylinux2014-compatible Taichi wheels with CUDA backend (#4491) (by Yi Xu)
    • [ci] Automate release publishing (#4428) (by Frost Ming)
    • [fix] dangling ti.func decorator in euler.py (#4492) (by Zihua Wu)
    • [ir] Fix a bug in simplify pass (#4489) (by Xiangyun Yang)
    • [test] Add test for recursive real function (#4477) (by Lin Jiang)
    • [Doc] Add initial variable and fragments (#4457) (by Justin)
    • [misc] Add a convenient script for testing compatibility of Taichi releases. (#4485) (by Chengchen(Rex) Wang)
    • [misc] Version bump: v0.9.1 -> v0.9.2 (#4484) (by Chengchen(Rex) Wang)
    • [ci] Update gpu docker image to test python 3.10 (#4472) (by Bo Qiao)
    Source code(tar.gz)
    Source code(zip)
  • v0.9.1(Mar 8, 2022)

    Highlights:

    • CI/CD workflow
      • Cleanup workspace before window test (#4405) (by Jian Zeng)
    • Documentation
      • Update docstrings for functions in ops (#4465) (by Zhao Liang)
      • Update docstring for functions in misc (#4474) (by Zhao Liang)
      • Update docstrings in misc (#4446) (by Zhao Liang)
      • Update docstring for functions in operations (#4427) (by Zhao Liang)
      • Update PyTorch interface documentation (#4311) (by Andrew Sun)
      • Update docstring for functions in operations (#4413) (by Zhao Liang)
      • Update docstring for functions in operations (#4392) (by Zhao Liang)
      • Fix broken links (#4368) (by Ye Kuang)
      • Re-structure the articles: getting-started, gui (#4360) (by Ye Kuang)
    • Error messages
      • Add error message when the number of elements in kernel arguments exceed (#4444) (by Xiangyun Yang)
      • Add error for invalid snode size (#4460) (by Lin Jiang)
      • Add error messages for wrong type annotations of literals (#4462) (by Yi Xu)
      • Remove the mentioning of ti.pyfunc in the error message (#4429) (by Lin Jiang)
    • Language and syntax
      • Support sparse matrix builder datatype configuration (#4411) (by Peng Yu)
      • Support type annotations for literals (#4440) (by Yi Xu)
      • Support simple matrix slicing (#4420) (by Xiangyun Yang)
      • Support kernel to return a matrix type value (#4062) (by Xiangyun Yang)
    • Vulkan backend
      • Enable Vulkan device selection when using cuda (#4330) (by Bo Qiao)

    Full changelog:

    • [bug] [llvm] Initialize the field to 0 when finalizing a field (#4463) (by Lin Jiang)
    • [Doc] Update docstrings for functions in ops (#4465) (by Zhao Liang)
    • [Error] Add error message when the number of elements in kernel arguments exceed (#4444) (by Xiangyun Yang)
    • [Doc] Update docstring for functions in misc (#4474) (by Zhao Liang)
    • [metal] Support device memory allocation/deallocation (#4439) (by Ye Kuang)
    • update docstring for exceptions (#4475) (by Zhao Liang)
    • [llvm] Support real function with single scalar return value (#4452) (by Lin Jiang)
    • [refactor] Remove LLVM logic from the generic Device interface (#4470) (by PGZXB)
    • [lang] Add decorator ti.experimental.real_func (#4458) (by Lin Jiang)
    • [ci] Add python 3.10 into nightly test and release (#4467) (by Bo Qiao)
    • [bug] Fix metal linker error when TI_WITH_METAL=OFF (#4469) (by Bo Qiao)
    • [Lang] Support sparse matrix builder datatype configuration (#4411) (by Peng Yu)
    • [Error] Add error for invalid snode size (#4460) (by Lin Jiang)
    • [aot] [refactor] Refactor AOT runtime API to use module (#4437) (by Bo Qiao)
    • [misc] Optimize verison check (#4461) (by Jiasheng Zhang)
    • [Error] Add error messages for wrong type annotations of literals (#4462) (by Yi Xu)
    • [misc] Remove some warnings (#4453) (by PGZXB)
    • [refactor] Move literal construction to expr module (#4448) (by Yi Xu)
    • [bug] [lang] Enable break in the outermost for not in the outermost scope (#4447) (by Lin Jiang)
    • [Doc] Update docstrings in misc (#4446) (by Zhao Liang)
    • [llvm] Support real function which has scalar arguments (#4422) (by Lin Jiang)
    • [Lang] Support type annotations for literals (#4440) (by Yi Xu)
    • [misc] Remove a unnecessary function (#4443) (by PGZXB)
    • [metal] Expose BufferMemoryView (#4432) (by Ye Kuang)
    • [Lang] Support simple matrix slicing (#4420) (by Xiangyun Yang)
    • [Doc] Update docstring for functions in operations (#4427) (by Zhao Liang)
    • [metal] Add Unified Device API skeleton code (#4431) (by Ye Kuang)
    • [refactor] Refactor llvm-offloaded-task-name mangling (#4418) (by PGZXB)
    • [Doc] Update PyTorch interface documentation (#4311) (by Andrew Sun)
    • [misc] Add deserialization tool for benchmarks (#4278) (by rocket)
    • [misc] Add matrix operations to micro-benchmarks (#4190) (by rocket)
    • [Error] Remove the mentioning of ti.pyfunc in the error message (#4429) (by Lin Jiang)
    • [metal] Add AotModuleLoader (#4423) (by Ye Kuang)
    • [Doc] Update docstring for functions in operations (#4413) (by Zhao Liang)
    • [vulkan] Support templated kernel in aot module (#4417) (by Ailing)
    • [vulkan] [aot] Add aot namespace Vulkan (#4419) (by Bo Qiao)
    • [Lang] Support kernel to return a matrix type value (#4062) (by Xiangyun Yang)
    • [test] Add a test for the ad_gravity example (#4404) (by FZC)
    • [Doc] Update docstring for functions in operations (#4392) (by Zhao Liang)
    • [CI] Cleanup workspace before window test (#4405) (by Jian Zeng)
    • [build] Enforce compatibility with manylinux2014 when TI_WITH_VULKAN=OFF (#4406) (by Yi Xu)
    • [ci] Update tag to projects (#4400) (by Bo Qiao)
    • [ci] Reduce test parallelism for m1 (#4394) (by Bo Qiao)
    • [aot] [vulkan] Add AotKernel and its Vulkan impl (#4387) (by Ye Kuang)
    • [vulkan] [aot] Move add_root_buffer to public members (#4396) (by Gabriel H)
    • [llvm] Remove LLVM functions related to a SNode tree from the module when the SNode tree is destroyed (#4356) (by Lin Jiang)
    • [test] disable serveral workflows on forks (#4393) (by Jian Zeng)
    • [ci] Windows build exits on the first error (#4391) (by Bo Qiao)
    • [misc] Upgrade test and docker image to support python 3.10 (#3986) (by Bo Qiao)
    • [aot] [vulkan] Output shapes/dims to AOT exported module (#4382) (by Gabriel H)
    • [test] Merge the py38 only cases into the main test suite (#4378) (by Frost Ming)
    • [vulkan] Refactor Runtime to decouple the SNodeTree part (#4380) (by Ye Kuang)
    • [lang] External Ptr alias analysis & demote atomics (#4273) (by Bob Cao)
    • [example] Fix implicit_fem example command line arguments (#4372) (by bx2k)
    • [mesh] Constructing mesh from data in memory (#4375) (by bx2k)
    • [refactor] Move aot_module files (#4374) (by Ye Kuang)
    • [test] Add test for exposed top-level APIs (#4361) (by Yi Xu)
    • [refactor] Move arch files (#4373) (by Ye Kuang)
    • [build] Build with Apple clang-13 (#4370) (by Ailing)
    • [test] [example] Add a test for print_offset example (#4355) (by Zhi Qi)
    • [test] Add a test for the game_of_life example (#4365) (by 0xzhang)
    • [test] Add a test for the nbody example (#4366) (by 0xzhang)
    • [Doc] Fix broken links (#4368) (by Ye Kuang)
    • [ci] Run vulkan and metal separately on M1 (#4367) (by Ailing)
    • [Doc] Re-structure the articles: getting-started, gui (#4360) (by Ye Kuang)
    • [Vulkan] Enable Vulkan device selection when using cuda (#4330) (by Bo Qiao)
    • [misc] Version bump: v0.9.0->v0.9.1 (#4363) (by Ailing)
    • [dx11] Materialize runtime, map and unmap (#4339) (by quadpixels)
    Source code(tar.gz)
    Source code(zip)
  • v0.9.0(Feb 22, 2022)

    Highlights

    New features

    1. Dynamic indexing of matrices (experimental)

    In previous versions of Taichi, a matrix can be accessed only with a constant index. As a result, you cannot perform operations such as clamp the minimum element in a vector to 0:

    @ti.kernel
    def clamp():
        ...  # assume we have a n-d vector A
        min_index = 0
        for i in range(n):
            if A[i] < A[min_index]:
                min_index = i
        A[min_index] = 0
    

    Of course, you may use the following workaround leveraging loop unrolling. It is, however, neither intuitive nor efficient:

    @ti.kernel
    def clamp():
        ...  # assume we have a n-d vector A
        min_index = 0
        for i in ti.static(range(n)):
            if A[i] < A[min_index]:
                min_index = i
        for i in ti.static(range(n)):
            if i == min_index:
                A[i] = 0
    

    With this new experimental feature of dynamic indexing of matrices, you can now run the former code snippet smoothly. The feature can be enabled by setting ti.init(dynamic_index=True).

    In v0.9.0, a new implicit FEM (Finite Element Method) example (https://github.com/taichi-dev/taichi/blob/master/python/taichi/examples/simulation/implicit_fem.py) is added, which also illustrates the benefit of having this feature. In this example, a huge (12 × 12) Hessian matrix is constructed for implicit time integration. Without dynamic indexing, the whole matrix construction loop needs to be unrolled, which takes 70 seconds to compile; with dynamic indexing, a traditional loop version can be applied, and the compilation time is shortened to 2.5 seconds.

    2. Vulkan backend on macOS

    Adds support for the ti.vulkan backend on macOS 10.15+ and now you can run GGUI on your macBook. Run the following GGUI examples to try for yourself.

    # prerequisites: taichi >= v0.9.0 and macOS >= 10.15
    # run GGUI examples
    ti example fractal3d_ggui 
    ti example fem128_ggui
    

    3. Compatibility with Google Colab

    The system would crash if you run Taichi of an earlier version in the Google Colab notebook environment (see #235 for more information). In this release, we refactored our compiler implementation so that Taichi is compatible with Google Colab.

    Feel free to run !pip install taichi to install Taichi and start your Colab journey with it.

    Improvements

    1. More stabilized, better-organized APIs

    Ensuring the developers use the right set of APIs is critical to the long-term stability of Taichi's APIs. In this release, we started to reorganize its package structure and deprecate some obsolete or internal APIs. The following table lists some critical APIs that may concern you.

    | Category | Deprecated API | Replaced with | | --------------------- | -------------------------------- | ------------------------------------------ | | Builtin | max() | ti.max() | | Builtin | min() | ti.min() | | Atomic operation | obj.atomic_add() | ti.atomic_add() | | Image-specific | ti.imread() | ti.tools.imread() | | Image-specific | ti.imwrite() | ti.tools.imwrite() | | Image-specific | ti.imshow() | ti.tools.imshow() | | Profiler-specific | ti.print_profile_info() | ti.profiler.print_scoped_profiler_info() | | Profiler-specific | ti.print_kernel_profile_info() | ti.profiler.print_kernel_profiler_info() |

    For a representative list of APIs deprecated in this release, see this Google doc.

    2. Better error reporting

    Lengthy traceback in an error report, for most of the time, can be distracting, making it hard to locate the code causing the error. In this release, we've removed the trivial traceback that does not concern developers in our error reporting to improve the debugging experience.

    Taking the following code snippet as an example:

    import taichi as ti
    
    ti.init()
    
    @ti.func
    def bar(a):
        a = a + 2j
    
    @ti.kernel
    def foo():
        bar(1)
    
    foo()
    

    Before v0.9.0, the error message looks like this:

    [Taichi] Starting on arch=x64
    Traceback (most recent call last):
      File "error.py", line 13, in <module>
        foo()
      File "/path_to_taichi/lang/kernel_impl.py", line 709, in wrapped
        return primal(*args, **kwargs)
      File "/path_to_taichi/lang/kernel_impl.py", line 636, in __call__
        key = self.ensure_compiled(*args)
      File "/path_to_taichi/lang/kernel_impl.py", line 627, in ensure_compiled
        self.materialize(key=key, args=args, arg_features=arg_features)
      File "/path_to_taichi/lang/kernel_impl.py", line 493, in materialize
        taichi_kernel = _ti_core.create_kernel(taichi_ast_generator,
      File "/path_to_taichi/lang/kernel_impl.py", line 488, in taichi_ast_generator
        compiled()
      File "error.py", line 11, in foo
        bar(1)
      File "/path_to_taichi/lang/kernel_impl.py", line 76, in decorated
        return fun.__call__(*args)
      File "/path_to_taichi/lang/kernel_impl.py", line 156, in __call__
        ret = self.compiled(*args)
      File "error.py", line 7, in bar
        a = a + 2j
      File "/path_to_taichi/lang/common_ops.py", line 16, in __add__
        return ti.add(self, other)
      File "/path_to_taichi/lang/ops.py", line 78, in wrapped
        return imp_foo(a, b)
      File "/path_to_taichi/lang/ops.py", line 63, in imp_foo
        return foo(x, y)
      File "/path_to_taichi/lang/ops.py", line 427, in add
        return _binary_operation(_ti_core.expr_add, _bt_ops_mod.add, a, b)
      File "/path_to_taichi/lang/ops.py", line 173, in _binary_operation
        a, b = wrap_if_not_expr(a), wrap_if_not_expr(b)
      File "/path_to_taichi/lang/ops.py", line 36, in wrap_if_not_expr
        return Expr(a) if not is_taichi_expr(a) else a
      File "/path_to_taichi/lang/expr.py", line 33, in __init__
        self.ptr = impl.make_constant_expr(arg).ptr
      File "/path_to_taichi/lang/util.py", line 196, in wrapped
        return func(*args, **kwargs)
      File "/path_to_taichi/lang/impl.py", line 414, in make_constant_expr
        raise ValueError(f'Invalid constant scalar expression: {type(val)}')
    ValueError: Invalid constant scalar expression: <class 'complex'>
    

    In v0.9.0, the error message looks like this:

    Traceback (most recent call last):
      File "/path_to_test/error.py", line 13, in <module>
        foo()
      File "/path_to_taichi/lang/kernel_impl.py", line 732, in wrapped
        raise type(e)('\n' + str(e)) from None
    taichi.lang.exception.TaichiTypeError: 
    On line 11 of file "/path_to_test/error.py", in foo:
        bar(1)
        ^^^^^^
    On line 7 of file "/path_to_test/error.py", in bar:
        a = a + 2j
            ^^^^^^
    Invalid constant scalar data type: <class 'complex'>
    

    3. Revamped Taichi's documentation site

    To improve the readability and user-friendliness of our documentation, we restructured Taichi's documentation site and incorporated API reference into it.

    Join our discussions to build the next Taichi release for you!

    We believe that our community plays a pivotal role in the development of the Taichi programming language. In that spirit, we encourage you to take an active part in our GitHub Discussions, propose potential changes, and contribute your ideas. Together, we improve the Taichi language release by release, for you and for every developer.

    The following is a selected list of hot topics for you to start with:

    • #4086
    • #4183

    Specifically, because beginners to Taichi sometimes get lost in different APIs such as ti.Vector, ti.types.vector, ti.Vector.field, we plan to make them clearer and would like to have your opinions on these proposed practices:

    • Always keep type identifiers in lowercase.
    • Always use ti.types.vector to define a vector type.
    • After having type definitions like my_vec2i = ti.types.vector(2, ti.i32), use my_vec2i([5, 10]) for a vector object.
    • For simplicity, we preserve ti.vector([1, 2]) as a shortcut for ti.types.vector()([1, 2]) , which automatically infers missing type information of the object.
    • Use ti.field(dtype=my_vec2i, shape=100) for a field object.

    API changes

    See this Google doc for a representative list of APIs deprecated in this release.

    Deprecation notice

    Python 3.6 has reached EOL as of December 2021. The next major Taichi release (e.g. v1.0) will be the last official release for Python3.6 and we're actively working on adding support for Python3.10.

    Full changelog:

    • [test] Add a test for simple_derivative example (#4323) (by TinyBox)
    • [example] Add implicit fem example (#4352) (by bx2k)
    • [opengl] Use element shape as compile information for OpenGL backend (#4284) (by Haidong Lan)
    • [ci] Exit on error windows test script (#4354) (by Bo Qiao)
    • [bug] Update children_offsets & stride info to align as elem_stride (#4345) (by Ailing)
    • [gui] Update GGUI examples to use vulkan backend if available (#4353) (by Ailing)
    • [ci] Use conda python for m1 jobs (#4351) (by Ailing)
    • [lang] Add support for operators "is" and "is not" in static scope and deprecate them (#4349) (by Lin Jiang)
    • [ci] Increase ci test parallelism (#4348) (by Bo Qiao)
    • [opengl] Remove support for dynamic snode (by Ailing Zhang)
    • [error] Let deprecation warnings display only once (#4346) (by Lin Jiang)
    • [ci] Fix generate_example_videos.py (#4347) (by Ailing)
    • [test] Add a test for autodiff/regression (#4322) (by TinyBox)
    • [ci] Install requirements and matplotlib for GPU tests (#4336) (by Bo Qiao)
    • [gui] [refactor] Avoid exposing different APIs with different GGUI_AVAILABLE values (#4329) (by Yi Xu)
    • [lang] Remove logical_and and logical_or from TaichiOperation (#4326) (by Lin Jiang)
    • [lang] Add deprecation warnings to atomic ops (#4325) (by Lin Jiang)
    • [refactor] Allow more build types from setup.py (#4313) (by Bo Qiao)
    • [refactor] make class Expr constructor explicit (#4272) (by Retrospection)
    • [doc] More revision on a new language (#4321) (by Ye Kuang)
    • [lang] Hide internal apis about Fields (#4302) (by Xiangyun Yang)
    • [Doc] Avoid log(0) problem in _funcs._randn() and update primitive_types.py (#4317) (by Zhao Liang)
    • [refactor] Remove Ndarray torch implementation and tests (#4307) (by Bo Qiao)
    • [Doc] Revise "Why a new programming language" (#4306) (by Ye Kuang)
    • [lang] Move sparse_matrix_builder from taichi.linalg to taichi.types (#4301) (by Ailing)
    • [lang] Make ti.cfg an alias of runtime cfg (#4264) (by Ailing)
    • [refactor] Refactor ForLoopDecoratorRecorder (#4309) (by PGZXB)
    • [lang] Hide dtype and needs_grad from SNode (#4308) (by Yi Xu)
    • [vulkan] Reduce runtime host overhead (#4282) (by Bob Cao)
    • [lang] Remove Matrix.value (#4300) (by Lin Jiang)
    • [lang] Hide internal APIs of FieldsBuilder (#4305) (by Yi Xu)
    • [lang] Hide pad_key and ndarray*_to_numpy in Ndarray (#4298) (by Bo Qiao)
    • [lang] Hide internal functions in SNode and _Root (#4303) (by Yi Xu)
    • [lang] Hide ndarray*_from_numpy (#4297) (by Bo Qiao)
    • [lang] Hide internal functions in Matrix and Struct (#4295) (by Lin Jiang)
    • [lang] Hide subscript in Matrix (#4299) (by Lin Jiang)
    • [lang] Hide initialize_host_accessor in Ndarray (#4296) (by Bo Qiao)
    • [lang] Hide internal functions in TaichiOperation (#4288) (by Lin Jiang)
    • [lang] Hide get_element_size and get_nelement in Ndarray (#4294) (by Bo Qiao)
    • [lang] Hide fill_by_kernel in Ndarray (#4293) (by Bo Qiao)
    • Hide data handle (#4292) (by Bo Qiao)
    • [lang] Remove CompoundType from taichi.types (#4291) (by Ailing)
    • [lang] Hide get_addr and type_assert in api docs (#4290) (by Ailing)
    • [lang] Only expose start_recording/stop_recording for now (#4289) (by Ailing)
    • [docs] Hide unnessary methods in annotation classes (#4287) (by Ailing)
    • [llvm] Use GEP for array access instead of ptrtoint/inttoptr (#4276) (by Yi Xu)
    • [lang] Fix bls_buffer allocation of x64 crashed in py3.10 (#4275) (by Chang Yu)
    • [misc] Code cleanup in benchmarks (#4280) (by rocket)
    • [doc] Improve operators page (#4073) (by Lin Jiang)
    • [spirv] Fix buffer info compare to fix external array bind point (#4277) (by Bob Cao)
    • [bug] Disallow function definition inside ti.func/kernel (#4274) (by Lin Jiang)
    • [refactor] Remove global instance of DecoratorRecorder (#4254) (by PGZXB)
    • [misc] Add stencil_2d to micro-benchmarks (#4176) (by rocket)
    • [refactor] Remove support for raise statement (#4262) (by Lin Jiang)
    • [refactor] Re-expose important implementation classes (#4268) (by Yi Xu)
    • [llvm] Add missing pre-processor macro in cpp-tests when LLVM is disabled (#4269) (by PGZXB)
    • Add more camera controls (#4212) (by Yu Zhang)
    • [vulkan] Test & build macOS 10.15 MoltenVK (#4259) (by Bob Cao)
    • [vulkan] Use TI_VISIBLE_DEVICE to select vulkan device (#4255) (by Bo Qiao)
    • [misc] Remove some unnecessary #include lines (#4265) (by PGZXB)
    • [lang] Expose mesh_patch_idx at top level (#4260) (by Ailing)
    • [Bug] Only ban passing non contiguous torch tensors to taichi kernels. (#4258) (by Ailing)
    • [ci] Run on pull_request_target to access the secrets (#4253) (by Frost Ming)
    • [misc] Update master version to 0.9.0 (#4248) (by Ailing)
    • [misc] Use test_utils.approx directly (#4252) (by Ailing)
    • [ci] Move _testing.py into tests folder (#4247) (by Ailing)
    • [refactor] Remove get_current_program() and global variable current_program (#4246) (by PGZXB)
    • [Doc] Update sparse compuation doc (#4060) (by Peng Yu)
    • [Error] Raise an error when breaking the outermost loop (#4235) (by Lin Jiang)
    • [ci] Disable Vulkan backend for mac1015 release. (#4245) (by Ailing)
    • [Refactor] Move ti.quant & ti.type_factory under ti.types.quantized_types (#4233) (by Yi Xu)
    • [doc] Major revision to the field (advanced) document (#4156) (by Haidong Lan)
    • [vulkan] Disable buffer device address if int64 is not supported (#4244) (by Bob Cao)
    • [CUDA] Fix random generator routines for f32 and f64 to make sure the returned value is in [0, 1) (#4243) (by Zhao Liang)
    • [ci] Create PR card in projects automatically (#4229) (by Frost Ming)
    • [refactor] Remove dependency on get_current_program() in lang::BinaryOpExpression (#4242) (by PGZXB)
    • [Refactor] Add require_version configuration in ti.init() (#4151) (by ZHANG Zhi)
    • [ci] Disable Vulkan backend for mac1014 release. (#4241) (by Ailing)
    • [refactor] Remove global scope_stack and dependencies on it (#4237) (by PGZXB)
    • [refactor] Remove lang::current_ast_builder() and dependencies on it (#4239) (by PGZXB)
    • [vulkan] Add buffer device address (physical pointers) support & other improvements (#4221) (by Bob Cao)
    • [Refactor] Avoid exposing ti.tape (#4234) (by Bo Qiao)
    • [lang] Annotate constants with dtype without casting. (#4224) (by Ailing)
    • [refactor] Remove legacy ti.benchmark() and ti.benchmark_plot() (#4222) (by Xiangyun Yang)
    • [misc] Add memcpy to micro-benchmarks (#4220) (by Bo Qiao)
    • [Refactor] Merge ti.tools.image.imdisplay() into ti.tools.image.imshow() (#4144) (by Zhao Liang)
    • [Refactor] Rename and move memory profiler info under ti.profiler (#4227) (by Xiangyun Yang)
    • [Bug] Ban passing torch view tensors into taichi kernel (#4225) (by Ailing)
    • [refactor] Remove dependency on get_current_program() in lang::FrontendForStmt (#4228) (by PGZXB)
    • [metal] Give random seeds a unique value (#4206) (by Ye Kuang)
    • [autodiff] Refactor the IB identification and optimize the checker for global atomics and purely nested loops (#4154) (by Mingrui Zhang)
    • [doc] Add the step of setting "TI_WITH_VULKAN" for linux (#4209) (by Neko Null)
    • [doc] Add instruction to install clang-format-10 on M1 Mac (#4219) (by Lin Jiang)
    • [Refactor] Move public APIs of ti.tools outside top level (#4218) (by Yi Xu)
    • [Refactor] Move ti.parallel_sort under _kernels (#4217) (by Yi Xu)
    • [refactor] Remove top level all (#4214) (by Yi Xu)
    • [vulkan] Support Vulkan 1.3 (#4211) (by Bob Cao)
    • [CI] Update release workflow (#4215) (by Jian Zeng)
    • [Refactor] Move ti.taichi_logo to examples (#4216) (by Yi Xu)
    • [vulkan] Fix MoltenVK support (#4205) (by Bob Cao)
    • [Refactor] Rename tools.util to tools.async_utils and hide functions inside (#4201) (by Yi Xu)
    • [spirv] SPIR-V / Vulkan NDArray (#4202) (by Bob Cao)
    • [misc] Export visibility of symbols required for Vulkan AOT execution (#4203) (by Gabriel H)
    • [misc] Test unified doc & api preview. (#4186) (by Ailing)
    • [refactor] Remove dependency on get_current_program() in exported functions of SNode (#4192) (by PGZXB)
    • [refactor] Export some functions which depend on current_ast_builder() as members of ASTBuilder (#4131) (by PGZXB)
    • [Refactor] Do not expose StructField and SourceBuilder to users (#4200) (by Yi Xu)
    • [Error] Add function name to traceback (#4195) (by Lin Jiang)
    • [Refactor] Remove redundant set_gdb_trigger (#4198) (by Yi Xu)
    • [javascript] Avoid using C++ inline asm when TI_EMSCRIPTENED (JS 6/n) (#4109) (by Dunfan Lu)
    • [javascript] Disable stack trace logging when TI_EMSCRIPTENED (JS 9/n) (#4117) (by Dunfan Lu)
    • [javascript] Support TI_EMSCRIPTENED option as an env var (JS 3/n) (#4106) (by Dunfan Lu)
    • [Refactor] Rename and move kernel profiler APIs (#4194) (by Yi Xu)
    • [doc] Update the doc for differentiable programming (#4057) (by Mingrui Zhang)
    • [misc] Add math operators to micro-benchmarks (#4122) (by rocket)
    • [misc] Add atomic operators to micro-benchmarks (#4169) (by rocket)
    • [dx11] Fix parse_reference_count signature (#4189) (by quadpixels)
    • [Doc] update demo code in readme doc (#4193) (by 箱子)
    • [bug] [opengl] Process child nodes to compute alignment (#4191) (by Ailing)
    • [refactor] Remove dependency on current_ast_builder() in lang::For and cpp_tests (#4185) (by PGZXB)
    • [refactor] Add TI_DLL_EXPORT to control symbol visibility (#4177) (by Ye Kuang)
    • [refactor] Remove is_signed/is_integral from top level. (#4182) (by Ailing)
    • [refactor] Move version_check out of taichi.lang. (#4178) (by Ailing)
    • [refactor] Remove locale_encode from top level. (#4179) (by Ailing)
    • [refactor] Remove dependency on get_current_program() in lang::Ndarray (#4162) (by PGZXB)
    • [Refactor] Clean up helper functions in tools.util (#4174) (by Yi Xu)
    • [refactor] Remove bit_vectorize from top level. (#4158) (by Ailing)
    • [test] [example] Add a test for taichi_logo example (#4170) (by Isaac)
    • [Refactor] Remove inspect for modules in lang init (#4173) (by Bo Qiao)
    • remove KernelDefError KernelArgError InvalidOperationError (#4166) (by Lin Jiang)
    • [Refactor] Expose runtime/snode ops properly (#4167) (by Yi Xu)
    • [opengl] Use && instead of and in C++ code (#4171) (by Dunfan Lu)
    • [Refactor] Move core_vec(i) to gui and hide (#4172) (by Yi Xu)
    • [ci] Fix concurrent run issue (#4168) (by Frost Ming)
    • [Refactor] Rename and move scoped profiler info under ti.profiler (#4165) (by Yi Xu)
    • [spirv] Move external arrays into seperate buffers (#4121) (by Bob Cao)
    • [doc] Improve Fields documentation (#4063) (by rocket)
    • [refactor] Move functions in init to misc (#4150) (by Xiangyun Yang)
    • [refactor] Remove dependency on get_current_program() and lang::current_ast_builder() in lang::Expr (#4103) (by PGZXB)
    • [refactor] Expose ti.abs and ti.pow (#4157) (by Lin Jiang)
    • [Doc] Update README.md (#4139) (by Ye Kuang)
    • [Refactor] Do not expose TapeImpl to users (#4148) (by Yi Xu)
    • [Refactor] Remove unnecessary exposure related to matrix and mesh (#4152) (by Lin Jiang)
    • [Refactor] Do not expose internal function in field, exception, expr, any_array , _ndrange, _ndarray (#4137) (by Xiangyun Yang)
    • [Refactor] Do not expose taichi.snode (#4149) (by Bo Qiao)
    • [refactor] [ir] Remove load_if_ptr and move pointer dereferencing to frontend-to-IR passes (#4104) (by daylily)
    • [Refactor] Do not expose internal function in ti.lang.impl (#4134) (by Xiangyun Yang)
    • [Refactor] Prevent modules in lang being wild imported and exposed (#4140) (by Bo Qiao)
    • Move getattr back to init.py (#4142) (by Lin Jiang)
    • [Refactor] Avoid exposing real and integer types API (#4129) (by Bo Qiao)
    • [Refactor] Do not expose functions in taichi.lang.util to users (#4128) (by Yi Xu)
    • [Refactor] Do not expose main to users (#4136) (by Yi Xu)
    • [doc] Revise doc for GUI system. (#4006) (by Jiasheng Zhang)
    • [refactor] Remove critical/debug/error/trace/warn/info/is_logging_effective from top level (#4133) (by Ailing)
    • [Refactor] Remove supported_log_levels (#4120) (by Bo Qiao)
    • [ci] Fix approx in autodiff example test (#4132) (by Ailing)
    • [bug] Fix starred expression when the value is not a list (#4130) (by Lin Jiang)
    • Support compiling taichi in x86 (#4107) (by Dunfan Lu)
    • [javascript] Avoid all usages of glfw/vulkan/volk when TI_EMSCRIPTENED (JS 5/n) (#4108) (by Dunfan Lu)
    • [javascript] [misc] Remove redundant pybind include in Taichi Core Library (#4110) (by Dunfan Lu)
    • [test] Remove allclose at top level. (by Ailing Zhang)
    • [test] Remove approx at top level. (by Ailing Zhang)
    • [test] Remove get_rel_eps() at top level. (by Ailing Zhang)
    • [test] Replace make_temp_file with tempfile (by Ailing Zhang)
    • [Refactor] Remove exposure of internal functions in taichi.lang.ops (#4101) (by Lin Jiang)
    • [misc] Fix the changelog generator to only count current branch commits (#4126) (by Frost Ming)
    • [build] Handle empty TAICHI_EMBIND_SOURCE (#4127) (by Ailing)
    • [ci] Use GHA workflow to control the concurrency (#4116) (by Frost Ming)
    • [misc] Version bump: v0.8.11 -> v0.8.12 (#4125) (by Ailing)
    • [refactor] Remove dependency on lang::current_ast_builder() in lang::ConstantFold (#4123) (by PGZXB)
    • [doc] Add doc about difference between taichi and python programs (#3996) (by Lin Jiang)
    • [doc] Update docs about kernels and functions (#4044) (by Lin Jiang)
    • [misc] Add containers and end-to-end result to micro-benchmarks (#4081) (by rocket)
    • [opengl] Use element_size as alignment in root buffer. (#4095) (by Ailing)
    • [javascript] Add TI_EMSCRIPTENED to cmake options (JS 1/n) (#4093) (by Dunfan Lu)
    • [javascript] Add Javascript PR tag (JS 0/n) (#4094) (by Dunfan Lu)
    • [opt] Remove legacy vectorization pass (#4096) (#4099) (by daylily)
    • [refactor] [ir] Refactor ExternalFuncCallExpression into a frontend statement (#4098) (by daylily)
    • [spirv] Add names to buffer struct types and pointers (#4092) (by Bob Cao)
    • [test] Add a test for the minimization example (#4091) (by Zydiii)
    • [refactor] Remove dependency on get_current_program() in ui/backends/vulkan (#4076) (by PGZXB)
    • [refactor] [ir] Use InternalFuncCall for retrieving thread index (#4090) (by daylily)
    • Add images to GL device API (#4084) (by Bob Cao)
    • [dx11] Add underlying DX11 device, memory allocation, and some tests (#3971) (by quadpixels)
    • [Bug] [lang] Ban redefinition of template and matrix arguments in Taichi kernel (#4080) (by Lin Jiang)
    • [bug] Fix warnings on external functions on windows (#4079) (by Lin Jiang)
    • [aot] [vulkan] Provide range_hint for range_for offloaded tasks in vulkan backend. (by Ailing Zhang)
    • [refactor] Reuse SNode tree id (#4056) (by Lin Jiang)
    • [bug] Fix ndrange with star arguments (#4077) (by Lin Jiang)
    • [aot] [opengl] Provide range_hint for range_for offloaded tasks in (by Ailing Zhang)
    • [misc] Add a minimal example for micro-benchmarks (#4031) (by rocket)
    • [doc] Refactor type system doc: primitive types (#4055) (by Yi Xu)
    • [misc] Migrate benchmarks to a new version (#4059) (by rocket)
    • [refactor] Re-export some functions called directly by ASTTransfomer.* as member of ASTBuilder (#4034) (by PGZXB)
    • [autodiff] Fix the local allocas defining in inner loop raising runtime error (#4041) (by Mingrui Zhang)
    • [opengl] Make sure ndarray arg bind indices are sorted. (#4069) (by Ailing)
    • [doc] Improve operator page (#4067) (by Bo Qiao)
    • [ir] [refactor] Make ReturnStmt support a vector of stmts (#4028) (by Xiangyun Yang)
    • [Lang] [bug] Stop misusing non-template argument types to determine template reinstantiation (#4049) (by Xiangyun Yang)
    • [misc] Version bump: v0.8.10 -> v0.8.11 (#4053) (by rocket)
    Source code(tar.gz)
    Source code(zip)
  • v0.8.11(Jan 25, 2022)

    This is a bug fix release for v0.8.10.

    If you have seen excessive warnings like below on windows, please upgrade to this release.

    • Bug fixes
      • [bug] Fix warnings on external functions on windows (#4079) (by Lin Jiang)
    a.py:11: UserWarning: Calling non-taichi function "ti.random". Scope inside the function is not processed by the Taichi AST transformer. The function may not work as expected. Proceed with caution! Maybe you can consider turning it into a @ti.func?
      a[i] = ti.pow(ti.random(), 2)
    a.py:11: UserWarning: Calling non-taichi function "ti.pow". Scope inside the function is not processed by the Taichi AST transformer. The function may not work as expected. Proceed with caution! Maybe you can consider turning it into a @ti.func?
      a[i] = ti.pow(ti.random(), 2)
    

    Full changelog:

    • [bug] Fix warnings on external functions on windows (#4079) (by Lin Jiang)
    • [misc] Version bump: v0.8.10 -> v0.8.11 (#4053) (by rocket)
    • [test] [example] Add test and video generator for cornell box. (#4045) (by Ailing)
    Source code(tar.gz)
    Source code(zip)
  • v0.8.10(Jan 18, 2022)

    Highlights:

    • AOT
      • Add a generic set of AOT structs (#3973) (by Ye Kuang)
      • Switch vulkan aot to use taichi::aot::ModuleData. (by Ailing Zhang)
      • Convert opengl aot to dump ModuleData. (#3991) (by Ailing)
    • Language and syntax
      • Use FrontendExprStmt in place of FrontendEvalStmt (#3978) (by daylily)
      • Get global vars by using globals (#3949) (by Lin Jiang)
      • Support static short circuit bool operations (#3958) (by Lin Jiang)
      • Experimental automatic mesh_local (#3989) (by Chang Yu)
      • Support nested mesh-for (#3990) (by Chang Yu)
    • Performance
      • Accelerate whole_kernel_cse pass (#3957) (by Xiangyun Yang)
      • Get rid of some no-ops in linear seek (by Ailing Zhang)
      • Reduce kernel launch context construction overhead (#3947) (by Haidong Lan)
      • Refactor func body to reduce python overhead and improve readability (#3984) (by Haidong Lan)
      • Get store_to_load_forwarding work with local tensors across basic blocks (#3942) (by Yi Xu)
    • Documentations
      • Update Docs preview settings. (#4021) (by Chengchen(Rex) Wang)
      • Add doc for compile-time recursion (#3994) (by Lin Jiang)
      • Add an operation page (#4004) (by Bo Qiao)
      • Improve type system documentation (#4002) (by Bo Qiao)
    • Error messages
      • Add TaichiTypeError (#3964) (by Lin Jiang)
      • Produce a warning when users call external functions (#4007) (by Lin Jiang)
      • Shorten the length of traceback of TaichiCompilationError (#3965) (by Lin Jiang)
      • Raise exception when encountering undefined name (#3951) (by Lin Jiang)
    • Bug fixes
      • Fix bug that building with TI_WITH_LLVM=OFF will fail (#4043) (by PGZXB)
      • Treat PtrOffsetStmt as random-initialized (#3998) (by Yi Xu)
      • GGUI imwrite BGRA to RGBA conversion (#4018) (by Bob Cao)

    Full changelog:

    • [bug] Fix bug that building with TI_WITH_LLVM=OFF will fail (#4043) (by PGZXB)
    • [doc] Improve type system documentation (#4002) (by Bo Qiao)
    • [Error] Add error message when non-0d numpy ndarray is given to initialize expression (#4030) (by Lin Jiang)
    • [Error] Produce a warning when users call external functions (#4007) (by Lin Jiang)
    • [aot] Use element_shape instead of row_num & column_num for CompiledFieldData. (by Ailing Zhang)
    • [vulkan] [aot] Switch vulkan aot to use taichi::aot::ModuleData. (by Ailing Zhang)
    • [vulkan] Fix gtmp type (#4042) (by Bob Cao)
    • [doc] Add an operation page (#4004) (by Bo Qiao)
    • [ir] [refactor] Split stmt typechecking to the frontend (#3875) (by daylily)
    • [build] Disable LTO for mac. (#4027) (by Ailing)
    • [autodiff] Restrict Independent Block scope for cases with atomic operations on global variables (#3897) (by Mingrui Zhang)
    • [gui] GGUI imwrite BGRA to RGBA conversion (#4018) (by Bob Cao)
    • [test] [example] Add a test and a video generator for mpm99 (#3995) (by Yi Xu)
    • [doc] [ci] Update Docs preview settings. (#4021) (by Chengchen(Rex) Wang)
    • [vulkan] [aot] Throw error for templated kernels in vulkan aot. (by Ailing Zhang)
    • [bug] [opt] Treat PtrOffsetStmt as random-initialized (#3998) (by Yi Xu)
    • [ci] Keep macOS actions run on macOS-10.15 (#4014) (by rocket)
    • [vulkan] [aot] Enable aot tests for vulkan backend. (#4000) (by Ailing)
    • [mesh] [opt] Experimental automatic mesh_local (#3989) (by Chang Yu)
    • [doc] Add doc for compile-time recursion (#3994) (by Lin Jiang)
    • [refactor] [opengl] Get rid of some no-ops in linear seek (by Ailing Zhang)
    • [opengl] Do not promote simple ExternalTensorShapeAlongAxisStmt into globals. (by Ailing Zhang)
    • [build] Upgrade SPIRV-Headers and SPIRV-Tools to their latest commits (#3967) (by PGZXB)
    • [opengl] [aot] Convert opengl aot to dump ModuleData. (#3991) (by Ailing)
    • [mesh] Support multiple major relations in one mesh-for loop (#3987) (by Chang Yu)
    • [refactor] Refactor func body to reduce python overhead and improve readability (#3984) (by Haidong Lan)
    • [opt] Add more strict alias analysis for ExternalPtrStmt (#3992) (by Ailing)
    • [opt] Accelerate whole_kernel_cse pass (#3957) (by Xiangyun Yang)
    • [mesh] [opt] Support nested mesh-for (#3990) (by Chang Yu)
    • [Lang] Provide sparse matrix shape (#3959) (by Peng Yu)
    • [refactor] Remove dependency on get_current_program() in backends/cpu and backends/cuda (#3956) (by PGZXB)
    • [mesh] Demote from-end element attribute atomic op (#3923) (by Chang Yu)
    • [ci] Support rebase and rerun command in comment for CI bot (#3952) (by Frost Ming)
    • [refactor] [ci] Enable identifier naming in clang-tidy (#3960) (by Bo Qiao)
    • [refactor] [ir] Use FrontendExprStmt in place of FrontendEvalStmt (#3978) (by daylily)
    • [example] [test] Fix misuse of logical operators in examples and tests (#3976) (by Yi Xu)
    • [Error] Shorten the length of traceback of TaichiCompilationError (#3965) (by Lin Jiang)
    • [aot] Add a generic set of AOT structs (#3973) (by Ye Kuang)
    • [Error] Add TaichiTypeError (#3964) (by Lin Jiang)
    • [aot] Add task_type for OpenGL (#3962) (by Ye Kuang)
    • [Error] Raise exception when encountering undefined name (#3951) (by Lin Jiang)
    • [misc] [cuda] Set the toolkit used by KernelProfiler at runtime (#3945) (by rocket)
    • [refactor] Get global vars by using globals (#3949) (by Lin Jiang)
    • [refactor] Support static short circuit bool operations (#3958) (by Lin Jiang)
    • [perf] [refactor] Reduce kernel launch context construction overhead (#3947) (by Haidong Lan)
    • [refactor] Move python/taichi/lang/meta.py to python/taichi/_kernels.py (by Ailing Zhang)
    • [refactor] Remove import taichi in taichi/lang/impl.py (by Ailing Zhang)
    • [refactor] Remove ndarray_use_torch from pybind (#3946) (by Bo Qiao)
    • [ci] Test opengl backend on windows (#3924) (by Frost Ming)
    • [Error] Do not show body in exceptions in nodes with body (#3940) (by Lin Jiang)
    • [opt] Get store_to_load_forwarding work with local tensors across basic blocks (#3942) (by Yi Xu)
    • [refactor] [ir] Remove legacy stmts from CHI IR (#3943) (by Yi Xu)
    • [Error] Shorten the length of traceback of exceptions thrown by ASTTransformer (#3873) (by lin-hitonami)
    • [misc] Version bump: v0.8.9 -> v0.8.10 (#3935) (by Bo Qiao)
    Source code(tar.gz)
    Source code(zip)
  • v0.8.9(Jan 4, 2022)

    Highlights:

    • Android
      • Add initial support of Android in GGUI (#3845) (by Gabriel H)
    • Bug fixes
      • Query device attribute when using cuda 11 and above (#3930) (by Bo Qiao)
      • Fix the ttf path (#3931) (by Xiangyun Yang)
    • Language and syntax
      • Initial matrix argument support for ti.kernel (#3905) (by Xiangyun Yang)
      • Enable dynamic indexing of matrix field elements when possible (#3865) (by Yi Xu)
    • Miscellaneous
      • Support logging on Android platforms (#3849) (by Gabriel H)
    • Refactor
      • Remove all occurrences of print_preprocessed and print_ast (#3911) (by Xiangyun Yang)
      • Deprecate excepthook and completely remove _taichi_skip_traceback (#3902) (by Haidong Lan)
    • Tests
      • Add initial AOT CPP test (#3850) (#3899) (by Gabriel H)
      • Add initial AOT CPP test (#3850) (by Gabriel H)

    Full changelog:

    • [ci] Install torch for windows release (#3932) (by Bo Qiao)
    • [Bug] fix the ttf path (#3931) (by Xiangyun Yang)
    • [Bug] [cuda] Query device attribute when using cuda 11 and above (#3930) (by Bo Qiao)
    • [refactor] Optimize vector and matrix ndarray fill (#3921) (by Bo Qiao)
    • [opt] [ir] [refactor] Remove exceptions from offload pass (#3925) (by Xiangyun Yang)
    • [opt] [refactor] Remove the variable_optimization pass (#3927) (by Mingkuan Xu)
    • [doc] Add a tutorial: Run Ndarray Taichi program (#3908) (by Vissidarte-Herman)
    • [opt] [ir] [refactor] Remove exceptions from lower_ast pass (#3916) (by Xiangyun Yang)
    • [cuda] Use cuMemsetD32 to fill scalar ndarray (#3907) (by Bo Qiao)
    • [ci] Add self-hosted Windows buildbot for GPU testing (#3852) (by Frost Ming)
    • [Lang] Initial matrix argument support for ti.kernel (#3905) (by Xiangyun Yang)
    • [doc] Add example of color_edit_3 (#3919) (by Vineyo)
    • [perf] Clear global_vars/matrix_fields after materialize() (#3914) (by Yi Xu)
    • [gui] Update camera.py (#3898) (by stamnug)
    • [bug] Enable field-related checks in materialize() not only in first call (#3906) (by Yi Xu)
    • [Refactor] Remove all occurrences of print_preprocessed and print_ast (#3911) (by Xiangyun Yang)
    • [perf] Avoid using property for simple attributes to reduce python overhead. (by Ailing Zhang)
    • [opengl] Respect max_block_dim in ti.init (by Ailing Zhang)
    • [dx11] Add DX11 device interface definition (#3880) (by quadpixels)
    • [Refactor] Deprecate excepthook and completely remove _taichi_skip_traceback (#3902) (by Haidong Lan)
    • [opengl] Optimize range_for for ndarrays (by Ailing Zhang)
    • [Test] [aot] Add initial AOT CPP test (#3850) (#3899) (by Gabriel H)
    • [refactor] Merge taichi/lang/linalg_impl.py into _funcs.py (by Ailing Zhang)
    • [refactor] Remove import taichi in taichi/lang/quant_impl.py (by Ailing Zhang)
    • [refactor] Remove import taichi in taichi/lang/util.py (by Ailing Zhang)
    • [cuda] Increase saturating grid dim to reduce tail effect (#3855) (by Bo Qiao)
    • [opengl] Reduce repeated read to args buffer. (by Ailing Zhang)
    • [refactor] Remove import taichi from lang/init.py (#3889) (by Yi Xu)
    • Revert "[Test][aot] Add initial AOT CPP test (#3850)" (#3890) (by Ye Kuang)
    • [Android] [gui] Add initial support of Android in GGUI (#3845) (by Gabriel H)
    • [Test][aot] Add initial AOT CPP test (#3850) (by Gabriel H)
    • [Lang] Enable dynamic indexing of matrix field elements when possible (#3865) (by Yi Xu)
    • [cuda] Hide debug info (#3878) (by Ye Kuang)
    • [refactor] Remove legacy helper functions for testing (#3874) (by Yi Xu)
    • [refactor] Remove import taichi from expr.py (#3871) (by Yi Xu)
    • [refactor] Remove import taichi from field.py (#3870) (by Yi Xu)
    • [refactor] Remove import taichi from mesh.py (#3869) (by Yi Xu)
    • [refactor] Remove import taichi in ast_transformer (#3827) (by lin-hitonami)
    • [refactor] Remove import taichi from struct.py (#3866) (by lin-hitonami)
    • [autodiff] Provide stmt name for auto-diff related assert info (#3864) (by Mingrui Zhang)
    • [lang] Cleanup parallel sort utility (#3858) (by Dunfan Lu)
    • [refactor] Enforce destruction order of OpenGlRuntime. (#3861) (by Ailing)
    • [refactor] Add _MatrixFieldElement class (#3862) (by Yi Xu)
    • [opt] Add more accurate alias analysis for ExternalPtrStmt (#3859) (by Yi Xu)
    • [mesh] A small fix for mesh loop syntax in python frontend (#3836) (by bx2k)
    • [test] Add more complicated tests for building and destroying SNodeTrees (#3415) (by ysh329)
    • [lang] Calculate dynamic indexing strides of matrix field elements (#3854) (by Yi Xu)
    • [bug] Fix autodiff for ceil. (#3844) (by Ailing)
    • [refactor] Remove import taichi in matrix.py (#3842) (by lin-hitonami)
    • [Misc] [android] Support logging on Android platforms (#3849) (by Gabriel H)
    • [opengl] Allocate new arg bufs per kernel launch (#3848) (by Ailing)
    • [opengl] Only sync per ti.kernel when there's external array arg. (by Ailing Zhang)
    • [refactor] Rename is_external_array -> is_array and arr_bufs_ -> ext_arr_bufs_ (by Ailing Zhang)
    • [opengl] Only bind buffers per ti.kernel. (by Ailing Zhang)
    • [misc] Version bump: v0.8.8 -> v0.8.9 (#3846) (by Yi Xu)
    Source code(tar.gz)
    Source code(zip)
  • v0.8.8(Dec 21, 2021)

    Highlights:

    • Android
      • Add initial support of the platform (#3755) (by Gabriel H)
    • Bug fixes
      • Fix copying a Matrix/Struct in Taichi scope (#3838) (by Yi Xu)
      • Modify the implementation of norm_sqr() (#3803) (by Yi Xu)
    • Documentation
      • Tweak API docstrings (#3820) (by Ye Kuang)
      • Refactored contribution guidelines (#3789) (by Vissidarte-Herman)
      • Fix a misspelling (#3773) (by 张皓)
    • Examples
      • Fix constraint correction (#3740) (by Peng Yu)
    • GUI
      • Fix gui.text crash bug (#3770) (by Peng Yu)
    • Language and syntax
      • Use ndarray own memory allocator by default (#3843) (by Bo Qiao)
      • User-friendly exception when copying between ti.field (#3442) (by J. Li)
      • Enforce single deterministic return in taichi kernels and functions (#3795) (by lin-hitonami)
      • Add "In" support in static scope (#3792) (by lin-hitonami)
      • Support sparse solver datatype configuration (#3733) (by Peng Yu)
      • Fix pylint rule C0321 (multiple-statements) and C0325 (superfluous-parens). (#3762) (by kxxt)
      • Enforce members of a matrix field to have same shape (#3761) (by Yi Xu)
      • Fix pylint rule W1309 (f-string-without-interpolation) (#3757) (by kxxt)
    • LLVM backend (CPU and CUDA)
      • Remove the dependency of llvm-as (#3562) (by Tianshu Xu)
    • Metal backend
      • Pass random seed to metal backend (#3724) (by Jian Zeng)
    • Performance improvements
      • Unnecessary assignment as it is redefined (#3753) (by skywf)
    • Vulkan backend
      • Update AOT Loader support to new API (#3766) (by Gabriel H)
      • Add support of loading AOT modules and fields (#3703) (by Gabriel H)

    Full changelog:

    • [misc] Fix postsubmit status in README.md by replacing it with publishing checks (#3840) (by Velaciela)
    • [Lang] Use ndarray own memory allocator by default (#3843) (by Bo Qiao)
    • memset before reusing memory (#3841) (by Bo Qiao)
    • [refactor] Recover 'in' operator in vector_to_fast_image() (#3839) (by Yi Xu)
    • [Bug] [lang] Fix copying a Matrix/Struct in Taichi scope (#3838) (by Yi Xu)
    • [refactor] Remove import taichi in common_ops.py (#3824) (by lin-hitonami)
    • [vulkan] Fix command serial ordering & uses less queue submits (#3818) (by Bob Cao)
    • [ci] Enforce using ninja on Windows in both release and normal testing (#3837) (by Yi Xu)
    • [lang] Fix ndarray cuda dealloc when using preallocated memory (#3829) (by Bo Qiao)
    • [Lang] User-friendly exception when copying between ti.field (#3442) (by J. Li)
    • [refactor] Remove import taichi in kernel_impl.py (#3825) (by lin-hitonami)
    • Update TaichiCXXFlags.cmake (#3823) (by Bob Cao)
    • [bug] Fix static grouped for (#3822) (by lin-hitonami)
    • [misc] Slient & non-blocking version check (#3816) (by Jiasheng Zhang)
    • [doc] Editorial updates to contributor_guide.md (#3806) (by Vissidarte-Herman)
    • [Vulkan] Update AOT Loader support to new API (#3766) (by Gabriel H)
    • [Doc] Tweak API docstrings (#3820) (by Ye Kuang)
    • [ci] Use clang+Ninja to build Taichi on windows (#3735) (by Bob Cao)
    • [LLVM] Remove the dependency of llvm-as (#3562) (by Tianshu Xu)
    • [build] No need to build main executable (#3804) (by Frost Ming)
    • [lang] Add parallel sort utility (#3790) (by Dunfan Lu)
    • [misc] [bug] Fix legacy benchmarks/run.py (#3812) (by rocket)
    • [refactor] Move CompiledFieldData to aot namespace. (#3797) (by Ailing)
    • [docs] Update README.md to include command to install nightly. (#3809) (by Ailing)
    • [Lang] Enforce single deterministic return in taichi kernels and functions (#3795) (by lin-hitonami)
    • [bug] Fix typo in opengl codegen. (#3801) (by Ailing)
    • [bug] Use NdarrayRwKeys for ndarray host reader & writer caching. (#3805) (by Ailing)
    • [Bug] [opengl] Modify the implementation of norm_sqr() (#3803) (by Yi Xu)
    • [bug] Create core folder if it doesn't exist. (#3799) (by Ailing)
    • [doc] Update CONTRIBUTING.md to include contribution opportunities (#3794) (by Ye Kuang)
    • [Doc] Refactored contribution guidelines (#3789) (by Vissidarte-Herman)
    • [refactor] Move ti.lib to ti._lib and move ti.core to ti._lib.core (#3731) (by lin-hitonami)
    • [Lang] Add "In" support in static scope (#3792) (by lin-hitonami)
    • [ir] [llvm] Add offset_bytes_in_parent_cell to SNode (#3793) (by Yi Xu)
    • [bug] [opengl] Avoid using new as variable name in generated glsl. (#3786) (by Ailing)
    • [refactor] Move diagnose.py and cc_compose.py to tools/ (#3788) (by lin-hitonami)
    • [misc] Increase the kernel number recorded by CUPTI (#3780) (by rocket)
    • [vulkan] Try to enable Vulkan test in macOS presubmit (#3456) (by Bob Cao)
    • [llvm] [bug] Support atomic min/max for unsigned int type (#3779) (by Chang Yu)
    • [Lang] Support sparse solver datatype configuration (#3733) (by Peng Yu)
    • [gui] Fix vector to fast image (#3778) (by Bob Cao)
    • [Example] [bug] Fix constraint correction (#3740) (by Peng Yu)
    • [refactor] Remove get_type_size() from JITSession (#3777) (by Yi Xu)
    • [Lang] Fix pylint rule C0321 (multiple-statements) and C0325 (superfluous-parens). (#3762) (by kxxt)
    • [misc] Remove more deprecated APIs (#3774) (by Zack Wu)
    • [GUI] Fix gui.text crash bug (#3770) (by Peng Yu)
    • Fix spirv types (#3772) (by Dunfan Lu)
    • [Doc] Fix a misspelling (#3773) (by 张皓)
    • [Lang] Enforce members of a matrix field to have same shape (#3761) (by Yi Xu)
    • [bug] Enable int32 atomic ops for opengl backend. (#3760) (by Ailing)
    • [gui] Fix ggui canvas.set_image (#3767) (by Dunfan Lu)
    • [test] Enable ndarray tests (#3759) (by Bo Qiao)
    • [Lang] Fix pylint rule W1309 (f-string-without-interpolation) (#3757) (by kxxt)
    • [Android] Add initial support of the platform (#3755) (by Gabriel H)
    • Update dev_install.md (#3758) (by Vissidarte-Herman)
    • [misc] Speed-up builds by removing LLVM includes from llvm_program.h (#3756) (by Bob Cao)
    • [Perf] Unnecessary assignment as it is redefined (#3753) (by skywf)
    • [ci] Update performance monitoring (#3741) (by rocket)
    • [opengl] Don't serialize ext_arr_access in aot. (#3749) (by Ailing)
    • [refactor] Move ti.randn out of taichi.lang. (#3742) (by Ailing)
    • [bug] Temporarily remove runtime_ usage in aot_module_builder. (#3746) (by Ailing)
    • [misc] Update title check to be more robust. (#3747) (by Ailing)
    • [Metal] Pass random seed to metal backend (#3724) (by Jian Zeng)
    • [Vulkan] Add support of loading AOT modules and fields (#3703) (by Gabriel H)
    • [refactor] Let build_xxx in ASTTransformer return node.ptr instead of node (#3695) (by lin-hitonami)
    • [doc] Fix a heading level in syntax (#3738) (by Ran)
    • [misc] Remove deprecated Python APIs (#3725) (by Zack Wu)
    • [ci] Auto generate manylinux dockerfile (#3699) (by Bo Qiao)
    • version bump (#3736) (by lin-hitonami)
    • [misc] Support Clang build on windows (#3732) (by Bob Cao)
    • [test] Refine the way to test the laplace example (#3721) (by Yi Xu)
    Source code(tar.gz)
    Source code(zip)
  • v0.8.7(Dec 7, 2021)

    Full changelog:

    • [refactor] Merge taichi.misc into taichi.tools. (by Ailing Zhang)
    • [refactor] Merge taichi.lang.types into taichi.types. (by Ailing Zhang)
    • Add /bigobj flag to allow Debug builds on windows (#3730) (by Bob Cao)
    • [misc] Add an option to skip version check (#3729) (by Jiasheng Zhang)
    • [refactor] Separate Cpp examples into different files (#3728) (by Dunfan Lu)
    • [ci] [bug] Fix release in CI (#3719) (by lin-hitonami)
    • [Lang] Support annotated assignment (#3709) (by Ziwen Ye)
    • [vulkan] Further decouple SPIRV codegen from Vulkan runtime (#3711) (by Dunfan Lu)
    • [refactor] First clang-tidy pass (#3407) (by Taichi Gardener)
    • [misc] Remove legacy Windows-related scripts (#3722) (by Yi Xu)
    • [misc] Error handle and TLS (#3718) (by Jiasheng Zhang)
    • fix tab and long line (#3669) (by lin-hitonami)
    • [bug] Fix template arguments of ti.func with default values (#3716) (by lin-hitonami)
    • [refactor] Rename CompiledProgram to CompiledTaichiKernel. (by Ailing Zhang)
    • [refactor] Add doc string for ndarray element_shapes & field_dim. (by Ailing Zhang)
    • Update euler.py (#3715) (by skywf)
    • [misc] Use ccache & prefer Ninja over Make as cmake (#3712) (by Ailing)
    • fix the script path in release workflow (#3713) (by Frost Ming)
    • [ci] Use the same script for docker and system build & test (#3698) (by Frost Ming)
    • [example] SPIR-V AOT example in C++ (#3707) (by Dunfan Lu)
    • [LLVM] Add missing pre-processor macro when LLVM is disabled (#3702) (by Gabriel H)
    • [example] Build cpp examples (#3705) (by Dunfan Lu)
    • [doc] Improve the dev install doc (#3685) (by Ye Kuang)
    • [cuda] Enable block splitting for cuda caching memory allocator (#3677) (by Bo Qiao)
    • [ci] Run cpp tests in unix_docker_test.sh (#3693) (by Jian Zeng)
    • [refactor] [opengl] Properly save both scalar args and array args. (by Ailing Zhang)
    • [refactor] [opengl] For each taichi kernel, save a {kernel_name, CompiledProgram} pair instead of a vector. (by Ailing Zhang)
    • [refactor] [opengl] Use CompiledOffloadedTask instead of CompiledKernel. (by Ailing Zhang)
    • [refactor] [opengl] Remove dtype & total_size_hint in serialized aot file. (by Ailing Zhang)
    • [Lang] Make example_any_arrays optional. (by Ailing Zhang)
    • [Lang] Add optional element_shapes & field_dim annotation to ndarray. (by Ailing Zhang)
    • [vulkan] Isolate SPIR-V codegen, cleanup Vulkan backend (#3676) (by Bob Cao)
    • [refactor] Turn off empty root buffer warning for finalize_for_aot. (#3681) (by Ailing)
    • [misc] Handle outermost error and change metadata server address to latest (#3686) (by Jiasheng Zhang)
    • [refactor] Remove ext_arr_map in favor of CompiledArrayData. (by Ailing Zhang)
    • [refactor] Remove unused functions in misc. (#3671) (by Ailing)
    • [ci] Fix cancel workflow to only cancel the same branch (#3678) (by Frost Ming)
    • [refactor] Rename testing.py to _testing.py (#3668) (by Yi Xu)
    • [Vulkan] add initial support of AOT (#3647) (by Gabriel H)
    • [refactor] Move image.py from misc/ to tools/ and remove it from top level package (#3672) (by Yi Xu)
    • [opengl] [refactor] Expose use_gles in CompileConfig. (#3662) (by Ailing)
    • checkout the repo for cancel (#3670) (by Frost Ming)
    • [refactor] [opengl] Move ndarray aot information inside kernels. (#3658) (by Ailing)
    • [ci] Add sccache to ci building process on Linux and Mac jobs (#3559) (by lin-hitonami)
    • [refactor] Remove np2ply, patterns, video from top level package (#3660) (by Yi Xu)
    • [ci] Set token in cancel workfow (#3664) (by Frost Ming)
    • [ci] Run code format check for doc-only changes as well. (#3665) (by Ailing)
    • [bug] Fix postsubmit (#3661) (by lin-hitonami)
    • [cuda] Query the max block count from the hardware (#3657) (by Bob Cao)
    • Restructured dev_install.md (#3616) (by Vissidarte-Herman)
    • [error] Let type_check throw TypeError (#3650) (by lin-hitonami)
    • Attempt to unite postsubmit and presubmit (#3654) (by Frost Ming)
    • [Refactor] Deprecate Expr::operator= (#3596) (by Jun)
    • Use helper fill function (#3655) (by Bo Qiao)
    • [perf] Accelerate _inside_class() (#3653) (by Yi Xu)
    • [Bug] [lang] Fix copying Matrix/StructField elements in Taichi scope (#3649) (by Yi Xu)
    • [refactor] Ignore filename prefix in opengl aot files. (#3648) (by Ailing)
    • [error] Add line number and source code to exception (#3637) (by lin-hitonami)
    • [Lang] Cuda caching allocator for ndarray 1/n (#3581) (by Bo Qiao)
    • [refactor] Move GUI from misc to ui. (by Ailing Zhang)
    • [refactor] Get rid of tools/file.py (#3645) (by Yi Xu)
    • [refactor] Get rid of tools/messenger.py. (by Ailing Zhang)
    • [refactor] Get rid of ti.task. (by Ailing Zhang)
    • [refactor] Remove primitive_types module in top level package. (by Ailing Zhang)
    • [Bug] [lang] Fix copying a Matrix/Struct from Python scope to Taichi scope (#3638) (by Yi Xu)
    • [Lang] Add option short_circuit_operators for short-circuiting boolean ops (#3632) (by daylily)
    • Fix build script for docker build (#3629) (by Frost Ming)
    • [gui] Fix incorrect shading when first calling mesh/particles (#3628) (by Chang Yu)
    • [misc] Temporarily disable performance monitoring for testing offline (#3626) (by rocket)
    • [refactor] Get rid of ASTTransformerTotal and rename IRBuilder to ASTTransformer (#3610) (by lin-hitonami)
    • [bug] Correctly support serializing maps & vectors to json. (by Ailing Zhang)
    • [bug][opengl] Only add atomicAdd functions in generated code for arrs when they're used. (by Ailing Zhang)
    • [refactor] Do not expand to absolute path for saved shaders. (by Ailing Zhang)
    • Respect drawing order specified by user (#3614) (by Dunfan Lu)
    • [bug] Revert part of #3569 so that tests are not skipped. (#3620) (by Ailing)
    • [lang] Disable signal handlers when TI_DISABLE_SIGNAL_HANDLERS=1 (#3613) (by Ye Kuang)
    • [misc] Do not let check_version block users (#3619) (by Jiasheng Zhang)
    • fix unix build script for nightly build (#3618) (by Frost Ming)
    • [Lang] Implement opt_level for accelerating compiling (#3434) (by squarefk)
    • [gui] Fix vulkan glfw image count (#3604) (by Bob Cao)
    • [misc] Remove legacy torch_io.py (#3609) (by Yi Xu)
    • [bug] Enable reassignment of scalar arguments (#3607) (by lin-hitonami)
    • [misc] Fix upload release error handling (#3606) (by Jiasheng Zhang)
    • [opengl] Serialize ndarrays and ndarray-based kernels in AOT. (by Ailing Zhang)
    • [opengl] Support taichi ndarray on opengl backend and enable tests. (by Ailing Zhang)
    • [llvm] Make taichi's Ndarray carry a ptr to its DeviceAllocation. (by Ailing Zhang)
    • Temporarily disable flaky test (#3603) (by Bo Qiao)
    • [Opt] [ir] [refactor] Remove exception from simpify pass (#3317) (by lin-hitonami)
    • [misc] Version bump: v0.8.6 -> v0.8.7 (#3602) (by Jiasheng Zhang)
    • [mesh] Make ti.Mesh compatible with dynamic index (#3599) (by Yi Xu)
    Source code(tar.gz)
    Source code(zip)
  • v0.8.6(Nov 23, 2021)

    Notes:

    We added a function to periodically check version information on ti.init to remind users if a new version has been released. However, this function is not fully tested when 0.8.6 is released, and the error handling is not well-implemented. Taichi crashes when it fails to check the version (maybe due to network issues). For this reason, 0.8.6 is removed from the official releases in PyPI. Please upgrade to a newer version if you are on this version. We are sorry for the inconvenience.

    Full changelog:

    • [Lang] Check version when importing taichi instead of when using ti (#3598) (by Jiasheng Zhang)
    • [Lang] Add ti.round op (#3541) (by gaoxinge)
    • [Bug] [ir] Fix the IdentifyValuesUsedInOtherOffloads pass (#3597) (by Yi Xu)
    • [ci] Fix release (#3594) (by Jiasheng Zhang)
    • [perf] Add async_mode restriction to ti.sync() of external_arrays in class Kernel (#3535) (by rocket)
    • [Lang] Add time check before performing version check (#3589) (by Jiasheng Zhang)
    • [Doc] Fix example link in README. (#3584) (by egolearner)
    • [gui] GGUI initial alpha transparency support (#3592) (by Bob Cao)
    • [ci] Turn off vulkan build on macos (#3591) (by Jiasheng Zhang)
    • [ci] Fix release bug (#3585) (by Jiasheng Zhang)
    • [misc] Update documentations after the examples directory changed (#3587) (by Velaciela)
    • [bug] Fix missing tests/ folder in postsubmit & release workflows. (#3583) (by Ailing)
    • [refactor] Avoid copying examples/ folder when building taichi. (by Ailing Zhang)
    • [Test] Remove ti test from taichi package. (by Ailing Zhang)
    • [ci] Fix postsubmit mac build failure (#3579) (by Jiasheng Zhang)
    • [misc] MoltenVK dynamic library & allow overriding CLANG_EXECUTABLE (#3565) (by Bob Cao)
    • [Lang] Add get_element_size for ndarray (#3576) (by Bo Qiao)
    • [ci] Remove usage of build.py in favor of setup.py (#3537) (by Frost Ming)
    • [Bug] [llvm] Fix FP<->UInt castings (#3560) (by Yi Xu)
    • [Bug] [vulkan] Fix data type alignment for arguments and return values (#3571) (by Yi Xu)
    • ci (#3569) (by Tianshu Xu)
    • [Bug] [metal] Fix data type alignment for arguments and return values (#3564) (by Yi Xu)
    • [Mesh] [opt] Support mesh-for for multi-CPUs & demote atomic stmts in BLS for x64 (by Chang Yu)
    • [Mesh] Remove Matrix.with_entries() & support ti.mesh_patch_idx() (by Chang Yu)
    • [Mesh] Support mesh-for for CPU backend (by Chang Yu)
    • [Mesh] [refactor] Migrate ti.Mesh to refactored frontend (by Chang Yu)
    • [Mesh] Add type_check for ti.Mesh frontend (by g1n0st)
    • [Mesh] Add CI tests for ti.Mesh (by Chang Yu)
    • [Mesh] Decouple metadata (by Chang Yu)
    • [Mesh] Fix misc & restore code formatter (by g1n0st)
    • [Mesh] Reduce SNode trees allocation for ti.Mesh (by Chang Yu)
    • [Mesh] Refactor serialize() for frontend IR (by Chang Yu)
    • [Mesh] Fix bugs to enable reordered mesh attribute (by Chang Yu)
    • [Mesh] Fix bugs to enable nested relation access (by Chang Yu)
    • [Mesh] [refactor] Refactor frontend (by Chang Yu)
    • [Mesh] [opt] Demote no relation access mesh-for to range-for (by Chang Yu)
    • [Mesh] Quick fix bugs after rebase (by g1n0st)
    • [Mesh] Fix Layout.AOS (by Chang Yu)
    • [Mesh] Quick fix bugs after rebase (by g1n0st)
    • [Mesh] [fix] Fix failed caching mapping only (by Chang Yu)
    • [Mesh] Add experimental compile configs (by Chang Yu)
    • [Mesh] [refactor] Remove MeshAttribute in mesh class (by Chang Yu)
    • [Mesh] [opt] Make mesh attribute local (by Chang Yu)
    • [Mesh] [Lang] Quick fix rebase conflicts (by g1n0st)
    • [Mesh] Add analyzer to determine which mapping should be cache (by g1n0st)
    • [Mesh] Clean MeshAttributeSet in DecoratorRecorder (by Chang Yu)
    • [Mesh] [refactor] Add global to reordered index mapping type (by Chang Yu)
    • [Mesh] [refactor] Unified MeshRelationAccessStmt and MeshRelationSizeStmt (by g1n0st)
    • [Mesh] [refactor] Rename to_string functions (by g1n0st)
    • [Mesh] Use ti.axes instead of ti.indices (by g1n0st)
    • [Mesh] Add ti.mesh_local() (by g1n0st)
    • [Mesh] [refactor] Divide make_mesh_index_mapping_local pass into multiple functions (by g1n0st)
    • [Mesh] [test] Delete outdated mesh-for test (by g1n0st)
    • [Mesh] [opt] Optimize reordered index mapping case (by g1n0st)
    • [Mesh] [refactor] from_type() as statement attribute (by g1n0st)
    • [Mesh] Add optimization pass to make index mapping local (by g1n0st)
    • [Mesh] Set MeshTaichi as extension (by g1n0st)
    • [Mesh] [refactor] Rename make_mesh_attribute_local to demote_mesh_statements (by g1n0st)
    • [Mesh] Support low-to-high and same-order relation access (by g1n0st)
    • [Mesh] Add analysis pass to gather mesh thread local variables (by g1n0st)
    • [Mesh] Add analysis pass to gather mesh_for relation types (by g1n0st)
    • [Mesh] Clean up field template based residual & fix bugs (by g1n0st)
    • [Mesh] Id property to interact with non-mesh field (by g1n0st)
    • [Mesh] MeshRelationAccessStmt & MeshIndexConversionStmt backend implementation (by g1n0st)
    • [Mesh] Frontend Impl (by g1n0st)
    • [Mesh] Fix code format (by g1n0st)
    • [Mesh] [IR] Add MeshRelationAccessStmt & MeshIndexConversionStmt (by g1n0st)
    • [IR] Quick fix rebase conflict (by g1n0st)
    • [Mesh] [Lang] New ti.Mesh frontend class prototype & MeshRelationSize statement and expression (by g1n0st)
    • [Mesh] Fix mesh-for in multiple passes (by g1n0st)
    • [Mesh] Fix type_check pass for body_prologue in OffloadedStmt (by g1n0st)
    • [Mesh] Add make_mesh_thread_local pass (by g1n0st)
    • [Mesh] Make the type of loop in MeshPatchIndexStmt explicit (by g1n0st)
    • [Mesh] Make get num_patches behavior correctly (by g1n0st)
    • [Mesh] Removed wildcard import in python (by g1n0st)
    • [Mesh] Add MeshPatchIndexStmt statement (by g1n0st)
    • [Mesh] Add a backend Mesh class prototype & relation based mesh_for (by g1n0st)
    • [Mesh] [refactor] Create a new pass called make_mesh_attribute_local (by g1n0st)
    • [Mesh] A simple BLS pass to do the local to global mapping (by bx2k)
    • [Mesh] A Frontend ti.Mesh class prototype (by g1n0st)
    • [Mesh] Fix meshfor at simplify pass (by bx2k)
    • [Mesh] A simple meshfor frontend enable to print index (by bx2k)
    • [Mesh] Add Meshfor prototype with dirty hacks (by bx2k)
    • [LLVM] Fix casting f64 to f16 (#3561) (by Tianshu Xu)
    • [misc] Improve the mechanism to find clang (#3379) (by Tianshu Xu)
    • [CI] Add version database update in CD (#3540) (by Jiasheng Zhang)
    • [Doc] Move build Taichi from source one level up (#3551) (by tison)
    • [Misc] [refactor] Move symbol versioning to a new file (#3426) (by Bo Qiao)
    • [misc] Fix redefined reset function (#3227) (#3521) (by u2386)
    • [ci] Dockerfile for CPU manylinux2014 compliant (#3542) (by Bo Qiao)
    • [Lang] [bug] Fix numpy from and to ndarray matrix (#3549) (by Bo Qiao)
    • [Lang] Remove disable_local_tensor and empty() from Matrix (#3546) (by Yi Xu)
    • [refactor] Remove with_entries() and keep_raw from Matrix (#3539) (by Yi Xu)
    • [lang] Limit torch based ndarray to cpu/cuda backend. (#3545) (by Ailing)
    • [Refactor] Fix typo and type in docstring, and format too long string (#3530) (by gaoxinge)
    • [Lang] Add check version function to taichi main (#3526) (by Jiasheng Zhang)
    • [bug] Remove fallback in C++ code (#3538) (by lin-hitonami)
    • [refactor] Remove empty_copy() and copy() from Matrix/Struct (#3536) (by Yi Xu)
    • [misc] Reorg CI stages (#3525) (by Tianshu Xu)
    • [ci] Remove build_and_test_cpu_required from CI to save time (#3534) (by lin-hitonami)
    • [refactor] Remove variable() of Matrix/Struct and empty() of Matrix/StructType (#3531) (by Yi Xu)
    • [ci] Disable arch fallback on CI (#3474) (by lin-hitonami)
    • [refactor] Decouple KernelProfilerBase::sync() from Program::synchronize() (#3504) (by rocket)
    • [Lang] Add deepcopy for ndarray (#3473) (by Bo Qiao)
    • [Lang] Fix pylint rule E1101 (#3500) (by DeepDuke)
    • [Lang] Implement ti.global_thread_idx (#3319) (by Shay P.C)
    • [refactor] Remove the old AST builder from python frontend (#3527) (by lin-hitonami)
    • [Lang] Remove disable_local_tensor in most cases (#3524) (by Yi Xu)
    • [opengl] Use separate ssbo for external arrays. (by Ailing Zhang)
    • [refactor] Rename Context to RuntimeContext. (by Ailing Zhang)
    • [Lang] Enable local tensors as writeback binary operation results (#3517) (by Yi Xu)
    • fix (#3523) (by Dunfan Lu)
    • [metal] Add a TI_WITH_METAL option (#3510) (by Dunfan Lu)
    • [Lang] Fix pylint rule W0621 (#3498) (by ZONEPG)
    • [refactor] Enable the new ast builder by default (#3516) (by lin-hitonami)
    • [ci] Add a helper script for Dockerfile generation. (#3509) (by Chengchen(Rex) Wang)
    • [ci] Fix release action now being able to be triggered manually (#3520) (by Jiasheng Zhang)
    • Create CONTRIBUTING.md (#3518) (by Tianshu Xu)
    • Update unix_build.sh (#3511) (by Bob Cao)
    • [ci] Minor fix for windows release upload. (#3513) (by Ailing)
    • [llvm] Add a TI_WITH_LLVM option (#3507) (by Dunfan Lu)
    • [CUDA] Fix a misuse of std::move: address of stack memory associated with temporary object of type std::lock_guardstd::mutex returned to caller (#3502) (by Twice)
    • [Lang] Add W0101 rule for pylint (#3497) (by Ligeng Zhu)
    • Add PEP 517 build specification (#3495) (by Frost Ming)
    • [Lang] Fix pylint rule W0622 (#3501) (by Mark Huang)
    • [Lang] Fix pylint rule R1710 (#3496) (by licktion)
    • [Lang] Fix pylint rule W0612 (#3151) (#3488) (by klein)
    • [ci] Fix pylint conflicts (#3503) (by Ye Kuang)
    • [Lang] Fix pylint rule C0209 (#3489) (by zstone12)
    • [Lang] Fix pylint rule W0404 (#3477) (by Dustyposa)
    • fix pylint W0235 (#3486) (by ImPerat0R_)
    • [IR] Enforce type check for all expressions (#3461) (by Yi Xu)
    • [Lang] Fix pylint rule W0101 (#3493) (by darkSheep)
    • [Lang] Fix pylint rule W0108 (#3482) (by Isaac)
    • [Lang] Fix pylint rule R0201 (#3494) (by Alex Chi)
    • [Lang] Fix pylint rule C0200 (#3480) (by Keming)
    • [Lang] Fix pylint rule R1705 (#3491) (by IceCodeNew)
    • [Lang] Fix pylint rule R1732 (#3490) (by IceCodeNew)
    • [Lang] Fix pylint rule R1703 (#3472) (by HHHJH)
    • [Lang] Fix pylint rule R0205 (#3487) (by Isaac)
    • [Lang] Fix pylint rule R0402 (#3483) (by Yu Dou)
    • [misc] Enable clang-tidy check in CI (#3475) (by Tianshu Xu)
    • [Doc] Fix wrong example about TAICHI_CMAKE_ARGS (#3485) (by Jun)
    • [Lang] Fix pylint rule W0201 (#3476) (by Dustyposa)
    • [Lang] Fix pylint rule W0611 (#3478) (by ZHANG Zhi)
    • [docs] Remove ti format in doc. (#3479) (by Ailing)
    • [Lang] Enable local tensors as arithmetic operation results (#3468) (by Yi Xu)
    • [Lang] Fix pylint rule W0401 (#3471) (by Alkaid)
    • [Lang] Change the type error to a real exception (#3439) (by Frost Ming)
    • Report error if upload fails (#3462) (by Frost Ming)
    • [opengl] Unify windows path to posix format in python. (#3470) (by Ailing)
    • [Bug] [opt] Visit RangeForStmt in IdentifyValuesUsedInOtherOffloads (#3466) (by Yi Xu)
    • [vulkan] Link to MoltenVK on macOS (#3445) (by Bob Cao)
    • [vulkan] Basic Bitmasked support (#3412) (by Bob Cao)
    • [opengl] Make windows path in saved aot json human readable. (#3460) (by Ailing)
    • [refactor] [bug] Eliminate failing tests on the new AST builder (#3441) (by lin-hitonami)
    • [misc] Add default values of TI_VERSION (#3459) (by Tianshu Xu)
    • [misc] Temporarily disable clang-tidy check. (#3458) (by Ailing)
    • [misc] Remove regex in TextSerializer. (#3454) (by Ailing)
    • [IR] Add type_check for Atomic/SNodeOpExpression (#3444) (by Yi Xu)
    • [IR] Remove EvalExpression (#3448) (by Yi Xu)
    • [misc] Version bump: v0.8.5->v0.8.6. (#3457) (by Ailing)
    Source code(tar.gz)
    Source code(zip)
  • v0.8.5(Nov 10, 2021)

    Full changelog:

    • [misc] Fix python wheel versioning. (#3450) (by Ailing)
    • [misc] Version bump: v0.8.4->v0.8.5. (#3447) (by Ailing)
    • [IR] Add type inference for loop variables (#3437) (by Yi Xu)
    • [LLVM] Fix link (#3443) (by Tianshu Xu)
    • [IR] Add type_check for RangeAssumption/LoopUnique/ExternalTensorShapeAlongAxisExpression (#3436) (by Yi Xu)
    • [llvm] Support atomic operations of f16 (#3428) (by Tianshu Xu)
    • [Doc] Put back content from old docs/lang/api/atomic.md (#3440) (by Yi Xu)
    • [refactor] Add Assert, BoolOp, NamedExpr and dict to the new AST builder (#3398) (by lin-hitonami)
    • [Bug] Revert #3428 (#3438) (by Tianshu Xu)
    • [gui] GGUI Tests (#3430) (by Dunfan Lu)
    • [gui] Fix canvas.lines on macOS (#3432) (by Dunfan Lu)
    • [ci] Fix aws machine not removing container (#3435) (by Jiasheng Zhang)
    • [gui] Show f16 image as f32. (#3433) (by Ailing)
    • [ci] Move required cpu check to AWS machine (#3427) (by Jiasheng Zhang)
    • [Refactor] Simplify runtime function definition (#3429) (by Tianshu Xu)
    • [lang] [refactor] Use preallocated memory via device allocation for Ndarray (#3395) (by Bo Qiao)
    • [refactor] Add ListComp and DictComp to the new AST builder (#3400) (by lin-hitonami)
    • [ci] Add build script for win (#3410) (by Frost Ming)
    • [OpenGL] Add mem_offset_in_parent to serialized file in AOT. (#3418) (by Ailing)
    • Update gui.md: comprehend widgets example (#3424) (by FantasyVR)
    • [refactor] Add for and while to the new frontend AST builder (#3353) (by lin-hitonami)
    • [CI] Recheck the title format when user updates the title (#3403) (by Manjusaka)
    • [Refactor] Use wrapped create_call (#3421) (by Tianshu Xu)
    • [Doc] Update dev install about m1 prebuilt llvm. (#3419) (by Ailing)
    • [opengl] Save opengl aot data in json format. (#3417) (by Ailing)
    • [IR] Add type_check for expressions related to fields and matrices (#3377) (by Yi Xu)
    • [misc] Update "get_largest_pot" in scalar.h + Bug Fix (#3405) (by Niclas Schwalbe)
    • [refactor] Remove Matrix.new (#3408) (by Yi Xu)
    • [refactor] Remove handling for real types in set_arg_int. (#3388) (by Ailing)
    • [CI] skip the full test when the PR just get Docs change (#3399) (by Manjusaka)
    • [LLVM] Fix logging formats (#3404) (by Tianshu Xu)
    • Update GLFW (#3406) (by Bob Cao)
    • [IR] Fix continue statement in struct for and add a related test (#3282) (by bx2k)
    • Fix non 4 byte element external array in SPIR-V codegen & enable f16 test for Vulkan (#3396) (by Bob Cao)
    • [misc] Add new issues templates (#3390) (by Tianshu Xu)
    • [refactor] Add SparseMatrixBuilder and any_array support in kernel argument in the new AST builder (#3352) (by lin-hitonami)
    • [Doc] Put back content from old docs/lang/api/arithmetics.md (#3394) (by Yi Xu)
    • [Refactor] Move the cuda codegen part of atan2/pow to codegen_cuda.cpp (#3392) (by Jian Zeng)
    • [Doc] Fix an API typo in GGUI doc (#3397) (by Chang Yu)
    • [example] Create inital_value_problem.py (#3383) (by Niclas Schwalbe)
    • [gui] Allowing recreating GGUI windows after ti.reset() (#3389) (by Dunfan Lu)
    • [ci] Add clang-tidy in CI (#3354) (by Tianshu Xu)
    • [metal] Add mem_offset_in_parent to AOT module (#3245) (by Ye Kuang)
    • [vulkan] FP16 support, fix a few bugs & warnings (#3387) (by Bob Cao)
    • [Lang] fp16 interacts with pytorch. (by Ailing Zhang)
    • [llvm] Basic f16 support (by Ailing Zhang)
    • [cuda] Perf: Use the min between saturation grid dim and const range-for dim (#3314) (by Bob Cao)
    • [IR] Add type_check for TernaryOpExpression (#3381) (by Yuheng Zou)
    • [lang] [refactor] Use DeviceAllocation for Ndarray (#3366) (by Bo Qiao)
    • [Test] Fix uninitialized tests for ndarray (#3365) (by Bo Qiao)
    • repush (#3376) (by Ye Kuang)
    • feat: define str method for DataType (#3370) (by Jian Zeng)
    • [opengl] Do not use macros in GLSL codegen (#3369) (by Ye Kuang)
    • [Lang] Call external function with llvm bitcode (#2873) (by squarefk)
    • [opengl] Off-screen context using EGL & Support GLES (#3358) (by Bob Cao)
    • [misc] Move configured headers out of the binary directory (#3363) (by Tianshu Xu)
    • [misc] Use configured headers to track version and commit hash (#3349) (by Tianshu Xu)
    • [IR] Add type_check for UnaryOpExpression (#3355) (by Yi Xu)
    • [gui] Fix IMGUI when GGUI is running in headless mode (#3357) (by Dunfan Lu)
    • [doc] Ggui image IO and headless docs (#3359) (by Dunfan Lu)
    • [ir] Add missing constructors for TypedConstant (#3351) (by Yi Xu)
    • [Lang] Better type error messages (#3345) (by Yi Xu)
    • [gui] Headless GGUI (#3348) (by Dunfan Lu)
    • [refactor] Add IfExp, static assign and AugAssign to the new frontend AST builder (#3299) (by lin-hitonami)
    • [Refactor] Python frontend refactor: build_call part, support format print and fstring (#3342) (by Jiasheng Zhang)
    • [refactor] Add compare to the new python frontend ast builder (#3344) (by lin-hitonami)
    • [gui] Fix bug where VBO/IBO cannot exceed 128 MB(#3347) (by Dunfan Lu)
    • [ci] Set GITHUB_CONTEXT for performance monitoring (#3343) (by rocket)
    • [opengl] Only run preprocess_kernels when glslc is available. (#3341) (by Ailing)
    • [CI] Linux CD containerization (#3339) (by Jiasheng Zhang)
    • [opengl] Expose allow_nv_shader_extension in compileconfig. (#3340) (by Ailing)
    • [ci] Update windows postsubmit job timeout to 90mins. (#3336) (by Ailing)
    • [ci] Update postsubmit job performance_monitoring (#3296) (by rocket)
    • [benchmark] Store the benchmark results as json files (#3294) (by rocket)
    • [docs] Only preview english doc. (#3338) (by Ailing)
    • [IR] Support frontend type inference in simple cases (#3302) (by Yi Xu)
    • [gui] GGUI Image IO (well, it's actually just O...) (#3333) (by Dunfan Lu)
    • [opengl] Provide an option to disable NV extensions during codegen (#3331) (by Ye Kuang)
    • [OpenGl] Support preprocessing glsl code in aot. (by Ailing Zhang)
    • [OpenGl] Set result_buffer for opengl aot. (by Ailing Zhang)
    • [vulkan] Fix GGUI (#3330) (by Dunfan Lu)
    • [misc] Port taichi to FreeBSD (#3325) (by Inoki)
    • [Refactor] Fix compiler warnings (#3322) (by Tianshu Xu)
    • Fix GL Device (#3315) (by Bob Cao)
    • [Doc] Update dev install about m1. (#3321) (by Ailing)
    • [Lang] Support f-string (by Jian Zeng)
    • [OpenGl] Merge Retr SSBO into Args. (#3313) (by Ailing)
    • [CI] Containerize CI (#3291) (by Jiasheng Zhang)
    • [vulkan] Release vulkan on macOS (#3305) (by Dunfan Lu)
    • [GUI] Fix re-use buffer bug in ggui (#3311) (by YuZhang)
    • [ci] Add Dockerfile to support minimum CPU (#3277) (by Bo Qiao)
    • [Lang] Add suppress_warning argument to ti.matrix initialization (#3310) (by Zhehao Li)
    • Remove Vulkan SDK dependency (#3307) (by Bob Cao)
    • [Bug] Add missing integral datatype support for ti.min/ti.max (#3248) (by FantasyVR)
    • [opengl] Remove extra semicolon in glsl generated code. (#3309) (by Ailing)
    • Merge GLBufId::Extr into Args. (#3306) (by Ailing)
    • [opengl] Fix typo in printing glsl kernel. (#3308) (by Ailing)
    • [Doc] update the security email address (#3297) (by Manjusaka)
    • [opengl] Recover GLSL printing (#3300) (by Ye Kuang)
    Source code(tar.gz)
    Source code(zip)
  • v0.8.4(Oct 27, 2021)

    Full changelog:

    • [misc] Version bump: v0.8.3->v0.8.4 (#3295) (by rocket)
    • [refactor] Finalize root FieldsBuilder only when it is not finalized (#3288) (by Ye Kuang)
    • [bug] Add default value to print_preprocessed_ir (#3292) (by lin-hitonami)
    • [Doc] Correct the note about dev installation (#3289) (by Tianshu Xu)
    • [refactor] [misc] Refactoring benchmark code for performance monitoring (#3269) (by rocket)
    • [Lang] Support more SNode trees for LLVM backends (#3279) (by Chang Yu)
    • [Refactor] Taichi frontend AST builder without generating code (#3037) (by lin-hitonami)
    • [ci] Reduce the artifacts retention duration to 20 days (#3286) (by Ye Kuang)
    • [ir] [refactor] Remove ptr_if_global in C++ Expr class (#3285) (by Yi Xu)
    • [doc] Add using clang++ for submodules in dev install instructions (#3273) (by Mingrui Zhang)
    • Update sparse.md (#3266) (by rockeyshao)
    • [vulkan] Indexed load codegen (#3259) (by Bob Cao)
    • [opengl] Remove listgen support (#3257) (by Ye Kuang)
    • [llvm] Separate compile_snode_tree_types from materialize_snode_tree in LLVM backends (#3267) (by Yi Xu)
    • [Lang] Add element shape to Ndarray (#3264) (by Bo Qiao)
    • Update write_test.md (#3263) (by FantasyVR)
    • [ci] Add benchmark to postsubmit workflow (#3220) (by rocket)
    • [ci] Move extract zip into ci_download.py (#3251) (by Frost Ming)
    • [Lang] Fix string format not support keywords format (#3256) (by yihong)
    • [vulkan] Force u8 capability on Apple (#3252) (by Dunfan Lu)
    • [vulkan] Catch std::runtime_error from button_id_to_name/buttom_name_to_id. (#3260) (by 0xzhang)
    • [cuda] Add CUDA version check (#3249) (by 0xzhang)
    • Fix Vulkan GGUI on CPU rendering (swiftshaders) (#3253) (by Bob Cao)
    • fix int / uint types & fix atomic op type mismatches (#3179) (by Bob Cao)
    • [test] Fix unrecognized test names in test_bls_assume_in_range.py (#3250) (by Yi Xu)
    • [ci] Add non-root user and conda environment (#3226) (by Bo Qiao)
    • [vulkan] Support for ti.u8 in vulkan (#3247) (by Dunfan Lu)
    • [lang] Make dynamic indexing compatible with BLS (#3244) (by Yi Xu)
    • [Bug] Fix indentation error when using tab indents (#3203) (by YuZhang)
    • [Bug] Remove the dataclass decorator from CuptiMetric, as it is not supported in Python 3.6 (#3246) (by rocket)
    • [opt] Enable CFG optimization for local tensors (#3237) (by Yi Xu)
    • [bug] Fix silent int overflow in indice calculation. (#3177) (by Ailing)
    • [doc] Update build badges on readme file. (#3235) (by Chengchen(Rex) Wang)
    • [Bug] Fix the always-in-cpu from_numpy in ti.ndarray (#3239) (by Yi Xu)
    • [ci] Fix ti testing with no supported arch. (#3236) (by Ailing)
    • [refactor] Get rid of unnecessary snodes map. (#3234) (by Ailing)
    • [lang] Make dynamic indexing compatible with grouped loop indices of struct fors (#3218) (by Yi Xu)
    • [Lang] Add a new transformation pass to rename module alias (#3180) (by Ce Gao)
    • [doc] Add doc string for _logging.py (#3209) (by 0xzhang)
    • [Refactor] Improve the core module code and remove unused imports (#3225) (by Frost Ming)
    • [doc] Update docsite link (#3232) (by Yuanming Hu)
    • [vulkan] Enable GGUI for Vulkan Backend (#3176) (by Dunfan Lu)
    • [opengl] Decouple AOT builder from the runtime (#3207) (by Ye Kuang)
    • [vulkan] Fix fence timeout (#3229) (by Dunfan Lu)
    • [lang] Make dynamic indexing compatible with grouped loop indices of ndrange fors (#3228) (by Yi Xu)
    • [lang] [opt] Memory allocator for ti.ndarray (#3020) (by Bo Qiao)
    • [refactor] Optimize Expression::serialize (#3221) (by 庄天翼)
    • [vulkan] Trying to fix external memory allocation (#3222) (by Bob Cao)
    • [doc] Remove the reference to libtinfo5 (#3219) (by Ye Kuang)
    • Constrain the python version in package metadata (#3217) (by Frost Ming)
    • [Example] Fix the ti example problems of sparse matrix demos (#3215) (by Jiafeng Liu)
    • Revert "[misc] Revert #3175 #3164 (#3185)" (#3212) (by Bob Cao)
    • [Lang] [bug] Fix support staticmethod decorator for data_oriented class (#3186) (by yihong)
    • [Doc] Update sparse_matrix.md (#3200) (by FantasyVR)
    • [Lang] Use ti.linalg.sparse_matrix_builder as kernel parameters (#3210) (by Jiafeng Liu)
    • [vulkan] Clear unnecessary GLSL shader files on Vulkan backend. (#3211) (by 0xzhang)
    • [refactor] [CUDA] Add kernel attributes for KernelProfiler (#3196) (by rocket)
    • [refactor] [misc] Add device name for KernelProfiler (#3194) (by rocket)
    • [opengl] Make AOT builder independent of the runtime (#3204) (by Ye Kuang)
    • [llvm] Fix llvm sparse when there are more than 1 snode trees (#3205) (by Dunfan Lu)
    • [opengl] OpenGL AOT Module Builder & Making GL CompiledProgram serializable (#3202) (by Bob Cao)
    • [Lang] Reuse sparse matrix builder (#3199) (by FantasyVR)
    • [ci] Fix ci shell scripts do not return error code (#3189) (by Jiasheng Zhang)
    • [Lang] Fixed pylint error C0121 (#3135) (by deepakdinesh1123)
    • [gui] Accept array of vectors as indices for lines and triangles (#3181) (by Jiasheng Zhang)
    • [vulkan] Fix some vulkan stuff (#3198) (by Dunfan Lu)
    • [refactor] Remove redundant code in snode_rw_accessors_bank (#3192) (by Yi Xu)
    Source code(tar.gz)
    Source code(zip)
  • v0.8.3(Oct 14, 2021)

    Full changelog:

    • [refactor] Rename nparray to external_array in Python-side (#3191) (by Yi Xu)
    • [misc] Version bump: v0.8.2->v0.8.3 (#3188) (by Bo Qiao)
    • [misc] Revert #3175 #3164 (#3185) (by Ye Kuang)
    • [Doc] Improve the documentation for ODOP (#3006) (by ljcc0930)
    • [CUDA] Update CUPTI profiling toolkit, add NVPW_MetricsEvaluator and its APIs for CUDA_VERSION >= 11.4 (#3172) (by rocket)
    • fix (#3175) (by Dunfan Lu)
    • [vulkan] Isolate vulkan runtime (#3164) (by Bob Cao)
    • [bug] Fix the mapping from virtual axes to physical axes again (#3170) (by Yi Xu)
    • [ci] Move pytest/pylint out of runtime dep. (#3169) (by Ailing)
    • [ci] Move code_format.py out of taichi package. (#3171) (by Ailing)
    • [gui] Use Device API memcpy in GGUI (#3163) (by Dunfan Lu)
    Source code(tar.gz)
    Source code(zip)
  • v0.8.2(Oct 12, 2021)

    Full changelog:

    • [bug] Fix the mapping from virtual axes to physical axes (#3159) (by Yi Xu)
    • [ci] Get rid of requirements_lint.txt. (#3161) (by Ailing)
    • [ci] Enable macos 10.14 in release.yml wip. (#3158) (by Ailing)
    • [refactor] [CUDA] Add optional metrics for KernelProfiler in Python scope (#3049) (by rocket)
    • [misc] Fix build on macos 10.14. (#3155) (by Ailing)
    • [bug] Fix Cpu/CudaDevice memory deallocation bug (#3157) (by Ye Kuang)
    • Revert "[Bug] Fix argument shadowed when there is a local variable with the same name (#3105)" (#3153) (by Ailing)
    • [Lang] Use ti.linalg.SparseMatrixBuilder as sparse matrix builder (#3152) (by FantasyVR)
    • [Doc] Add docstring and documents for sparse matrix (#3119) (by Jiafeng Liu)
    • [vulkan] Inter-device memcpy API (#3137) (by Dunfan Lu)
    • [Lang] Improve sparse matrix/solver API (#3145) (by FantasyVR)
    • Fix and cleanup GGUI examples (#3144) (by Dunfan Lu)
    • [ci] [vulkan] Upgrade Dockerfile to enable Vulkan tests (#2970) (by Bo Qiao)
    • [Example] An implicit mass spring demo with sparse matrix (#3116) (by FantasyVR)
    • [misc] Version bump: v0.8.1->v0.8.2 (#3149) (by Yi Xu)
    • [refactor] Get rid of the indirection between KernelCodeGen::compile and KernelCodeGen::codegen. (by Ailing Zhang)
    • [misc] Enable flushing when printing preprocessed ast. (by Ailing Zhang)
    • [Bug] Throw proper error message when creating more than 32 snode trees in LLVM backend. (by Ailing Zhang)
    • [Cuda] [opt] Fix duplicate shared memory allocation (#3140) (by Chang Yu)
    • [CI] Add pip cache (#3139) (by zstone12)
    • rgb_to_hex: fix typo and use bit operator instead of multiplication (#3136) (by gaoxinge)
    • [llvm] Establish a correspondence between SNodes and DevicePtr (#3120) (by Dunfan Lu)
    • [Metal] Support multiple SNode roots in codegen/runtime (#3090) (by Ye Kuang)
    • [Autodiff] Throw proper error message for unsupported ti.random. (#3131) (by Ailing)
    • [test] Parametrize test_ad_basics.py (#3129) (by Ye Kuang)
    • [ci] Fix sparse_solver for pylint. (#3132) (by Ailing)
    • [refactor] Enable user selected metrics for CUDA, remove print() method from class KernelProfiler (#3048) (by rocket)
    • [misc] Delete accidentally included spirv dump (#3130) (by Bob Cao)
    • [perf] Reduce SNodeTree materialization time in LLVM backends (#3127) (by Yi Xu)
    • [Lang] Move sparse matrix/solver into subfolder (#3115) (by FantasyVR)
    • [refactor] Enable pylint checking in ci and minor cleanup. (by Ailing Zhang)
    • [refactor] Move imports to the top level. (by Ailing Zhang)
    • [vulkan] Make atomic helper functions inline (#3118) (by Chang Yu)
    • [perf] Constant folding optimization (#3108) (by Bob Cao)
    • [Bug] Fix argument shadowed when there is a local variable with the same name (#3105) (by lin-hitonami)
    • [ci] Add back fixes for github actions windows image upgrade. (#3117) (by Ailing)
    • [doc] Update dev install about develop/install. (#3113) (by Ailing)
    • [cuda] Remove unified memory allocator (#3098) (by Dunfan Lu)
    • [refactor] Make ext_arr/any_arr/template to taichi.type.annotations. (by Ailing Zhang)
    • [refactor] Make type annotations simple, remove dep on high level data structures. (by Ailing Zhang)
    • [ci] Run shell scripts in CI (#3034) (by Jiasheng Zhang)
    • [Lang] Support chained assignments (#3062) (by Ce Gao)
    • [metal] Rearrange how KernelManager is initialized (#3109) (by Ye Kuang)
    • [GUI] Add context manager support for ui.Gui (#3055) (by Xuanwo)
    • [Lang] [bug] Fix support property decorator for data_oriented class (#3052) (by yihong)
    • [refactor] Work around some cyclic imports. (by Ailing Zhang)
    • [refactor] Remove importing outside top level. (by Ailing Zhang)
    • [refactor] Stop overriding taichi.core from taichi/lang/impl.py. (by Ailing Zhang)
    • [metal] Separate runtime and snodes initialization (#3093) (by Ye Kuang)
    • [ci] Copy paste linux & windows fixes from presubmit to release. (#3103) (by Ailing)
    • [Doc] Add documentation for gui system and install trouble shooting. (#2985) (by Jiasheng Zhang)
    • [doc] Fix the link to the QuanTaichi paper in README.md (#3102) (by Yi Xu)
    • [doc] Remove docs subpath and update references across the codebase (#3085) (by Chengchen(Rex) Wang)
    • [Lang] Use more user-friendly exception when converting from numpy array (#3058) (by Ce Gao)
    • renable sfg test (#3097) (by Dunfan Lu)
    • [cuda] Disable unified memory and make CI pass (#3067) (by Dunfan Lu)
    • [cuda] Fix a memory alignment bug in pre-allocated memory allocator (#3096) (by Dunfan Lu)
    • Fix minor bugs in GL Device (#3091) (by Bob Cao)
    • [metal] Pull out the Runtime MSL code into its own module (#3086) (by Ye Kuang)
    • [Doc] Fix an example in the documentation for coordinate offsets (#3089) (by 张皓)
    • [Example] A stable fluid demo with sparse matrix (#3081) (by Jiafeng Liu)
    • [doc] Add Vulkan into README.md (#3088) (by Bob Cao)
    • [refactor] Style police (#3082) (by Bob Cao)
    • [GUI] Extract the type casts of environment variables in misc.gui.gui into a reusable function (#3065) (by Dream Hunter)
    • [lang] Add a method to get all SNodes under a root (#3083) (by Ye Kuang)
    • [Test] Allow ti test work with -a cpu, cuda. (#3066) (by Ailing)
    • [refactor] Move the AST utils into taichi.lang.ast (#3063) (by Ye Kuang)
    • [refactor] Get rid of settings.py. (by Ailing Zhang)
    • [refactor] Get rid of build and load_module. (by Ailing Zhang)
    • [refactor] Move primitive_types.py to type folder. (by Ailing Zhang)
    • [refactor] Move record.py out of ti.core. (by Ailing Zhang)
    • [refactor] Move logging out of ti.core. (by Ailing Zhang)
    • [ci] Try moving all torch tests to single thread. (by Ailing Zhang)
    • [Doc] Update developer installation (#3070) (by Bo Qiao)
    • [Lang] Fix python AST print format issues in python/taichi/lang/transformer.py (#3061) (by Ce Gao)
    • [ci] Extend windows timeout. (#3068) (by Ailing)
    • [ci] Try fixing windows. (#3064) (by Ailing)
    • [Doc][autodiff] Add a section about customized gradient in autodiff. (#3054) (by Ailing)
    • [Doc] Update Type system (#3043) (by Tiantian Liu)
    Source code(tar.gz)
    Source code(zip)
  • v0.8.1(Sep 29, 2021)

    Full changelog:

    • [vulkan] Disable Vulkan validation layer (#3050) (by Ye Kuang)
    • [Doc] Update Python test doc (#3011) (by ljcc0930)
    • [Doc] Improve the documentation for profiler (#3014) (by rocket)
    • [Doc] Fix Arch Linux building guide clang dependence (#3042) (by Cinderella)
    • [Doc] Update kernels and functions (#2999) (by Mingrui Zhang)
    • [Doc] Update cpp_style.md (#3040) (by Ye Kuang)
    • [Doc] Developer installation update (#2996) (by Bo Qiao)
    • [Lang] [bug] Fix subscripting user classes in Taichi kernels (#3047) (by Yi Xu)
    • [misc] Version bump: v0.8.0->v0.8.1 (#3044) (by Yi Xu)
    • [Doc] Add docstring for init(), reset() and a few constants (#3026) (by Ye Kuang)
    • [gui] Fix mouse position error after window had been resized (#3041) (by Dunfan Lu)
    • [Doc] Performance tuning update (#2997) (by Bo Qiao)
    • [Doc] Update field (#2994) (by FantasyVR)
    • [Doc] Update interacting with external arrays (#3000) (by FantasyVR)
    • [Autodiff] Rename complex_kernel and complex_kernel_grad. (#3035) (by Ailing)
    • [Doc] Update contributor guidelines (#2991) (by Yi Xu)
    • [LLVM] Rename runtime memory allocation function (#3036) (by Bo Qiao)
    • [benchmark] [CUDA] Add the implementation of class CuptiToolkit (#2923) (by rocket)
    • [Doc] Add documentation for TLS/BLS (#2990) (by Ye Kuang)
    • [Doc] Update differentiable programming doc. (#2993) (by Ailing)
    • [Doc] Fix note format on webpage and code highlights for metaprogramming (#3027) (by Mingrui Zhang)
    • Disable validation layer with release builds (#3028) (by Bob Cao)
    • [Doc] Update the Type system section (#3010) (by Tiantian Liu)
    • [Doc] Add documentation for sparse computation (#2983) (by Yuanming Hu)
    • [Doc] Update metaprogramming doc (#3009) (by Mingrui Zhang)
    • [Doc] Improve the documentation for debugging and GGUI (#3002) (by Chang Yu)
    • [ci] Fix release.yml syntax error (#3022) (by Jiasheng Zhang)
    • [misc] Get changelogs via tags instead of commit messages (#3021) (by Yi Xu)
    • [Doc] Update documentation for fields (advanced) (#3012) (by Yi Xu)
    • [Doc] Improve the documentation for C++ style guide (#3001) (by Ye Kuang)
    • [Doc] Fix typo in data_oriented docstring (#3005) (by ljcc0930)
    • [Doc] Remove 'Versioning and releases'; Fix 'Documentation writing guide' (#2987) (by Yi Xu)
    • [misc] Add Vulkan as a target for ti diagnose (#2995) (by Yuheng Zou)
    • [Opt] [ir] [refactor] Remove exceptions from IR pass extract_constant (#2966) (by lin-hitonami)
    • [refactor] [benchmark] Add ti.profiler in python scope (#2922) (by rocket)
    • [refactor] Private field names and function restructure in cc backend. (#2989) (by Jiasheng Zhang)
    • [cpu] Cpu device 1/n: memory allocation (#2984) (by Dunfan Lu)
    • [refactor] Refactored and unified CC backend, removed CCProgram and use CCProgramImpl instead. (#2978) (by Jiasheng Zhang)
    • [CI] Better CI title check info (#2986) (by yihong)
    • [ci] Disable fail-fast matrix on release jobs. (#2982) (by Ailing)
    • [cuda] Cuda Device API 1/n: memory allocation (#2981) (by Dunfan Lu)
    • [misc] Add deactivate_all_snodes and ti.FieldsBuilder.deactivate_all (#2967) (by ljcc0930)
    • [misc] Remove unnecessary symlink step. (#2976) (by Ailing)
    • [ci] Releases must be done on buildbot-ubuntu machine. (#2977) (by Ailing)
    • [ci] Fix OOM on nightly release. (#2975) (by Ailing)
    • [Lang] Add error message for printing an incomplelely-defined field (#2979) (by yihong)
    • [refactor] Minimize Python context (#2971) (by Yi Xu)
    • [Doc] Installation with mirror source (#2946) (by FantasyVR)
    Source code(tar.gz)
    Source code(zip)
  • v0.8.0(Sep 23, 2021)

    Highlights in v0.8.0

    Packed Mode

    Previously in Taichi, all non-power-of-two dimensions of a field were automatically padded to a power of two. For instance, a field of shape (18, 65) would have internal shape (32, 128). Although the padding had many benefits such as allowing fast and convenient bitwise operations for coordinate handling, it would consume potentially much more memory than people thought.

    For people indeed want smaller memory usage, we now introduce an optional packed mode. In packed mode, no more padding will be applied so a field will not have a larger internal shape when some of its dimensions are not power-of-two. The downside is that the runtime performance will regress slightly.

    A switch named packed for ti.init() decides whether to use packed mode:

    ti.init()  # default: packed=False
    a = ti.field(ti.i32, shape=(18, 65))  # padded to (32, 128)
    
    ti.init(packed=True)
    a = ti.field(ti.i32, shape=(18, 65))  # no padding
    

    GGUI

    A new GUI system, which is codenamed GGUI, is added to Taichi. GGUI will use GPUs for rendering, which enables it to be much faster than the original ti.gui, and to render 3d meshes and particles. It also comes with a brand new set of immediate mode widgets APIs.

    Sample 3D code:

    window = ti.ui.Window("Hello Taichi", (1920, 1080))
    
    canvas = window.get_canvas()
    scene = ti.ui.Scene()
    camera = ti.ui.make_camera()
    
    while window.running:
    
        camera.position(...)
        camera.lookat(...)
        scene.set_camera(camera)
    
        scene.point_light(pos=(...), color=(...))
    
        # vertices, centers, etc. are taichi fields
        scene.mesh(vertices, ...)
        scene.particles(centers, radius, ...)
    
        canvas.scene(scene)
        window.show()
    

    Sample IMGUI code:

    window = ti.ui.Window("Hello Taichi", (500, 500))
    canvas = window.get_canvas()
    
    gx, gy, gz = (0, -9.8, 0)
    
    while window.running:
    
        window.GUI.begin("Greetings", 0.1, 0.1, 0.8, 0.15)
        window.GUI.text("Welcome to TaichiCon !")
        if window.GUI.button("Bye"):
            window.running = False
        window.GUI.end()
    
        window.GUI.begin("Gravity", 0.1, 0.3, 0.8, 0.3)
        gx = window.GUI.slider_float("x", gx, -10, 10)
        gy = window.GUI.slider_float("y", gy, -10, 10)
        gz = window.GUI.slider_float("z", gz, -10, 10)
        window.GUI.end()
    
        canvas.set_background_color(color)
        window.show()
    

    For more examples, please checkout examples/ggui_examples in the taichi repo.

    Dynamic SNode Allocation

    Previously in Taichi, we cannot allocate new fields after the kernel's execution. Now we can use a new class FieldsBuilder to support dynamic allocation.

    FieldsBuilder has the same data structure declaration API as the previous root, such as dense(), pointer() etc. After declaration, we need to call the finalize() function to compile the FieldsBuilder to an SNodeTree object.

    Example usage for FieldsBuilder:

    import taichi as ti
    ti.init()
    
    @ti.kernel
    def func(v: ti.template()):
        for I in ti.grouped(v):
            v[I] += 1
    
    fb = ti.FieldsBuilder()
    x = ti.field(dtype = ti.f32)
    fb.dense(ti.ij, (5, 5)).place(x)
    fb_snode_tree = fb.finalize() # Finalizing the FieldsBuilder and returns a SNodeTree
    func(x)
    
    fb2 = ti.FieldsBuilder()
    y = ti.field(dtype = ti.f32)
    fb2.dense(ti.i, 5).place(y)
    fb2_snode_tree = fb2.finalize() # Finalizing the FieldsBuilder and returns a SNodeTree
    func(y)
    

    Additionally, root now is implemented by FieldsBuilder implicitly, so we can allocate the fields directly under root.

    import taichi as ti
    ti.init() # ti.root = ti.FieldsBuilder()
    
    @ti.kernel
    def func(v: ti.template()):
        for I in ti.grouped(v):
            v[I] += 1
    
    x = ti.field(dtype = ti.f32)
    ti.root.dense(ti.ij, (5, 5)).place(x)
    func(x) # automatically called ti.root.finalize()
    # ti.root = new ti.FieldsBuilder()
    
    y = ti.field(dtype = ti.f32)
    ti.root.dense(ti.i, 5).place(y)
    func(y) # automatically called ti.root.finalize()
    

    Furthermore, after we called the finalize() of a FieldsBuilder, it will return a finalized SNodeTree object. If we do not want to use the fields under this SNodeTree, we could call destroy() manually to recycle the memory into the memory pool.

    e.g.:

    import taichi as ti
    ti.init()
    
    @ti.kernel
    def func(v: ti.template()):
        for I in ti.grouped(v):
            v[I] += 1
    
    fb = ti.FieldsBuilder()
    x = ti.field(dtype = ti.f32)
    fb.dense(ti.ij, (5, 5)).place(x)
    fb_snode_tree = fb.finalize() # Finalizing the FieldsBuilder and returns a SNodeTree
    func(x)
    
    fb_snode_tree.destroy()
    # func(x) cannot be used anymore
    

    Full changelog:

    • [doc] Fix several typos in doc (#2972) (by Ziyi Wu)
    • [opengl] Runtime refactor 1/n (#2965) (by Bob Cao)
    • [refactor] Avoid passing device strings into torch (#2968) (by Yi Xu)
    • [misc] Fix typos in examples/simulation/fractal.py (#2882) (by Yilong Li)
    • [opt] Support atomic min/max in warp reduction optimization (#2956) (by Yi Xu)
    • [Bug] Add GIL that was accidentally removed in PR #2939 back (#2964) (by lin-hitonami)
    • [misc] Support clean command to setup.py. (by Ailing Zhang)
    • [misc] Fix some build warnings. (by Ailing Zhang)
    • [doc] Add docstring for GGUI python API (#2958) (by Dunfan Lu)
    • [gui] Move all ggui kernels to python by using taichi fields as staging buffers (#2957) (by Dunfan Lu)
    • [opt] Add conservative alias analysis for ExternalPtrStmt (#2952) (by Yi Xu)
    • [opengl] Move old runtime onto Device API (#2945) (by Bob Cao)
    • [Lang] Remove deprecated usage of ti.Matrix.init (#2950) (by Yi Xu)
    • [Lang] Add data_handle property to Ndarray (#2947) (by Yi Xu)
    • [misc] Throw proper error if real function is not properly annotated. (#2943) (by Ailing)
    • [gui] Fix normal bug when default fp is not f32. (#2944) (by Dunfan Lu)
    • [opengl] Device API: Adding GL error checks & correct memory mapping flags (#2941) (by Bob Cao)
    • [Lang] Support configure sparse solver ordering (#2907) (by FantasyVR)
    • [refactor] remove Program::KernelProxy (#2939) (by lin-hitonami)
    • [doc] Update README.md (#2940) (by Yuanming Hu)
    • [opengl] Initial Device API work (#2925) (by Bob Cao)
    • [Lang] Support ti_print for wasm (#2910) (by squarefk)
    • [Lang] Fix ti func with template and add corresponding tests (#2871) (by squarefk)
    • [doc] Update README.md (#2937) (by Yuanming Hu)
    • [metal] Fix metal codegen to make OSX 10.14 work (#2935) (by Ye Kuang)
    • [Doc] Add developer installation to README.md (#2933) (by Ye Kuang)
    • [misc] Edit preset indices (#2932) (by ljcc0930)
    • fratal example (#2931) (by Dunfan Lu)
    • [refactor] Exchange compiled_grad_functions and compiled_functions in kernel_impl.py (#2930) (by Yi Xu)
    • [Misc] Update doc links (#2928) (by FantasyVR)
    • Disable a few vulkan flaky tests. (#2926) (by Ailing)
    • [llvm] Remove duplicated set dim attribute for GlobalVariableExpression (#2929) (by Ailing)
    • [ci] Artifact uploading before test in release.yml (#2921) (by Jiasheng Zhang)
    • [bug] Fix the Bug that cannot assign a value to a scalar member in a struct from python scope (#2894) (by JeffreyXiang)
    • [misc] Update examples (#2924) (by Taichi Gardener)
    • [ci] Enable tmate session if release test fails. (#2919) (by Ailing)
    • [refactor] [CUDA] Wrap the default profiling tool as EventToolkit , add a new class for CUPTI toolkit (#2916) (by rocket)
    • [metal] Fix upperbound for list-gen and struct-for (#2915) (by Ye Kuang)
    • [ci] Fix linux release forgot to remove old taichi (#2914) (by Jiasheng Zhang)
    • [Doc] Add docstring for indices() and axes() (#2917) (by Ye Kuang)
    • [refactor] Rename SNode::n to SNode::num_cells_per_container (#2911) (by Ye Kuang)
    • Enable deploy preview if changes are detected in docs. (#2913) (by Ailing)
    • [refactor] [CUDA] Add traced_records_ for KernelProfilerBase, refactoring KernelProfilerCUDA::sync() (#2909) (by rocket)
    • [ci] Moved linux release to github action (#2905) (by Jiasheng Zhang)
    • [refactor] [CUDA] Move KernelProfilerCUDA from program/kernel_profiler.cpp to backends/cuda/cuda_profiler.cpp (#2902) (by rocket)
    • [wasm] Fix WASM AOT module builder order (#2904) (by Ye Kuang)
    • [CUDA] Add a compilation option for CUDA toolkit (#2899) (by rocket)
    • [vulkan] Support for multiple SNode trees in Vulkan (#2903) (by Dunfan Lu)
    • add destory snode tree api (#2898) (by Dunfan Lu)
    Source code(tar.gz)
    Source code(zip)
  • v0.7.32(Sep 9, 2021)

    Full changelog:

    • [vulkan] Turn off Vulkan by default and add dev install instructions (#2897) (by Dunfan Lu)
    • [Misc] Fix the path in conda_env.yaml (#2895) (by Ce Gao)
    • [ci] Rollback buggy Dockerfile (by Dunfan Lu)
    • [bug] [opt] Disable putting pointers into global tmp buffer (#2888) (by Yi Xu)
    • [Llvm] Increase the number of arguments allowed in a kernel (#2886) (by Yi Xu)
    • [gui] GGUI fix undefined variable (#2885) (by Mingrui Zhang)
    • [ci] Build and Release Vulkan in CI/CD (#2881) (by Dunfan Lu)
    • [Lang] Refine semantics of ti.any_arr (#2875) (by Yi Xu)
    • [refactor] OpenGL program impl (#2878) (by Dunfan Lu)
    • [refactor] Vulkan program impl (#2876) (by Dunfan Lu)
    • Clean up sparse matrix (#2872) (by squarefk)
    • Re-enable sfg test on CUDA (#2874) (by Bo Qiao)
    • [refactor] Unify llvm_program_ and metal_program_ in Program class. (by Ailing Zhang)
    • [refactor] Let LlvmProgramImpl inherit ProgramImpl. (by Ailing Zhang)
    • [refactor] Init ProgramImpl from MetalProgramImpl. (by Ailing Zhang)
    • [CUDA] [bug] Fix CUDA error "allocate_global (DataType type) misaligned address" (#2863) (by rocket)
    • [Lang] Experimental SpMV and direct linear solvers (#2853) (by FantasyVR)
    • [refactor] Get rid of some unnecessary get_current_program(). (by Ailing Zhang)
    • [ci] Enable CI on pushing to master. (#2865) (by Ailing)
    • [Lang] Support fill, from_numpy, to_numpy for ti.ndarray (#2868) (by Yi Xu)
    Source code(tar.gz)
    Source code(zip)
  • v0.7.31(Sep 2, 2021)

    Full changelog:

    • [doc] Remove links from documentation articles to API reference (#2866) (by Chengchen(Rex) Wang)
    • [Lang] Support struct fors on ti.any_arr (#2857) (by Yi Xu)
    • [ci] M1 release (#2855) (by Jiasheng Zhang)
    • [Doc] Update the API reference section. (#2856) (by Chengchen(Rex) Wang)
    • [LLVM] [Bug] fix typo of PR #2781 (#2854) (by rocket)
    • [Vulkan] Use reference counting based wrapper layer (#2849) (by Bob Cao)
    • [gui] Two sided mesh (#2851) (by Dunfan Lu)
    • [refactor] Make ti.ext_arr a special case of ti.any_arr (#2850) (by Yi Xu)
    • [Lang] Add ti.Vector.ndarray and ti.Matrix.ndarray (#2808) (by Yi Xu)
    • [ci] No need to specify arch on M1 CI. (#2845) (by Ailing)
    • [Lang] Customized struct support (#2627) (by Andrew Sun)
    • [Lang] Fix ti test parameters (#2830) (by squarefk)
    • [ci] Enable verbose on M1 CI to collect more info on hanging jobs. (#2844) (by Ailing)
    • [ci] Fixed bug of wrong os parameter (#2843) (by Jiasheng Zhang)
    • [gui] GGUI 17/n: doc (#2842) (by Dunfan Lu)
    • [gui] GGUI 16/n: examples (#2841) (by Dunfan Lu)
    • [refactor] Move FrontendContext from global into Callable class. (by Ailing Zhang)
    • [refactor] Decouple AsyncEngine with Program. (by Ailing Zhang)
    • [refactor] Decouple MemoryPool with Program. (by Ailing Zhang)
    • [refactor] Decouple opengl codegen compile with Program. (by Ailing Zhang)
    • [refactor] Unify compile() for LlvmProgramImpl and MetalProgramImpl. (by Ailing Zhang)
    • [refactor] Initial MetalProgramImpl implementation. (by Ailing Zhang)
    • [gui] GGUI small fixups (#2840) (by Dunfan Lu)
    • [ci] Fixed bugs of double env in release.yml (#2838) (by Jiasheng Zhang)
    • [Lang] Let rescale_index support SNode as input parameter (#2826) (by Jack12xl)
    • [refactor] Minor cleanup in program.cpp. (by Ailing Zhang)
    • [gui] GGUI 15/n: Python-side code (#2832) (by Dunfan Lu)
    • [gui] GGUI 14/n: Shaders (#2829) (by Dunfan Lu)
    • [Lang] Experimental sparse matrix support on CPUs (#2792) (by FantasyVR)
    • [Vulkan] Add relaxed FIFO presentation mode (#2828) (by Bob Cao)
    • [ci] Conditional build matrix on release (#2819) (by Jiasheng Zhang)
    • [gui] GGUI 13/n: Pybind stuff (#2825) (by Dunfan Lu)
    • [gui] GGUI 12/n: Window and Canvas (#2824) (by Dunfan Lu)
    • [gui] GGUI 11/n: Renderer (#2818) (by Dunfan Lu)
    • [gui] GGUI 7.5/n: Avoid requiring CUDA toolchains to compile GGUI (#2821) (by Dunfan Lu)
    • [vulkan] Let me pass (#2823) (by Dunfan Lu)
    • [ci] Add timeout for every job in presubmit.yml (#2820) (by Jiasheng Zhang)
    • [Vulkan] Device API Multi-streams, multi-queue, and initial multi-thread support (#2802) (by Bob Cao)
    • [Doc] Fix example path and conda instruction link (#2815) (by Bo Qiao)
    • [Lang] Fix unfolding subscripting inside ti.external_func_call() (#2806) (by squarefk)
    • Enable tensor subscripting as input for external function call (#2812) (by squarefk)
    • [gui] GGUI 10/n: IMGUI (#2809) (by Dunfan Lu)
    • [Vulkan] [ci] Enable and release Vulkan (#2795) (by Chang Yu)
    • [Vulkan] Fixing floating point load/store/atomics on global temps and context buffers (#2796) (by Bob Cao)
    • [gui] GGUI 9/n: Renderables and Scene (#2803) (by Dunfan Lu)
    • [vulkan] Fix bug in empty root buffer (#2807) (by Chang Yu)
    • [Lang] Fix tensor based grouped ndrange for (#2800) (by squarefk)
    • [gui] GGUI 8/n: Renderable class (#2798) (by Dunfan Lu)
    Source code(tar.gz)
    Source code(zip)
  • v0.7.30(Aug 25, 2021)

    Full changelog:

    • [gui] GGUI 7/n: Vertex class, and kernels for updating VBO/IBO/texture (#2797) (by Dunfan Lu)
    • [Refactor] Make Layout an Enum class and move it away from impl.py (#2774) (by Yi Xu)
    • app context and swap chain (#2794) (by Dunfan Lu)
    • [Refactor] Move snode_tree_buffer_manager and llvm_runtime to private. (by Ailing Zhang)
    • [Refactor] Move llvm_context_host/device to private. (by Ailing Zhang)
    • [Refactor] Further cleanup Program constructor. (by Ailing Zhang)
    • [Refactor] Move check_runtime_error to LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Simplify synchronize and materialize_runtime in Program. (by Ailing Zhang)
    • [gui] GGUI 5/n: remove some stuff (#2793) (by Dunfan Lu)
    • [ci] Added gpu test timeout (#2791) (by Jiasheng Zhang)
    • [Vulkan] [test] Fix Vulkan CI test bug & Enable tests for Vulkan backend (#2776) (by Yu Chang)
    • [vulkan] Graphics Device API (#2789) (by Dunfan Lu)
    • [ci] Changed nightly tag from pre-release to post-release (#2786) (by Jiasheng Zhang)
    • [ci] Add Apple M1 buildbot (#2731) (by ljcc0930)
    • [Refactor] Move a few helpers in LlvmProgramImpl to private. (by Ailing Zhang)
    • [Refactor] Cleanup llvm specific apis in program.h (by Ailing Zhang)
    • [Ir] Clean up frontend ir for global tensor and local tensor (#2773) (by squarefk)
    • [Refactor] Only prepare sandbox for cc backend. (#2775) (by Ailing)
    • [gui] Remove DPI settings (#2767) (by Ye Kuang)
    • [Doc] Add instructions for how to use conda (#2764) (by Ye Kuang)
    • [Lang] Enable treating external arrays as Taichi vector/matrix fields (#2727) (by Yi Xu)
    • [Opt] [ir] Optimize offload (#2673) (by squarefk)
    • [Refactor] Add initialize_llvm_runtime_system to LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Add materialize_snode_tree to LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Move initialize_llvm_runtime_snodes to LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Move clone_struct_compiler_initial_context to LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Move is_cuda_no_unified_memory to CompileConfig. (by Ailing Zhang)
    • [Refactor] Add maybe_initialize_cuda_llvm_context to LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Add get_snode_num_dynamically_allocated to LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Move print_memory_profiler_info to LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Move print_list_manager_info to LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Move runtime_query to LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Move get_llvm_context to LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Move preallocated_device_buffer into LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Move thread_pool into LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Move runtime_mem_info into LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Move llvm_runtime to LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Move llvm_context_device to LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Move llvm_context_host to LlvmProgramImpl. (by Ailing Zhang)
    • [Refactor] Move snode_tree_buffer_manager into LlvmProgramImpl. (by Ailing Zhang)
    • [Doc] Several dev install doc improvements. (#2741) (by Ailing)
    • [ci] Fix clang-format version to 10 (#2739) (by Ailing)
    • [Test] Unify tests decorators with @ti.test() (#2674) (by squarefk)
    • [Doc] Fix --user in dev install instruction. (#2732) (by Ailing)
    • [test] Fix potential memory error when DecoratorRecorder hasn't been reset correctly (#2735) (by squarefk)
    • [ci] Fix GPU buildbot paths and configs (#2728) (by Yi Xu)
    • [ci] Reduce the number of python wheels built nightly (#2726) (by Jiasheng Zhang)
    • [ir] Internal function call now supports arguments and i32 return value (#2722) (by Yuanming Hu)
    • [vulkan] Fix dumb memory error (#2721) (by Bob Cao)
    • [refactor] Remove unneccessary constructor argument (#2720) (by saltyFamiliar)
    • [Misc] Add submodule Eigen (#2707) (by FantasyVR)
    • Pin yapf version to 0.31.0. (#2710) (by Ailing)
    • [vulkan] [test] Support full atomic operations on Vulkan backend (#2709) (by Yu Chang)
    • [Vulkan] Move vulkan to device API (#2695) (by Bob Cao)
    • [Doc] Update developer install doc to use setup.py. (#2706) (by Ailing)
    • [ci] Fixed pypi version (#2708) (by Jiasheng Zhang)
    • [Lang] Redesign Ndarray class and add ti.any_arr() annotation (#2703) (by Yi Xu)
    • [Bug] Close the kernel context when failing to compile AST (#2704) (by Calvin Gu)
    • [Doc] Add gdb debug instructions to dev utilities. (#2702) (by Ailing)
    • [vulkan] Better detect Vulkan availability (#2699) (by Yu Chang)
    • [benchmark] [refactor] Move fill() and reduction() into Membound suite, calculate the geometric mean of the time results (#2697) (by rocket)
    • [ci] Added taichi nightly auto release to github action (#2670) (by Jiasheng Zhang)
    • [Misc] Keep debug symbols/line numbers in taichi_core.so by setting DEBUG=1. (#2694) (by Ailing)
    • [vulkan] [test] Enable Vulkan backend on OS X (#2692) (by Yu Chang)
    • [Refactor] Remove is_release() in the codebase. (#2691) (by Ailing)
    • [doc] fix outdated links of examples in examples.md (#2693) (by Yu Chang)
    • [Refactor] Make is_release always True and delete runtime dep on TAICHI_REPO_DIR. (#2689) (by Ailing)
    • [CUDA] Save an extra host to device copy if arg_buffer is already on device. (#2688) (by Ailing)
    • [Refactor] Allow taichi_cpp_tests run in release mode as well. (#2686) (by Ailing)
    • [Refactor] Re-enable gdb attach on crash. (#2687) (by Ailing)
    • [Lang] Add a Ndarray class to serve as an alternative to dense scalar fields (#2676) (by Yi Xu)
    • [gui] GGUI 4/n: Vulkan GUI backend utils (#2672) (by Dunfan Lu)
    • [Test] Smarter enumerating features and raise exception when not supported (#2679) (by squarefk)
    • [vulkan] Querying features like a mad man (#2671) (by Bob Cao)
    • [IR] Support local tensor (#2637) (by squarefk)
    • [vulkan] Check that additional extensions are supported before adding them (#2667) (by Dunfan Lu)
    • [vulkan] [test] Fix bugs detected by tests & Skip unnecessary tests for Vulkan backend (#2664) (by Yu Chang)
    Source code(tar.gz)
    Source code(zip)
Owner
Taichi Developers
Taichi Developers
Camera track the tip of a pen to use as a drawing tablet

cablet Camera track the tip of a pen to use as a drawing tablet Setup You will need: Writing utensil with a colored tip (preferably blue or green) Bac

14 Feb 20, 2022
Islam - This is a simple python script.In this script I have written all the suras of Al Quran. As a result, by using this script, you can know the number of any sura at the moment.

Introduction: If you want to know sura number of al quran by just typing the name of sura than you can use this script. Usage in termux: $ pkg install

Fazle Rabbi 1 Jan 02, 2022
The next generation Canto RSS daemon

Canto Daemon This is the RSS backend for Canto clients. Canto-curses is the default client at: http://github.com/themoken/canto-curses Requirements De

Jack Miller 155 Dec 28, 2022
A script to automatically update bot status at GitHub as well as in Telegram channel.

A simple & short repository to show your bot's status in your GitHub README.md file as well as in you channel.

Jainam Oswal 55 Dec 13, 2022
Multi View Stereo on Internet Images

Evaluating MVS in a CPC Scenario This repository contains the set of artficats used for the ENGN8601/8602 research project. The thesis emphasizes on t

Namas Bhandari 1 Nov 10, 2021
A module to prevent invites and joins to Matrix rooms by checking the involved server(s)' domain.

Synapse Domain Rule Checker A module to prevent invites and joins to Matrix rooms by checking the involved server(s)' domain. Installation From the vi

matrix.org 4 Oct 24, 2022
End-to-End text sumarization, QAs generation using flask.

Help-Me-Read A web application created with Flask + BootStrap + HuggingFace 🤗 to generate summary and question-answer from given input text. It uses

Ankush Kuwar 12 Nov 13, 2022
Blender addon - Breakdown in object mode

Breakdowner Breakdown in object mode Download latest Demo Youtube Description Same breakdown shortcut as in armature mode in object mode Currently onl

Samuel Bernou 4 Mar 30, 2022
Shows a pixel art of any Pokémon in your terminal!

pokemon-icat This script is inspired by this project, but since the output heavily depends on the font of your terminal, i decided to make a script th

ph04 52 Dec 22, 2022
The FLARE team's open-source library to disassemble Common Intermediate Language (CIL) instructions.

dncil is a Common Intermediate Language (CIL) disassembly library written in Python that supports parsing the header, instructions, and exception hand

MANDIANT 95 Jan 08, 2023
A country information finder module

A country information finder module

Fayas Noushad 3 Nov 28, 2021
Reproduction repository for the MDX 2021 Hybrid Demucs model

Submission This is the submission for MDX 2021 Track A, for Track B go to the track_b branch. Submission Summary Submission ID: 151378 Submitter: defo

Alexandre Défossez 62 Dec 18, 2022
frida-based ceserver. iOS analysis is possible with Cheat Engine.

frida-ceserver frida-based ceserver. iOS analysis is possible with Cheat Engine. Original by Dark Byte. Usage Install frida on iOS. python main.py Cyd

KenjiroIchise 89 Jan 08, 2023
GA SEI Unit 4 project backend for Bloom.

Grow Your OpportunitiesTM Background Watch the Bloom Intro Video At Bloom, we believe every job seeker deserves an opportunity to find meaningful work

Jonathan Herman 3 Sep 20, 2021
How did Covid affect businesses?

NYC_Business_Analysis How did Covid affect businesses? COVID's effect on NYC businesses We all know that businesses in NYC have been affected by COVID

AK 1 Jan 15, 2022
A tool for generating skill map/tree like diagram

skillmap A tool for generating skill map/tree like diagram. What is a skill map/tree? Skill tree is a term used in video games, and it can be used for

Yue 98 Jan 07, 2023
A demo Piccolo app - a movie database!

PyMDb Welcome to the Python Movie Database! Built using Piccolo, Piccolo Admin, and FastAPI. Created for a presentation given at PyData Global 2021. R

11 Oct 16, 2022
Module to align code with thoughts of users and designers. Also magically handles navigation and permissions.

This readme will introduce you to Carteblanche and walk you through an example app, please refer to carteblanche-django-starter for the full example p

Eric Neuman 42 May 28, 2021
resultados (data) de elecciones 2021 y código para extraer data de la ONPE

elecciones-peru-2021-ONPE Resultados (data) de elecciones 2021 y código para extraer data de la ONPE Data Licencia liberal, pero si vas a usarlo por f

Ragi Yaser Burhum 21 Jun 14, 2021
Context-free grammar to Sublime-syntax file

Generate a sublime-syntax file from a non-left-recursive, follow-determined, context-free grammar

Haggai Nuchi 8 Nov 17, 2022