A library for efficient similarity search and clustering of dense vectors.

Related tags

Deep Learningfaiss
Overview

Faiss

Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python/numpy. Some of the most useful algorithms are implemented on the GPU. It is developed by Facebook AI Research.

News

See CHANGELOG.md for detailed information about latest features.

Introduction

Faiss contains several methods for similarity search. It assumes that the instances are represented as vectors and are identified by an integer, and that the vectors can be compared with L2 (Euclidean) distances or dot products. Vectors that are similar to a query vector are those that have the lowest L2 distance or the highest dot product with the query vector. It also supports cosine similarity, since this is a dot product on normalized vectors.

Most of the methods, like those based on binary vectors and compact quantization codes, solely use a compressed representation of the vectors and do not require to keep the original vectors. This generally comes at the cost of a less precise search but these methods can scale to billions of vectors in main memory on a single server.

The GPU implementation can accept input from either CPU or GPU memory. On a server with GPUs, the GPU indexes can be used a drop-in replacement for the CPU indexes (e.g., replace IndexFlatL2 with GpuIndexFlatL2) and copies to/from GPU memory are handled automatically. Results will be faster however if both input and output remain resident on the GPU. Both single and multi-GPU usage is supported.

Building

The library is mostly implemented in C++, with optional GPU support provided via CUDA, and an optional Python interface. The CPU version requires a BLAS library. It compiles with a Makefile and can be packaged in a docker image. See INSTALL.md for details.

How Faiss works

Faiss is built around an index type that stores a set of vectors, and provides a function to search in them with L2 and/or dot product vector comparison. Some index types are simple baselines, such as exact search. Most of the available indexing structures correspond to various trade-offs with respect to

  • search time
  • search quality
  • memory used per index vector
  • training time
  • need for external data for unsupervised training

The optional GPU implementation provides what is likely (as of March 2017) the fastest exact and approximate (compressed-domain) nearest neighbor search implementation for high-dimensional vectors, fastest Lloyd's k-means, and fastest small k-selection algorithm known. The implementation is detailed here.

Full documentation of Faiss

The following are entry points for documentation:

Authors

The main authors of Faiss are:

Reference

Reference to cite when you use Faiss in a research paper:

@article{JDH17,
  title={Billion-scale similarity search with GPUs},
  author={Johnson, Jeff and Douze, Matthijs and J{\'e}gou, Herv{\'e}},
  journal={arXiv preprint arXiv:1702.08734},
  year={2017}
}

Join the Faiss community

For public discussion of Faiss or for questions, there is a Facebook group at https://www.facebook.com/groups/faissusers/

We monitor the issues page of the repository. You can report bugs, ask questions, etc.

License

Faiss is MIT-licensed.

Comments
  • faiss::gpu::runMatrixMult failure

    faiss::gpu::runMatrixMult failure

    The full log: Faiss assertion err == CUBLAS_STATUS_SUCCESS failed in void faiss::gpu::runMatrixMult(faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, float, float, cublasHandle_t, cudaStream_t) [with T = float; cublasHandle_t = cublasContext*; cudaStream_t = CUstream_st*] at utils/MatrixMult.cu:141Aborted (core dumped)

    I have successfully run demo_ivfpq_indexing_gpu, which I think the faiss was installed successfully.

    bug cant-repro 
    opened by hellolovetiger 36
  • No module named '_swigfaiss' for conda install

    No module named '_swigfaiss' for conda install

    Summary

    Platform

    OS: macOS 10.13.4

    Faiss version:

    Faiss compilation options:

    Running on :

    • [ ] CPU

    Reproduction instructions

    I installed with

    conda install faiss-cpu -c pytorch
    

    and got No module named '_swigfaiss' error. I went into faiss directory and tried to import again, but got the same error message. It is mentioned in the trouble shooting that this error is caused by faiss not being compiled. Since I use conda install, I suppose it is not the case?

    bug install 
    opened by hsiaoma 29
  • make py: fatal error: Python.h: No such file or directory

    make py: fatal error: Python.h: No such file or directory

    I am also facing same issue, i did following steps

    1. Cloned FAISS
    2. updated makefile.inc with anaconda python path and installed necessary dependencies like libopenblas-dev python-numpy python-dev
    3. make (After this step i am not finding any _swigfaiss.so files anywhere)
    4. make py (Gave following error) $ make py g++ -I. -fPIC -m64 -Wall -g -O3 -msse4 -mpopcnt -fopenmp -Wno-sign-compare -std=c++11 -fopenmp -g -fPIC -fopenmp -I~/anaconda2/envs/faissenv/include/python2.7/ -I~/anaconda2/envs/faissenv/lib/python2.7/site-packages/numpy/core/include -shared
      -o python/_swigfaiss.so python/swigfaiss_wrap.cxx libfaiss.a /usr/lib/libopenblas.so.0 python/swigfaiss_wrap.cxx:154:21: fatal error: Python.h: No such file or directory compilation terminated. Makefile:84: recipe for target 'python/_swigfaiss.so' failed make: *** [python/_swigfaiss.so] Error 1 I am able to run cpp implementation, but only this python wrapper is failing, let me know what i am setting wrong. As _swigfaiss.so is not generated, what went wrong while doing make?

    Originally posted by @Mahanteshambi in https://github.com/facebookresearch/faiss/issues/336#issuecomment-365565492

    question cant-repro install 
    opened by daisy-belle 24
  • Faiss import error when run in virtualenv by using own built Faiss-python

    Faiss import error when run in virtualenv by using own built Faiss-python

    Summary

    I have built faiss-core and faiss-python by myself. I installed python into my local virtual env and try to import faiss and I got an error, checked egg file, it does have _swigfaiss.so inside. I checked conda swigfaiss.py, it's still using old swig_import_helper, not sure if caused by this you remove it by using swig create python/swigfaiss.py as follows:

    https://github.com/facebookresearch/faiss/commit/7f5b22b0fff0882ce4afd93ce54cc2833a224909#diff-8cf6167d58ce775a08acafcfe6f40966

    $ ls faiss-1.5.2-py3.6/faiss
    __init__.py	__pycache__	_swigfaiss.so	swigfaiss.py
    

    Platform

    OS: centos 7

    Faiss version: 1.5.2

    Faiss compilation options:

     ./configure  --prefix=/usr --without-cuda --with-blas=/usr/lib64/libblas.so.3 --with-lapack=/usr/lib64/liblapack.so.3
    make
    sudo make install
    make py
    cd ~ && rm -rf env && python3 -m venv env
    source env/bin/activate
    cd ~/faiss && sudo make -C python install
    

    Running on:

    • [X] CPU
    • [ ] GPU

    Interface:

    • [ ] C++
    • [X] Python

    Reproduction instructions

    $ python
    Python 3.6.7 | packaged by conda-forge | (default, Feb 28 2019, 09:07:38)  [GCC 7.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import faiss
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/midas/env/lib/python3.6/site-packages/faiss-1.5.2-py3.6.egg/faiss/__init__.py", line 18, in <module>
      File "/home/midas/env/lib/python3.6/site-packages/faiss-1.5.2-py3.6.egg/faiss/swigfaiss.py", line 13, in <module>
    ImportError: cannot import name '_swigfaiss'
    
    install 
    opened by billyean 23
  • PyTorch tensor / Faiss index interoperability

    PyTorch tensor / Faiss index interoperability

    Summary: This diff allows for native usage of PyTorch tensors for Faiss indexes on both CPU and GPU. It is currently only implemented in this diff for things that inherit from faiss.Index, which covers the non-binary indices, and it patches the same functions on faiss.Index that were also covered by __init__.py for numpy interoperability.

    There must be uniformity among the inputs: if any array input is a Torch tensor, then all array inputs must be Torch tensors. Similarly, if any array input is a numpy ndarray, then all array inputs must be numpy ndarrays.

    If faiss.contrib.torch_utils is imported, it ensures that import faiss has already been performed to patch all of the functions using the base __init__.py numpy wrappers, and then patches the following functions again:

    add
    add_with_ids
    assign
    train
    search
    remove_ids
    reconstruct
    reconstruct_n
    range_search
    update_vectors
    search_and_reconstruct
    sa_encode
    sa_decode
    

    to allow usage of PyTorch CPU tensors, and additionally PyTorch GPU tensors if the index being used is on the GPU.

    numpy functionality is still available when faiss.contrib.torch_utils is imported; we pass through to the original patched numpy function when we detect numpy inputs.

    In addition, to allow for better (asynchronous) GPU usage without requiring the CPU to be involved, all of these functions which construct tensors/arrays for output now take optional arguments for storage (numpy or torch.Tensor) to be provided that will contain the output data. range_search is the only exception to this, as the size of the output data is indeterminate. The eventual GPU implementation will likely require the user to provide a maximum cap on the output size, and allow that to be passed instead. If the optional pre-allocated output values are presented by the user, they are used; otherwise, new return ndarray / Tensors are constructed as before and used for the return. If this feature were not provided on the GPU, then every execution would be completely serial as we would depend upon the CPU to allocate GPU memory before every operation. Instead, now this can function much like NN graph execution on the GPU, assuming that all of the data requirements are pre-allocated, so the execution will run at the full speed of the GPU and not be stalled sequentially launching kernels.

    This diff also exposes the GpuResources shared_ptr object owned by a GPU index. This is required for pytorch GPU so that we can perform proper stream ordering in Faiss with respect to the current pytorch stream. So, Faiss indices now perform more or less as any NN operation in Torch does.

    Note, however, that a Faiss index has its own setting on current device, and if the pytorch GPU tensor inputs are resident on a different device than what the Faiss index expects, a cross-device copy will be initiated. I may choose to make this an error in the future and require matching device to device.

    This diff also found a bug when passing GPU data directly to train() for GpuIndexIVFFlat and GpuIndexIVFScalarQuantizer, as I guess we never tested passing GPU data directly to these functions before. GpuIndexIVFPQ was doing the right thing however.

    The assign function is now also implemented on the GPU as well, and is now marked const to be in line with the search function.

    Also added better checking of non-contiguous inputs for both Torch tensors and numpy ndarrays.

    Updated the knn_gpu function with a base implementation always present that allows for usage of numpy arrays, which is overridden when torch_utils is imported to allow torch usage. This supports row/column major layout, float32/float16 data and int64/int32 indices for both numpy and torch.

    Reviewed By: mdouze

    Differential Revision: D24299400

    CLA Signed fb-exported 
    opened by wickedfoo 21
  • GPU issue when installing from conda

    GPU issue when installing from conda

    Summary

    I install Faiss from conda (GPU version) image

    And I got ImportError: No module named 'swigfaiss' Could you guys help me out? Did I forget anything?

    Platform

    OS: Ubuntu

    Faiss version:

    Faiss compilation options:

    Running on :

    • [ ] CPU
    • [x] GPU

    Reproduction instructions

    image

    GPU install 
    opened by hminle 20
  • Speedup exhaustive_L2sqr_blas for AVX2, ARM NEON and AVX512

    Speedup exhaustive_L2sqr_blas for AVX2, ARM NEON and AVX512

    Summary: Add a fused kernel for exhaustive_L2sqr_blas() call that combines a computation of dot product and the search for the nearest centroid. As a result, no temporary dot product values are written and read in RAM.

    Significantly speeds up the training of PQx[1] indices for low-dimensional PQ vectors ( 1, 2, 4, 8 ), and the effect is higher for higher values of [1]. AVX512 provides additional overloads for dimensionality of 12 and 16.

    The speedup is also beneficial for higher values of pq.cp.max_points_per_centroid (which is 256 by default).

    Speeds up IVFPQ training as well.

    AVX512 kernel is not enabled, but I've seen it speeding up the training TWICE versus AVX2 version. So, please feel free to use it by enabling AVX512 manually.

    Differential Revision: D41166766

    CLA Signed fb-exported 
    opened by alexanderguzhva 18
  • Does Faiss support searching from Disk?

    Does Faiss support searching from Disk?

    I checked this issue[#552] and also this demo file. But when I checked the demo file, it was not for searching from disk, The demo file was about how save an trained index and load the index to memory for searching. Does Faiss really support searching from disk? If it does, could you let me know where I can refer to do it.

    question 
    opened by sam3oh5 18
  • _swigfaiss_avx2.so may not be loaded properly in conda

    _swigfaiss_avx2.so may not be loaded properly in conda

    Summary

    When I install faiss via conda, IndexPQFastScan is slower than IndexPQ. It seems that AVX2 is not activated properly because _swigfaiss_avx2.so is not loaded correctly.

    Platform

    OS: Ubuntu 20.04 on AWS EC2. (ami-0e039c7d64008bd84, c5.large)

    Faiss version: faiss-cpu 1.7.0 (pytorch/linux-64::faiss-cpu-1.7.0-py3.8_h2a577fa_0_cpu)

    Installed from: conda install -c pytorch faiss-cpu

    Faiss compilation options:

    Running on:

    • [x] CPU
    • [ ] GPU

    Interface:

    • [ ] C++
    • [x] Python

    Reproduction instructions

    I found that IndexPQFastScan is slower than IndexPQ for faiss 1.7.0 installed from conda. Here is the benchmark code.

    import faiss
    import numpy as np
    import time
    
    np.random.seed(123)
    D = 128
    N = 1000
    X = np.random.random((N, D)).astype(np.float32)
    M = 64
    nbits = 4
    
    pq = faiss.IndexPQ(D, M, nbits)
    pq.train(X)
    pq.add(X)
    
    pq_fast = faiss.IndexPQFastScan(D, M, nbits)
    pq_fast.train(X)
    pq_fast.add(X)
    
    t0 = time.time()
    d1, ids1 = pq.search(x=X[:3], k=5)
    t1 = time.time()
    print(f"pq: {(t1 - t0) * 1000} msec")
    
    t0 = time.time()
    d2, ids2 = pq_fast.search(x=X[:3], k=5)
    t1 = time.time()
    print(f"pq_fast: {(t1 - t0) * 1000} msec")
    
    assert np.allclose(ids1, ids2)
    

    The result is:

    pq: 0.4680156707763672 msec
    pq_fast: 1.6791820526123047 msec
    

    After investigating, the cause seems that _swigfaiss_avx2.so is not loaded correctly. If I rename _swigfaiss_avx2.so to _swigfaiss.so, the above code works as expected:

    cd ~/anaconda/lib/python3.8/site-packages/faiss/
    mv _swigfaiss.so _swigfaiss.so.bk
    mv _swigfaiss_avx2.so _swigfaiss.so
    

    Then the benchmark results in:

    pq: 0.8258819580078125 msec
    pq_fast: 0.07104873657226562 msec
    

    Here, IndexPQFastScan becomes much faster.

    The root cause seems that swigfaiss.py is somehow exactly the same as swigfaiss_avx2.py.

    diff swigfaiss.py swigfaiss_avx2.py     # same
    

    If I understand correctly, swigfaiss_avx2.py must load _swigfaiss_avx2.so. But currently swigfaiss_avx2.py is the same as swigfaiss.py and loads _swigfaiss.so.

    install 
    opened by matsui528 16
  • Indexing 1B vectors by creating smaller indexes on batches and merging them

    Indexing 1B vectors by creating smaller indexes on batches and merging them

    Need guidance...

    We'll have an application where we will stream a set of vectors (on the order of a billion). We cannot wait until we collect all the vectors to train an index (you recommend IMI at this scale). We are thinking of building indexes for smaller batches of vectors... once we have a batch ready, we could train the index from a sample, create an index for the batch and in the end merge all the indexes. I understand only IVF supports merging of indexes, wanted your thoughts on this approach.

    Thanks

    question GPU 
    opened by mvss80 16
  • CUDA 9 issue: results of GPU Index are not right?

    CUDA 9 issue: results of GPU Index are not right?

    1. The result of GPU index is not the same as CPU, even although on the same dateset with the same index

    import numpy as np
    d = 64                           # dimension
    nb = 100000                      # database size
    nq = 10000                       # nb of queries
    np.random.seed(1234)             # make reproducible
    xb = np.random.random((nb, d)).astype('float32')
    xb[:, 0] += np.arange(nb) / 1000.
    xq = np.random.random((nq, d)).astype('float32')
    xq[:, 0] += np.arange(nq) / 1000.
    #=================================================================
    import faiss                   # make faiss available
    index = faiss.IndexFlatL2(d)   # build the index
    index.add(xb)                  # add vectors to the index
    k = 4                          # we want to see 4 nearest neighbors
    D, I = index.search(xq, k)     # actual search
    print I[-5:]                # neighbors of the 5 last queries
    print D[-5:]
    
    del index, D, I
    #=================================================================
    print "================="
    index = faiss.IndexFlatL2(d)   # build the index
    res = faiss.StandardGpuResources()
    index = faiss.index_cpu_to_gpu(res, 0, index)
    index.add(xb)                  # add vectors to the index
    k = 4                          # we want to see 4 nearest neighbors
    D, I = index.search(xq, k)     # actual search
    print I[-5:]                # neighbors of the 5 last queries
    print D[-5:]
    
    del index, D, I
    
    exit(1)
    

    The result is

    [[ 9900 10500  9309  9831]
     [11055 10895 10812 11321]
     [11353 11103 10164  9787]
     [10571 10664 10632  9638]
     [ 9628  9554 10036  9582]]
    [[ 6.53157043  6.97875977  7.00392151  7.01379395]
     [ 4.33526611  5.23693848  5.31942749  5.70327759]
     [ 6.07269287  6.57675171  6.61395264  6.7322998 ]
     [ 6.63751221  6.64874268  6.85787964  7.00964355]
     [ 6.21836853  6.45251465  6.54876709  6.58129883]]
    =================
    number of GPUs: 1
    [[10500 10500  9831  9831]
     [10895 10895 10812 11321]
     [11103 11103  9787  9787]
     [10632 10632  9638  9638]
     [ 9628  9554  9582  9582]]
    [[ 6.53156281  6.97874451  7.00393677  7.01376343]
     [ 4.33531189  5.23696899  5.31942749  5.70326233]
     [ 6.07269287  6.57672119  6.61393738  6.73226929]
     [ 6.63748169  6.64871216  6.85783386  7.00959778]
     [ 6.21837616  6.45251465  6.54875183  6.58128357]]
    

    The result of the GPU index and CPU index are not the same

    2. Duplicate items in the GPU result

    As the result shown above, there are duplicate ids in the result but with different distances, like [10500 10500 9831 9831].

    Could someone tell me what is the problem and how to fix it, THX!

    bug GPU 
    opened by DrLai12club 16
  • Tests fail to link: undefined symbol: testing::AssertionSuccess()

    Tests fail to link: undefined symbol: testing::AssertionSuccess()

    Summary

    ld: error: undefined symbol: testing::AssertionSuccess()
    >>> referenced by test_binary_flat.cpp
    >>>               tests/CMakeFiles/faiss_test.dir/test_binary_flat.cpp.o:(BinaryFlat_accuracy_Test::TestBody())
    >>> referenced by test_dealloc_invlists.cpp
    >>>               tests/CMakeFiles/faiss_test.dir/test_dealloc_invlists.cpp.o:((anonymous namespace)::test_dealloc_invlists(char const*))
    >>> referenced by test_ivfpq_codec.cpp
    >>>               tests/CMakeFiles/faiss_test.dir/test_ivfpq_codec.cpp.o:(IVFPQ_codec_Test::TestBody())
    >>> referenced 533 more times
    
    ld: error: undefined symbol: testing::Message::Message()
    >>> referenced by test_binary_flat.cpp
    >>>               tests/CMakeFiles/faiss_test.dir/test_binary_flat.cpp.o:(BinaryFlat_accuracy_Test::TestBody())
    >>> referenced by test_dealloc_invlists.cpp
    >>>               tests/CMakeFiles/faiss_test.dir/test_dealloc_invlists.cpp.o:((anonymous namespace)::test_dealloc_invlists(char const*))
    >>> referenced by test_ivfpq_codec.cpp
    >>>               tests/CMakeFiles/faiss_test.dir/test_ivfpq_codec.cpp.o:(IVFPQ_codec_Test::TestBody())
    >>> referenced 746 more times
    

    Platform

    OS: FreeBSD 13.1

    Faiss version: 1.7.3

    Installed from: FreeBSD port

    opened by yurivict 0
  • Have max_codes consider only subset entries in IndexIVF search

    Have max_codes consider only subset entries in IndexIVF search

    Summary

    Hey! Nice work with V1.7.3! I have a feature request.

    Is there a way to have the max_codes stop criteria in IndexIVF searches only consider those entries that actually belong to a subset if one is specified? Wit the current implementation, to my understanding, when the number of scanned entries reaches max_codes, the search is stopped. However, for subset searches, this might happen before we actually scanned max_codes entries in the subset as even entries not in the subset count towards this limit.

    Specifically, just as a proof of concept, all that would be necessary is to have scan_one_list not return list_size but instead return the number returned by scanner->scan_codes(list_size, codes, ids, simi, idxi, k); a few lines above. See here.

    Obviously, that's just a quick hack only for the IVF index. 🙃 I assume in order to not break the current behavior, this would need to be controlled via an additional search parameter for all indices that have the same behavior currently.

    Faiss version: 1.7.3 (19f7696deedc93615c3ee0ff4de22284b53e0243)

    Running on:

    • [x] CPU
    • [ ] GPU

    Interface:

    • [x] C++
    • [ ] Python
    opened by wro-ableton 0
  • Scan exactly max_codes elements

    Scan exactly max_codes elements

    Summary: The max_codes search parameter for IVF indexes limits the number of distance computations that are performed. Previously, the number of distance computations could exceed max_codes because inverted lists were scanned completely. This diff changed this to scan the beginning of the last inverted list to reach max_codes exactly.

    Differential Revision: D42367593

    CLA Signed fb-exported 
    opened by mdouze 2
  • search slow after time.sleep

    search slow after time.sleep

    Platform

    OS: Ubuntu 18.04.6 LTS

    Faiss version: 1.5.3

    Installed from: pip install faiss

    Running on:

    • unknown

    Interface:

    • [x] Python

    build index: index = faiss.index_factory(self.d, "IDMap,Flat", faiss.METRIC_INNER_PRODUCT) save index: faiss.write_index(index, index_path) read_index: faiss_index = faiss.read_index(index_path)

    loop 100, if "time.sleep(0.2)", some step cost time > 20ms if no "time.sleep(0.2)", all step cost time is steady

    #1 for i in range(0, 100): time.sleep(0.2) s_time = time.time() D, I = faiss_index.search(feature, 10) print(time.time() - s_time)

    time(s): 0.033809662 0.001636744 0.001227379 0.000584841 0.000673294 0.001588345 0.000566244 0.025577307 0.000347614 0.000542164 0.00073719 0.000379801 0.000360966 0.000362158 0.000305891 0.000477791 0.000341892 0.000299692 0.027928352 0.000314474 0.000792265 0.000283957 0.000373125 0.000294924 0.000402451 0.000293255 0.000303745 0.000368595 0.000586987 0.0218997 0.000355959 0.000353813 0.000363588 0.000471115 0.000345945 0.00036335 0.000501871 0.000407934 0.000304461 0.025905132 0.000546932 0.000391483 0.000262737 0.000678778 0.000277281 0.000338316 0.000325441 0.000415325 0.000396729 0.000430822 0.025371552 0.000266314 0.000350237 0.000250816 0.000309944 0.000453234 0.000368357 0.000521183 0.000347614 0.000543833 0.000417709 0.051602125 0.000535011 0.00065589 0.00056839 0.000513554 0.000328541 0.000306129 0.00067091 0.00054121 0.00051856 0.00036788 0.02731204 0.000954151 0.00055337 0.000694036 0.000400543 0.000449419 0.00043416 0.000398636 0.000354052 0.000365257 0.033364534 0.000450373 0.000359058 0.004323483 0.000331402 0.000561714 0.000916481 0.000369787 0.000481844 0.000393391 0.000357866 0.025733948 0.000584841 0.000360727 0.000318527 0.000590801 0.000495434 0.000266552

    #2 for i in range(0, 100): #time.sleep(0.2)
    s_time = time.time() D, I = faiss_index.search(feature, 10) print(time.time() - s_time)

    time(s): 0.046122789 0.000362396 0.00031805 0.000313759 0.000325203 0.00032568 0.000318527 0.000315666 0.000306606 0.000331163 0.000328302 0.000318289 0.000317335 0.000319004 0.00031662 0.00031805 0.000314713 0.000321388 0.000338554 0.000316143 0.000310659 0.000306129 0.000330448 0.000365973 0.000255823 0.000335455 0.00032115 0.000276089 0.000339508 0.000310898 0.000317812 0.00032568 0.000333309 0.00030756 0.000320435 0.000317812 0.00032258 0.000314236 0.000326872 0.000309706 0.000336885 0.000307322 0.000322104 0.00032711 0.00032711 0.000305414 0.000321388 0.000312805 0.000305891 0.00031805 0.000324965 0.00030899 0.000313282 0.000323772 0.000318527 0.000325918 0.000321627 0.000317097 0.000327587 0.000323296 0.000310898 0.000326872 0.000333548 0.000359297 0.000272274 0.000305414 0.000329018 0.000317335 0.000315666 0.000325441 0.00031662 0.000314474 0.00033021 0.000314951 0.000320911 0.00033021 0.000313282 0.000319958 0.000318289 0.000332832 0.000331879 0.000303507 0.000319242 0.000331879 0.000316381 0.000310659 0.000353813 0.000301838 0.000322819 0.00031662 0.000310183 0.000318766 0.000341415 0.000312328 0.00033021 0.000317335 0.000331402 0.000324726 0.000315905 0.000311375

    opened by safehumeng 0
  • GpuIndexFlatL2 doesn't produce distances for the last 8 queries

    GpuIndexFlatL2 doesn't produce distances for the last 8 queries

    Platform

    OS: Windows 10 Faiss version: 1.7.3

    Installed from: Compiled using Visual Studio 17 2022

    Faiss compilation options: Using MKL 2202.2.1

    Cuda version: 12.0.0

    GPU: GTX 1060

    Running on:

    • [X] CPU
    • [X] GPU

    Interface:

    • [X] C++
    • [ ] Python

    Reproduction instructions

    Using the test file linked below, faiss makes a CPU index and a GPU index. Then performs a query search on the first 1000 vectors from a 100000 vector database. Code copied directly from 1-Flat for the CPU portion, and 4-GPU for the GPU portion.

    Consistently, the last 8 vectors from the distance matrix are all 0's. Whether querying 1000 elements, or 10000 elements, it's only the last 8 elements.

    6-GPU-CPU.zip

    Output of the program is as follows:

    Building data
    Make index
    is_trained = true
    ntotal = 100000
    I (5 first results)=
        0   723   254   152   403    92   368  1129   673   571
        1   995   136   183   223   555   880   671     5    68
        2   312   253    29   124   148   112   718   713   260
        3   983   467    88   786   327   326   684   367  1053
        4   403   112   643   430   679   142   733   119   382
    I (10 last results)=
      990   962  2284   863  1133  1683  1463  2339  1730  2228
      991  1026   995   540  1396   365  1348  1271  1861   975
      992   257   163   135  1489  1315   878  1017   219   777
      993  1331   210  1362   286   444  1329   608  1191   986
      994   155   134   631   469  1044   388  1042   766  1561
      995   511     1   664   991  1800   689    37   634   631
      996   770  1043   827  1264  1310  1828  1504  1535   876
      997  1288   920   742  1432   840  1174  1337  1041  1113
      998   689  1044   810  1229  2199  1448  2112  1888  1442
      999  1722   901  1161  1044  1251   505  1310   791   308
    D (10 last results)=
          0 6.46885 6.56971 6.80382 7.19488 7.25274 7.44602 7.56737 7.75592  7.8215
          0 5.75124 5.96521 6.00626 6.17735  6.6787 6.74106 6.87712 6.89094 6.89425
          0 5.82659 6.08222 6.16805 6.19852 6.25793 6.56962 6.60474 6.71429 6.72893
          0 6.79663 6.83468  6.9018 6.90929 7.06563 7.07221 7.15147 7.18442 7.20781
          0 6.02754 6.53414 6.62136 6.73151 6.83076 6.85785 6.86768 6.87643 6.89012
          0 5.52238 5.78548 5.80803 5.96521 5.97704 6.12522  6.1321 6.18419 6.51028
          0 5.73736 6.25742 6.38132 6.43517 6.63315 6.70425 6.81538 6.84794  6.8531
          0 6.59953 6.84864 7.11777 7.33908 7.38752 7.39641 7.48399 7.52819 7.60603
          0 5.54166 5.68894 5.72082 5.98355 6.49582 6.52649  6.5502 6.66038 6.66049
          0 6.26311 6.37093 6.39842 6.62256 6.73258 6.82148 6.83769 6.84539 6.91491
    is_trained = true
    ntotal = 100000
    I (5 first results)=
        0   723   254   152   403    92   368  1129   673   571
        1   995   136   183   223   555   880   671     5    68
        2   312   253    29   124   148   112   718   713   260
        3   983   467    88   786   327   326   684   367  1053
        4   403   112   643   430   679   142   733   119   382
    I (10 last results)=
      990   962  2284   863  1133  1683  1463  2339  1730  2228
      991  1026   995   540  1396   365  1348  1271  1861   975
      992   257   163   135  1489  1315   878  1017   219   777
      993  1331   210  1362   286   444  1329   608  1191   986
      994   155   134   631   469  1044   388  1042   766  1561
      995   511     1   664   991  1800   689    37   634   631
      996   770  1043   827  1264  1310  1828  1504  1535   876
      997  1288   920   742  1432   840  1174  1337  1041  1113
      998   689  1044   810  1229  2199  1448  2112  1888  1442
      999  1722   901  1161  1044  1251   505  1310   791   308
    D (10 last results)=
    7.62939e-06 6.46885 6.56971 6.80381 7.19488 7.25273 7.44602 7.56738 7.75592  7.8215
          0 5.75124 5.96521 6.00626 6.17735 6.67871 6.74106 6.87711 6.89094 6.89426
          0       0       0       0       0       0       0       0       0       0
          0       0       0       0       0       0       0       0       0       0
          0       0       0       0       0       0       0       0       0       0
          0       0       0       0       0       0       0       0       0       0
          0       0       0       0       0       0       0       0       0       0
          0       0       0       0       0       0       0       0       0       0
          0       0       0       0       0       0       0       0       0       0
          0       0       0       0       0       0       0       0       0       0
    
    cant-repro GPU 
    opened by JulianThijssen 1
Releases(v1.7.3)
  • v1.7.3(Nov 30, 2022)

    Added

    • Sparse k-means routines and moved the generic kmeans to contrib
    • FlatDistanceComputer for all FlatCodes indexes
    • Support for fast accumulation of 4-bit LSQ and RQ
    • Product additive quantization support
    • Support per-query search parameters for many indexes + filtering by ids
    • write_VectorTransform and read_vectorTransform were added to the public API (by @AbdelrahmanElmeniawy)
    • Support for IDMap2 in index_factory by adding "IDMap2" to prefix or suffix of the input String (by @AbdelrahmanElmeniawy)
    • Support for merging all IndexFlatCodes descendants (by @AbdelrahmanElmeniawy)
    • Remove and merge features for IndexFastScan (by @AbdelrahmanElmeniawy)
    • Performance improvements: 1) specialized the AVX2 pieces of code speeding up certain hotspots, 2) specialized kernels for vector codecs (this can be found in faiss/cppcontrib)

    Fixed

    • Fixed memory leak in OnDiskInvertedLists::do_mmap when the file is not closed (by @AbdelrahmanElmeniawy)
    • LSH correctly throws error for metric types other than METRIC_L2 (by @AbdelrahmanElmeniawy)
    Source code(tar.gz)
    Source code(zip)
  • v1.7.2(Jan 10, 2022)

    ADDED

    • Support LSQ on GPU (by @KinglittleQ)
    • Support for exact 1D kmeans (by @KinglittleQ)
    • LUT-based search for additive quantizers
    • Autogenerated Python docstrings from Doxygen comments

    CHANGED

    • Cleanup of index_factory parsing
    Source code(tar.gz)
    Source code(zip)
  • v1.6.4(Oct 22, 2020)

    Features

    • Arbitrary dimensions per sub-quantizer now allowed for GpuIndexIVFPQ.
    • Brute-force kNN on GPU (bfKnn) now accepts int32 indices.
    • Faiss CPU now supports Windows. Conda packages are available from the nightly channel.
    Source code(tar.gz)
    Source code(zip)
  • v1.5.3(Jun 24, 2019)

    Bugfixes:

    • slow scanning of inverted lists (#836).

    Features:

    • add basic support for 6 new metrics in CPU IndexFlat and IndexHNSW (#848);
    • add support for IndexIDMap/IndexIDMap2 with binary indexes (#780).

    Misc:

    • throw python exception for OOM (#758);
    • make DistanceComputer available for all random access indexes;
    • gradually moving from long to int64_t for portability.
    Source code(tar.gz)
    Source code(zip)
  • v1.5.2(May 30, 2019)

    The license was changed from BSD+Patents to MIT.

    Changelog:

    • propagates exceptions raised in sub-indexes of IndexShards and IndexReplicas;
    • support for searching several inverted lists in parallel (parallel_mode != 0);
    • better support for PQ codes where nbit != 8 or 16;
    • IVFSpectralHash implementation: spectral hash codes inside an IVF;
    • 6-bit per component scalar quantizer (4 and 8 bit were already supported);
    • combinations of inverted lists: HStackInvertedLists and VStackInvertedLists;
    • configurable number of threads for OnDiskInvertedLists prefetching (including 0=no prefetch);
    • more test and demo code compatible with Python 3 (print with parentheses);
    • refactored benchmark code: data loading is now in a single file.
    Source code(tar.gz)
    Source code(zip)
  • v1.5.1(May 30, 2019)

    Changelog:

    • a MatrixStats object, which reports useful statistics about a dataset;
    • an option to round coordinates during k-means optimization;
    • an alternative option for search in HNSW;
    • moved stats() and imbalance_factor() from IndexIVF to InvertedLists object;
    • range search is now available for IVFScalarQuantizer;
    • support for direct uint_8 codec in ScalarQuantizer;
    • renamed IndexProxy to IndexReplicas (now ;
    • better support for PQ code assignment with external index;
    • support for IMI2x16 (4B virtual centroids!);
    • support for k = 2048 search on GPU (instead of 1024);
    • most CUDA mem alloc failures now throw exceptions instead of terminating on an assertion;
    • support for renaming an ondisk invertedlists;
    • interrupt computations with interrupt signal (ctrl-C) in python;
    • simplified build system (with --with-cuda/--with-cuda-arch options);
    • updated example Dockerfile;
    • conda packages now depend on the cudatoolkit packages, which fixes some interferences with pytorch. Consequentially, faiss-gpu should now be installed by conda install -c pytorch faiss-gpu cudatoolkit=10.0.
    Source code(tar.gz)
    Source code(zip)
  • v1.5.0(May 30, 2019)

  • v1.4.0(Aug 31, 2018)

    Faiss 1.4.0

    Features:

    • automatic tracking of C++ references in Python
    • non-intel platforms supported -- some functions optimized for ARM
    • override nprobe for concurrent searches
    • support for floating-point quantizers in binary indexes

    Bug fixes:

    • no more segfaults in python (I know it's the same as the first feature but it's important!)
    • fix GpuIndexIVFFlat issues for float32 with 64 / 128 dims
    • fix sharding of flat indexes on GPU with index_cpu_to_gpu_multiple

    The Python interface of Faiss closely mimics the C++ interface. This means that all C++ functions, objects, fields and methods are visible and accessible in Python. This is done thanks to SWIG, that automatically generates Python classes from the C++ headers. The downside is that this low-level access means that there is no automatic tracking of C++ references in Python. For example:

    index = IndexIVFFlat(IndexFlatL2(10), 10, 100) 
    

    would crash. Python does not know that the IndexFlatL2 is referenced by the IndexIVFFlat, so the garbage collector deallocates the IndexFlatL2 while IndexIVFFlat still references it. In Faiss 1.4.0, we added code to all such constructors that adds a Python-level reference to the object and prevents deallocation. With this upgrade, there should be no crashes in pure Python any more, you can report them right away as issues.

    Faiss was developed on 64-bit x86 platforms, Linux and Mac OS. There were quite a few locations in the code that shamelessly assumed that they were compiled with SSE support. Faiss 1.4.0 is portable to other hardware, it has pure C++ code for all operations, and SSE/AVX is only enabled if the appropriate macro are set. This was tested on an ARM platform and also a few operations were optimized for the ARM SIMD operations (in utils_simd.cpp).

    To compile on a non-x86 platform, you will need to provide a BLAS library (OpenBLAS works for aarch64) and remove x86-specific flags from the makefile.inc (manually for now). Faiss is not portable to other compilers than g++/clang though.

    The search-time parameters like nprobe for IndexIVF are set in the index object. What if you want to perform concurrent searches from several threads with different search parameters? This was not possible so far. Now there is an IVFSearchParameters object that can override the parameters set at the object level. See tests/test_params_override.cpp

    Faiss' support for binary indexes is recent, and not so many index types are supported. To work around this, we added IndexBinaryFromFloat, a binary index that wraps around any floating-point index. This makes it possible, for example, to use an IndexHNSW as a quantizer for an IndexBinaryIVF. See tests/test_index_binary_from_float.py

    We also fixed a few bugs that correspond to github issues.

    Source code(tar.gz)
    Source code(zip)
  • v1.3.0(Jul 12, 2018)

    Features:

    • Support for binary indexes (IndexBinaryFlat, IndexBinaryIVF)
    • Support fp16 encoding in scalar quantizer
    • Support for deduplication in IndexIVFFlat
    • Support for index serialization

    Bugs:

    • Fix MMAP bug for normal indexes
    • Fix propagation of io_flags in read func
    • Fix k-selection for CUDA 9
    • Fix race condition in OnDiskInvertedLists
    Source code(tar.gz)
    Source code(zip)
  • v1.2.1(Mar 1, 2018)

Owner
Meta Research
Meta Research
FAST Aiming at the problems of cumbersome steps and slow download speed of GNSS data

FAST Aiming at the problems of cumbersome steps and slow download speed of GNSS data, a relatively complete set of integrated multi-source data download terminal software fast is developed. The softw

ChangChuntao 23 Dec 31, 2022
NudeNet: Neural Nets for Nudity Classification, Detection and selective censoring

NudeNet: Neural Nets for Nudity Classification, Detection and selective censoring Uncensored version of the following image can be found at https://i.

notAI.tech 1.1k Dec 29, 2022
Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"

TR-BERT Source code and dataset for "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference". The code is based on huggaface's transformers.

THUNLP 37 Oct 30, 2022
PyTorch implementation of Histogram Layers from DeepHist: Differentiable Joint and Color Histogram Layers for Image-to-Image Translation

deep-hist PyTorch implementation of Histogram Layers from DeepHist: Differentiable Joint and Color Histogram Layers for Image-to-Image Translation PyT

Winfried Lötzsch 10 Dec 06, 2022
A Robust Unsupervised Ensemble of Feature-Based Explanations using Restricted Boltzmann Machines

A Robust Unsupervised Ensemble of Feature-Based Explanations using Restricted Boltzmann Machines Understanding the results of deep neural networks is

Johan van den Heuvel 2 Dec 13, 2021
NR-GAN: Noise Robust Generative Adversarial Networks

Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter Code and checkpoints for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling

Takuhiro Kaneko 59 Dec 11, 2022
A TikTok-like recommender system for GitHub repositories based on Gorse

GitRec GitRec is the missing recommender system for GitHub repositories based on Gorse. Architecture The trending crawler crawls trending repositories

337 Jan 04, 2023
A wrapper around SageMaker ML Lineage Tracking extending ML Lineage to end-to-end ML lifecycles, including additional capabilities around Feature Store groups, queries, and other relevant artifacts.

ML Lineage Helper This library is a wrapper around the SageMaker SDK to support ease of lineage tracking across the ML lifecycle. Lineage artifacts in

AWS Samples 12 Nov 01, 2022
PyTorch implementation of "Continual Learning with Deep Generative Replay", NIPS 2017

pytorch-deep-generative-replay PyTorch implementation of Continual Learning with Deep Generative Replay, NIPS 2017 Results Continual Learning on Permu

Junsoo Ha 127 Dec 14, 2022
patchmatch和patchmatchstereo算法的python实现

patchmatch patchmatch以及patchmatchstereo算法的python版实现 patchmatch参考 github patchmatchstereo参考李迎松博士的c++版代码 由于patchmatchstereo没有做任何优化,并且是python的代码,主要是方便解析算

Sanders Bao 11 Dec 02, 2022
This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation".

Prompt-Based Multi-Modal Image Segmentation This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation". The sys

Timo Lüddecke 305 Dec 30, 2022
A machine learning library for spiking neural networks. Supports training with both torch and jax pipelines, and deployment to neuromorphic hardware.

Rockpool Rockpool is a Python package for developing signal processing applications with spiking neural networks. Rockpool allows you to build network

SynSense 21 Dec 14, 2022
IsoGCN code for ICLR2021

IsoGCN The official implementation of IsoGCN, presented in the ICLR2021 paper Isometric Transformation Invariant and Equivariant Graph Convolutional N

horiem 39 Nov 25, 2022
EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation.

This repository contains data and code for our EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation. Please contact me at

9 Oct 28, 2022
Band-Adaptive Spectral-Spatial Feature Learning Neural Network for Hyperspectral Image Classification

Band-Adaptive Spectral-Spatial Feature Learning Neural Network for Hyperspectral Image Classification

258 Dec 29, 2022
ReGAN: Sequence GAN using RE[INFORCE|LAX|BAR] based PG estimators

Sequence Generation with GANs trained by Gradient Estimation Requirements: PyTorch v0.3 Python 3.6 CUDA 9.1 (For GPU) Origin The idea is from paper Se

40 Nov 03, 2022
Crowd-Kit is a powerful Python library that implements commonly-used aggregation methods for crowdsourced annotation and offers the relevant metrics and datasets

Crowd-Kit: Computational Quality Control for Crowdsourcing Documentation Crowd-Kit is a powerful Python library that implements commonly-used aggregat

Toloka 125 Dec 30, 2022
Implementation of algorithms for continuous control (DDPG and NAF).

DEPRECATION This repository is deprecated and is no longer maintaned. Please see a more recent implementation of RL for continuous control at jax-sac.

Ilya Kostrikov 288 Dec 31, 2022
Pytorch implementation of DeepMind's differentiable neural computer paper.

DNC pytorch This is a Pytorch implementation of DeepMind's Differentiable Neural Computer (DNC) architecture introduced in their recent Nature paper:

Yuanpu Xie 91 Nov 21, 2022
A curated list of the top 10 computer vision papers in 2021 with video demos, articles, code and paper reference.

The Top 10 Computer Vision Papers of 2021 The top 10 computer vision papers in 2021 with video demos, articles, code, and paper reference. While the w

Louis-François Bouchard 118 Dec 21, 2022