The supported way to install Faiss is through conda. Stable releases are pushed regularly to the pytorch conda channel, as well as pre-release nightly builds.
- The CPU-only faiss-cpu conda package is currently available on Linux (x86-64 and aarch64), OSX (arm64 only), and Windows (x86-64)
- faiss-gpu, containing both CPU and GPU indices, is available on Linux (x86-64 only) for CUDA 11.4 and 12.1
- faiss-gpu-raft 1 package containing GPU indices provided by NVIDIA RAFT version 24.06, is available on Linux (x86-64 only) for CUDA 11.8 and 12.4.
To install the latest stable release:
# CPU-only version
$ conda install -c pytorch faiss-cpu=1.10.0
# GPU(+CPU) version
$ conda install -c pytorch -c nvidia faiss-gpu=1.10.0
# GPU(+CPU) version with NVIDIA RAFT
$ conda install -c pytorch -c nvidia -c rapidsai -c conda-forge faiss-gpu-raft=1.10.0
# GPU(+CPU) version using AMD ROCm not yet available
For faiss-gpu, the nvidia channel is required for CUDA, which is not published in the main anaconda channel.
For faiss-gpu-raft, the rapidsai, conda-forge and nvidia channels are required.
Nightly pre-release packages can be installed as follows:
# CPU-only version
$ conda install -c pytorch/label/nightly faiss-cpu
# GPU(+CPU) version
$ conda install -c pytorch/label/nightly -c nvidia faiss-gpu=1.10.0
# GPU(+CPU) version with NVIDIA cuVS (package built with CUDA 12.4)
conda install -c pytorch -c rapidsai -c conda-forge -c nvidia pytorch/label/nightly::faiss-gpu-cuvs 'cuda-version>=12.0,<=12.5'
# GPU(+CPU) version with NVIDIA cuVS (package built with CUDA 11.8)
conda install -c pytorch -c rapidsai -c conda-forge -c nvidia pytorch/label/nightly::faiss-gpu-cuvs 'cuda-version>=11.4,<=11.8'
# GPU(+CPU) version using AMD ROCm not yet available
In the above commands, pytorch-cuda=11 or pytorch-cuda=12 would select a specific CUDA version, if it’s required.
A combination of versions that installs GPU Faiss with CUDA and Pytorch (as of 2024-05-15):
conda create --name faiss_1.8.0
conda activate faiss_1.8.0
conda install -c pytorch -c nvidia faiss-gpu=1.8.0 pytorch=*=*cuda* pytorch-cuda=11 numpy
Faiss is also being packaged by conda-forge, the community-driven packaging ecosystem for conda. The packaging effort is collaborating with the Faiss team to ensure high-quality package builds.
Due to the comprehensive infrastructure of conda-forge, it may even happen that certain build combinations are supported in conda-forge that are not available through the pytorch channel. To install, use
# CPU version
$ conda install -c conda-forge faiss-cpu
# GPU version
$ conda install -c conda-forge faiss-gpu
# NVIDIA cuVS and AMD ROCm version not yet available
You can tell which channel your conda packages come from by using conda list
.
If you are having problems using a package built by conda-forge, please raise
an issue on the
conda-forge package "feedstock".
Faiss can be built from source using CMake.
Faiss is supported on x86-64 machines on Linux, OSX, and Windows. It has been found to run on other platforms as well, see other platforms.
The basic requirements are:
- a C++17 compiler (with support for OpenMP support version 2 or higher),
- a BLAS implementation (on Intel machines we strongly recommend using Intel MKL for best performance).
The optional requirements are:
- for GPU indices:
- nvcc,
- the CUDA toolkit,
- for AMD GPUs:
- AMD ROCm,
- for using NVIDIA cuVS implementations:
- libcuvs=24.12
- for the python bindings:
- python 3,
- numpy,
- and swig.
Indications for specific configurations are available in the troubleshooting section of the wiki.
cuVS contains state-of-the-art implementations of several algorithms for running approximate nearest neighbors and clustering on the GPU. It is built on top of the RAPIDS RAFT library of high performance machine learning primitives. Building Faiss with cuVS enabled allows a user to choose between regular GPU implementations in Faiss and cuVS implementations for specific algorithms.
The libcuvs dependency should be installed via conda:
- With CUDA 12.0 - 12.5:
conda install -c rapidsai -c conda-forge -c nvidia libcuvs=24.12 'cuda-version>=12.0,<=12.5'
- With CUDA 11.4 - 11.8
conda install -c rapidsai -c conda-forge -c nvidia libcuvs=24.12 'cuda-version>=11.4,<=11.8'
For more ways to install cuVS 24.12, refer to the RAPIDS Installation Guide.
$ cmake -B build .
This generates the system-dependent configuration/build files in the build/
subdirectory.
Several options can be passed to CMake, among which:
- general options:
-DFAISS_ENABLE_GPU=OFF
in order to disable building GPU indices (possible values areON
andOFF
),-DFAISS_ENABLE_PYTHON=OFF
in order to disable building python bindings (possible values areON
andOFF
),-DFAISS_ENABLE_CUVS=ON
in order to use the NVIDIA cuVS implementations of the IVF-Flat, IVF-PQ and CAGRA GPU-accelerated indices (default isOFF
, possible, values areON
andOFF
). Note:-DFAISS_ENABLE_GPU
must be set toON
when enabling this option.-DBUILD_TESTING=OFF
in order to disable building C++ tests,-DBUILD_SHARED_LIBS=ON
in order to build a shared library (possible values areON
andOFF
),-DFAISS_ENABLE_C_API=ON
in order to enable building C API (possible values areON
andOFF
),
- optimization-related options:
-DCMAKE_BUILD_TYPE=Release
in order to enable generic compiler optimization options (enables-O3
on gcc for instance),-DFAISS_OPT_LEVEL=avx2
in order to enable the required compiler flags to generate code using optimized SIMD/Vector instructions. Possible values are below:- On x86-64,
generic
,avx2
, 'avx512', andavx512_spr
(for avx512 features available since Intel(R) Sapphire Rapids), by increasing order of optimization, - On aarch64,
generic
andsve
, by increasing order of optimization,
- On x86-64,
-DFAISS_USE_LTO=ON
in order to enable Link-Time Optimization (default isOFF
, possible values areON
andOFF
).
- BLAS-related options:
-DBLA_VENDOR=Intel10_64_dyn -DMKL_LIBRARIES=/path/to/mkl/libs
to use the Intel MKL BLAS implementation, which is significantly faster than OpenBLAS (more information about the values for theBLA_VENDOR
option can be found in the CMake docs),
- GPU-related options:
-DCUDAToolkit_ROOT=/path/to/cuda-10.1
in order to hint to the path of the CUDA toolkit (for more information, see CMake docs),-DCMAKE_CUDA_ARCHITECTURES="75;72"
for specifying which GPU architectures to build against (see CUDA docs to determine which architecture(s) you should pick),-DFAISS_ENABLE_ROCM=ON
in order to enable building GPU indices for AMD GPUs.-DFAISS_ENABLE_GPU
must beON
when using this option. (possible values areON
andOFF
),
- python-related options:
-DPython_EXECUTABLE=/path/to/python3.7
in order to build a python interface for a different python than the default one (see CMake docs).
$ make -C build -j faiss
This builds the C++ library (libfaiss.a
by default, and libfaiss.so
if
-DBUILD_SHARED_LIBS=ON
was passed to CMake).
The -j
option enables parallel compilation of multiple units, leading to a
faster build, but increasing the chances of running out of memory, in which case
it is recommended to set the -j
option to a fixed value (such as -j4
).
If making use of optimization options, build the correct target before swigfaiss.
For AVX2:
$ make -C build -j faiss_avx2
For AVX512:
$ make -C build -j faiss_avx512
For AVX512 features available since Intel(R) Sapphire Rapids.
$ make -C build -j faiss_avx512_spr
This will ensure the creation of neccesary files when building and installing the python package.
$ make -C build -j swigfaiss
$ (cd build/faiss/python && python setup.py install)
The first command builds the python bindings for Faiss, while the second one generates and installs the python package.
$ make -C build install
This will make the compiled library (either libfaiss.a
or libfaiss.so
on
Linux) available system-wide, as well as the C++ headers. This step is not
needed to install the python package only.
To run the whole test suite, make sure that cmake
was invoked with
-DBUILD_TESTING=ON
, and run:
$ make -C build test
$ (cd build/faiss/python && python setup.py build)
$ PYTHONPATH="$(ls -d ./build/faiss/python/build/lib*/)" pytest tests/test_*.py
A basic usage example is available in
demos/demo_ivfpq_indexing.cpp
.
It creates a small index, stores it and performs some searches. A normal runtime is around 20s. With a fast machine and Intel MKL's BLAS it runs in 2.5s.
It can be built with
$ make -C build demo_ivfpq_indexing
and subsequently ran with
$ ./build/demos/demo_ivfpq_indexing
$ make -C build demo_ivfpq_indexing_gpu
$ ./build/demos/demo_ivfpq_indexing_gpu
This produce the GPU code equivalent to the CPU demo_ivfpq_indexing
. It also
shows how to translate indexes from/to a GPU.
A longer example runs and evaluates Faiss on the SIFT1M dataset. To run it,
please download the ANN_SIFT1M dataset from http://corpus-texmex.irisa.fr/
and unzip it to the subdirectory sift1M
at the root of the source
directory for this repository.
Then compile and run the following (after ensuring you have installed faiss):
$ make -C build demo_sift1M
$ ./build/demos/demo_sift1M
This is a demonstration of the high-level auto-tuning API. You can try setting a different index_key to find the indexing structure that gives the best performance.
The following script extends the demo_sift1M test to several types of indexes. This must be run from the root of the source directory for this repository:
$ mkdir tmp # graphs of the output will be written here
$ python demos/demo_auto_tune.py
It will cycle through a few types of indexes and find optimal operating points. You can play around with the types of indexes.
The example above also runs on GPU. Edit demos/demo_auto_tune.py
at line 100
with the values
keys_to_test = keys_gpu
use_gpu = True
and you can run
$ python demos/demo_auto_tune.py
to test the GPU code.
Footnotes
-
The vector search and clustering algorithms in NVIDIA RAFT have been formally migrated to NVIDIA cuVS. This package is being renamed to
faiss-gpu-cuvs
in the next stable release, which will use these GPU implementations from the pre-compiledlibcuvs=24.12
binary. ↩