Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On GPU haystack should install faiss-gpu instead of faiss-cpu #454

Closed
lalitpagaria opened this issue Sep 30, 2020 · 2 comments
Closed

On GPU haystack should install faiss-gpu instead of faiss-cpu #454

lalitpagaria opened this issue Sep 30, 2020 · 2 comments
Labels
type:bug Something isn't working

Comments

@lalitpagaria
Copy link
Contributor

Describe the bug
On installing haystack on GPU via pip, faiss-cpu also gets installed. But ideally faiss-gpu should be used on GPU. And deepset/haystack-gpu docker image have same issue as it is using same requirements.txt

Error message
NA

Expected behavior
As stated in #446 , making faiss dependency optional will solve this. Also creating env specific txt file will help user to install additional dependencies like preposed on stackoverflow.

Additional context
NA

To Reproduce
Run on colab or check output of this tutorial

!nvidia-smi

!pip install git+https://github.com/deepset-ai/haystack.git
...
...
Collecting faiss-cpu
  Downloading https://files.pythonhosted.org/packages/1d/84/9de38703486d9f00b1a63590887a318d08c52f10f768968bd7626aee75da/faiss_cpu-1.6.3-cp36-cp36m-manylinux2010_x86_64.whl (7.2MB)
     |████████████████████████████████| 7.2MB 15.4MB/s 

System:

  • GPU/CPU: GPU
  • Haystack version (commit or version number): latest
@lalitpagaria lalitpagaria added the type:bug Something isn't working label Sep 30, 2020
@tholor
Copy link
Member

tholor commented Oct 1, 2020

Good point. So far this is only present as a comment:

# for using FAISS with GPUs, install faiss-gpu
faiss-cpu

We wanted to add windows specific installation in setup.py anyway (#446), so we could add a gpu-specific case there, too.

@tholor
Copy link
Member

tholor commented Oct 5, 2020

So far we couldn't see significant speed-ups with GPU. From our current understanding, GPU will only help for datasets > 1Mio. Before that, the overhead of copying data to GPU is probably too big. Furthermore, we usually have single queries in the retriever while GPUs rather shine in the case of "batch queries".

In our particular case, we will therefore keep the FAISS CPU version in the docker image because a) typical users will have datasets < 1 Mio docs and b) even for bigger datasets it's currently beneficial to reserve the GPU for the reader (with concurrent API requests we will have better utilization of all available resources).

Therefore, closing this for now. We might do more detailed benchmarking in future.

@tholor tholor closed this as completed Oct 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants