Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

efficiency of remove_ids and updating vector #1021

Closed
Dr1rD opened this issue Nov 9, 2019 · 3 comments
Closed

efficiency of remove_ids and updating vector #1021

Dr1rD opened this issue Nov 9, 2019 · 3 comments

Comments

@Dr1rD
Copy link

Dr1rD commented Nov 9, 2019

Hi, I'm using IndexBinaryIVF index, which currently has 300 million vectors. Search, add_with_ids and remove_ids requests happen randomly all the time. My questions are:

  1. Remove_ids method is time-consuming. It takes 400ms to remove a vector.
    id_sel = faiss.IDSelectorBatch(id.size, faiss.swig_ptr(id)) index.remove_ids(id_sel)
    Is there a way to improve the efficiency of remove_ids?
  2. As far as I know, faiss does not support unique id. So I will use remove_ids before using add_with_ids to achieve the purpose of updating one vector.
    Is there a better way to implement vector updates?

Thanks.

@XueRonger
Copy link

I have the same questions.

@mdouze
Copy link
Contributor

mdouze commented Nov 19, 2019

As stated in
https://github.com/facebookresearch/faiss/wiki/Special-operations-on-indexes#removing-elements-from-an-index
remove_ids does a full pass over the index, which is why it is so slow.
There is a update_vectors but it is implemented only for IVFFlat.

It is possible to extend the update_vectors to other index types and implement efficient single-element remove. I'll mark this as an enhancement.

@mdouze
Copy link
Contributor

mdouze commented Mar 30, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants