-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faiss document store create duplicate vector_ids #392
Comments
It will be fixed by this PR #385 |
Hey @lalitpagaria , Good catch! I can reproduce this bug. I saw #385, but I think for this bug we can have a simpler solution. I will therefore create a separate PR to not mix it with the bigger changes of #385. We will need a more careful review and discussion there. |
Fixed by #395 |
I'm not sure if this is related. I'll try specifying index_buffer_size. My version is very fresh, from the source.
|
Describe the bug
Faiss document store will generate duplicate vector ids when number of documents in
write_documents
andupdate_embeddings
functions is greater than configuredindex_buffer_size
.Error message
No error message
Expected behavior
All written documents should have unique vector_ids
Additional context
Using
enumerate
index as vector_id causing this issue.My PR will fix this issue, because of refactoring of vector_is generation logic.
To Reproduce
Following test data will reproduce this issue
System:
All system
The text was updated successfully, but these errors were encountered: