Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MB-64513: Removing modification of max codes based on filtered document size #40

Merged
merged 1 commit into from
Dec 17, 2024

Conversation

Likith101
Copy link
Member

@Likith101 Likith101 commented Dec 11, 2024

Currently max codes is the number of scans performed at the faiss level. Without prefiltering, every single scan leads to a computation as well.

With prefiltering however, the logic is slightly different. Max codes is reduced from a percent of the total number of vectors to a percent of the filtered documents. This implies that we scan through max codes percent of the filtered vectors. Looking into the faiss code paths, this is not the case. We scan through max codes number of vectors and then apply the filtered vectors first before the distance computation. This means that max codes takes precedence over filtering and there is a very high possibility of not scanning the filtered vectors.

This fix removes the dependency between prefiltering and max codes by not reducing the max codes number from total vectors length to filtered vector length.

@Likith101 Likith101 changed the title Removing modification of max codes based on filtered document size MB-64513: Removing modification of max codes based on filtered document size Dec 16, 2024
@Likith101 Likith101 merged commit 2127bb0 into master Dec 17, 2024
@metonymic-smokey
Copy link
Member

I will update the design doc to reflect this change.

@abhinavdangeti abhinavdangeti deleted the maxCodes branch December 17, 2024 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants