-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random sample in local mode #705
Random sample in local mode #705
Conversation
remote_client = init_remote(prefer_grpc=prefer_grpc) | ||
init_client(remote_client, fixture_points) | ||
|
||
compare_client_results(local_client, remote_client, searcher.random_query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are the random implementations between local and server equivalent?
I would expect those to return different values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this test we basically sample all the points available in the collection
In random_query
we use limit=100
and we generate 100 points in fixture_points = generate_fixtures(100)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We return all points postprocessed to be sorted by ID. Just to make sure we can return all of them
random_scores = np.random.rand(len(self.ids)) | ||
random_order = np.argsort(random_scores) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: np.random.permutation(self.inv_ids)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but we need the internal ids to filter against the mask, not the external ones
my comments are mostly nit, other than that looks good |
c49579f
to
41c2fd5
Compare
* pre-implement random sampling * generate models * add conversions and tests * fix mypy lints * tests: add test for sample random conversion * use camelcase Sample.Random * review fixes * fix mypy --------- Co-authored-by: George Panchuk <[email protected]>
* universal-search: Query Group API and local mode * requires different logic for gRPC error assertion * you come to me at runtime for a compile time issue * suddenly throwing a different error * test more group key types * extend limit of prefetches during group_by * rescoring is the issue * one problem at a time please * code review * regen clients * code review * add lookup_from to query_points_groups * test with_lookup * test and fix gRPC * drop dedicated conversion * Update qdrant_client/qdrant_client.py Co-authored-by: Luis Cossío <[email protected]> * regen async * Distribution-based score fusion in local mode (#703) * pre-implement dbsf * add dbsf congruence tests * mypy lints * add conversions * tests: add test for dbsf conversion --------- Co-authored-by: George Panchuk <[email protected]> * Random sample in local mode (#705) * pre-implement random sampling * generate models * add conversions and tests * fix mypy lints * tests: add test for sample random conversion * use camelcase Sample.Random * review fixes * fix mypy --------- Co-authored-by: George Panchuk <[email protected]> * fix: add type ignore for mypy * fix: fix type hints for 3.8 * fix: do not run mypy on async client generator in CI, simplify condition * Grpc comparison in tests (#726) * add parametrized fixture for using grpc too * compare grpc and http without running each setup twice * fix: fix exception types in invalid types test * fix: remove random seed which led to a erroneous sequence --------- Co-authored-by: Luis Cossío <[email protected]> --------- Co-authored-by: Luis Cossío <[email protected]> Co-authored-by: George Panchuk <[email protected]>
* universal-search: Query Group API and local mode * requires different logic for gRPC error assertion * you come to me at runtime for a compile time issue * suddenly throwing a different error * test more group key types * extend limit of prefetches during group_by * rescoring is the issue * one problem at a time please * code review * regen clients * code review * add lookup_from to query_points_groups * test with_lookup * test and fix gRPC * drop dedicated conversion * Update qdrant_client/qdrant_client.py Co-authored-by: Luis Cossío <[email protected]> * regen async * Distribution-based score fusion in local mode (#703) * pre-implement dbsf * add dbsf congruence tests * mypy lints * add conversions * tests: add test for dbsf conversion --------- Co-authored-by: George Panchuk <[email protected]> * Random sample in local mode (#705) * pre-implement random sampling * generate models * add conversions and tests * fix mypy lints * tests: add test for sample random conversion * use camelcase Sample.Random * review fixes * fix mypy --------- Co-authored-by: George Panchuk <[email protected]> * fix: add type ignore for mypy * fix: fix type hints for 3.8 * fix: do not run mypy on async client generator in CI, simplify condition * Grpc comparison in tests (#726) * add parametrized fixture for using grpc too * compare grpc and http without running each setup twice * fix: fix exception types in invalid types test * fix: remove random seed which led to a erroneous sequence --------- Co-authored-by: Luis Cossío <[email protected]> --------- Co-authored-by: Luis Cossío <[email protected]> Co-authored-by: George Panchuk <[email protected]>
Implements random sampling in local mode and adds the required conversions