Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix read consistency when performing batch search #587

Closed
wants to merge 117 commits into from
Closed

Fix read consistency when performing batch search #587

wants to merge 117 commits into from

Conversation

Apmats
Copy link
Contributor

@Apmats Apmats commented Apr 9, 2024

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you installed pre-commit with pip3 install pre-commit and set up hooks with pre-commit install?

Changes to Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

NirantK and others added 30 commits August 31, 2023 11:36
* * fix(qdrant_fastembed.py): change method name from encode to embed in QdrantFastembedMixin class
* fix(qdrant_fastembed.py): change method name from encode to embed

* * fix(qdrant_fastembed.py): change DEFAULT_EMBEDDING_MODEL from "sentence-transformers/all-MiniLM-L6-v2" to "BAAI/bge-small-en"

* * fix(qdrant_fastembed.py): add support for different embedding types in QdrantFastembedMixin
* feat(qdrant_fastembed.py): add embed_type parameter to embed method in QdrantFastembedMixin

* * fix(qdrant_fastembed.py): change vector field name from "text-{model_name}" to "fast-{model_name}"

* * fix(qdrant_fastembed.py): add comments to clarify code intent

* * refactor(qdrant_fastembed.py): remove unused import 'Sequence' from typing module

* * test(test_fast_embed.py): refactor test_add_without_query function to simplify parameter handling and remove unnecessary checks and assignments

* * fix(qdrant_fastembed.py): fix type error in NamedVector Query

* * fix(qdrant_fastembed.py): change method name from `embed` to `query_embed` in QdrantFastembedMixin class
* fix(qdrant_fastembed.py): fix query_vectors assignment in `query` method of QdrantFastembedMixin class
* add tests for vector operations

* rename vectors->points in vector update API

* fix tests
* tests: add some values count congruence tests

* fix: fix incorrect behaviour of value_counts, isempty, isnull in local mode

* fix: update isempty behaviour

* tests: add test cases for isnull and isempty
* sync API

* fix tests

* backward compatibility version up
* up version

* up integration tests version
* fix fastembed for pydantic 1.x

* add extras setup in readme

* move fastembed from group to extras

* lock
* fix: add init from conversion, support init from in local mode

* fix: fix mypy

* more test coverage

---------

Co-authored-by: generall <[email protected]>
* WIP: recommendation api update

* implement local new reco + add new tests

* refactor calculate_best_scores, normalize score assertion

* fix raw vectors in new recommend

* fix mypy errors, add new reco on group recommend

* fix more lints from ci

* fix pyright lint

* formattng

* Add descriptions to enums

* no mutable default argument

* edit score comparison precision based on magnitude

* Address review comments

* remove custom message for type ignore

* add coverage test

* improve coverage test a little

* add more fixtures for conversion test

---------

Co-authored-by: Luis Cossío <[email protected]>
* regenerate client

* regen grpc client

* document build process

* work from Zein in #272

* update rest client with optional field

* code refactor and better naming

* remove shapely as a dependency

* fix typing

---------

Co-authored-by: zzzz-vincent <[email protected]>
Co-authored-by: generall <[email protected]>
* update payload selector and tests

* fix mypy
* new: manually implemented async qdrant client

* fix: remove await before sync call

* fix: make upload collection, records and migrate synchronous

* fix: add init method to async client base

* refactoring: remove redundant import

* new: add super().__init__ in qdrant remote, update import in http

* new: mvp async qdrant client generator

* new: fix mypy, update generator script, refactoring

* fix: fix test script

* new: update generator launch script, update async files

* new: refactor async client generator

* refactoring: remove redundant operations, add comments, refactor

* new: add isort, black and autoflake to dev dependencies

* fix: add more checks, fix type hints

* fix: do not check types in async client generator for python3.8

* new: do not type check async_qdrant_fastembed

* fix: fix pyright run

* new: update async client tests

* fix: update versions in CI

* new: update pre-commit python version, update autogenerated files

* new: update generated files, add tests for async generator, update generator script

* fix: exclude generator test from 3.8

* fix: fix python version condition

* fix: add python target version for black

* fix: generate async client only on python 3.9
* update interface and version for fastembed

* fix types

* fix types

* regen async

* use python 3.11 to check compatibility

* fix docstring

* regen async

* propagate batch size
joein and others added 27 commits February 8, 2024 18:30
* apply fast_sigmoid fn to context pair score

* remove redundant else statement

* better NaN and float32 handling

* remove unused import
* generate rest client (only points_api)

* generate grpc client (points part only)

* add local mode implementation

* update collections_api.py

* add grpc conversions

* fix problematic StartFrom Union order

* add basic congruence test

* route order_by in qdrant clients

* fix local mode `start_from` logic

* add library stubs for dateutil

* generate async client

* generate the rest of the rest client 👻

* test datetime values too

* add int Range and int StartFrom

* update points.proto and points_pb2.py

* fix conversion of range interface

* add conversion fixtures and use better conversion of datetime to timestamp

* remove integer range

* restore `value_by_key()`

* OrderBy grpc to rest conversions

* generate async client

* use `OrderByInterface` instead of only `OrderBy`

* - use equivalent qdrant core datetime parsing,
- use type alias for OrderingValue, instead of custom class,
- better random date generation

* Drop custom datetime parser implementation

* nit fixes

* fix flakiness of the congruence test by subsorting

* better datetime to timestamp conversion

* uncomment conversion fixtures, add Direction

* restore poetry.lock and remove dateutil dep

* poetry lock --no-update

* rename datetime.py to datetime_utils.py

* add StrictInt to StartFrom union

* use more date formats in fixtures

* move conversion test to `test_validate_conversions.py`

* drop `%:z` formats

* add more complex timezones to payload fixtures, fix deserialization

* update datetime parsing test
* update grpc client

* update rest client
* Fix for close() method of QdrantLocal class

* new: update async client

---------

Co-authored-by: George Panchuk <[email protected]>
#480)

* expose grpc channel-level compression settings in base functions

* expose grpc channel-level compression settings in remote classes

* expose grpc channel-level compression settings in client

* raise TypeError for compression

* added test cases for grcp channel-level compression

* move grpc_compression parameter from client's signature to **kwargs

* use grpc.Compression instead of creating new enum qdrant.grpc.Compression in qdrant/grpc/__init__.py

* refactor grpc_compression type hint

* fix: Compression instead of grpc.Compression in type hint

* tests: move and update tests

* chore: remove magic method

* fix: fix async client generator, update precommit dependencies

* fix: update isort options

* fix: update dev dependencies

---------

Co-authored-by: George Panchuk <[email protected]>
* fix typo

gPRC -> gRPC

* generate async client

---------

Co-authored-by: TJ Bai <[email protected]>
Co-authored-by: Luis Cossío <[email protected]>
* new: add min should clause

* fix: local mode fix payload filter type hint

* tests: extend conversion

* fix: fix param name

* fix: fix grpc structure, update name
* new: add collection exists interface

* tests: fix version comparison

* tests: add collection exists type stub, add async test
* fix: turn on test no fastembed

* fix: ignore input

* Update fastembed tests job name

Co-authored-by: Luis Cossío <[email protected]>

---------

Co-authored-by: Luis Cossío <[email protected]>
* Update fastembed to v0.2.1

* chore(qdrant_fastembed.py): update DEFAULT_EMBEDDING_MODEL

* fix(fastembed integration): upgrade to latest version

* Prefer black over ruff

* Prefer black over ruff

* Remove hardcoded directory structure from Qdrant Client checks

* new: deprecate current default model, deprecate max token length, update fastembed

* fix: make embedding_model_name method sync

* fix: update poetry lock

* refactor: use list_supported_models() (#501)

* fix: fix fastembed check

* fix: fix fastembed class var assignment

* fix: remove fastembed deprecation from qdrant client (#524)

---------

Co-authored-by: George Panchuk <[email protected]>
Co-authored-by: Anush <[email protected]>
* fix: do not set tz if it is not provided in local mode

* refactor: remove commented out code

* fix: rollback tzinfo, add tzinfo to conditions in local mode

* fix: slightly increase chance of datetime payload generation
* fix: handle tzinfo in HH format in local mode

* fix: fix mypy

* fix: replace dt split by trying adding minutes to tz (#540)

* refactoring: remove redundant import
* new: propagate timeout from methods to httpx

* refactor: refactor timeout propagation (#535)

* tests: add timeout test

* refactor: remove redundant kwargs
* fix: fix mypy and pyright

* fix: fix nested key case, add tests

* fix: fix mypy

* fix: fix mypy again

* new: extend jsonpath support, update set and get value by key in local mode

* fix: uncomment tests, fix set payload call, fix corner case

* refactor: split payload_value extractor into several files

* docs: update docstring

* refactor: refactor local mode payload setter (#544)

* refactor: refactor local mode payload setter

* fix: fix mypy

* fix: return docstring

* fix: address review issues

* fix: fix extraction from by invalid keys

* fix: move payload tests to a separate folder to be recognized by pytest

* tests: add nested array filters

* review fixes

* fix: fix async

* fix: set payload by key handle escaped quotes

---------

Co-authored-by: generall <[email protected]>
* WIP: hybrid search with fastembed

* hybrid queries with fastembed

* test for hybrid

* fix typo

* new: extend hybrid search tests, fix mypy, small refactoring (#554)

* refactor: align model name parameters in setters, update tests

* fix: fix async

* fix: add a good test, fix sparse vectors in query batch

* refactoring: reduce branching, refactor fastembed tests

---------

Co-authored-by: George <[email protected]>
* fix: remove redundant pytest import breaking qdrant-client

* refactoring: remove redundant imports
Copy link

netlify bot commented Apr 9, 2024

Deploy Preview for poetic-froyo-8baba7 ready!

Name Link
🔨 Latest commit 8d7d31d
🔍 Latest deploy log https://app.netlify.com/sites/poetic-froyo-8baba7/deploys/6615108d32f5030008dd1264
😎 Deploy Preview https://deploy-preview-587--poetic-froyo-8baba7.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@Apmats Apmats changed the base branch from master to dev April 9, 2024 09:56
@Apmats Apmats closed this Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.