Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix read consistency when performing batch search #587

Closed
wants to merge 117 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
117 commits
Select commit Hold shift + click to select a range
9c62dae
Fix API Param in README
NirantK Aug 31, 2023
c2403e3
Merge pull request #274 from qdrant/Fix-API-parameter-in-README
NirantK Aug 31, 2023
38b9ef2
sync indexed_only and binary quant params
generall Aug 21, 2023
001591c
Fixes for test, type errors (#266)
NirantK Aug 23, 2023
67d2538
add tests for vector operations (#275)
generall Sep 4, 2023
02935f8
Congruence values count (#268)
joein Sep 4, 2023
41fd801
sync API (#280)
generall Sep 7, 2023
cc34721
fix README for fastembed
generall Sep 7, 2023
c1e640f
V1.5.0 (#281)
generall Sep 7, 2023
27d4e21
fix fastembed for pydantic 1.x (#285)
generall Sep 11, 2023
4aa7e4b
bump to v1.5.1 (#287)
joein Sep 11, 2023
52db932
fix: explicitly convert np vector to list for previous versions of py…
joein Sep 11, 2023
352aee6
bump to v1.5.2 (#290)
joein Sep 11, 2023
9777800
bump version 1.5.3
generall Sep 12, 2023
3fca54a
new: add kwargs to QdrantBase (#292)
joein Sep 12, 2023
82ecb10
bump version 1.5.4
generall Sep 12, 2023
63a5b3e
fix: fix type hint for get_args_subscribed and batch operation (#301)
joein Sep 25, 2023
a8e47f1
fix: add init from conversion, support init from in local mode (#317)
joein Sep 29, 2023
4ebc828
Recommendation api update + local mode (#314)
generall Oct 5, 2023
d2a8b43
python 3.12 support (#326)
generall Oct 6, 2023
5b155f8
Geo polygons support (#325)
agourlay Oct 6, 2023
66ae8f7
update payload selector and tests (#330)
generall Oct 6, 2023
056e5a1
bump version 1.6.0
generall Oct 6, 2023
ca0eb70
Adding forgotton wait to the upload records (#332)
alimohammad1995 Oct 7, 2023
7204c31
Merge pull request #331 from qdrant/v1.6.0
timvisee Oct 9, 2023
2a07215
Async qdrant client (#319)
joein Oct 13, 2023
0cfc45a
update interface and version for fastembed (#340)
generall Oct 16, 2023
834e40c
fix: convert score to float in local mode for pydantic (#337)
joein Oct 16, 2023
30788ed
update async readme
generall Oct 16, 2023
cb2159e
bump version 1.6.1
generall Oct 16, 2023
ea9aa64
fix: add missing parameters in recommend batch (#338)
joein Oct 15, 2023
98c2de1
bump version 1.6.2
generall Oct 16, 2023
3ba3c1f
fix _embed_documents performance
generall Oct 16, 2023
3de2d67
regen async
generall Oct 16, 2023
a54448b
bump version 1.6.3
generall Oct 16, 2023
2b53d10
new: async local mode generator (#345)
joein Oct 20, 2023
6875996
Bump version to 1.6.4
joein Oct 26, 2023
3253f99
Merge pull request #342 from qdrant/docs-link
NirantK Oct 19, 2023
6504774
Merge pull request #349 from qdrant/docs-tinker
NirantK Oct 24, 2023
bd8549e
Add Executable Quickstart Example (#352)
NirantK Oct 26, 2023
82ad21a
Rewrite Docstrings + Change Netlify build to branch from Pip (#359)
NirantK Oct 31, 2023
86b38a1
Add New API to Docs (#361)
NirantK Nov 1, 2023
d610061
Remove upload collection example and update (#366)
NirantK Nov 2, 2023
8692dd4
new: add multithreading for local mode in Linux (#369)
joein Nov 14, 2023
e207d1d
bump version 1.6.5
generall Nov 15, 2023
91d44df
force_disable_check_same_thread flag (#376)
generall Nov 15, 2023
f1faca4
bump version 1.6.6
generall Nov 15, 2023
ff7a152
fix LocalCollection initialization
generall Nov 15, 2023
c5c02ea
bump version 1.6.7
generall Nov 15, 2023
a3748ac
fix async_qdrant_local
generall Nov 15, 2023
59b8f91
bump version 1.6.8
generall Nov 15, 2023
994dc01
fix force_disable_check_same_thread for LocalCollection load again
generall Nov 15, 2023
8875b45
bump version 1.6.9
generall Nov 15, 2023
fd2bf9c
update recommend formula
coszio Nov 1, 2023
298e67b
Add Discovery API and local mode (#368)
coszio Nov 22, 2023
e7970ef
new: switch async generator version to 3.10.x (#387)
joein Nov 30, 2023
3d3e09f
new: prohibit setting more than one param from host, url, location, p…
joein Nov 30, 2023
cbc9dda
Sparse vectors API and local mode (#378)
agourlay Dec 2, 2023
e53bc6f
Manhattan and shard key (#391)
generall Dec 4, 2023
621030e
New public sparse vector configuration (#393)
agourlay Dec 6, 2023
6cc515c
new: add wait parameters to snapshots, update status codes, update ty…
joein Dec 6, 2023
6426d7f
new: update poetry.lock
joein Dec 7, 2023
38dfa12
bump version to v1.7.0
joein Dec 7, 2023
6d019e6
Merge pull request #395 from qdrant/release-v1.7.0
timvisee Dec 8, 2023
efb876f
Fix Vector Count Mismatch during Client Migration #403 (#428)
shivas1516 Jan 11, 2024
3964a49
include fastembed model into the list of all models (#412)
generall Dec 26, 2023
f7cd365
new: expose grpc options in client (#401)
joein Jan 4, 2024
65919af
fix: do not try to close none objects in local mode (#398)
joein Jan 4, 2024
cfa886d
fix: forbid nan in payload in local mode to restore congruence (#397)
joein Jan 4, 2024
9ff5755
fix: fix portalocker being deleted before closing the client (#421)
joein Jan 5, 2024
48a21cb
new: add shard keys to upload_collection and upload_records (#396)
joein Jan 5, 2024
cef5bfd
new: unlock python upper cap (#419)
joein Jan 5, 2024
cdef1eb
fix: fix division by zero in cosine similarity in local mode (#425)
joein Jan 6, 2024
fff677d
fix: replace data with content to fix httpx deprecation warning (#426)
joein Jan 6, 2024
1bd9c2e
Allow fastembed embedding model params configuration in parity with f…
praveen-palanisamy Jan 7, 2024
da41292
Update the urlllib3 package dependency to make it compatible with v2 …
lingcoder Jan 8, 2024
6bafdcc
fix: fix upsert check in local mode (#432)
joein Jan 19, 2024
66161cc
fix: fix local mode loading with dense and sparse vectors (#433)
joein Jan 19, 2024
4eacd6e
Fix sort sparse vectors (#442)
joein Jan 19, 2024
cd41258
fix: align timeout type for qdrant client and its methods (#443)
joein Jan 19, 2024
2b8f3d2
new: deprecate upload records, update tests, prohibit migration of co…
joein Jan 19, 2024
ce5b0db
bump version to v1.7.1
joein Jan 19, 2024
a901a98
fix: fix setting grpc options in sync and async channels (#467)
joein Jan 30, 2024
11314e7
fix: convert some types to python jsonable types (#462)
joein Jan 30, 2024
640fe5a
fix: add missing methods to type stub (#454)
joein Jan 30, 2024
876e171
fix: fix implicit ids in upload collection with paralell > 1 (#460)
joein Jan 31, 2024
7e7d4f2
Bump to v1.7.2 (#471)
joein Jan 31, 2024
acabf39
Adds missing import in documentation example (#464)
oulianov Feb 3, 2024
d23c4d4
Fix gRPC conversion for sparse search batch (#484)
agourlay Feb 8, 2024
820a34b
fix: add grpc_grace param to close in QdrantClient (#477)
joein Feb 8, 2024
eda201a
Bump version to v1.7.3
joein Feb 8, 2024
3123236
Fix potential edge case scoring in context search (#474)
coszio Feb 8, 2024
ee22ab2
Local mode of `order_by` parameter in `scroll` + datetime support (#491)
coszio Mar 1, 2024
21863d1
Generate rest and grpc clients for 1.8 (#512)
coszio Mar 1, 2024
94e4889
Fix for close() method of QdrantLocal class (#505)
FranckZibi Mar 1, 2024
7df1e57
new: update qdrant version in backward compatibility tests (#513)
joein Mar 1, 2024
90586d2
Remove deprecated field conversions (#502)
coszio Mar 1, 2024
4aea36f
feat: Expose Setting for GRPC Channel-Level Compression at Client Sid…
geetu040 Mar 1, 2024
da91181
Fix grpc typo (#515)
joein Mar 1, 2024
5ba8d99
docs: add batching recommendation to readme (#516)
joein Mar 1, 2024
f184add
Add support for datetime ranges (#517)
coszio Mar 1, 2024
0849671
new: add min should clause (#519)
joein Mar 2, 2024
ee67ccf
new: add collection exists interface (#518)
joein Mar 2, 2024
3068e03
fix: do not try to close already closed file (#521)
joein Mar 2, 2024
7365f83
fix: turn on test no fastembed (#522)
joein Mar 4, 2024
cb0aa80
Upgrade FastEmbed Version (#493)
NirantK Mar 5, 2024
c63c62e
bump version to v1.8.0 (#526)
joein Mar 6, 2024
e25380a
Break out of retry loop upon successful upsert (#531)
almostimplemented Mar 11, 2024
4f08d7e
fix: make datetime range conditions tz aware (#538)
joein Mar 13, 2024
5232d55
fix: handle tzinfo in HH format in local mode (#537)
joein Mar 13, 2024
40a34c4
new: propagate timeout from methods to httpx (#534)
joein Mar 13, 2024
5de0f3d
Add key set payload (#536)
joein Mar 26, 2024
0d7e46c
WIP: hybrid search with fastembed (#553)
generall Mar 27, 2024
71255c2
bump version to v1.8.1
joein Mar 27, 2024
437738c
Remove pytest import (#556)
joein Mar 27, 2024
8e3ea58
bump version to v1.8.2
joein Mar 27, 2024
8d7d31d
Fix read consistency when performing batch search
Apmats Apr 9, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/integration-tests-macos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,11 @@ jobs:
strategy:
matrix:
python-version:
- '3.7.x'
- '3.8.x'
- '3.9.x'
- '3.10.x'
- '3.11.x'
- '3.12.x'
os:
- macos-latest

Expand Down
20 changes: 17 additions & 3 deletions .github/workflows/integration-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ jobs:
- '3.9.x'
- '3.10.x'
- '3.11.x'
- '3.12.x'
os:
- ubuntu-latest

Expand All @@ -35,19 +36,32 @@ jobs:
run: |
python -m pip install poetry
poetry config virtualenvs.create false
poetry install --no-interaction --no-ansi
poetry install --no-interaction --no-ansi --all-extras
- name: Run async client generation tests
run: |
if [[ ${{ matrix.python-version }} == "3.10.x" ]]; then
./tests/async-client-consistency-check.sh
fi
shell: bash
- name: Run Python doc tests
run: |
python -m doctest qdrant_client/local/local_collection.py
- name: Run integration tests
run: |
./tests/integration-tests.sh
shell: bash
- name: Backward compatibility integration tests
run: |
export RUNNER_OS=${{ runner.os }}
QDRANT_VERSION='v1.3.2' ./tests/integration-tests.sh
QDRANT_VERSION='v1.7.4' ./tests/integration-tests.sh
shell: bash
- name: Run fastembed tests without fastembed
run: |
pip3 uninstall fastembed -y
pytest tests/test_fastembed.py
shell: bash
- name: Check conversion coverage
run: |
export RUNNER_OS=${{ runner.os }}
./tests/coverage-test.sh
shell: bash

7 changes: 5 additions & 2 deletions .github/workflows/type-checkers.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ jobs:
strategy:
fail-fast: true
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11"]
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
os: [ubuntu-latest]

name: Python ${{ matrix.python-version }} test
Expand All @@ -28,7 +28,10 @@ jobs:

- name: mypy
run: |
poetry run mypy . --disallow-incomplete-defs --disallow-untyped-defs
if [[ ${{ matrix.python-version }} != "3.8" ]] || [[ ! -d "tools/async_client_generator" ]]; then
# async_qdrant_fastembed.py is autogenerated and erases type ignore statements from the source code
poetry run mypy . --exclude "async_qdrant_fastembed.py" --disallow-incomplete-defs --disallow-untyped-defs
fi

- name: pyright
run: |
Expand Down
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,10 +1,16 @@
venv
.DS_Store
__pycache__
.pytest_cache
.idea
.vscode
.devcontainer
.coverage
htmlcov
*.iml
dist
*.tar.gz
local_cache/*/*
.python-version
docs/source/examples/local_cache/*
docs/source/examples/path/to/db/*
10 changes: 5 additions & 5 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
#default_language_version:
# python: python3.8
default_language_version:
python: python3.10

exclude: 'qdrant_client/(grpc|http|models)/'

Expand All @@ -14,15 +14,15 @@ repos:
- id: check-added-large-files

- repo: https://github.com/psf/black
rev: 23.1.0
rev: 23.12.1
hooks:
- id: black
name: "Black: The uncompromising Python code formatter"
args: ["--line-length", "99"]

- repo: https://github.com/PyCQA/isort
rev: 5.12.0
rev: 5.13.2
hooks:
- id: isort
name: "Sort Imports"
args: ["--profile", "black"]
args: ["--profile", "black", "--py", "310"]
64 changes: 50 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,12 @@
<a href="https://github.com/qdrant/qdrant-client/blob/master/LICENSE"><img src="https://img.shields.io/badge/License-Apache%202.0-success" alt="Apache 2.0 License"></a>
<a href="https://qdrant.to/discord"><img src="https://img.shields.io/badge/Discord-Qdrant-5865F2.svg?logo=discord" alt="Discord"></a>
<a href="https://qdrant.to/roadmap"><img src="https://img.shields.io/badge/Roadmap-2023-bc1439.svg" alt="Roadmap 2023"></a>
<a href="https://python-client.qdrant.tech/"><img src="docs/images/api-icon.svg" width="30px"></a>
</p>

# Python Qdrant Client

Client library and SDK for the [Qdrant](https://github.com/qdrant/qdrant) vector search engine.
Client library and SDK for the [Qdrant](https://github.com/qdrant/qdrant) vector search engine. Python Client API Documentation is available [here](https://python-client.qdrant.tech/).

Library contains type definitions for all Qdrant API and allows to make both Sync and Async requests.

Expand Down Expand Up @@ -70,7 +71,7 @@ Local mode is useful for development, prototyping and testing.
## Fast Embeddings + Simpler API

```
pip install fastembed qdrant-client
pip install qdrant-client[fastembed]
```

FastEmbed is a library for creating fast vector embeddings on CPU. It is based on ONNX Runtime and allows to run inference on CPU with GPU-like performance.
Expand All @@ -85,16 +86,24 @@ client = QdrantClient(":memory:") # or QdrantClient(path="path/to/db")

# Prepare your documents, metadata, and IDs
docs = ["Qdrant has Langchain integrations", "Qdrant also has Llama Index integrations"]
metadatas = [
metadata = [
{"source": "Langchain-docs"},
{"source": "Linkedin-docs"},
]
ids = [42, 2]

# Use the new add method
client.add(collection_name="demo_collection", docs={"documents": docs, "metadatas": metadatas, "ids": ids})
client.add(
collection_name="demo_collection",
documents=docs,
metadata=metadata,
ids=ids
)

search_result = client.query(collection_name="demo_collection", query_texts=["This is a query document"])
search_result = client.query(
collection_name="demo_collection",
query_text="This is a query document"
)
print(search_result)
```

Expand Down Expand Up @@ -154,6 +163,9 @@ import numpy as np
from qdrant_client.models import PointStruct

vectors = np.random.rand(100, 100)
# NOTE: consider splitting the data into chunks to avoid hitting the server's payload size limit
# or use `upload_collection` or `upload_points` methods which handle this for you
# WARNING: uploading points one-by-one is not recommended due to requests overhead
client.upsert(
collection_name="my_collection",
points=[
Expand Down Expand Up @@ -215,23 +227,47 @@ client = QdrantClient(host="localhost", grpc_port=6334, prefer_grpc=True)

## Async client

Async methods are available in raw autogenerated clients.
Usually, you don't need to use them directly, but if you need extra performance, you can access them directly.

### Async gRPC
Starting from version 1.6.1, all python client methods are available in async version.

Example of using raw async gRPC client:
To use it, just import `AsyncQdrantClient` instead of `QdrantClient`:

```python
from qdrant_client import QdrantClient, grpc
from qdrant_client import AsyncQdrantClient, models
import numpy as np
import asyncio

async def main():
# Your async code using QdrantClient might be put here
client = AsyncQdrantClient(url="http://localhost:6333")

await client.create_collection(
collection_name="my_collection",
vectors_config=models.VectorParams(size=10, distance=models.Distance.COSINE),
)

await client.upsert(
collection_name="my_collection",
points=[
models.PointStruct(
id=i,
vector=np.random.rand(10).tolist(),
)
for i in range(100)
],
)

client = QdrantClient(prefer_grpc=True, timeout=3.0)
res = await client.search(
collection_name="my_collection",
query_vector=np.random.rand(10).tolist(), # type: ignore
limit=10,
)

grpc_collections = client.async_grpc_collections
print(res)

res = await grpc_collections.List(grpc.ListCollectionsRequest(), timeout=1.0)
asyncio.run(main())
```

Both, gRPC and REST API are supported in async mode.
More examples can be found [here](./tests/test_async_qdrant_client.py).

### Development
Expand Down
2 changes: 2 additions & 0 deletions docs/images/api-icon.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@
"sphinx.ext.autodoc",
"sphinx.ext.viewcode",
"sphinx.ext.intersphinx",
"nbsphinx",
"IPython.sphinxext.ipython_console_highlighting"
]

# prevents sphinx from adding full path to type hints
Expand Down Expand Up @@ -68,6 +70,8 @@
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = [
"*qdrant_openapi_client*",
"*grpc*",
"*tests*" # tests are not part of the documentation
]
# -- Options for HTML output -------------------------------------------------

Expand Down
3 changes: 0 additions & 3 deletions docs/source/examples/upload_collection.rst

This file was deleted.

Loading