Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hybrid search throws error when prefer_grpc=True #673

Closed
StreetLamb opened this issue Jun 30, 2024 · 5 comments
Closed

Hybrid search throws error when prefer_grpc=True #673

StreetLamb opened this issue Jun 30, 2024 · 5 comments

Comments

@StreetLamb
Copy link

I am running a simple hybrid service with fastembed with prefer_grpc=True:

from qdrant_client import QdrantClient

client = QdrantClient(
    url="http://qdrant.localhost:6333", api_key="xxx", prefer_grpc=True
)
client.set_model("BAAI/bge-small-en-v1.5")
client.set_sparse_model("prithivida/Splade_PP_en_v1")

if not client.collection_exists(collection_name="test"):
    client.create_collection(
        collection_name="test",
        vectors_config=client.get_fastembed_vector_params(),
        sparse_vectors_config=client.get_fastembed_sparse_vector_params(),
    )

client.add(
    collection_name="test",
    documents=["Hello there"],
)

However, I get the following error:

  File "/Users/xxx/Desktop/projects/test/test.py", line 34, in <module>
    client.add(
  File "/Users/xxx/Library/Caches/pypoetry/virtualenvs/app-w4MTL1IK-py3.12/lib/python3.12/site-packages/qdrant_client/qdrant_fastembed.py", line 513, in add
    self._validate_collection_info(collection_info)
  File "/Users/xxx/Library/Caches/pypoetry/virtualenvs/app-w4MTL1IK-py3.12/lib/python3.12/site-packages/qdrant_client/qdrant_fastembed.py", line 386, in _validate_collection_info
    sparse_vector_field_name in collection_info.config.params.sparse_vectors
TypeError: argument of type 'NoneType' is not iterable

The error only occurs if prefer_grpc=True. I checked that in the created collection's config, sparse_vectors has fast-sparse-splade_pp_en_v1 so collection_info.config.params.sparse_vectors should not be None.

@joein
Copy link
Member

joein commented Jul 2, 2024

Hi @StreetLamb

Thanks for pointing out, it's actually a bug from our side
I've proposed a fix in #674

@aiorga-sherpas
Copy link

When trying to perform an Hybrid Query as shown in the documentation it fails due to not detecting the dense embeddings as a valid query model.

Snipped of code from the documentation:

from qdrant_client import QdrantClient, models

client = QdrantClient(url="http://localhost:6333")

client.query_points(
    collection_name="{collection_name}",
    prefetch=[
        models.Prefetch(
            query=models.SparseVector(indices=[1, 42], values=[0.22, 0.8]),
            using="sparse",
            limit=20,
        ),
        models.Prefetch(
            query=[0.01, 0.45, 0.67, ...],  # <-- dense vector
            using="dense",
            limit=20,
        ),
    ],
    query=models.FusionQuery(fusion=models.Fusion.RRF),
)

Error:

 qdrant_client/conversions/conversion.py", line 2579, in convert_query
    raise ValueError(f"invalid Query model: {model}")  # pragma: no cover
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: invalid Query model:

def convert_query(cls, model: rest.Query) -> grpc.Query:

There the function receives a dense vector as model so any of the conditions is met.

@joein
Copy link
Member

joein commented Jul 8, 2024

Hi @aiorga-sherpas
Thanks for pointing it out, we'll fix it asap

@joein
Copy link
Member

joein commented Jul 8, 2024

#682

@joein
Copy link
Member

joein commented Jul 8, 2024

fixes should be available with qdrant-client==1.10.1

@joein joein closed this as completed Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants