-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding support for Threshold, Limit, and Order Arguments #12
base: master
Are you sure you want to change the base?
Conversation
…_neighbor method/scope. All test cases are passing and new ones added for new options.
Hi @sebscholl, thanks for the PR.
|
Makes sense. Do you believe it would be helpful to add this info to the docs (e.g, |
@ankane I believe in the case of using class method like Movie.nearest_neighbor(embedding, my_gen_embedding, ...) the ordering is not set by distance. Instead I'm getting P.S. I set Movie.unscoped {} but still am getting ORDER BY ID, AFAIK there is no way to set to order by distance with the gem. P.P.S I set |
Regarding thresholds, if others are working on this, here is some relevant code I came up with:
Some nuances to take note of:
|
@vestedpr-dev thanks for writing that out! I wasn't able to get this line to work in rails 8, pg 17:
My actual command: m.nearest_neighbors(:embeddings, distance: "cosine").where("embeddings <=> '[?]' <= ?", m.embeddings, 1 - 0.9).first(3) error:
I tried various ways of casting and finally got this way to work, but this feels very inefficient: # Convert embeddings array to a parameterized string safely
embeddings_array = self.embeddings.map { |v| ActiveRecord::Base.sanitize_sql(v) }.join(', ')
# Safely bind parameters to avoid SQL injection
MyModel
.where.not(id: self.id)
.where.not(embeddings: nil)
.where(Arel.sql("embeddings <=> ARRAY[#{embeddings_array}]::vector <= ?"), 1 - threshold)
.order(Arel.sql("embeddings <=> ARRAY[#{embeddings_array}]::vector"))
.limit(3) |
This seems to work for me using Postgres.
I wasn't able to get the |
bump - because this looks great! We should get it reviewed and merged. |
This pull request adds 3 keyword arguments to the
nearest_neighbor
method. They are:order
limit
threshold
Multiple Options
All options can be used at the same time or separately.
These options manipulate the SQL statement generated by ActiveRecord. All original test suits are intact and passing, and the new tests were written with the new options.