-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move back to BM25 similarity #36431
base: main
Are you sure you want to change the base?
Move back to BM25 similarity #36431
Conversation
With the recent lucene upgrade we have temporarily adopted the LegacyBM25Similarity which exposes the same scores as BM25Similarity before the k1+1 factor was removed from the numerator of the scoring formula. This commit moves the default Elasticsearch similarity back to the Bm25Similarity and updates the scores that have changed in our docs and tests
Pinging @elastic/es-search |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question around the SQL tests, looks good otherwise.
@@ -326,7 +326,7 @@ public void testExplain() throws Exception { | |||
assertThat(searchResponse.getHits().getTotalHits().value, equalTo(1L)); | |||
Explanation explanation = searchResponse.getHits().getHits()[0].getExplanation(); | |||
assertThat(explanation.getValue(), equalTo(searchResponse.getHits().getHits()[0].getScore())); | |||
assertThat(explanation.toString(), startsWith("0.36464313 = Score based on 2 child docs in range from 0 to 1")); | |||
assertThat(explanation.toString(), startsWith(explanation.getValue() + " = Score based on 2 child docs in range from 0 to 1")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
2.288635 |Frank Herbert |Dune |604 |1965-06-01T00:00:00Z | ||
1.8893257 |Frank Herbert |Dune Messiah |331 |1969-10-15T00:00:00Z | ||
1.0402887 |Frank Herbert |Dune |604 |1965-06-01T00:00:00Z | ||
0.8587844 |Frank Herbert |Dune Messiah |331 |1969-10-15T00:00:00Z | ||
1.6086555 |Frank Herbert |Children of Dune |408 |1976-04-21T00:00:00Z |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look right? All the scores should change, and currently things aren't in score order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I will double check if I fixed all the issues in tests, and if so I will check why this is the case here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok this is really weird. Two documents have updated score, while the other two in this test (and other tests returning the score) seem to have the same score as before, which with the updated bm25 similarity should not be the case. Tests are green which is what concerns me the most. @costin @nik9000 would you know why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me add that I have just indexed the same documents used for these tests and verified manually that all of the 4 documents matching this query get an updated score with the bm25 change. Seems like something weird happens with these tests, not sure what exactly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A mixed cluster with nodes in 6x and 7 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possibly, yet I would expect these tests to run both in multi-node and single-node. which makes me expect that I should not get a green build with the current changes. Single-node should require to update all the scores. I may not be getting how these tests are run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@javanna indeed, there is something wrong here. The tests pass and, looking at the logs it produces, it seems like it's not comparing the output from the csv-spec
with the actual output of the query being executed... am looking into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see @astefan thanks for looking! Would you have the chance to fix this upstream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I'll open a PR.
6fbc166
to
90b97d8
Compare
retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @javanna!
This allows users to opt-out of the updated bm25 similarities and use the deprecated legacy bm25 similarity. This helps especially cross-cluster search cases across clusters on multiple versions, otherwise sorting by score would lead to very weird results depending on the version of the data node that scores each doc.
retest this please |
run gradle build tests 1 |
Pinging @elastic/es-search (Team:Search) |
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
With the last lucene upgrade we have temporarily adopted the
LegacyBM25Similarity
which exposes the same scores as BM25Similarity before the k1+1 factor was removed from the numerator of the scoring formula.This PR changes the default Elasticsearch similarity back to the
Bm25Similarity
and updates the scores that have changed due to such change in our docs and tests.