-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to set percentage influence of each function in function score query #15670
Comments
can't you just use the |
@Vineeth-Mohan ping |
Hello @s1monw , Let me walk through the motivation here.
With this , I am seeing the following results -
As you can see the score by field_value_factor is always shadowing the score given by random_score , as in random_score has no relevance here. My motivation for this issue came from this problem. The percentage suggestion was based on this , but I am finding it difficult to pen the maths behind the same. Only solution i found was to find the range of each score given by each function across all document and use that for percentage influence. But as scoring is per document , that wont be feasible. Let me know your thoughts on the subject. |
@Vineeth-Mohan I can see what you are saying and I admit it can be challenging. I personally don't see a good way to apply a general way of normalization here. I see the function score feature as a toolset of primitives that lets / forces the user to ensure that each element of the equation has it's relevant weight etc. I wonder if other ie. @brwe has some ideas? |
It seems to me this is a case of "learning to rank". To find proper weights you would need to know what the expected ordering of result for different queries would be and the tune the weights accordingly. Without that the only thing you can do now is guess.
|
@brwe another benefit of what's proposed here if I understand correctly is one could use score_mode In fact, on reading the docs at https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html , I initially interpreted that we could pass |
The only other thing I could suggest is to apply a min/max score to each function, eg you could force |
Closing this in favour of #27588, where one of the desired features could be to normalize scores |
The functions score query gives a good facility to implement various aspects of the score , but then its not exactly giving control over the influence of each function.
For eg: , for the function below -
There are 4 functions and they dictate the end score. Here , either of the function like the script_score function can eat up all the influence of the score. That is the value of the script_score might be in range of 1000 to 2000 and value of the decay would be between 0 and 1. Hence the influence of the decay function is not exactly passed on to the final score , rather its the script_score that eats up all the influence , rest of the functions might have little or no influence on the final score.
To fix this , it might be useful to have a influenceScore factor per function which tells what percentage of the end score , this function should influence.
For eg: , the above query can be rewriten as
Here , we will have a influenceScore per function which dictates the influence of each function. This will help us in further fine tuning the score.
The text was updated successfully, but these errors were encountered: