Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can search script avoid hashtable lookups for field access? #25913

Closed
jpountz opened this issue Jul 26, 2017 · 15 comments
Closed

Can search script avoid hashtable lookups for field access? #25913

jpountz opened this issue Jul 26, 2017 · 15 comments
Assignees
Labels
:Core/Infra/Scripting Scripting abstractions, Painless, and Mustache >enhancement Team:Core/Infra Meta label for core/infra team

Comments

@jpountz
Copy link
Contributor

jpountz commented Jul 26, 2017

Take this search script: Math.abs(doc['field'].value). On every document, we will call LeafDocLookup.get which mostly fetches script doc values for the field field in a hashtable and advances it to the current document. The hashtable lookup feels a bit unnecessary since we will always return the same ScriptDocValues object on a per-segment basis, is there a way we could avoid it? A hashtable lookup may look lightweight, but when everything that you do in your script is reading a doc value and applying a cheap function like Math.abs, it might not be negligible?

@jpountz jpountz added :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache discuss labels Jul 26, 2017
@nik9000
Copy link
Member

nik9000 commented Jul 26, 2017

I think it is possible and probably dovetails into any work we would do to remove some of the hidden casts as well. I don't know the scheduling for something like that though.

@jdconrad
Copy link
Contributor

This can be explored more once we break apart the analysis step into a multi-pass compiler.

@jtibshirani
Copy link
Contributor

I looked into this when debugging the performance of script_score queries. The summary: the hash lookup can indeed contribute to latency, even when loading a single hard-coded field. There is also overhead from using Painless itself.

Our rally benchmarks include a comparison of a function_score and script_score query that compute the same result. Here’s the script query:

"query": {
  "script_score": {
    "query": {
      "match_all": {}
    },
    "script": {
      "source": "Math.log10(doc['population'].value * 1.2 + 2)"
    }
  }
}

Current performance
Scripting is ~30% slower.

| 50th percentile service time | field_value_function_score | 22.1627 | ms |
| 95th percentile service time | field_value_function_score | 22.5731 | ms |

| 50th percentile service time | field_value_script_score | 29.4205 | ms |
| 95th percentile service time | field_value_script_score | 30.0251 | ms |

Only load doc values once
Using this hacky branch, we load doc values for ‘population’ once, and always return them instead of performing a hash lookup. Now scripting is ~20% slower.

| 50th percentile service time | field_value_function_score | 20.2305 | ms |
| 95th percentile service time | field_value_function_score | 21.5232 | ms |

| 50th percentile service time | field_value_script_score | 24.483 | ms |
| 95th percentile service time | field_value_script_score | 25.679 | ms |

Use a custom script engine
The same branch defines a HackyScriptEngine that just computes the mathematical expression directly. It builds off the doc values hack, so this is an combined improvement. This seems to close the performance gap.

| 50th percentile service time | field_value_function_score | 19.8072 | ms |
| 95th percentile service time | field_value_function_score | 20.6771 | ms |

| 50th percentile service time | field_value_script_score | 19.9949 | ms |
| 95th percentile service time | field_value_script_score | 21.1168 | ms |

@jpountz
Copy link
Contributor Author

jpountz commented Apr 17, 2020

Thanks for benchmarking this @jtibshirani, this is very interesting. @stu-elastic and @jdconrad might have more insights, but I wonder that your second experiment with the HackyScriptEngine also gives type hints that your Painless script doesn't have? Would ScriptDocValues.Longs scriptValues = (ScriptDocValues.Longs) doc['population']; long value = scriptValues.value; return Math.log10(value * 1.2 + 2); be a fairer script to compare with?

@jtibshirani
Copy link
Contributor

jtibshirani commented Apr 17, 2020

I caught up with @jdconrad earlier and he suggested trying the following Painless script: "Math.log10(((long)doc['population'].value) * 1.2 + 2)". This significantly improved performance:

| 50th percentile service time | field_value_function_score |       19.12 |     ms |
| 95th percentile service time | field_value_function_score |      19.708 |     ms |

| 50th percentile service time |   field_value_script_score |     21.5198 |     ms |
| 95th percentile service time |   field_value_script_score |     22.0022 |     ms |

Again the numbers represent a combined improvement with the hacky docvalues change.

@mayya-sharipova
Copy link
Contributor

@jdconrad Is this a general recommendation to cast ScriptDocValues to explicit type in a user's script?

@jdconrad
Copy link
Contributor

@jtibshirani Thanks for the performance numbers! It's nice to know where to put effort to improve things.

@mayya-sharipova I don't feel comfortable recommending this generally because it's very fragile. For instance, if the user does a query across indices and the mappings don't match, they may end up with ClassCastExceptions. The user requires direct knowledge of the indices searched on and the mappings for those indices for each query. On the other hand, for a workaround with an advanced user, it'll definitely speed things up.

@jdconrad
Copy link
Contributor

jdconrad commented Apr 17, 2020

A note about this script "Math.log10(((long)doc['population'].value) * 1.2 + 2)

Casting to long here means that the rest of the math can be standard JVM ASM instructions written at compile-time because we promote all the values here to double (the 1.2 causes this). Without the explicit cast to a long, the runtime must call a method to figure out the type the def value returned from ScriptDocValues and then multiply it by 1.2.

@jpountz
Copy link
Contributor Author

jpountz commented Apr 18, 2020

@jdconrad This makes me wonder whether the suggestion of doing such hash lookups up-front on a per-segment basis would also enable optimizing this automatically, as Painless would know that doc['population'] is a ScriptDocValues.Longs?

@nik9000
Copy link
Member

nik9000 commented Apr 18, 2020 via email

@jdconrad
Copy link
Contributor

@jpountz My understanding is this isn't the case because we still don't know the type until runtime for two reasons. 1. the caching would be done using a member variable on each script instance that doesn't get set until the first instance invocation so we don't have the information when the class is actually written (maybe we could do some deferred writing of the script like Java does with lamdas) and 2. we don't have access to mappings in scripting (something that would be a nice for optimization), but even then you still have to contend with possibly different mappings across indices in a single query

@ywelsch suggested that we may able to union all mappings for a specific query, though, to see if there's a common type

@nik9000 I'll need to give your suggestion some more thought on how that would work. I do wonder if it's easier to optimize the hashing of this after semantic validation.

@rjernst rjernst added the Team:Core/Infra Meta label for core/infra team label May 4, 2020
@rjernst rjernst added the needs:triage Requires assignment of a team area label label Dec 3, 2020
@jdconrad jdconrad removed the needs:triage Requires assignment of a team area label label Dec 9, 2020
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@nik9000
Copy link
Member

nik9000 commented Mar 6, 2024

ESQL avoids this so maybe we should call it "solved by ESQL"?

@nik9000
Copy link
Member

nik9000 commented Mar 6, 2024

That doesn't use scripts, but it's own syntax. But it does work.

@arpadkiraly arpadkiraly assigned rjernst and unassigned arpadkiraly Mar 11, 2024
@jpountz
Copy link
Contributor Author

jpountz commented Mar 22, 2024

ES|QL does indeed help there, but I'm also thinking of all runtime field definitions that will not benefit from this. Let's close for now, we can still reopen later if interest in this issue comes back.

@jpountz jpountz closed this as completed Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Scripting Scripting abstractions, Painless, and Mustache >enhancement Team:Core/Infra Meta label for core/infra team
Projects
None yet
Development

No branches or pull requests