Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up sorted scroll when the index sort matches the search sort #25138

Merged
merged 4 commits into from
Jun 12, 2017

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Jun 8, 2017

Sorted scroll search can use early termination when the index sort matches the scroll search sort.
The optimization can be done after the first query (which still needs to collect all documents)
by applying a query that only matches documents that are greater than the last doc retrieved in the previous request.
Since the index is sorted, retrieving the list of documents that are greater than the last doc
only requires a binary search on each segment.
This change introduces this new query called SortedSearchAfterDocQuery and apply it when possible.
Scrolls with this optimization will search all documents on the first request and then will early terminate each segment
after $size doc for any subsequent requests.

Relates #6720

Sorted scroll search can use early termination when the index sort matches the scroll search sort.
The optimization can be done after the first query (which still needs to collect all documents)
by applying a query that only matches documents that are greater than the last doc retrieved in the previous request.
Since the index is sorted, retrieving the list of documents that are greater than the last doc
only requires a binary search on each segment.
This change introduces this new query called `SortedSearchAfterDocQuery` and apply it when possible.
Scrolls with this optimization will search all documents on the first request and then will early terminate each segment
after $size doc for any subsequent requests.

Relates elastic#6720
@jimczi jimczi added :Search/Search Search-related issues that do not fall into other categories >enhancement review v6.0.0 labels Jun 8, 2017
@jpountz jpountz self-requested a review June 9, 2017 09:22
Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good and I'm fine with merging as-is, but I think it would be better to add early-termination support to TopFieldCollector by adding a trackTotalHits parameter?

@jpountz
Copy link
Contributor

jpountz commented Jun 9, 2017

Oops actually your changes are unrelated to adding early-termination support to TopFieldCollector (which would still be a good thing to do I think :)), LGTM!

@jimczi jimczi merged commit 7ab3d5d into elastic:master Jun 12, 2017
@jimczi jimczi deleted the feature/sorted_scroll_index_sort branch June 12, 2017 07:33
@jimczi
Copy link
Contributor Author

jimczi commented Jun 12, 2017

Thanks @jpountz !

jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jun 12, 2017
* master:
  Do not swallow node lock failed exception
  Revert "Revert "Sense for VirtualBox and $HOME when deciding to turn on vagrant testing. (elastic#24636)""
  Aggregations bug: Significant_text fails on arrays of text. (elastic#25030)
  Speed up sorted scroll when the index sort matches the search sort (elastic#25138)
  TranslogTests.testWithRandomException ignored a possible simulated OOM when trimming files
  Adapt TranslogTests.testWithRandomException to checkpoint syncing on trim
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jun 13, 2017
* master:
  Explicitly reject duplicate data paths
  Do not swallow node lock failed exception
  Revert "Revert "Sense for VirtualBox and $HOME when deciding to turn on vagrant testing. (elastic#24636)""
  Aggregations bug: Significant_text fails on arrays of text. (elastic#25030)
  Speed up sorted scroll when the index sort matches the search sort (elastic#25138)
  TranslogTests.testWithRandomException ignored a possible simulated OOM when trimming files
  Adapt TranslogTests.testWithRandomException to checkpoint syncing on trim
  Change BWC versions on get mapping 404s
  Fix get mappings HEAD requests
  TranslogTests#commit didn't allow for a concurrent closing of a view
  Fix handling of exceptions thrown on HEAD requests
  Fix comment formatting in EvilLoggerTests
  Remove unneeded weak reference from prefix logger
  Test: remove faling test that relies on merge order
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories v6.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants