-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[META] Performance Comparison of OS 3.0 Lucene 10 with Mainline #16934
Comments
@reta @rishabh6788 We can use this issue for any discussions related to Lucene 10 perf tests. Created it separately as the PR was getting crowded with lots of comments. |
Catch All Triage - 1, 2, 3 |
Setup DetailsThe comparison was done in 2 EC2 Instances ( r5.xlarge ) one running on OS 3.0 and other on OS 2.19. Each server had a heap of 4gb and same EBS configurations. Both the indices were written using only 1 indexing bulk client to minimize index order related slowdowns showing up in the results.
Benchmarking ResultsAttaching the summary for comparison between OS 2.19 and OS 3.0
Click this to collapse/fold the benchmark results
Resolved/Fix knownThis run for OS 3.0 contained the fix present in 17329 so Open Areas
Some queries are consistently slower (<5%) irrespective of OS 3.0 or OS 2.19 index being used with OS 3.0 server
The query below ONLY becomes slower ( < 5% ) when OS2.19 index is used with OS 3.0 server. So, folks migrating to OS 3.0 without reindexing might face slight slowness.
Query performs better with OS 2.19 index compared to OS 3.0 Index. So, the queries below have room for improvements but are overall better in OS 3.0.
Implicit Improvements
|
Overview
We will run the benchmarks from the PR using the OpenSearch CDK and OSB and compare the runs for every workload with mainline ( containing Lucene 9 ) to ensure there are no regressions in search or indexing. Lucene 10 has introduced explicit vectorization for comparing vectors and decoding postings, so we should ensure the benchmarks are run on data nodes that have CPU supporting SIM-D capabilities.
We should note down any improvements and try to correlate with changes in Lucene 10 that could have caused them.
We might encounter cases where we are using a older Lucene API which has a faster alternative available in Lucene 10. Any performance regression needs to be fixed before we can go ahead with preparing the RC for OpenSearch Version 3.0
Related component
Other
To Reproduce
Compare performance report of runs with daily runs of mainline.
Expected behavior
There should be no performance regression in search or indexing.
Any improvements should be noted explicitly.
Issues
The text was updated successfully, but these errors were encountered: