Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
83472: kvstreamer: improve the avg response size heuristic r=yuzefovich a=yuzefovich This commit improves the heuristic we use for estimating the average response size. Previously, we used a simple average, and now we multiple the average by 1.5. Having this multiple is good for a couple of reasons: - this allows us to fulfill requests that are slightly larger than the current average. For example, imagine that we're processing three requests of sizes 100B, 110B, and 120B sequentially, one at a time. Without the multiple, after the first request, our estimate would be 100B so the second request would come back empty (with ResumeNextBytes=110), so we'd have to re-issue the second request. At that point the average is 105B, so the third request would again come back empty and need to be re-issued with larger TargetBytes. Having the multiple allows us to handle such a scenario without any requests coming back empty. In particular, TPCH Q17 has similar setup. - this allows us to slowly grow the TargetBytes parameter over time when requests can be returned partially multiple times (i.e. Scan requests spanning multiple rows). Consider a case when a single Scan request has to return 1MB worth of data, but each row is only 100B. With the initial estimate of 1KB, every request would always come back with exactly 10 rows, and the avg response size would always stay at 1KB. We'd end up issuing 1000 of such requests. Having a multiple here allows us to grow the estimate over time, reducing the total number of requests needed. This multiple seems to fix the remaining perf regression on Q17 when comparing against the streamer OFF config. This commit also introduces a cluster setting that controls this multiple. Value of 1.5 was chosen using `tpchvec/bench` and this setting. Additionally, I introduced a similar cluster setting for the initial avg response size estimate (currently hard-coded at 1KiB) and used `tpchvec/bench`, and it showed that 1KiB value is pretty good. It was also the value mentioned in the RFC, so I decided to remove the corresponding setting. Addresses: #82159. Release note: None Co-authored-by: Yahor Yuzefovich <[email protected]>
- Loading branch information