-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[multistage] Support Window Functions #7213
Comments
LinkedIn Pinot team is currently working on this feature right now. We'll share the detailed design shortly. |
Can you share some high level details about the approach being taken? Is it based aggregation UDF or does it requires changes to pinot core itself? |
It will be good to add an interface for sliding window aggregates functions similar to transform and aggregate function. Once we have that, we can implement moving average etc |
Hi all, Here is the design document for supporting window functions in Pinot: Please feel free to leave any comments or feedback. Thanks, |
Thanks for putting this together. Very good write up. High level thoughts
Overall, this is a great initiative and let's try to think few steps ahead and lay the right foundation |
My understanding is that Window Functions would be applied over the resultset being produced by a query to show additional column aggregations by partitioning the resultset. match_recognize seems to work over the table produced by the FROM clause to output another table that matches certain pattern sequences. So they appear to be different in that sense, some of the internal operators for window function could apply to match_recognize as well. I think one could apply window functions over the output produced by match_recognize? |
Totally agree on that. Window function is not the only feature facing this (e.g. post aggregation in order-by, having clause with LIMIT, etc). Same for future features like joins and sub queries.
Do we know when we aim to finish the query spi layer? If it will take a while, we can first do the implementation on the execution part with the existing way of query compilation and then wire in with the new query spi layer once that's ready. |
@jackjlli Weixiang (@weixiangsun) is working on the gapfill function (#7422) which is quite similar to the window function. Both features require reading the results from previous rows. Can you sync up with each other on the design so that we don't duplicate the implementation? I feel a lot of the code can be shared across the 2 features. |
@kishoreg @mayankshriv @Jackie-Jiang @richardstartin we are planning to start window function implementation soon. Is there anything remaining with respect to design or any other related issue that needs to be discussed before we do so? |
Can you please address the comments from @Jackie-Jiang in the design doc? This is a complex feature and the design doc is mostly describing how Pinot operators work. It will help me a lot if you can expand on the following topics
Overall I am +1 on adding this feature but -1 on the design. |
@kishoreg @mayankshriv @Jackie-Jiang @richardstartin I got a bit sidetracked with other work after our last discussion on window functions, but would like to restart again. We have updated design based on the discussions that we had last time. There is also a fairly mature prototype that I am in process of cleaning up and finalizing. We would like to incorporate this code into Pinot. Please let me know your thoughts and comments. |
Are we planning to implement this on the broker side? I'm okay with that as an initial version, but that won't be the long term solution, and will have a lot of constraints due to the limited resource on a single node. Long term solution should be to implement it on the multi-stage engine. cc @walterddr to also chime in |
@Jackie-Jiang only the merge will be done on Broker side, the computation of window functions is on the server side. I think @siddharthteotia is going to be setting up time with you and @walterddr to discuss further. We do want to port the design to multi-stage engine as well. |
@amrishlal I don't see that in the design doc. If you have created a new one, please share it here. |
I'll be working on this issue |
Have a design document for phase 1 ready for review: https://docs.google.com/document/d/13CmFm4djI09JKF_Xty5acoXxJoxC9CLXAsmikgzgtIs/edit#heading=h.54yrqayotu6g |
Initial set of query planning changes have been merged. Implementation for runtime is in progress |
Closing this as the basic window functions are all supported |
Currently in pinot there is no way to query Pinot table in a sliding window fashion. E.g. suppose I want to get the 7-day average number of Covid-19 cases. For now in order to do this, we'd need to send multiple queries; one query for each window.
We would like Pinot to support aggregation functions (like sum, count, avg, T-digest, etc) on sliding window within one single query.
The text was updated successfully, but these errors were encountered: