-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Ballista query stage scheduler #1935
Conversation
I will take a look later this week. |
I will try and review this tomorrow or Wednesday. |
It's a little hard to tell from the diff which changes are specific to this PR vs. part of previous refactoring PRs but if I understand correctly this doesn't change the existing scheduler but just moves the scheduling logic to an encapsulated |
Hi @thinkharderdev, you are right. For #1908, #1909, #1910 and #1934, these 4 issues are mainly focusing on the refactoring and changing the scheduler server state changing to be event-based. While for #1936, I'll introduce stage and the state machine for it. With the state machine of the job stage, it will be much easier and more efficient to do error handling, speculative execution. In general, event-based processing is very suitable for handling the state machine. However, the PR for #1936 is really complicated. I'm still working on it. |
Cool. Looks good to me in general but would like to see the other PRs merged before approving. For work like this in the future (large refactorings broken into multiple PRs which depend on each other) it might be a good idea to do something similar to the @alamb @alamb @yjshen @realno @matthewmturner and @liukun4515 Does ^^ make sense? |
I actually prefer to avoid large feature branches (like I think the sequence of PRs from @yahoNanJing is actually a pretty good way (I think they are basically a stacked diff). I especially like keeping changes that "moving code around, but keeping the behavior the same" into their own PRs as I think they are easy to review and we can merge them quickly. In this case, I am sorry it has taken some time to get the in -- as you can see we were a little bottlenecked on review capacity with an influx of contributors. I hope this bottleneck is reduced over time The downside is that when you end up with 4-5 PRs chained together the overall goal sometimes is harder to see. I have found making a single tracking issue that explains the overall plan and lists the individual steps works well Like
I don't think there is any reason to have a 1-to-1 correspondence between issue and PR if the PRs are chained together. |
Thanks @alamb and @thinkharderdev. Actually, I prefer step by step. Otherwise, like arrow2, it may hard to be merged. For the distributed task scheduling, there's still two PR left, #1935 and #1983. Actually, the most important PR is #1983. All of other PRs in this chain are for finally achieving #1983. I just finished it today. And I will do some load testing with my team tomorrow to see how the performance is improved. In the future, another important feature is to add task error recovering. I think it will be much easier to work it on this new task scheduling framework. |
The PR looks good to me, thanks @yahoNanJing! One question I have is about testing, there will be a lot of major changes to Ballista, do we have good test coverage to ensure things are working as expected? Here is a related PR #1982, is Ballista integration test currently part of the pipeline? |
Thanks @realno. These PRs basically does not change behaviors. Therefore, I didn't add additional unit tests. However, for #1983, I have added unit test for both pull-staged-based task scheduling and push-staged-based task scheduling. For the pipeline, I'm not so sure whether the Ballista integration test be added or not. |
Yes, unit test for this PR is good. My comment was more general to Ballista overall, sorry didn't make that clear :)
I checked the pipeline definition the integration test is not part of it. I think at some point we should include it. @andygrove @alamb should we do that now? I guess one concern might be if we want it to block the whole DataFusion pipeline |
I think after see the big improvement in load testing #1530, It‘s worth doing such a big change! I will be glad to add UT for this change(if needed). |
Running some Ballista integration testing as part of CI is a long standing request and I think be a major benefit / improvement now that Ballista development is driving forward. @Ted-Jiang if you have time to look into that it would be great cc @andygrove |
Thanks @yahoNanJing -- great work so far. I wonder if you might be interested in presenting (perhaps at one of the Rust meetups) some of the work you have done with the scheduler? I know the original design was listed on apache/datafusion-ballista#24 but I am not sure if it has evolved during the implementation |
@alamb Sure, i will check 😊, Maybe add more test case in IT. |
Thanks @alamb. Our team will be very appreciated for the chance to present the scheduler changes in one of the meetups. However, there's still several PR under review, like #1983, #1842, #1862. It's better to finish all of them before the presentation. Then we can have a complete benchmark results and present the whole story. 😄 |
Sounds like a great plan to me 👍 If you don't mind me asking, who else is on your team? Are you working with @Ted-Jiang @liukun4515 and @mingmwang? (I am just trying to keep all the people and groups somewhat straight in my head) |
Which issue does this PR close?
Closes #1934.
Rationale for this change
Details see #1934.
What changes are included in this PR?
Put the stage generation logic into the channel processing.
Are there any user-facing changes?
It's blocked by #1912, #1913.