-
Notifications
You must be signed in to change notification settings - Fork 930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TASK][EASY][SPARK] kyuubi.engine.submit.time is wrong for ApplicationMaster retries #5766
Comments
cc @wForget |
Will the spark engine still work well after ApplicationMaster failover? |
Not now maybe, but we shall make it work well |
I think the current behavior is by design, I'm not sure changing the behavior is good idea since Kyuubi can create a new engine when a new session comes in if the previous engine is broken. If we change the behavior, the attempted engine launched after the new engine will waste resources, such behavior violates the original expectation |
Your concern is reasonable, but the attempted engine failure should not be triggered by timeout. So, I think this fix can be accepted, and we can recommend users to set |
We can default spark.yarn.maxAppAttempts to 1 |
I think #5776 is not necessary |
Thanks. after setting |
# 🔍 Description ## Issue References 🔗 This pull request fixes #5766 ## Describe Your Solution 🔧 As discussed in #5766 (comment), we should add `spark.yarn.maxAppAttempts=1` for spark engine when `spark.master` is `yarn`. ## Types of changes 🔖 - [X] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [ ] Test coverage is ok - [x] Assignees are selected. - [x] Minimum number of approvals - [x] No changes are requested **Be nice. Be informative.** Closes #5798 from wForget/KYUUBI-5766-2. Closes #5766 6477dfd [wforget] fix c50f656 [wforget] fix order dbc1891 [wforget] comment a493e29 [wforget] fix style 4fa0651 [wforget] fix test b899646 [wforget] add test 954a30d [wforget] [KYUUBI #5766] Default `spark.yarn.maxAppAttempts` to 1 for spark engine Authored-by: wforget <[email protected]> Signed-off-by: wforget <[email protected]> (cherry picked from commit 6a282fc) Signed-off-by: wforget <[email protected]>
Code of Conduct
Search before asking
Describe the bug
When Kyuubi spark engine starting failover, the kyuubi.engine.submit.time is wrong to use
Affects Version(s)
1.7.0
Kyuubi Server Log Output
No response
Kyuubi Engine Log Output
No response
Kyuubi Server Configurations
No response
Kyuubi Engine Configurations
No response
Additional context
No response
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: