Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KYUUBI #5766] Default spark.yarn.maxAppAttempts to 1 for spark engine #5798

Closed
wants to merge 7 commits into from

Conversation

wForget
Copy link
Member

@wForget wForget commented Dec 1, 2023

🔍 Description

Issue References 🔗

This pull request fixes #5766

Describe Your Solution 🔧

As discussed in #5766 (comment), we should add spark.yarn.maxAppAttempts=1 for spark engine when spark.master is yarn.

Types of changes 🔖

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Test Plan 🧪

Behavior Without This Pull Request ⚰️

Behavior With This Pull Request 🎉

Related Unit Tests


Checklists

📝 Author Self Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • This patch was not authored or co-authored using Generative Tooling

📝 Committer Pre-Merge Checklist

  • Pull request title is okay.
  • No license issues.
  • Milestone correctly set?
  • Test coverage is ok
  • Assignees are selected.
  • Minimum number of approvals
  • No changes are requested

Be nice. Be informative.

@codecov-commenter
Copy link

codecov-commenter commented Dec 1, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (51c3e03) 61.29% compared to head (6477dfd) 61.27%.
Report is 2 commits behind head on master.

Files Patch % Lines
...ache/kyuubi/engine/spark/SparkProcessBuilder.scala 75.00% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #5798      +/-   ##
============================================
- Coverage     61.29%   61.27%   -0.03%     
  Complexity       23       23              
============================================
  Files           608      608              
  Lines         36027    36033       +6     
  Branches       4952     4952              
============================================
- Hits          22083    22078       -5     
- Misses        11556    11559       +3     
- Partials       2388     2396       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

def extraYarnConf: Map[String, String] = {
if (clusterManager().exists(_.toLowerCase(Locale.ROOT).startsWith("yarn"))) {
// set `spark.yarn.maxAppAttempts` to 1 to avoid invalid attempts.
Map(YARN_MAX_APP_ATTEMPTS_KEY -> "1")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YARN-5617. AMs only intended to run one attempt can be run more than once

Fix Version/s: 2.9.0, 3.0.0-alpha4, 2.8.2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should add such info to the comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be a very early issue, do we still need to add relevant comment to code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be better to leave a hint for things happened after Hadoop 2.7

@wForget wForget requested review from pan3793 and cxzl25 December 6, 2023 03:01
@@ -140,8 +140,9 @@ class SparkProcessBuilder(
allConf = allConf ++ zkAuthKeytabFileConf(allConf)
}
// pass spark engine log path to spark conf
(allConf ++ engineLogPathConf ++ appendPodNameConf(allConf)).foreach { case (k, v) =>
buffer ++= confKeyValue(convertConfigKey(k), v)
(allConf ++ engineLogPathConf ++ extraYarnConf ++ appendPodNameConf(allConf)).foreach {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like extraYarnConf will override the user's configuration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved extraYarnConf to the front

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Emm... if we really want to respect user's conf here, we need to do things like appendPodNameConf does

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, fixed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The construction of allConf is too complex, I prefer to put kvs into a mutable map, let's optimize it later.

@wForget wForget self-assigned this Dec 7, 2023
@wForget wForget added this to the v1.8.1 milestone Dec 7, 2023
@wForget wForget closed this in 6a282fc Dec 7, 2023
wForget added a commit that referenced this pull request Dec 7, 2023
# 🔍 Description
## Issue References 🔗

This pull request fixes #5766

## Describe Your Solution 🔧

As discussed in #5766 (comment), we should add `spark.yarn.maxAppAttempts=1` for spark engine when `spark.master` is `yarn`.

## Types of changes 🔖

- [X] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklists
## 📝 Author Self Checklist

- [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project
- [x] I have performed a self-review
- [x] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

## 📝 Committer Pre-Merge Checklist

- [x] Pull request title is okay.
- [x] No license issues.
- [x] Milestone correctly set?
- [ ] Test coverage is ok
- [x] Assignees are selected.
- [x] Minimum number of approvals
- [x] No changes are requested

**Be nice. Be informative.**

Closes #5798 from wForget/KYUUBI-5766-2.

Closes #5766

6477dfd [wforget] fix
c50f656 [wforget] fix order
dbc1891 [wforget] comment
a493e29 [wforget] fix style
4fa0651 [wforget] fix test
b899646 [wforget] add test
954a30d [wforget] [KYUUBI #5766] Default `spark.yarn.maxAppAttempts` to 1 for spark engine

Authored-by: wforget <[email protected]>
Signed-off-by: wforget <[email protected]>
(cherry picked from commit 6a282fc)
Signed-off-by: wforget <[email protected]>
@wForget
Copy link
Member Author

wForget commented Dec 7, 2023

Thanks, merged to master and 1.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[TASK][EASY][SPARK] kyuubi.engine.submit.time is wrong for ApplicationMaster retries
4 participants