Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-37713][K8S] Assign namespace to executor configmap #34983

Closed
wants to merge 3 commits into from

Conversation

dcoliversun
Copy link
Contributor

What changes were proposed in this pull request?

Since Spark 3.X, Executor pod needs separate executor configmap. But, no namespace is assigned in configmap while building it. K8s views configmap without namespace as global resource. Once pod access is restricted (global resources cannot be read), and executor cannot obtain configmap itself.

Why are the changes needed?

Fix executor pod authorization error in K8S. When executor pod cannot read global resources, executor configmap is not available for executor.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Pass Integration Test and CIs.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@dcoliversun
Copy link
Contributor Author

@dongjoon-hyun Would you have review for these code? Thanks :)

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test case, @dcoliversun ?

@dongjoon-hyun
Copy link
Member

Gentle ping, @dcoliversun .

@dcoliversun
Copy link
Contributor Author

Sorry so late to reply. I've been working on it for days. I couldn't add test case because k8s allows configmap without .metadata.namespce. Request to APIServer will put namespace to body. But, .metadata.namespace is still null when request to APIServer. I think it's a good action. User will see this configmap through dashboard. The action that namespace of executor configmap yaml is null will confuse user.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it a reason not to add a test case? You need to add a test case for your code, not K8s code itself.

I couldn't add test case because k8s allows configmap without .metadata.namespce.

It sounds like you did try to make K8s itself fail, but we know that Apache Spark works well with the previous config without any issue.

def buildConfigMap(
configMapName: String,
confFileMap: Map[String, String],
configMapNameSpace: String,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try to get namespace via confFileMap instead changing this method signature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, buildConfigMap method could get namespace via confFileMap

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made two requests for this PR.

  • Provide a test coverage for newly added code path.
  • Avoid a method signature change as much as possible when you can do.

For the other code, it looks reasonable, @dcoliversun .

@dcoliversun
Copy link
Contributor Author

OK,I understand. I am working on it


test("verify that configmap built as expected") {
val configMapName = "testConfigMapName"
val properties = Map(Config.KUBERNETES_NAMESPACE.key -> "default")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't mind, could you use a random name here? Although the previous code doesn't attache the namespace, this default is too generic to be in the test case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I change it.

@@ -82,8 +81,9 @@ class KubernetesClientUtilsSuite extends SparkFunSuite with BeforeAndAfter {
}

test("verify that configmap built as expected") {
val configMapName = "testConfigMapName"
val properties = Map(Config.KUBERNETES_NAMESPACE.key -> "default")
val configMapName = java.util.UUID.randomUUID.toString
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import java.util.UUID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works without "import java.util.UUID". Is this a code import style limitation? If true, I can import :)

Copy link
Member

@dongjoon-hyun dongjoon-hyun Dec 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcoliversun . Apache Spark recommends not to hide your import in the code. Especially, you are using it twice at line 84 and 85. Your code is verbose due to the repeatability and not good at the readability too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun Thanks for your suggestion. I import java.util.UUID and add name/namespace prefix to improve readability. Please reivew it 😊

@dcoliversun
Copy link
Contributor Author

Sorry, CI has some error. I'm fixing it.

@@ -82,8 +81,9 @@ class KubernetesClientUtilsSuite extends SparkFunSuite with BeforeAndAfter {
}

test("verify that configmap built as expected") {
val configMapName = "testConfigMapName"
val properties = Map(Config.KUBERNETES_NAMESPACE.key -> "default")
val configMapName = java.util.UUID.randomUUID.toString
Copy link
Member

@dongjoon-hyun dongjoon-hyun Dec 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcoliversun . Apache Spark recommends not to hide your import in the code. Especially, you are using it twice at line 84 and 85. Your code is verbose due to the repeatability and not good at the readability too.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-37713][K8S] assign namespace to executor configmap [SPARK-37713][K8S] Assign namespace to executor configmap Dec 31, 2021
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @dcoliversun .

@dongjoon-hyun
Copy link
Member

cc @williamhyun

@dcoliversun
Copy link
Contributor Author

+1, LGTM. Thank you, @dcoliversun .

@dongjoon-hyun Thanks for your review, which helps me a lot. HAPPY NEW YEAR

@dongjoon-hyun
Copy link
Member

Happy New Year, @dcoliversun ! :)

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Dec 31, 2021

K8s tests passed. Merged to master for Apache Spark 3.3.

@dongjoon-hyun
Copy link
Member

If you need this in next Apache Spark 3.2.1, please make a backporting PR against branch-3.2.

@dcoliversun dcoliversun deleted the SPARK-37713 branch December 31, 2021 06:15
@Yikun
Copy link
Member

Yikun commented Jan 17, 2022

FYI, this PR breaks the case of using ConfigMap with namespace specified (driver side), see #35215 .

You could using below cmd to revert this commits manually before above PR merged:

git revert c4a9772f741836cdb399e35a45b3b5df48d9eea6

@GabeChurch
Copy link

Noticing the same behavior as Yikun with this pull breaking ConfigMap usage in non "default" namespace(s).

dongjoon-hyun added a commit that referenced this pull request Jan 29, 2022
### What changes were proposed in this pull request?

This is a follow-up of SPARK-37713 (#34983) to fix insufficient namespace propagation at configMap creation.

### Why are the changes needed?

This will recover the behavior on non-`default` namespace Spark jobs like the following.
```
NAMESPACE                          NAME                                   DATA   AGE
10e32ae0f2304236be9c687edf8a06a4   kube-root-ca.crt                       1      8s
10e32ae0f2304236be9c687edf8a06a4   spark-drv-fa3f387ea3871bbd-conf-map    2      5s
10e32ae0f2304236be9c687edf8a06a4   spark-exec-da5ea37ea38727f1-conf-map   2      3s
```

### Does this PR introduce _any_ user-facing change?

No. This is a bug fix of the unreleased patch.

### How was this patch tested?

Pass the CIs and manually run the integration tests which was broken since the original PR.

```
$ mvn package -Pkubernetes -DskipTests
$ resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --deploy-mode docker-for-desktop --exclude-tags minikube,r
...
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with no resources & statefulset allocation
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j2.properties
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- Test basic decommissioning
- Test basic decommissioning with shuffle cleanup
- Test decommissioning with dynamic allocation & shuffle cleanups
- Test decommissioning timeouts
- SPARK-37576: Rolling decommissioning
Run completed in 10 minutes, 34 seconds.
Total number of tests run: 22
Suites: completed 2, aborted 0
Tests: succeeded 22, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```

Closes #35299

Closes #35359 from dongjoon-hyun/SPARK-37713.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants