Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple fixes for GPU scripts #922

Merged
merged 12 commits into from
Sep 11, 2023

Conversation

manosnoam
Copy link
Contributor

@manosnoam manosnoam commented Sep 10, 2023

In provision-gpu.sh:

  • Fix shabang error: /bash/bin: bad interpreter
  • Skip creating machineset if there's already one with annotations of GPU > 0
  • Apply the manifest instead of create, in case machineset already exists
  • Fix renaming of machineset that appends multiple "-gpu" to its name
  • Remove the incorrect GPU label flag when creating the machineset
  • Add GPU label to the new machineset

In gpu_deploy.sh:

  • Fix GPU_INSTALL_DIR to point to the relative directory
  • Chmod +x gpu_deploy.sh

This fixes error running provision-gpu.sh:
/bash/bin: bad interpreter: No such file or directory

Signed-off-by: manosnoam <[email protected]>
@github-actions
Copy link
Contributor

Robot Results

✅ Passed ❌ Failed ⏭️ Skipped Total Pass %
359 0 0 359 100

The following command fails:
oc create -f /tmp/gpu-machineset.json -l gpu-machineset=true

Since passing label to the oc create is not supported:
error: no objects passed to create

Signed-off-by: manosnoam <[email protected]>
@manosnoam manosnoam changed the title Fix shabang in provision-gpu.sh Fix provision-gpu.sh Sep 10, 2023
@manosnoam manosnoam changed the title Fix provision-gpu.sh Multiple fixes for GPU scripts Sep 11, 2023
@manosnoam
Copy link
Contributor Author

Output example:

image

@manosnoam manosnoam requested a review from lugi0 September 11, 2023 09:18
lugi0
lugi0 previously approved these changes Sep 11, 2023
set -e

# Check if existing machineset with for GPU already exists
EXISTING_GPU_MACHINESET="$(oc get machineset -A -o jsonpath='{.items[?(@.metadata.labels.gpu-machineset=="true")].metadata.name}')"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally opt to use the openshift annotation for GPU nodes as it'd be more robust than a custom label we apply, but either way works

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@manosnoam manosnoam merged commit 834a854 into red-hat-data-services:master Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants