Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Downgrade ompi version to v4.1.5rc2 #7441

Merged
merged 6 commits into from
Jul 16, 2024
Merged

fix: Downgrade ompi version to v4.1.5rc2 #7441

merged 6 commits into from
Jul 16, 2024

Conversation

krishung5
Copy link
Contributor

@krishung5 krishung5 commented Jul 12, 2024

What does the PR do?

Downgrade ompi version to avoid ucc issues when running TRT-LLM container on kubernetes.

Checklist

  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging ref.
  • All template sections are filled out.
  • Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

  • build
  • ci
  • docs
  • feat
  • fix
  • perf
  • refactor
  • revert
  • style
  • test

Related PRs:

Where should the reviewer start?

Test plan:

TRT-LLM backend build and tests should be passing.

  • CI Pipeline ID: 16512052

Caveats:

Background

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

@krishung5 krishung5 added PR: build Changes that affect the build system or external dependencies PR: fix A bug fix labels Jul 12, 2024
@krishung5 krishung5 requested review from mc-nv and fpetrini15 July 12, 2024 21:29
Copy link
Contributor

@fpetrini15 fpetrini15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the fix 🚀

@krishung5
Copy link
Contributor Author

@fpetrini15 FYI - though the TRT-LLM team says there shouldn't be issues for the downgrade, they are still trying to verify if there is any perf regression. Might worth waiting for their verification to merge this in.

@fpetrini15 fpetrini15 self-requested a review July 12, 2024 22:17
Copy link
Contributor

@fpetrini15 fpetrini15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissing approval until the TRT-LLM team verifies the fix

@krishung5
Copy link
Contributor Author

Rebased

@fpetrini15 fpetrini15 self-requested a review July 16, 2024 17:51
@fpetrini15 fpetrini15 merged commit 7a407d2 into r24.07 Jul 16, 2024
3 checks passed
@fpetrini15 fpetrini15 deleted the krish-ucx-mpi branch July 16, 2024 17:52
pvijayakrish pushed a commit that referenced this pull request Jul 23, 2024
Downgrade ompi version to v4.1.5rc2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: build Changes that affect the build system or external dependencies PR: fix A bug fix
Development

Successfully merging this pull request may close these issues.

2 participants