Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a lower concurrency with more repetition for L0_memory_growth #7127

Merged
merged 2 commits into from
Apr 23, 2024

Conversation

krishung5
Copy link
Contributor

The test is failing due to onnxruntime throwing CUDA OOM error. Decrease the concurrency and increase the repetition to make sure that CUDA memory isn't exhausted, and we still have enough amount of requests sending from PA to observe if the memory grows or not.

@krishung5 krishung5 requested a review from Tabrizian April 17, 2024 20:43
Tabrizian
Tabrizian previously approved these changes Apr 17, 2024
@krishung5
Copy link
Contributor Author

Use a larger window to avoid intermittent PA unstable issue.

@krishung5 krishung5 merged commit 7d1b015 into main Apr 23, 2024
3 checks passed
@krishung5 krishung5 deleted the krish-fix-l0-mem-growth branch April 23, 2024 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants