Skip to content
This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

Benchmarking : Remote push job #92

Merged
merged 1 commit into from
Mar 13, 2024

Conversation

varun-sundar-rabindranath

SUMMARY:
Trigger minimal benchmarking on remote-push jobs.

TEST PLAN:
Jobs on this PR

@varun-sundar-rabindranath varun-sundar-rabindranath marked this pull request as draft March 5, 2024 15:24
@varun-sundar-rabindranath varun-sundar-rabindranath marked this pull request as ready for review March 5, 2024 19:01
Comment on lines 31 to 34
AWS-AVX2-192G-4-A10G-96G-Benchmark:
uses: ./.github/workflows/nm-benchmark.yml
with:
label: aws-avx2-192G-4-a10g-96G
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this could just a single GPU? it's just a 7b model so 4 GPUs seems like overkill

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks guys.
4 GPU instance (Gi_per_thread = 4):

  • build time : 10 mins
  • bench time : 20 mins
    1 GPU instance (Gi_per_thread = 12):
  • build time : 30 mins
  • bench time : 20 mins

I am trying to see if it builds with Gi_per_thread = 8.

Copy link
Member

@andy-neuma andy-neuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, but the model is a bit small, so check if we can use a single gpu runner.

Comment on lines 31 to 34
AWS-AVX2-192G-4-A10G-96G-Benchmark:
uses: ./.github/workflows/nm-benchmark.yml
with:
label: aws-avx2-192G-4-a10g-96G
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@varun-sundar-rabindranath varun-sundar-rabindranath merged commit fa8e147 into main Mar 13, 2024
2 checks passed
@varun-sundar-rabindranath varun-sundar-rabindranath deleted the varun/remote-push-benchmark branch March 13, 2024 17:17
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants