-
Notifications
You must be signed in to change notification settings - Fork 10
Conversation
.github/workflows/remote-push.yml
Outdated
AWS-AVX2-192G-4-A10G-96G-Benchmark: | ||
uses: ./.github/workflows/nm-benchmark.yml | ||
with: | ||
label: aws-avx2-192G-4-a10g-96G |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this could just a single GPU? it's just a 7b model so 4 GPUs seems like overkill
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks guys.
4 GPU instance (Gi_per_thread = 4):
- build time : 10 mins
- bench time : 20 mins
1 GPU instance (Gi_per_thread = 12): - build time : 30 mins
- bench time : 20 mins
I am trying to see if it builds with Gi_per_thread = 8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, but the model is a bit small, so check if we can use a single gpu runner.
.github/workflows/remote-push.yml
Outdated
AWS-AVX2-192G-4-A10G-96G-Benchmark: | ||
uses: ./.github/workflows/nm-benchmark.yml | ||
with: | ||
label: aws-avx2-192G-4-a10g-96G |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
f83269e
to
fbecddb
Compare
997c64c
to
960cb81
Compare
SUMMARY:
Trigger minimal benchmarking on remote-push jobs.
TEST PLAN:
Jobs on this PR