Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add E2E Tests for Phi-3 and Tuning #476

Merged
merged 64 commits into from
Jul 8, 2024
Merged
Changes from 1 commit
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
96e125c
fix: Only 1 GPU required - DDP not required
ishaansehgal99 Jun 18, 2024
1b4a132
feat: Begin addition of E2E for finetuning
ishaansehgal99 Jun 18, 2024
650eb91
feat: Add Phi3 inference infernece e2e
ishaansehgal99 Jun 18, 2024
7bd1b23
fix: comment to test new e2e
ishaansehgal99 Jun 18, 2024
9372bc7
fix: update image reference
ishaansehgal99 Jun 19, 2024
21c6337
feat: Add client-go for getting pod logs for tuning e2e
ishaansehgal99 Jun 21, 2024
8c48497
Update Makefile
ishaansehgal99 Jun 21, 2024
4364af0
fix: namespace for configmap creation
ishaansehgal99 Jun 21, 2024
d89dbc6
Merge branch 'Ishaan/e2e-tuning' of https://github.com/Azure/kaito in…
ishaansehgal99 Jun 21, 2024
3414e71
fix: Use updated test SKU
ishaansehgal99 Jun 21, 2024
a2fe6ee
fix: increase timeout
ishaansehgal99 Jun 21, 2024
8129a80
fix: increase timeout
ishaansehgal99 Jun 21, 2024
828d22a
fix: Allow tuning to create service
ishaansehgal99 Jun 21, 2024
edd8ae5
fix: Get preset name
ishaansehgal99 Jun 21, 2024
84730b9
Update Makefile
ishaansehgal99 Jun 24, 2024
bf3b32c
Update Makefile
ishaansehgal99 Jun 24, 2024
655b850
fix: remove push secret
ishaansehgal99 Jun 24, 2024
a67dba5
add some logging
ishaansehgal99 Jun 24, 2024
a9cf3de
requires push secret
ishaansehgal99 Jun 24, 2024
0d769c4
fix:copy secret code
ishaansehgal99 Jun 24, 2024
2bc0170
fix: additional logs
ishaansehgal99 Jun 24, 2024
4546cbb
fix: Change to using Phi3
ishaansehgal99 Jun 24, 2024
45fe354
fix: image tags
ishaansehgal99 Jun 24, 2024
9474bb8
fix: update dockerfile
ishaansehgal99 Jun 25, 2024
b44a4f2
revert dockerfile for now
ishaansehgal99 Jun 25, 2024
7623e4e
fix: remove deepspeed
ishaansehgal99 Jun 25, 2024
10d7823
remove deepspeed
ishaansehgal99 Jun 25, 2024
dfb3302
fix: specify target modules
ishaansehgal99 Jun 26, 2024
b895abc
fix: specify target modules
ishaansehgal99 Jun 26, 2024
3ef891d
fix: target modules
ishaansehgal99 Jun 26, 2024
dff4e1a
fix: in cluster loading
ishaansehgal99 Jun 26, 2024
9c88254
fix: more err msg
ishaansehgal99 Jun 26, 2024
cb15d57
fix: improve testing
ishaansehgal99 Jun 26, 2024
82c1865
fix: nil pointer exception
ishaansehgal99 Jun 26, 2024
fa04e50
fix: Using correct paths
ishaansehgal99 Jun 26, 2024
b6ce51e
revert timeout
ishaansehgal99 Jun 26, 2024
ed47711
fix: errs validation fix
ishaansehgal99 Jun 27, 2024
bcfd26c
Merge branch 'main' of https://github.com/Azure/kaito into Ishaan/e2e…
ishaansehgal99 Jun 27, 2024
0991c78
fix: resolve
ishaansehgal99 Jun 27, 2024
93b5069
Merge branch 'main' of https://github.com/Azure/kaito into Ishaan/e2e…
ishaansehgal99 Jul 2, 2024
806675e
feat: remove extraneous files
ishaansehgal99 Jul 3, 2024
cd89e20
resolve: defaults
ishaansehgal99 Jul 3, 2024
57c7863
resolve: defaults
ishaansehgal99 Jul 3, 2024
aecf955
resolve: defaults
ishaansehgal99 Jul 3, 2024
8c8ec1f
resolve: defaults
ishaansehgal99 Jul 3, 2024
fc4b45a
resolve: defaults
ishaansehgal99 Jul 3, 2024
6010fa0
resolve: defaults
ishaansehgal99 Jul 3, 2024
0401c71
resolve: defaults
ishaansehgal99 Jul 3, 2024
d725d79
resolve: defaults
ishaansehgal99 Jul 3, 2024
b8f65aa
feat: add util functions and preset test update
ishaansehgal99 Jul 3, 2024
767ead7
typo
ishaansehgal99 Jul 3, 2024
1f01414
feat: increase max parallel
ishaansehgal99 Jul 3, 2024
c6ac4f2
fix: dont use pointer for string
ishaansehgal99 Jul 3, 2024
0011adb
fix: Use public mcr image
ishaansehgal99 Jul 3, 2024
4f2d7fc
fix: image name
ishaansehgal99 Jul 4, 2024
4edba5a
Merge branch 'main' of https://github.com/Azure/kaito into Ishaan/e2e…
ishaansehgal99 Jul 4, 2024
5ee4e42
fix: secret
ishaansehgal99 Jul 4, 2024
dbb3be9
fix
ishaansehgal99 Jul 4, 2024
3080949
fix
ishaansehgal99 Jul 4, 2024
22d775c
fix: remove svc requirement
ishaansehgal99 Jul 5, 2024
942b1bb
fix: add back tests
ishaansehgal99 Jul 5, 2024
58d0c77
fix: lower steps
ishaansehgal99 Jul 5, 2024
365cbd9
Merge branch 'main' into Ishaan/e2e-tuning
ishaansehgal99 Jul 5, 2024
9598fb9
Merge branch 'main' into Ishaan/e2e-tuning
ishaansehgal99 Jul 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix: lower steps
  • Loading branch information
ishaansehgal99 committed Jul 5, 2024
commit 58d0c77006e9fc25c36fdf7d749865b3f7979e25
2 changes: 1 addition & 1 deletion test/e2e/utils/utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -295,7 +295,7 @@ func GenerateE2ETuningConfigMapManifest(namespace string) *corev1.ConfigMap {
ddp_find_unused_parameters: false
save_strategy: "epoch"
per_device_train_batch_size: 1
max_steps: 5 # Adding this line to limit training to 5 steps
max_steps: 2 # Adding this line to limit training to 2 steps

DataCollator:
mlm: true
Expand Down
Loading