-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add functionality to select heads to evaluate error tables on and perform dry runs. #836
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix the pre-commit check suggestions, otherwise looks good.
@IsaacParker30 this addresses some of the things in #839 should have checked the PR before writing it. I wonder if makes sense to implement also first part in this. |
one small point, if you do not need the head stats at the end you may not want to print it during the training either. |
@IsaacParker30 this will need a small rebase |
Added 2 new arguments:
--eval_heads
This allows the user to specify which heads they want to evaluate / print an error table for at the end of training. If not set, will default evaluate all heads.
Usage: Specify all heads you want to evaluate on as a string with heads separated by commas. For example:
--eval_heads='default,pt_head,head3'
Example use scenario would be for replay fine-tuning if the user doesn't want to evaluate the large foundation model database on the
pt_head
.--dry_run
Adding this argument will stop the
run_train.py
script just before the model training is about to begin (calls thetools.train()
function. This allows the user to check if they've set their parameters correctly before beginning an expensive training run.I have tested:
Let me know if any changes needed.