Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FSDP-QLoRA doc updates for TRL integration #1471

Merged
merged 2 commits into from
Jan 14, 2025

Conversation

blbadger
Copy link
Contributor

@blbadger blbadger commented Jan 8, 2025

Small documentation changes to maintain compatibility as the SFTTrainer class received updates as of TRL v0.13.0:

  1. max_seq_length and dataset_text_field moved to become SFTConfig class arguments from the SFTTrainer, causing errors when calling the trainer if these arguments are retained.
  2. The tokenizer argument is being deprecated in favor of the more general processing_class, which can be a tokenizer.

The doc currently does not define the training_arguments arg, but we can assume that this is a an instance of the SFTConfig class in which case max_seq_length and dataset_text_field would be defined there. I have removed these args from the SFTTrainer instantiation because of this, although these arguments could be defined in a config if that is preferable.

Swapping the tokenizer for processing_class deals with the second change. This modification is optional as the tokenizer kwarg is deprecated and swapped automatically, but will probably be useful going forward.

Copy link
Collaborator

@Titus-von-Koeller Titus-von-Koeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @blbadger for you first PR and this proactive attitude towards fixing things to the benefit of the community ❤️ Really appreciated!

@matthewdouglas matthewdouglas merged commit a9cfd1b into bitsandbytes-foundation:main Jan 14, 2025
@blbadger
Copy link
Contributor Author

Happy to contribute @Titus-von-Koeller !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants