Handle last token from generation prompt #1153
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Some tokenizers, like Llama BPE tokenizer, might merge tokens into one such as the case where there is a space follow by some characters like
$
.In a previous PR we handled the case for chosen/rejected but one small improvement is to remove the extra space token from the end of the prompt used for generation in
get_batch_samples
. It should left to the model if the next generated token is just the space token or one such that it combines the space with other token into one. To do so, we need to manually checkprompt_ids
for chosen/rejected and by itself and chose the shortest knowing that they can only differ by 1 token.Given the following example,
prompt + chosen
will leave the space after[\INST]
unchanged but it wont be the case forprompt+rejected
since it starts with$
.@kashif