Stop on multiple end tokens #1518

grasskin · 2024-03-21T16:39:24Z

No description provided.

mattdangerw · 2024-03-22T00:26:21Z

I think we should think first of the overall API experience we want here. What about something like this?

# Default. Stop at gemma_lm.preprocessor.tokenizer.end_token_id, or error if
# self.preprocessor is none.
gemma_lm.generate(
    prompt,
    max_length=64,
    stop_token_ids="auto",
)
# Don't stop till max_length!
gemma_lm.generate(
    prompt,
    max_length=64,
    stop_token_ids=None,
)
# Custom. Provide multiple stop tokens, in this case we also stop on the literal word stop.
gemma_lm.generate(
    prompt,
    max_length=64,
    stop_token_ids=[tokenizer.end_token_id, tokenizer.token_to_id("stop")],
)

I don't really like setting this on the tokenizer. Tokenizer special token ids are not generally set by a user. Every tokenizer.xx_token_id is just a single integer right now. Preprocessing can also be detached from the task, in which case, the CausalLM does not even have a tokenizer to query.

mattdangerw · 2024-03-22T00:41:13Z

If we go we with above proposal, we should update the sampler API to also take in stop_tokens_ids, but it does not need the "auto" value.

We can do this with Gemma at first, but we should eventually update all models to have a consistent API surface.

We also might want to refactor a helper into tensor_utils.py. Would help readability:

def any_equal(inputs, values):
    """Return a mask that is True anywhere `inputs` has a value in `values`."""
    output = ops.equal(inputs, values[0])
    for value in values[1:]:
        output = ops.logical_or(outputs, value)
    return output

grasskin · 2024-03-22T20:12:04Z

We're currently defaulting to a mix of if preprocessor is specified use "auto" otherwise go with None. Should we error out if no preprocessor is specified or just switch to None?

grasskin · 2024-03-22T20:36:19Z

Discussed offline - we're going to do a full refactor and go with the more sane choice of erroring if "auto" is specified with no preprocessor. API will be more consistent for multitoken requirements.

grasskin · 2024-03-25T18:17:19Z

@mattdangerw this works for Gemma, if overall method lgty we can replicate in other models. Given that we're switching to stop_token_ids we very explicitly require iterables instead of single int, fixed sampling tests already.

mattdangerw

Looks good generally! Dropped a few comments. Can probably start replicating to other models.

keras_nlp/models/gemma/gemma_causal_lm_preprocessor.py

keras_nlp/models/generative_task.py

keras_nlp/utils/tensor_utils.py

mattdangerw · 2024-03-25T22:55:25Z

keras_nlp/models/generative_task.py

@@ -81,7 +81,7 @@ def wrapped_generate_function(
            import jax

            @jax.jit
-            def compiled_generate_function(inputs, end_token_id, state):
+            def compiled_generate_function(inputs, stop_token_ids, state):


Fairly technical jax question, but what happens if we pass lists of differing lengths here? Does jax automatically recompile? We could possibly pass static_argnames here, but not sure the right approach.

Every element in the list will be treated as an individual Jax variable and will trigger recompilation. I think static_argnames is the right call here just checking that they work well with lists

We will have to force this to be a tuple to guarantee that stop tokens are immutable

mattdangerw

Looks good to me! Few last comments.

keras_nlp/models/generative_task.py

keras_nlp/models/gemma/gemma_causal_lm.py

* Add multitoken stopping * Update gemma_causal_lm.py * Add further multitoken support * Formatting * Revert tokenizer changes * Move multi token stop to generative task * None check * None check * Error message * Add stop_token_ids * Util testing * Fix sampler tests * All multitoken stop to all models * Sampler multi token * Formatting * Tuple required * Tuple docstring * Pytorch GPU fix * Numpy fix

Add multitoken stopping

cf1fc4a

grasskin marked this pull request as draft March 21, 2024 16:39

grasskin added 3 commits March 21, 2024 13:13

Update gemma_causal_lm.py

b8c86c5

Add further multitoken support

8e5c636

Formatting

b52c9a2

grasskin added 5 commits March 22, 2024 19:12

Revert tokenizer changes

3baf014

Move multi token stop to generative task

d9707de

None check

4c8dd8b

None check

29a098d

Error message

2c41704

grasskin added 3 commits March 22, 2024 23:57

Add stop_token_ids

660195c

Util testing

17c392b

Fix sampler tests

d766b3f

grasskin requested a review from mattdangerw March 25, 2024 18:15

mattdangerw reviewed Mar 25, 2024

View reviewed changes

grasskin added 5 commits March 26, 2024 19:18

All multitoken stop to all models

fd004a7

Sampler multi token

1a91cef

Formatting

7e56d76

Tuple required

2b63924

Tuple docstring

09b64c7

mattdangerw approved these changes Mar 27, 2024

View reviewed changes

mattdangerw marked this pull request as ready for review March 27, 2024 16:44

grasskin added 2 commits March 27, 2024 18:47

Pytorch GPU fix

46ea50a

Numpy fix

4b0010f

mattdangerw merged commit 4d1c883 into keras-team:master Mar 27, 2024
9 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop on multiple end tokens #1518

Stop on multiple end tokens #1518

grasskin commented Mar 21, 2024

mattdangerw commented Mar 22, 2024 •

edited

Loading

mattdangerw commented Mar 22, 2024

grasskin commented Mar 22, 2024

grasskin commented Mar 22, 2024

grasskin commented Mar 25, 2024

mattdangerw left a comment

mattdangerw Mar 25, 2024

grasskin Mar 26, 2024

grasskin Mar 26, 2024

mattdangerw left a comment

Stop on multiple end tokens #1518

Stop on multiple end tokens #1518

Conversation

grasskin commented Mar 21, 2024

mattdangerw commented Mar 22, 2024 • edited Loading

mattdangerw commented Mar 22, 2024

grasskin commented Mar 22, 2024

grasskin commented Mar 22, 2024

grasskin commented Mar 25, 2024

mattdangerw left a comment

Choose a reason for hiding this comment

mattdangerw Mar 25, 2024

Choose a reason for hiding this comment

grasskin Mar 26, 2024

Choose a reason for hiding this comment

grasskin Mar 26, 2024

Choose a reason for hiding this comment

mattdangerw left a comment

Choose a reason for hiding this comment

mattdangerw commented Mar 22, 2024 •

edited

Loading