You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.
This separates the network itself from the other fiddly bits, allowing you to implement any strategy for handling the logits (such as increasing/decreasing probabilities based on external criteria before sampling with top_p / top_k, restricting to a subset with logits using context-free grammars, or any number of other strategies i.e. #235 stuff).
It also removes the need to pass a rng, have a lot of parameters tied to a session, lets one use a different tokenizer (e.g. the huggingface one) and removes the need to set up progress tracking callbacks.
The text was updated successfully, but these errors were encountered:
The closest thing is infer_next_token. You're right that there should be a lower-level API that evaluates the transformer and recalculates the logits, but leaves sampling up to you.
Hi there! I've just merged #280 which should address this by letting you define your own sampling strategy, which you can combine with infer_next_token. Can you let me know if that solves your problem?
The reason for not supporting arbitrary inference is that the previous tokens are required for continued sampling, so the session needs to know which token was chosen after inference.
I was wondering if any of the crates exposes a simple, low-level API
Something along the lines of
(sessionState, tokenIds) -> (nextSessionState, next_logits)
This separates the network itself from the other fiddly bits, allowing you to implement any strategy for handling the logits (such as increasing/decreasing probabilities based on external criteria before sampling with top_p / top_k, restricting to a subset with logits using context-free grammars, or any number of other strategies i.e. #235 stuff).
It also removes the need to pass a rng, have a lot of parameters tied to a session, lets one use a different tokenizer (e.g. the huggingface one) and removes the need to set up progress tracking callbacks.
The text was updated successfully, but these errors were encountered: