Skip to content
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.

Lower level API? #267

Closed
spion opened this issue May 22, 2023 · 3 comments
Closed

Lower level API? #267

spion opened this issue May 22, 2023 · 3 comments
Labels
issue:enhancement New feature or request

Comments

@spion
Copy link

spion commented May 22, 2023

I was wondering if any of the crates exposes a simple, low-level API

Something along the lines of

(sessionState, tokenIds) -> (nextSessionState, next_logits)

This separates the network itself from the other fiddly bits, allowing you to implement any strategy for handling the logits (such as increasing/decreasing probabilities based on external criteria before sampling with top_p / top_k, restricting to a subset with logits using context-free grammars, or any number of other strategies i.e. #235 stuff).

It also removes the need to pass a rng, have a lot of parameters tied to a session, lets one use a different tokenizer (e.g. the huggingface one) and removes the need to set up progress tracking callbacks.

@philpax
Copy link
Collaborator

philpax commented May 22, 2023

The closest thing is infer_next_token. You're right that there should be a lower-level API that evaluates the transformer and recalculates the logits, but leaves sampling up to you.

@philpax
Copy link
Collaborator

philpax commented May 31, 2023

Hi there! I've just merged #280 which should address this by letting you define your own sampling strategy, which you can combine with infer_next_token. Can you let me know if that solves your problem?

The reason for not supporting arbitrary inference is that the previous tokens are required for continued sampling, so the session needs to know which token was chosen after inference.

@spion
Copy link
Author

spion commented May 31, 2023

The Sampler trait looks perfect at first glance, I'm going to give it a try in my project. edit: Thank you!

@philpax philpax closed this as completed Jun 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
issue:enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants