Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Examples/Server API change breaks prompt as JSON array. #4676

Closed
jboero opened this issue Dec 28, 2023 · 8 comments
Closed

Examples/Server API change breaks prompt as JSON array. #4676

jboero opened this issue Dec 28, 2023 · 8 comments

Comments

@jboero
Copy link
Contributor

jboero commented Dec 28, 2023

My app uses the examples/server API /completions endpoint and recently it stopped working giving "no slots available" errors even though a slot is clearly available. It seems like one slot is required per item in the prompt array. Also now instead of a simple message content, streaming results are provided as an object containing an array of results.

Was there a documented change for this? Maybe it's better to use the ChatGPT-compatible /v1/chat/completions endpoint instead? It seems the built-in API isn't versioned.

@yiding
Copy link

yiding commented Jan 12, 2024

It seems like #4232 changed the behavior of prompt parameter such that when given an array, instead of interpreting the array as a single prompt defined using an array, it's interpreted as multiple prompts to be evaluated in parallel. This change is not documented in the example server README, which still refers to the old behavior.

This means that it's impossible to use the current code to pass in a single completion request with a prompt that contains an array with multiple elements. This means that you can no longer:

  1. pass in an array of tokens
  2. pass in an array of (strings and tokens).

In my case, I wanted to do this because I wanted to intersperse text and raw tokens in the prompt, without making tokenize requests first to tokenize the text.

I think at the very least the new behavior should be documented. Moreover, I think overloading this behavior to the prompt parameter is confusing since it changes the interpretation of the field, and shadows behaviors. Perhaps it would make more sense to have a different parameter such as prompts (plural) for multi-prompt mode (and make it mutually exclusive to prompt (singular))

cc @ziedbha who originally added this.

@jboero
Copy link
Contributor Author

jboero commented Jan 12, 2024

Excellent thank you for confirming my issue.

@ziedbha
Copy link
Contributor

ziedbha commented Jan 13, 2024

Apologies, I haven't had time to look at this properly. What I can do is revert to the old behavior to unblock you, and revisit this later. Sorry again.

@yiding
Copy link

yiding commented Jan 13, 2024

Commenting out this block is sufficient to restore old behavior.

// when a completion task's prompt array is not a singleton, we split it into multiple requests
if (task.data.count("prompt") && task.data.at("prompt").size() > 1)
{
lock.unlock(); // entering new func scope
return split_multiprompt_task(task);
}

Personally I can just work off of a local commit, so no need to do any reverting on my account, not sure about @jboero

@jboero
Copy link
Contributor Author

jboero commented Jan 13, 2024

No need to revert if this is the way it's meant to be. Or maybe version the API to support both?
Or guidelines should recommend using the ChatGPT-compatible endpoint instead? To be fair what started out as a simple example is maybe exceeding the purposes it was originally built for.

@ggerganov
Copy link
Owner

To be fair what started out as a simple example is maybe exceeding the purposes it was originally built for.

It is. It would be great to have a fully functional server with all the bells and whistles, but maintaining it requires extra effort - even more so without a way to create a CI with some simple tests. I'm planning to refactor a lot of the code, but it's not really high-priority atm since there are more important things to be done for the core lib first.

Hopefully the community will help out in the meantime (#4216)

@jboero
Copy link
Contributor Author

jboero commented Jan 13, 2024 via email

@jboero
Copy link
Contributor Author

jboero commented Jan 22, 2024

Closing this for now as it seems this was a feature change.

@jboero jboero closed this as completed Jan 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants