-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add more prompting options to OpenAIAnswerGenerator #4138
Conversation
Hey @Timoeller, @tholor and I finished coding and reviewing a biggish update to this node in this PR #4038 last week. Could you build your changes from this PR on top of those? It should be merged into main now. Apologies for the extra work on this |
@retry_with_exponential_backoff( | ||
backoff_in_seconds=int(os.environ.get(HAYSTACK_REMOTE_API_BACKOFF_SEC, 1)), | ||
max_retries=int(os.environ.get(HAYSTACK_REMOTE_API_MAX_RETRIES, 5)), | ||
errors=(OpenAIRateLimitError, OpenAIError), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding OpenAIError here is very important since the OpenAI API breaks regularly.
@Timoeller unfortunately I didn't find time to finish it. However I tried the merge to the best of my knowledge, by converting instructions and runtime instructions to PromptTemplates. |
temperature: float = 0.2, | ||
presence_penalty: float = 0.1, | ||
frequency_penalty: float = 0.1, | ||
examples_context: Optional[str] = None, | ||
examples: Optional[List[List[str]]] = None, | ||
instructions: Optional[str] = None, | ||
add_runtime_instructions: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I am not really convinced of adding those two params here. It feels like adding "here and there" a piece to instead of adding the needed functionality into the design of PromptTemplate. Finding a suitable design there is not easy but worth the effort, I think.
I'd propose splitting this PR:
- one PR that we can merge quickly: changes to the OpenAIError (important!), new stop_words param, clean_documents method
- one PR (or just an issue for now) describing the need / use case you have here for
instructions
and how that potentially look like in a PromptTemplate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as a heads up, the OpenAIError has been added to main in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, a properly design prompttemplate that allows for runtime instructions is what we need. When do you think we will have such a design + associated testing?
The OpenAIAnswerGenerator class (or its internals) should be substituted by the promptnode + template in the future anyways. That is why I think we should experiment quickly and proceed with the current implementation. If the promptemplate design is just around the corner we dont need to proceed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's already a proposal out there and it's actively discussed. I would expect to have something implemented in the next weeks. I would rather concentrate our efforts on finding there a nice design than "experimenting on two things in parallel".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Timoeller As the OpenAIErrors are already merged. I'd propose for this PR: Either you reduce it to "new stop_words param" + "clean_documents" OR we close it
`[["Q: What is human life expectancy in the United States?", "A: 78 years."]]` | ||
:param stop_words: Up to four sequences where the API stops generating further tokens. The returned text does | ||
format you'd like. | ||
:param instructions: Here you can initialize custom instructions as prompt. Defaults to 'Create a concise and informative answer...' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is basically the prompt text?
Maybe rephrasing to: The text of the prompt instructing the model what you want it to do. The default prompt is: Create a concise and informative answer..".
(also adding a task for Docs to create guidelines for writing prompts)
:param stop_words: Up to four sequences where the API stops generating further tokens. The returned text does | ||
format you'd like. | ||
:param instructions: Here you can initialize custom instructions as prompt. Defaults to 'Create a concise and informative answer...' | ||
:param add_runtime_instructions: If you like to add the prompt instructions (the instructions around the question) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this boolean? If yes, I think we should say it explicitly:
Set to True
to add additional instructions for the model at query time. By default, this setting uses the value from the instructions
parameter.
If you set it to True
, you'll be able to add more instructions when typing a query. If you do that, separate the instructions from the query in this format: "Here go your instructions [separator (what separators can they use?)] and here goes the query". Use the $documents
and $query
variables in the instruction text. They will be replaced with actual values during runtime.
For example: <can we give an example here?>
"... <instructions> ... [SEPARATOR] <question>" | ||
Also make sure to mention "$documents" and "$query" in the <instructions>, such | ||
that those will be replaced in correctly. | ||
:param stop_words: Up to 4 sequences where the API stops generating further tokens. The returned text does |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's a "sequence"? is it a string? If yes, I think it's clearer to say:
Specifies the strings that make the API stop generating more tokens. The returned text doesn't contain these strings. You can specify up to four such strings.
temp = query.split("[SEPARATOR]") | ||
if len(temp) != 2: | ||
logger.error( | ||
"Instructions given to the OpenAIAnswerGenerator were not correct, please follow the structure " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The instructions for OpenAIAnswerGenerator were incorrect. Make sure you follow the structure described in the docstrings (can we tell them explicitly which docstrings we mean here?)
@Timoeller this PR is quite old at this point, is it still valid? |
Proposed Changes:
How did you test it?
Notes for the reviewer
Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
.