feat: Add more prompting options to OpenAIAnswerGenerator #4138

Timoeller · 2023-02-10T18:03:39Z

Proposed Changes:

Add a better default prompt that conditions the answer more to the given context documents. Also the model will try to cite the source of the answer.
Add the option to give instructions at runtime inside the query, see docstrings for format

How did you test it?

I only tested manually, I hope existing tests should suffice.

Notes for the reviewer

I also added the OpenAIError to the backoff method, since the OpenAI API is often unreachable. With the generic backoff we should be safe against the unstable API

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added tests that demonstrate the correct behavior of the change
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test:.
I documented my code
I ran pre-commit hooks and fixed any issue

sjrl · 2023-02-11T23:11:29Z

Hey @Timoeller, @tholor and I finished coding and reviewing a biggish update to this node in this PR #4038 last week. Could you build your changes from this PR on top of those? It should be merged into main now. Apologies for the extra work on this

Timoeller · 2023-02-14T09:54:52Z

haystack/nodes/answer_generator/openai.py

+    @retry_with_exponential_backoff(
+        backoff_in_seconds=int(os.environ.get(HAYSTACK_REMOTE_API_BACKOFF_SEC, 1)),
+        max_retries=int(os.environ.get(HAYSTACK_REMOTE_API_MAX_RETRIES, 5)),
+        errors=(OpenAIRateLimitError, OpenAIError),


Adding OpenAIError here is very important since the OpenAI API breaks regularly.

tstadel · 2023-02-16T20:36:45Z

@Timoeller unfortunately I didn't find time to finish it. However I tried the merge to the best of my knowledge, by converting instructions and runtime instructions to PromptTemplates.

tholor · 2023-02-17T16:23:22Z

haystack/nodes/answer_generator/openai.py

        temperature: float = 0.2,
        presence_penalty: float = 0.1,
        frequency_penalty: float = 0.1,
        examples_context: Optional[str] = None,
        examples: Optional[List[List[str]]] = None,
+        instructions: Optional[str] = None,
+        add_runtime_instructions: bool = False,


To be honest, I am not really convinced of adding those two params here. It feels like adding "here and there" a piece to instead of adding the needed functionality into the design of PromptTemplate. Finding a suitable design there is not easy but worth the effort, I think.

I'd propose splitting this PR:

one PR that we can merge quickly: changes to the OpenAIError (important!), new stop_words param, clean_documents method

one PR (or just an issue for now) describing the need / use case you have here for instructions and how that potentially look like in a PromptTemplate

Just as a heads up, the OpenAIError has been added to main in this PR.

I agree, a properly design prompttemplate that allows for runtime instructions is what we need. When do you think we will have such a design + associated testing?

The OpenAIAnswerGenerator class (or its internals) should be substituted by the promptnode + template in the future anyways. That is why I think we should experiment quickly and proceed with the current implementation. If the promptemplate design is just around the corner we dont need to proceed here.

There's already a proposal out there and it's actively discussed. I would expect to have something implemented in the next weeks. I would rather concentrate our efforts on finding there a nice design than "experimenting on two things in parallel".

@Timoeller As the OpenAIErrors are already merged. I'd propose for this PR: Either you reduce it to "new stop_words param" + "clean_documents" OR we close it

agnieszka-m · 2023-02-23T08:50:56Z

haystack/nodes/answer_generator/openai.py

-                         `[["Q: What is human life expectancy in the United States?", "A: 78 years."]]`
-        :param stop_words: Up to four sequences where the API stops generating further tokens. The returned text does
+                         format you'd like.
+        :param instructions: Here you can initialize custom instructions as prompt. Defaults to 'Create a concise and informative answer...'


So this is basically the prompt text?

Maybe rephrasing to: The text of the prompt instructing the model what you want it to do. The default prompt is: Create a concise and informative answer..".

(also adding a task for Docs to create guidelines for writing prompts)

agnieszka-m · 2023-02-23T08:56:54Z

haystack/nodes/answer_generator/openai.py

-        :param stop_words: Up to four sequences where the API stops generating further tokens. The returned text does
+                         format you'd like.
+        :param instructions: Here you can initialize custom instructions as prompt. Defaults to 'Create a concise and informative answer...'
+        :param add_runtime_instructions: If you like to add the prompt instructions (the instructions around the question)


Is this boolean? If yes, I think we should say it explicitly:

Set to True to add additional instructions for the model at query time. By default, this setting uses the value from the instructions parameter.

If you set it to True, you'll be able to add more instructions when typing a query. If you do that, separate the instructions from the query in this format: "Here go your instructions [separator (what separators can they use?)] and here goes the query". Use the $documents and $query variables in the instruction text. They will be replaced with actual values during runtime.
For example: <can we give an example here?>

agnieszka-m · 2023-02-23T09:01:01Z

haystack/nodes/answer_generator/openai.py

+                                         "... <instructions> ... [SEPARATOR] <question>"
+                                         Also make sure to mention "$documents" and "$query" in the <instructions>, such
+                                         that those will be replaced in correctly.
+        :param stop_words: Up to 4 sequences where the API stops generating further tokens. The returned text does


what's a "sequence"? is it a string? If yes, I think it's clearer to say:
Specifies the strings that make the API stop generating more tokens. The returned text doesn't contain these strings. You can specify up to four such strings.

agnieszka-m · 2023-02-23T09:04:10Z

haystack/nodes/answer_generator/openai.py

+            temp = query.split("[SEPARATOR]")
+            if len(temp) != 2:
+                logger.error(
+                    "Instructions given to the OpenAIAnswerGenerator were not correct, please follow the structure "


The instructions for OpenAIAnswerGenerator were incorrect. Make sure you follow the structure described in the docstrings (can we tell them explicitly which docstrings we mean here?)

masci · 2023-11-23T08:37:10Z

@Timoeller this PR is quite old at this point, is it still valid?

Init

53601b2

Timoeller added topic:reader type:enhancement labels Feb 10, 2023

Timoeller requested a review from a team as a code owner February 10, 2023 18:03

Timoeller requested review from silvanocerza and removed request for a team February 10, 2023 18:03

github-actions bot added topic:LLM type:documentation Improvements on the docs and removed topic:reader labels Feb 10, 2023

Timoeller commented Feb 14, 2023

View reviewed changes

tstadel and others added 2 commits February 16, 2023 21:19

Merge branch 'main' into RAQA_prompting

a6ac087

fix merge

65faff2

fix pylint

6868c7d

tholor reviewed Feb 17, 2023

View reviewed changes

agnieszka-m requested changes Feb 23, 2023

View reviewed changes

tholor removed the request for review from silvanocerza February 23, 2023 13:33

Timoeller closed this Nov 23, 2023

masci deleted the RAQA_prompting branch December 5, 2023 08:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add more prompting options to OpenAIAnswerGenerator #4138

feat: Add more prompting options to OpenAIAnswerGenerator #4138

Timoeller commented Feb 10, 2023

sjrl commented Feb 11, 2023

Timoeller Feb 14, 2023

tstadel commented Feb 16, 2023

tholor Feb 17, 2023

sjrl Feb 17, 2023

Timoeller Feb 20, 2023

tholor Feb 22, 2023 •

edited

Loading

tholor Feb 22, 2023 •

edited

Loading

agnieszka-m Feb 23, 2023

agnieszka-m Feb 23, 2023

agnieszka-m Feb 23, 2023

agnieszka-m Feb 23, 2023

masci commented Nov 23, 2023

feat: Add more prompting options to OpenAIAnswerGenerator #4138

feat: Add more prompting options to OpenAIAnswerGenerator #4138

Conversation

Timoeller commented Feb 10, 2023

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

sjrl commented Feb 11, 2023

Choose a reason for hiding this comment

tstadel commented Feb 16, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tholor Feb 22, 2023 • edited Loading

Choose a reason for hiding this comment

tholor Feb 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

masci commented Nov 23, 2023

tholor Feb 22, 2023 •

edited

Loading

tholor Feb 22, 2023 •

edited

Loading