Features/multiturn conversation #217

steffencruz · 2024-04-23T21:13:11Z

Adds multi-turn conversation capabilities to the validator. This is important for the following reasons:

Miners are assessed on their ability to continue a conversation over multiple turns
Miners are required to parse the entire conversation history (essential for chat app)

Overview:

The approach reuses the QuestionAnsweringTask to create followup questions, with a dedicated followup prompt.
We generate a reference answer using a dedicated followup reference prompt, and use the normal QA reward stack.
After the initial task, all tasks are QA.
The intial context is used throughout the entire conversation.
The number of turns in the conversation is randomly determined. Currently, the probability of continuing is 50% at each turn. This seems to work fine. This produces conversations up to 10 turns (although most are less than 3).
We use the best completion in the conversation history (but we could alternatively use the reference, or both with some prob?)
We don't create a challenge for subsequent steps. Instead we prompt the LLM to continue the conversation in a consistent style as the user (challenge), which seems to work well...

Followup prompt

Several iterations of prompt engineering were carried out. Each was manually inspected, analyzed for obvious artefacts (QG+QA, length) and run through a battery of gpt4 evals.

A total of 6 iterations of prompt engineering were carried out. After refinement, the followup prompt we now appears to produce good, continuous conversations.

Reference answers, as judged by rewards distributions on tracked experiments. The plots show no clear deterioration in rewards with conversation turn, which indicates that the prompts are stable.

run_paths = {
    'mt-base': 'opentensor-dev/alpha-validators/h4ywaxzb', # my first attempt
    'mt-qg+qa' :'opentensor-dev/alpha-validators/v9ggnzha', # adds extra instruction to not answer the followup
    'mt-gpt1': 'opentensor-dev/alpha-validators/9w4oroso', # first gpt variation
    'mt-gpt2': 'opentensor-dev/alpha-validators/kc3lxgd0', # second gpt variation
    'mt-gpt3': 'opentensor-dev/alpha-validators/0swpcny1', #third gpt variation
    'mt-gpt2-v2': 'opentensor-dev/alpha-validators/d0iwnlc2', # refinement on second gpt variation
}

GPT-4 evals

The plots below show the quality of the followup questions based on GPT4 as a judge

Pre-staging

v2.0 [WIP]

… with some prob

Pre staging

Version 2.1.0

…e QA with 50% each turn

…opentensor/prompting into features/multiturn-conversation

p-ferreira and others added 21 commits April 4, 2024 14:31

Merge pull request #136 from opentensor/pre-staging

0da2f4b

Pre-staging

Merge pull request #185 from opentensor/staging

414abbb

v2.0 [WIP]

Append each step into a history object and create a followup question…

2ebca49

… with some prob

Refactor random task creation into conversation.py

c9af4e7

Add prototype followup question

18f33f1

Merge pull request #204 from opentensor/pre-staging

8da8358

Pre staging

Merge pull request #211 from opentensor/staging

e466412

Version 2.1.0

Remove random_task

3625819

Revert forward step to use previous design, and enfore followups to b…

a5cde41

…e QA with 50% each turn

Log turn number

5a4bfed

update system prompt to discourage QG+QA hallucinations

54dbd70

Refine followup prompt

115c29a

Cleanup comments

1205710

Add roles and messages to synapse

b9255c4

Merge branch 'main' into features/multiturn-conversation

44b7c90

fix broken unit test

12502a6

Add new cleaning pipeline

14066d6

sets package version

dd757ea

Merge branch 'features/multiturn-conversation' of https://github.com/…

bfad7fb

…opentensor/prompting into features/multiturn-conversation

Add Union Typing

770e648

Remove re.replace

3c8b4b3

p-ferreira requested review from p-ferreira and bkb2135 April 30, 2024 18:52

p-ferreira added the v2.2.0 label Apr 30, 2024

p-ferreira approved these changes Apr 30, 2024

View reviewed changes

bkb2135 approved these changes Apr 30, 2024

View reviewed changes

p-ferreira merged commit 046e40c into pre-staging Apr 30, 2024
2 checks passed

p-ferreira mentioned this pull request May 1, 2024

v2.2.0 #213

Merged

Hollyqui deleted the features/multiturn-conversation branch August 2, 2024 08:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Features/multiturn conversation #217

Features/multiturn conversation #217

steffencruz commented Apr 23, 2024 •

edited

Loading

Features/multiturn conversation #217

Features/multiturn conversation #217

Conversation

steffencruz commented Apr 23, 2024 • edited Loading

Overview:

Followup prompt

GPT-4 evals

steffencruz commented Apr 23, 2024 •

edited

Loading