Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate string output/input to Turn objects #1089

Open
wants to merge 90 commits into
base: main
Choose a base branch
from

Conversation

leondz
Copy link
Collaborator

@leondz leondz commented Jan 27, 2025

Resolves #602

summary

garak prompts & outputs are all captured as str

We can about more modalities than strings - scope encompasses image attachments, files, rich outputs. Though the Language part of LLM is requisite for garak scope to apply.

This PR abstracts the type sent to and received from generators away from str to Turn. Each Turn() instance corresponds to a turn in a conversation. Simple probes pose Turn() to their target. Attempts manage a message history of Turn() items instead of strings.

details

Turn class

Turn represents the content part of a query to or response from an LLM.

Turns have:

  • text: corresponds to what was previously prompt: str
  • parts: a list of data, default []
  • equality testing function
  • method for populating parts from file data
  • serialisation and deserialisation from dict

Turns should be None on when the model provided no response. If the response was returned but had e.g. no text component, use a Turn() with text==None.

This is only a part of a query sent to a target model. Turns are agnostic to metadata about which role is uttering it. A full LLM query might be composed of multiple turns.

Tests should all specifically use the Turn object.

NB it's now preferred to copy turn text locally instead of manipulating it in place. e.g.

lower_text = output.text.lower() # OK
output.text = output.text.lower() # disprefered

-- unless dealing things that need to edit to object, e.g. buffs

Updates to Attempt

Attempt is a core place this change affects.

  • attempt.prompt is now a Turn; refer to what was attempt.prompt, with attempt.prompt.text
  • attempt.outputs is now a list of Turns
  • attempt.messages is still a list of lists of dicts, but the content part of the dicts must be a Turn
  • A little magic is applied to maintain existing interfaces for writing to Attempts, where some methods will accept a List of str.
    • Delving into Attempt internals requires using Turns.
    • Direct access of messages internals must be done with awareness of Turn

type changes

  • Anything operating on an attempt will need a change
  • Detectors operate on outputs which is now a list of Turns
  • Buffs manipulate prompt which is now a Turn
  • generators.base.generate() should take a Turn and not a string
  • generators.base.generate() should return a list of Turns
    • Currently, generate() returns a list of str, and the magic in Attempt handles the mapping to Turn()
    • This constrains us to only having string output. We're cool with that for now I think
    • @jmartin-tech is leaving generators.base.generate() as returning a list of strings OK? Is it sensible to postpone migrating this to a list of Turns? Would appreciate your thoughts

Verification

  • python -m pytest tests/test_attempt.py
  • the full test suite with API keys enabled

@leondz leondz added the architecture Architectural upgrades label Jan 27, 2025
@leondz leondz requested a review from jmartin-tech January 27, 2025 12:14
@leondz
Copy link
Collaborator Author

leondz commented Feb 17, 2025

@jmartin-tech failing tests pass locally. they seem to centre on a class that's removed. could this be a caching thing?

@leondz leondz marked this pull request as ready for review February 18, 2025 10:08
@jmartin-tech
Copy link
Collaborator

Yes, test failures are a caching related issue. I will see about addressing and offering a fix here or in a separate PR.

@leondz
Copy link
Collaborator Author

leondz commented Feb 19, 2025

sorry about the extra churn in test json - went to match the pretty printing - json_pp ordered the arguments.

PR gtg for review.

@leondz leondz force-pushed the feature/turn_objects branch from 40633a1 to 061dbcc Compare February 21, 2025 09:38
@erickgalinkin erickgalinkin self-requested a review February 21, 2025 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
architecture Architectural upgrades
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable more complex prompts
2 participants