Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable more complex prompts #602

Open
jmartin-tech opened this issue Apr 11, 2024 · 1 comment · May be fixed by #1089
Open

Enable more complex prompts #602

jmartin-tech opened this issue Apr 11, 2024 · 1 comment · May be fixed by #1089
Assignees
Labels
architecture Architectural upgrades

Comments

@jmartin-tech
Copy link
Collaborator

jmartin-tech commented Apr 11, 2024

The current generator interface expects to receive prompts as str see: https://github.com/leondz/garak/blob/4127ae5092ad3acaba680a32011018fc564cc92a/garak/generators/base.py#L66

This initial simple submission process has worked to date; however #587 show an example of a query prompt that needs a more complex structure. In this case the Multi-modal model accepts both text and image data to generate a response.

I propose an added abstraction layer by implementing a Prompt base interface class that be extended to model these more complex prompts to be processed by each generator.

def generate(self, prompt: Prompt) -> List[str]:

or possibly also abstracting the response as well:

def generate(self, prompt: Prompt) -> List[PromptResponse]:

Prompts can then be further segmented into things like TextPrompt, MultiStepTextPrompt, VisualPrompt, VisualTextPrompt and other such constructs to that on the base functions available to allow use with different and even mixed prompt modalities for models that can accept various input patterns.

Rough example:

class Prompt:
    text = None

    def str(self)
        return self.text

class TextPrompt(Prompt):
    def __init__(self, text: str):
        self.text = text

class VisualTextPrompt(Prompt):
    image
    def __init__(self, text: str, image_path: str):
        self.text = text
        try:
            Image.open(image_path)
         except Exception:
             logger.error(f"No image found at: {image_path}")
@jmartin-tech
Copy link
Collaborator Author

Another recent finding related to multi-modal prompts is a need to define relationships between parts of the prompt. The case identified is that some models request formats may have different expectations for referencing images in text. The current visual_jailbreak prompts include a placeholder in the text segment of the prompt that some models may need to remove or replace with an API specific linking/embedding.

@leondz leondz self-assigned this Jan 15, 2025
@leondz leondz linked a pull request Jan 27, 2025 that will close this issue
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
architecture Architectural upgrades
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants