Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Harmbench integration #45

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

adarshan-intel
Copy link
Collaborator

@adarshan-intel adarshan-intel commented Mar 7, 2025

Overview:
We couldn't run Harmbench due to environmental issues, and the README commands from their repo not working here.
From what I could undertand, the --completions_path is basically a file which consists of the test_case_id and generated_response from the model.
As per @mariusarvinte suggestion, we have only implemented this for advbench_dataset. This is currently disabled for any other types of data input.

Utils.py
Added a JSON helper function to keep updating the json file
Attack.py

  1. Transformed the continuation prompt which has tags + user + assistant responses => Response only from assistant using the LLmart.Transformer class.
  2. Enabled JSON logging only for data=advbench
  3. Indexing of JSON id based on subset id as suggested by @mariusarvinte

Input command:
python -m llmart model=llama3-8b-instruct data=advbench_behavior data.subset=[0,1,2,3] loss=model steps=1
Execution Output:
harmbench_output

Note: The output of JSON is structured exactly in the format of Harmbench evaluation script

Fixes #11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

More friendly output formats
2 participants