Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More friendly output formats #11

Open
dxoigmn opened this issue Jan 27, 2025 · 2 comments · May be fixed by #45
Open

More friendly output formats #11

dxoigmn opened this issue Jan 27, 2025 · 2 comments · May be fixed by #45
Assignees

Comments

@dxoigmn
Copy link
Contributor

dxoigmn commented Jan 27, 2025

The primary way we inspect outputs is via Tensorboard. However, most other tools cannot consume Tensorboard summaries (protobufs). Your job is to research output formats that can feed into other tools (e.g., Inspect, HarmBench, etc.), and add support for these formats.

This is probably best done in conjunction with #12.

@ajithraj-intel
Copy link
Collaborator

ajithraj-intel commented Mar 5, 2025

@dxoigmn I wanted your help in clarifying some questions.

Q1. Is the intention of this issue to make the output of llmart look more like tools like Inspect and Harmbench - Interface with adversarial prompts generated and their response from custom models like this ?

Q2. Or is the intention to add support for more trackers like WandB, CometML, Aim, MLFlow, ClearML, DVCLive by modifying the section below.

# Setup Tensorboard
accelerator = Accelerator(
log_with="tensorboard",
project_dir=cfg.output_dir,
dataloader_config=DataLoaderConfiguration(
split_batches=cfg.data.split_batches
if cfg.data.split_batches is not None
else (True if cfg.data.n_train > 1 else False)
),

Q3. Or is the intention to convert the protobuf output to more universal outputs like json, hdf5, csv, etc.

Q4. Could you please explain the relation of this issue with #12 and how you want the solution which interlinks and fixes both of them ?

@dxoigmn
Copy link
Contributor Author

dxoigmn commented Mar 5, 2025

@dxoigmn I wanted your help in clarifying some questions.

Q1. Is the intention of this issue to make the output of llmart look more like tools like Inspect and Harmbench - Interface with adversarial prompts generated and their response from custom models like this ?

It would be nice if our tool could output a dataset that is consumable by those tools. HarmBench, for example, has an evaluation script. Our tool should output a file that we can then pass as --completions_path for example. I haven't looked to closely to see if inspect has an analogous tool.

Q2. Or is the intention to add support for more trackers like WandB, CometML, Aim, MLFlow, ClearML, DVCLive by modifying the section below.

LLMart/src/llmart/attack.py

Lines 43 to 51 in 0a49cd9

Setup Tensorboard

accelerator = Accelerator(
log_with="tensorboard",
project_dir=cfg.output_dir,
dataloader_config=DataLoaderConfiguration(
split_batches=cfg.data.split_batches
if cfg.data.split_batches is not None
else (True if cfg.data.n_train > 1 else False)
),

No, we are happy with Tensorboard for now.

Q3. Or is the intention to convert the protobuf output to more universal outputs like json, hdf5, csv, etc.

Yes, one option is to write a separate tool that can consume tensorboard outputs and turn them into something consumable by HarmBench's eval script.

Q4. Could you please explain the relation of this issue with #12 and how you want the solution which interlinks and fixes both of them ?

The relationship is that whatever this new tool outputs should be the attack with the lowest loss and not just the last attack.

@adarshan-intel adarshan-intel linked a pull request Mar 7, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants