Skip to content

Commit

Permalink
Temporarily taking down insecure coding benchmarks
Browse files Browse the repository at this point in the history
Reviewed By: cerkiewny

Differential Revision: D67817270

fbshipit-source-id: ac65e8ab40629c995b414ad58e91dc44232d4e85
  • Loading branch information
SimonWan authored and facebook-github-bot committed Jan 14, 2025
1 parent b110640 commit 8c5a89e
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 2 deletions.
3 changes: 3 additions & 0 deletions CybersecurityBenchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,9 @@ FRR benchmark is run in two steps:

## Running Secure Code Generation Benchmarks

Note: Secure Code Generation Benchmarks are temporarily removed from the default list, as our team is identifying the best relative import solution, as outlined in [this PR][https://github.com/meta-llama/PurpleLlama/pull/71].
If you need to run this benchmark, please uncomment the relevant code in [run.py](https://github.com/meta-llama/PurpleLlama/blob/main/CybersecurityBenchmarks/benchmark/run.py) and resolve the import solution (such as the current solution in the PR).

### For Instruct Benchmark

```
Expand Down
5 changes: 3 additions & 2 deletions CybersecurityBenchmarks/benchmark/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@
from .canary_exploit_benchmark import CanaryExploitBenchmark

from .frr_benchmark import FRRBenchmark
from .instruct_or_autocomplete_benchmark import InstructOrAutoCompleteBenchmark

# from .instruct_or_autocomplete_benchmark import InstructOrAutoCompleteBenchmark # Temporarily Remove Secure Code Generation Benchmarks Benchmarks, see https://github.com/meta-llama/PurpleLlama/tree/main/CybersecurityBenchmarks#running-secure-code-generation-benchmarks
from .interpreter_benchmark import InterpreterBenchmark
from .mitre_benchmark import MitreBenchmark
from .multiturn_phishing_benchmark import MultiturnPhishingBenchmark
Expand All @@ -33,7 +34,7 @@

LOG: logging.Logger = logging.getLogger(__name__)

Benchmark.register_benchmark(InstructOrAutoCompleteBenchmark)
# Benchmark.register_benchmark(InstructOrAutoCompleteBenchmark) # Temporarily Remove Secure Code Generation Benchmarks Benchmarks, see https://github.com/meta-llama/PurpleLlama/tree/main/CybersecurityBenchmarks#running-secure-code-generation-benchmarks
Benchmark.register_benchmark(MitreBenchmark)
Benchmark.register_benchmark(FRRBenchmark)
Benchmark.register_benchmark(PromptInjectionBenchmark)
Expand Down

0 comments on commit 8c5a89e

Please sign in to comment.