Skip to content

Commit f6e8884

Browse files
committed
chore: separated code_auditor and code_auditor_with_judge
1 parent 5756535 commit f6e8884

File tree

4 files changed

+90
-16
lines changed

4 files changed

+90
-16
lines changed

examples/code_auditor/task.yml

-16
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,6 @@ guidance:
1313
- Don't make assumptions or hypotheticals and only report vulnerabilities that can be confirmed by the source code provided.
1414
- Prioritize reporting vulnerabilities that can lead to unauthorized access to the application, code execution, or other unauthorized actions.
1515
- Avoid reporting misconfigurations or other non-vulnerability issues such as improper error handling.
16-
- ALWAYS use the judge tool to confirm whether or not a finding is a vulnerability.
17-
- NEVER report a finding unless the judge tool confirms it is a vulnerability.
1816
- Use exclusively the report_findings tool to report your findings.
1917
- Your task is not complete until you have analyzed ALL the files in EVERY source code subfolder and reported ALL of your findings.
2018
- Analyze the files in a folder before moving on to the next folder.
@@ -26,20 +24,6 @@ prompt: >
2624
functions:
2725
- name: Report
2826
actions:
29-
- name: judge_finding
30-
description: Use this tool to ask an external expert to judge if a finding is a vulnerability or not. If the expert confirms it is a vulnerability, use the report_finding tool to report it.
31-
example_payload: >
32-
{
33-
"title": "SQL Injection",
34-
"severity": "HIGH",
35-
"impact": "An unauthorized attacker could execute arbitrary SQL commands, leading to unauthorized access to the database.",
36-
"description": "This is an example finding",
37-
"evidence": "This is the evidence for the finding",
38-
"file": "/full/path/to/vulnerable_file.py",
39-
"proof": "This is the proof of concept for the finding"
40-
}
41-
judge: judge.yml
42-
4327
- name: report_finding
4428
description: Use this tool to report your findings.
4529
example_payload: >
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
Looks for vulnerabilities in the specified code folder, variant of code_auditor that adds a judge AI to confirm vulnerabilities.
2+
3+
### Example Usage
4+
5+
```sh
6+
nerve -G "openai://gpt-4" -T code_auditor -DTARGET_PATH=/path/to/code
7+
```
+26
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
using:
2+
# allows the agent to inspect folders and read files. No write access is provided.
3+
- filesystem
4+
5+
system_prompt: >
6+
You are an expert application security professional judging if the security finding submitted by a junior auditor is a vulnerability or not.
7+
8+
guidance:
9+
- Don't make assumptions or hypotheticals and only confirm vulnerabilities that can be proven by the source code provided.
10+
- ALWAYS check the file source and the code to confirm the finding.
11+
12+
prompt: >
13+
confirm by using the judge tool whether or not this is a vulnerability: $STDIN
14+
15+
functions:
16+
- name: Judge
17+
actions:
18+
- name: confirm_vulnerability
19+
description: Use this tool to confirm whether or not the finding is a vulnerability.
20+
complete_task: true
21+
tool: echo "VULNERABILITY CONFIRMED"
22+
23+
- name: not_vulnerability
24+
description: Use this tool to judge that the finding is not a vulnerability.
25+
complete_task: true
26+
tool: echo "NOT A VULNERABILITY, IGNORE"
+57
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
using:
2+
# allows the agent to inspect folders and read files. No write access is provided.
3+
- filesystem
4+
# allows the agent to set the task as complete
5+
- task
6+
7+
system_prompt: >
8+
You are an expert application security professional.
9+
You are given access to a folder with the source code for an application to audit.
10+
You are acting as a useful assistant that performs code auditing by reviewing the files in the folder and looking for potential vulnerabilities.
11+
12+
guidance:
13+
- Don't make assumptions or hypotheticals and only report vulnerabilities that can be confirmed by the source code provided.
14+
- Prioritize reporting vulnerabilities that can lead to unauthorized access to the application, code execution, or other unauthorized actions.
15+
- Avoid reporting misconfigurations or other non-vulnerability issues such as improper error handling.
16+
- ALWAYS use the judge tool to confirm whether or not a finding is a vulnerability.
17+
- NEVER report a finding unless the judge tool confirms it is a vulnerability.
18+
- Use exclusively the report_findings tool to report your findings.
19+
- Your task is not complete until you have analyzed ALL the files in EVERY source code subfolder and reported ALL of your findings.
20+
- Analyze the files in a folder before moving on to the next folder.
21+
- Make sure you reported everything you found and ince you are done reporting ALL of your findings, set your task as complete.
22+
23+
prompt: >
24+
find vulnerabilities in source code in $TARGET_PATH and report your findings.
25+
26+
functions:
27+
- name: Report
28+
actions:
29+
- name: judge_finding
30+
description: Use this tool to ask an external expert to judge if a finding is a vulnerability or not. If the expert confirms it is a vulnerability, use the report_finding tool to report it.
31+
example_payload: >
32+
{
33+
"title": "SQL Injection",
34+
"severity": "HIGH",
35+
"impact": "An unauthorized attacker could execute arbitrary SQL commands, leading to unauthorized access to the database.",
36+
"description": "This is an example finding",
37+
"evidence": "This is the evidence for the finding",
38+
"file": "/full/path/to/vulnerable_file.py",
39+
"proof": "This is the proof of concept for the finding"
40+
}
41+
judge: judge.yml
42+
43+
- name: report_finding
44+
description: Use this tool to report your findings.
45+
example_payload: >
46+
{
47+
"title": "SQL Injection",
48+
"severity": "HIGH",
49+
"impact": "An unauthorized attacker could execute arbitrary SQL commands, leading to unauthorized access to the database.",
50+
"description": "This is an example finding",
51+
"evidence": "This is the evidence for the finding",
52+
"file": "/full/path/to/vulnerable_file.py",
53+
"proof": "This is the proof of concept for the finding"
54+
}
55+
alias: filesystem.append_to_file
56+
define:
57+
filesystem.append_to_file.target: findings.jsonl

0 commit comments

Comments
 (0)