You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: examples/code_auditor/task.yml
-16
Original file line number
Diff line number
Diff line change
@@ -13,8 +13,6 @@ guidance:
13
13
- Don't make assumptions or hypotheticals and only report vulnerabilities that can be confirmed by the source code provided.
14
14
- Prioritize reporting vulnerabilities that can lead to unauthorized access to the application, code execution, or other unauthorized actions.
15
15
- Avoid reporting misconfigurations or other non-vulnerability issues such as improper error handling.
16
-
- ALWAYS use the judge tool to confirm whether or not a finding is a vulnerability.
17
-
- NEVER report a finding unless the judge tool confirms it is a vulnerability.
18
16
- Use exclusively the report_findings tool to report your findings.
19
17
- Your task is not complete until you have analyzed ALL the files in EVERY source code subfolder and reported ALL of your findings.
20
18
- Analyze the files in a folder before moving on to the next folder.
@@ -26,20 +24,6 @@ prompt: >
26
24
functions:
27
25
- name: Report
28
26
actions:
29
-
- name: judge_finding
30
-
description: Use this tool to ask an external expert to judge if a finding is a vulnerability or not. If the expert confirms it is a vulnerability, use the report_finding tool to report it.
31
-
example_payload: >
32
-
{
33
-
"title": "SQL Injection",
34
-
"severity": "HIGH",
35
-
"impact": "An unauthorized attacker could execute arbitrary SQL commands, leading to unauthorized access to the database.",
36
-
"description": "This is an example finding",
37
-
"evidence": "This is the evidence for the finding",
38
-
"file": "/full/path/to/vulnerable_file.py",
39
-
"proof": "This is the proof of concept for the finding"
40
-
}
41
-
judge: judge.yml
42
-
43
27
- name: report_finding
44
28
description: Use this tool to report your findings.
# allows the agent to inspect folders and read files. No write access is provided.
3
+
- filesystem
4
+
# allows the agent to set the task as complete
5
+
- task
6
+
7
+
system_prompt: >
8
+
You are an expert application security professional.
9
+
You are given access to a folder with the source code for an application to audit.
10
+
You are acting as a useful assistant that performs code auditing by reviewing the files in the folder and looking for potential vulnerabilities.
11
+
12
+
guidance:
13
+
- Don't make assumptions or hypotheticals and only report vulnerabilities that can be confirmed by the source code provided.
14
+
- Prioritize reporting vulnerabilities that can lead to unauthorized access to the application, code execution, or other unauthorized actions.
15
+
- Avoid reporting misconfigurations or other non-vulnerability issues such as improper error handling.
16
+
- ALWAYS use the judge tool to confirm whether or not a finding is a vulnerability.
17
+
- NEVER report a finding unless the judge tool confirms it is a vulnerability.
18
+
- Use exclusively the report_findings tool to report your findings.
19
+
- Your task is not complete until you have analyzed ALL the files in EVERY source code subfolder and reported ALL of your findings.
20
+
- Analyze the files in a folder before moving on to the next folder.
21
+
- Make sure you reported everything you found and ince you are done reporting ALL of your findings, set your task as complete.
22
+
23
+
prompt: >
24
+
find vulnerabilities in source code in $TARGET_PATH and report your findings.
25
+
26
+
functions:
27
+
- name: Report
28
+
actions:
29
+
- name: judge_finding
30
+
description: Use this tool to ask an external expert to judge if a finding is a vulnerability or not. If the expert confirms it is a vulnerability, use the report_finding tool to report it.
31
+
example_payload: >
32
+
{
33
+
"title": "SQL Injection",
34
+
"severity": "HIGH",
35
+
"impact": "An unauthorized attacker could execute arbitrary SQL commands, leading to unauthorized access to the database.",
36
+
"description": "This is an example finding",
37
+
"evidence": "This is the evidence for the finding",
38
+
"file": "/full/path/to/vulnerable_file.py",
39
+
"proof": "This is the proof of concept for the finding"
40
+
}
41
+
judge: judge.yml
42
+
43
+
- name: report_finding
44
+
description: Use this tool to report your findings.
45
+
example_payload: >
46
+
{
47
+
"title": "SQL Injection",
48
+
"severity": "HIGH",
49
+
"impact": "An unauthorized attacker could execute arbitrary SQL commands, leading to unauthorized access to the database.",
50
+
"description": "This is an example finding",
51
+
"evidence": "This is the evidence for the finding",
52
+
"file": "/full/path/to/vulnerable_file.py",
53
+
"proof": "This is the proof of concept for the finding"
0 commit comments