Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added semgrep and some fixes #74

Merged
merged 2 commits into from
Feb 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions .github/workflows/semgrep.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
name: Semgrep SAST Scan

on:
pull_request:

jobs:
semgrep:
# User definable name of this GitHub Actions job.
name: semgrep/ci
# If you are self-hosting, change the following `runs-on` value:
runs-on: ubuntu-latest
container:
# A Docker image with Semgrep installed. Do not change this.
image: returntocorp/semgrep
# Skip any PR created by dependabot to avoid permission issues:
if: (github.actor != 'dependabot[bot]')
permissions:
# required for all workflows
security-events: write
# only required for workflows in private repositories
actions: read
contents: read

steps:
# Fetch project source with GitHub Actions Checkout.
- name: Checkout repository
uses: actions/checkout@v4

- name: Perform Semgrep Analysis
# @NOTE: This is the actual semgrep command to scan your code.
# Modify the --config option to 'r/all' to scan using all rules,
# or use multiple flags to specify particular rules, such as
# --config r/all --config custom/rules
run: semgrep scan -q --sarif --config auto --config "p/secrets" . > semgrep-results.sarif

- name: Pretty-Print SARIF Output
run: |
jq . semgrep-results.sarif > formatted-semgrep-results.sarif || echo "{}"
echo "Formatted SARIF Output (First 20 lines):"
head -n 20 formatted-semgrep-results.sarif || echo "{}"

- name: Validate JSON Output
run: |
if ! jq empty formatted-semgrep-results.sarif > /dev/null 2>&1; then
echo "⚠️ Semgrep output is not valid JSON. Skipping annotations."
exit 0
fi

- name: Add PR Annotations for Semgrep Findings
run: |
total_issues=$(jq '.runs[0].results | length' formatted-semgrep-results.sarif)
if [[ "$total_issues" -eq 0 ]]; then
echo "✅ No Semgrep issues found!"
exit 0
fi

jq -c '.runs[0].results[]' formatted-semgrep-results.sarif | while IFS= read -r issue; do
file=$(echo "$issue" | jq -r '.locations[0].physicalLocation.artifactLocation.uri')
line=$(echo "$issue" | jq -r '.locations[0].physicalLocation.region.startLine')
message=$(echo "$issue" | jq -r '.message.text')

if [[ -n "$file" && -n "$line" && -n "$message" ]]; then
echo "::error file=$file,line=$line,title=Semgrep Issue::${message}"
fi
done
4 changes: 2 additions & 2 deletions modelconverter/packages/hailo/exporter.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@

@contextmanager
def _replace_module(original, substitute):
original_module = importlib.import_module(original)
substitute_module = importlib.import_module(substitute)
original_module = importlib.import_module(original) # nosemgrep

Check failure on line 24 in modelconverter/packages/hailo/exporter.py

View workflow job for this annotation

GitHub Actions / semgrep/ci

Semgrep Issue

Untrusted user input in `importlib.import_module()` function allows an attacker to load arbitrary code. Avoid dynamic values in `importlib.import_module()` or use a whitelist to prevent running untrusted code.
substitute_module = importlib.import_module(substitute) # nosemgrep

Check failure on line 25 in modelconverter/packages/hailo/exporter.py

View workflow job for this annotation

GitHub Actions / semgrep/ci

Semgrep Issue

Untrusted user input in `importlib.import_module()` function allows an attacker to load arbitrary code. Avoid dynamic values in `importlib.import_module()` or use a whitelist to prevent running untrusted code.

sys.modules[original] = substitute_module
try:
Expand Down
15 changes: 13 additions & 2 deletions modelconverter/packages/multistage_exporter.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,10 +110,21 @@
for out_name in output_dirs
}

# TODO: safe exec
local_scope = {}
safe_globals = {"__builtins__": {}}

try:
exec( # nosemgrep

Check failure on line 117 in modelconverter/packages/multistage_exporter.py

View workflow job for this annotation

GitHub Actions / semgrep/ci

Semgrep Issue

Detected the use of exec(). exec() can be dangerous if used to evaluate dynamic content. If this content can be input from outside the program, this may be a code injection vulnerability. Ensure evaluated content is not definable by external sources.
script, safe_globals, local_scope
)
except Exception as e:
raise RuntimeError(f"Error executing script: {e}")

if "run_script" not in local_scope:
raise RuntimeError(
"Error: `run_script` function not found in script."
)

exec(script, globals(), local_scope)
run_script = local_scope["run_script"]
arr = run_script(outputs)
np.save(dest / f"{i}.npy", arr)
Expand Down
17 changes: 15 additions & 2 deletions modelconverter/utils/nn_archive.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,21 @@ def process_nn_archive(
elif tarfile.is_tarfile(path):
if untar_path.suffix == ".tar":
untar_path = MISC_DIR / untar_path.stem
with tarfile.open(path) as tar:
tar.extractall(untar_path)

def safe_members(tar):
"""Filter members to prevent path traversal attacks."""
safe_files = []
for member in tar.getmembers():
# Normalize path and ensure it's within the extraction folder
if not member.name.startswith("/") and ".." not in member.name:
safe_files.append(member)
else:
logger.warning(f"Skipping unsafe file: {member.name}")
return safe_files

tf = tarfile.open(path, mode="r")
tf.extractall(untar_path, members=safe_members(tf))

else:
raise RuntimeError(f"Unknown NN Archive path: `{path}`")

Expand Down
Loading