Closes the four remaining items from the post-Codex hardening list. #1 Rate-limit unlock + change-password endpoints (1.0.0-33) * Generalised the existing login limiter into a reusable `_RateLimiter` class in app/auth.py. Atomic check-then-increment in synchronous code so a parallel asyncio burst can't slip past the threshold. * `unlock_limiter` (5 attempts in 10 min → 10 min lockout) gates POST /api/v1/drives/{id}/unlock per-drive AND per-source-IP. * `pwchange_limiter` (5 in 10 min → 15 min lockout) gates POST /api/v1/auth/change-password per-user AND per-IP. * Both clear on successful operation. The login limiter keeps its existing `register_login_attempt` / `clear_login_failures` facade names so external callers don't change. #3 mypy in security-scan (1.0.0-33) * Added a 4th tool to the daily scan + forge workflow. Runs in a throwaway python:3.12-slim container against the deploy dir, exit code is informational only (NOT included in the `TOTAL_EXIT` failure sum). Findings land in ~/security-scans/scan-YYYY-MM-DD/mypy.txt for ratchet-down work over time. * Forge job uses `continue-on-error: true` so it doesn't fail the workflow until the type-debt baseline is annotated down. #4 Lifecycle test coverage (1.0.0-33) * New tests/test_lifecycle.py with 15 cases: - TestCommonHelpers (7 tests): _start_stage, _finish_stage success/failure/error-preservation, _recalculate_progress weighted math, _is_cancelled, _append_stage_log. - TestStartCancelJob (4 tests): start_job inserts queued row + correct stage list, duplicate-active rejection, cancel marks state, cancel returns False on terminal-state jobs. - TestRateLimiter (4 tests): under-threshold ok, trips at threshold, clear removes both counter + lockout, separate keys don't interfere. * Total goes from 44 to 59 tests; closes the orchestration-path coverage gap Codex flagged. #2 Partial routes.py split (1.0.0-34) * routes.py → routes/ package. Same staged-extraction pattern as the burnin.py split. * routes/auth.py — login/logout/setup/change-password (170 LoC). * routes/system.py — /health, /ws/terminal, /api/v1/updates/check (136 LoC). * routes/_helpers.py — shared utilities used by both extracted modules and the still-monolithic remainder: client_ip, operator_for, is_stale, stale_context, secret_status, SECRET_FIELDS (97 LoC). * routes/__init__.py shrank from 1568 LoC to 1261. Future slices can extract drives, burnin, history, settings the same way. * GOTCHA recorded in commit body: `from app import auth` at the top of __init__.py binds `auth` as an attribute on the package namespace, so `from . import auth as _auth_routes` finds the OUTER module and yields `app.auth` instead of the submodule. Fix is `import app.routes.auth as _auth_routes` (absolute). This bit me once at deploy time; container failed to start with `module 'app.auth' has no attribute 'router'`. Verification: 59/59 tests pass (44 existing + 15 new); container boots clean at 1.0.0-34; /health 200 with all checks green; security scan still clean (mypy informational findings ignored from totals). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
141 lines
5.9 KiB
Bash
141 lines
5.9 KiB
Bash
#!/usr/bin/env bash
|
|
# Daily security scan of the deployed truenas-burnin source on maple.
|
|
# Mirrors the .forgejo/workflows/security-scan.yml CI pipeline so a finding
|
|
# the runner-less forge would have flagged still surfaces here.
|
|
#
|
|
# Tools all run in containers — nothing installed on the host.
|
|
# pip-audit — known CVEs in installed packages (scans the LIVE container)
|
|
# bandit — Python static security analysis on host source tree
|
|
# gitleaks — secrets across the full git history
|
|
#
|
|
# Output:
|
|
# ~/security-scans/scan-YYYY-MM-DD/{pip-audit,bandit,gitleaks}.txt
|
|
# ~/security-scans/findings.log — appended one line per scan with findings
|
|
#
|
|
# Wiring:
|
|
# Daily systemd user timer at 03:30 local (after the in-app retention job
|
|
# so backups are fresh). See scripts/security-scan.{service,timer}.
|
|
|
|
set -uo pipefail
|
|
|
|
REPO_URL="${REPO_URL:-https://git.hellocomputer.xyz/brandon/truenas-burnin.git}"
|
|
REPO="${REPO:-$HOME/scan-checkouts/truenas-burnin}"
|
|
OUT_BASE="${OUT_BASE:-$HOME/security-scans}"
|
|
DATE="$(date +%Y-%m-%d)"
|
|
OUT_DIR="$OUT_BASE/scan-$DATE"
|
|
SUMMARY="$OUT_BASE/findings.log"
|
|
GITLEAKS_VERSION="${GITLEAKS_VERSION:-8.21.2}"
|
|
|
|
mkdir -p "$OUT_DIR" "$(dirname "$REPO")"
|
|
|
|
# Maintain a dedicated checkout for scanning. The deploy at
|
|
# ~/docker/stacks/truenas-burnin/ is just the bind-mounted source — no
|
|
# .git, no history — so gitleaks can't scan there. We keep a separate
|
|
# clone, fast-forward it to origin/main each run.
|
|
if [ ! -d "$REPO/.git" ]; then
|
|
echo "Cloning $REPO_URL to $REPO ..."
|
|
git clone --quiet "$REPO_URL" "$REPO" || {
|
|
echo "fatal: git clone failed" >&2
|
|
exit 65
|
|
}
|
|
fi
|
|
|
|
cd "$REPO"
|
|
# Refresh the scan checkout. Failures here mean we'd be scanning stale
|
|
# code without knowing — fail loudly instead of soldiering on silently.
|
|
if ! git fetch --quiet --prune origin; then
|
|
echo "fatal: git fetch failed in $REPO" >&2
|
|
exit 65
|
|
fi
|
|
git checkout --quiet main || true # ok if already on main
|
|
if ! git reset --hard --quiet origin/main; then
|
|
echo "fatal: git reset --hard failed in $REPO" >&2
|
|
exit 65
|
|
fi
|
|
|
|
echo "=== Security scan $DATE ===" > "$OUT_DIR/summary.txt"
|
|
date -Iseconds >> "$OUT_DIR/summary.txt"
|
|
echo >> "$OUT_DIR/summary.txt"
|
|
|
|
# --- pip-audit against the lockfile in a throwaway container ------------
|
|
# Previously we did `docker exec truenas-burnin pip install pip-audit`
|
|
# which mutated the live production container with a transient package.
|
|
# Now scan the lockfile in an ephemeral container — same coverage of
|
|
# pinned versions + their transitives, no side effects on prod.
|
|
echo "--- pip-audit (requirements.txt in throwaway container) ---" | tee -a "$OUT_DIR/summary.txt"
|
|
docker run --rm \
|
|
-v "$REPO/requirements.txt:/work/requirements.txt:ro" \
|
|
-w /work \
|
|
python:3.12-slim sh -c \
|
|
"pip install --quiet --no-cache-dir --disable-pip-version-check pip-audit 2>/dev/null && pip-audit --requirement requirements.txt --strict --format=columns" \
|
|
> "$OUT_DIR/pip-audit.txt" 2>&1
|
|
PIPS=$?
|
|
echo " exit=$PIPS ($OUT_DIR/pip-audit.txt)" | tee -a "$OUT_DIR/summary.txt"
|
|
|
|
# --- bandit against the LIVE deploy dir ---------------------------------
|
|
# Scan what's actually running, not what's in git — catches drift between
|
|
# forge HEAD and maple. B608 (SQL injection via dynamic strings) is
|
|
# skipped globally: every dynamic SQL build in this codebase uses
|
|
# bound parameters for data and structural placeholders only.
|
|
DEPLOY_DIR="${DEPLOY_DIR:-$HOME/docker/stacks/truenas-burnin}"
|
|
echo "--- bandit (deploy: $DEPLOY_DIR) ---" | tee -a "$OUT_DIR/summary.txt"
|
|
docker run --rm \
|
|
-v "$DEPLOY_DIR/app:/src:ro" \
|
|
python:3.12-slim sh -c \
|
|
"pip install --quiet --no-cache-dir --disable-pip-version-check bandit 2>/dev/null && bandit -r /src -ll -ii --skip B608" \
|
|
> "$OUT_DIR/bandit.txt" 2>&1
|
|
BANDITS=$?
|
|
echo " exit=$BANDITS ($OUT_DIR/bandit.txt)" | tee -a "$OUT_DIR/summary.txt"
|
|
|
|
# --- mypy against the deploy dir (informational only) -------------------
|
|
# Type checker — surfaces None-handling bugs and missing-attribute errors
|
|
# the runtime would have caught at the worst possible moment. Doesn't
|
|
# count toward the failure exit-code sum until the codebase is annotated
|
|
# enough to make findings actionable.
|
|
echo "--- mypy (informational) ---" | tee -a "$OUT_DIR/summary.txt"
|
|
docker run --rm \
|
|
-v "$DEPLOY_DIR/app:/src:ro" \
|
|
python:3.12-slim sh -c \
|
|
"pip install --quiet --no-cache-dir --disable-pip-version-check mypy 2>&1 | tail -3 && mypy --ignore-missing-imports --no-strict-optional /src" \
|
|
> "$OUT_DIR/mypy.txt" 2>&1
|
|
MYPY=$?
|
|
echo " exit=$MYPY ($OUT_DIR/mypy.txt) — informational only" | tee -a "$OUT_DIR/summary.txt"
|
|
|
|
# --- gitleaks against the full git history ------------------------------
|
|
echo "--- gitleaks ---" | tee -a "$OUT_DIR/summary.txt"
|
|
docker run --rm \
|
|
-v "$REPO:/repo:ro" \
|
|
"zricethezav/gitleaks:v$GITLEAKS_VERSION" \
|
|
detect --source /repo --no-banner --redact --verbose \
|
|
> "$OUT_DIR/gitleaks.txt" 2>&1
|
|
LEAKS=$?
|
|
echo " exit=$LEAKS ($OUT_DIR/gitleaks.txt)" | tee -a "$OUT_DIR/summary.txt"
|
|
|
|
# --- summary + notification --------------------------------------------
|
|
TOTAL_EXIT=$(( PIPS + BANDITS + LEAKS ))
|
|
{
|
|
echo
|
|
echo "Total findings exit-code sum: $TOTAL_EXIT"
|
|
echo " pip-audit: $PIPS"
|
|
echo " bandit: $BANDITS"
|
|
echo " gitleaks: $LEAKS"
|
|
} >> "$OUT_DIR/summary.txt"
|
|
|
|
if [ "$TOTAL_EXIT" -ne 0 ]; then
|
|
printf '%s — findings (pip-audit=%d bandit=%d gitleaks=%d) — see %s\n' \
|
|
"$DATE" "$PIPS" "$BANDITS" "$LEAKS" "$OUT_DIR" >> "$SUMMARY"
|
|
# Hook for downstream notification — wire to your existing Mattermost
|
|
# / Fastmail / webhook chain. Stays a no-op until SECURITY_SCAN_WEBHOOK
|
|
# is set in the systemd unit's Environment=.
|
|
if [ -n "${SECURITY_SCAN_WEBHOOK:-}" ]; then
|
|
curl -fsS -X POST -H 'Content-Type: text/plain' \
|
|
--data-binary "@$OUT_DIR/summary.txt" \
|
|
"$SECURITY_SCAN_WEBHOOK" || true
|
|
fi
|
|
fi
|
|
|
|
# Retention — keep last 30 daily directories, prune older.
|
|
find "$OUT_BASE" -maxdepth 1 -type d -name "scan-*" -mtime +30 \
|
|
-exec rm -rf {} \;
|
|
|
|
exit "$TOTAL_EXIT"
|