nas-burnin/app
Brandon Walter 066fbbc403
Some checks are pending
Security scan / pip-audit (push) Waiting to run
Security scan / bandit (push) Waiting to run
Security scan / gitleaks (push) Waiting to run
fix: address Codex audit findings (1.0.0-28)
Addresses 12 of 13 findings from the Codex tech-debt + security review
of versions 1.0.0-22 through 1.0.0-27. Item #5 (live pool re-check
before start_job) deferred — would add an SSH round-trip per start.

#1  Pool detection now treats zpool / lsblk / findmnt failures
    INDEPENDENTLY. Previously a single None blew away the whole map,
    so a host where lsblk lacks zfs_member info but zpool works would
    never lock pool members. Extended findmnt parser to recognise
    /dev/mapper/*, /dev/dm-*, /dev/md*, /dev/da*, /dev/ada* (LVM,
    devicemapper, MD RAID, FreeBSD CORE devnames).

#2  Admin role enforced on every settings mutation. New
    auth.require_admin() helper applied to GET /settings,
    POST /api/v1/settings, /test-smtp, /test-ssh. Previously any
    authenticated user (the CLI explicitly supports non-admin
    accounts) could rewrite SMTP/SSH/API secrets.

#3  First-user setup race closed. auth.create_user() now accepts
    bootstrap_only=True which wraps the existence check + insert in
    BEGIN IMMEDIATE so two concurrent /api/v1/auth/setup requests
    can't both create admin accounts during the bootstrap window.

#4  Case-insensitive uniqueness enforced via new
    `uniq_users_username_nocase` index. Login does NOCASE lookup so
    without this `Admin` and `admin` could coexist as distinct rows.

#6  New `session_cookie_secure` setting (default False for LAN/dev
    deploys, set True in production behind HTTPS) flips the session
    cookie's Secure flag. Defends against on-the-wire exposure when
    the dashboard is reachable over plain HTTP.

#7  Audit trail bound to authenticated identity. Burn-in start /
    cancel / unlock / drive reset all now use `_operator_for(request)`
    which reads `request.state.current_user.full_name|username`
    instead of the body's operator field. Logged-in users can no
    longer spoof attribution. Drive reset's literal-"operator"
    fallback (window._operator was never set) is also fixed by this.

#8  Login rate-limit race fixed. New `register_login_attempt()` is
    atomic check-AND-increment in synchronous code (no awaits inside),
    so a parallel burst can't slip past the threshold.
    `record_login_failure()` removed; `clear_login_failures()` now
    also drops any active lockout for a successful auth. Pre-existing
    bug where `tripped` was always False (so user_login_locked_out
    audit events never fired) also fixed.

#9  NVMe surface_validate post-format check now mirrors the SSH path:
    fails on FAILED health AND on real SMART attribute failures,
    soft-passes SSH-only failures (logged), surfaces warnings to the
    stage log without failing.

#10 retention.backup_db() now writes to `.tmp` then atomic-renames
    into the canonical daily slot — an interrupted backup leaves the
    tmp behind but doesn't corrupt the real snapshot. Scheduler marks
    last_run_date only on (prune AND backup) success so a transient
    failure gets retried within the 03:00 hour.

#11 /health DB probe now exercises the WRITE path via a temp-table
    INSERT/SELECT/COMMIT round-trip. Previously only read PRAGMA
    journal_mode + a row count, which silently passes on read-only
    mounts and broken-WAL conditions.

#12 security-scan.sh now fails loudly if `git fetch` or
    `git reset --hard origin/main` errors (was `|| true`, scanning
    stale code silently). pip-audit now runs in a throwaway
    python:3.12-slim container against requirements.txt instead of
    `docker exec`-ing into the live truenas-burnin container —
    cleaner separation, no transient package install on prod.

#13 Badblocks SSH stage no longer doubles its log_text. Previously
    appended every 20-line chunk during streaming AND the full
    accumulated output at end. Now only flushes the un-flushed tail
    (typically <20 lines). `result["output"]` stays in-memory only.

Verification: all 44 unit tests pass in container; /health 200;
security scan returns 0 findings; deployed maple build is green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 18:48:16 -04:00
..
static feat: secret handling — status badges + redacted endpoint + rotation audit (1.0.0-26) 2026-05-02 18:15:57 -04:00
templates feat: secret handling — status badges + redacted endpoint + rotation audit (1.0.0-26) 2026-05-02 18:15:57 -04:00
__init__.py Initial commit — TrueNAS Burn-In Dashboard v0.5.0 2026-02-24 00:08:29 -05:00
auth.py fix: address Codex audit findings (1.0.0-28) 2026-05-02 18:48:16 -04:00
auth_cli.py feat: app-level login + hardening sweep (1.0.0-22 -> 1.0.0-23) 2026-05-02 11:08:29 -04:00
burnin.py fix: address Codex audit findings (1.0.0-28) 2026-05-02 18:48:16 -04:00
config.py fix: address Codex audit findings (1.0.0-28) 2026-05-02 18:48:16 -04:00
database.py fix: address Codex audit findings (1.0.0-28) 2026-05-02 18:48:16 -04:00
logging_config.py Initial commit — TrueNAS Burn-In Dashboard v0.5.0 2026-02-24 00:08:29 -05:00
mailer.py feat: app-level login + hardening sweep (1.0.0-22 -> 1.0.0-23) 2026-05-02 11:08:29 -04:00
main.py fix: address Codex audit findings (1.0.0-28) 2026-05-02 18:48:16 -04:00
models.py feat: pool-membership lock + cancellation hardening + smart_health refresh + tunables (1.0.0-13 -> 1.0.0-21) 2026-05-02 09:25:56 -04:00
notifier.py Stage 7: SSH architecture, SMART attribute monitoring, drive reset, and polish 2026-02-24 08:09:30 -05:00
poller.py fix: address Codex audit findings (1.0.0-28) 2026-05-02 18:48:16 -04:00
renderer.py Stage 7: SSH architecture, SMART attribute monitoring, drive reset, and polish 2026-02-24 08:09:30 -05:00
retention.py fix: address Codex audit findings (1.0.0-28) 2026-05-02 18:48:16 -04:00
routes.py fix: address Codex audit findings (1.0.0-28) 2026-05-02 18:48:16 -04:00
settings_store.py feat: pool-membership lock + cancellation hardening + smart_health refresh + tunables (1.0.0-13 -> 1.0.0-21) 2026-05-02 09:25:56 -04:00
ssh_client.py fix: address Codex audit findings (1.0.0-28) 2026-05-02 18:48:16 -04:00
terminal.py chore: re-sync deployed work that pre-dates this session 2026-05-02 09:24:42 -04:00
truenas.py chore: re-sync deployed work that pre-dates this session 2026-05-02 09:24:42 -04:00