Commit graph

3 commits

Author SHA1 Message Date
Brandon Walter
eb2a964171 fix: address Codex review of burnin package split (1.0.0-32)
Some checks are pending
Security scan / pip-audit (push) Waiting to run
Security scan / bandit (push) Waiting to run
Security scan / gitleaks (push) Waiting to run
Three LOW-severity findings from Codex's audit of the post-split
package, all small mechanical cleanups:

#1 routes.py:848 read burnin.UNLOCK_TTL_SECONDS — a snapshot alias
   bound at import time. After a test (or runtime) monkey-patches
   app.burnin.unlock.UNLOCK_TTL_SECONDS the API response would
   advertise the OLD value while grant_pool_unlock used the new one.
   Now reads burnin.unlock.UNLOCK_TTL_SECONDS directly so the API
   stays in sync with whatever the actual source-of-truth is.

#2 _stage_surface_validate_ssh() carried dead extraction scaffolding
   from when the badblocks logic was first inlined into burnin.py:
   _is_cancelled_sync (sync wrapper that does run_until_complete in
   a coroutine — would deadlock if ever called), last_logged_pct,
   on_progress, accumulated_lines, on_progress_async — none on any
   control-flow path. Plus result["output"] which was set but never
   read. All deleted; the inline _drain coroutines below already
   handle progress/log throttling correctly.

#3 The new module boundaries were leaking — root orchestration
   mutated _remote_pids and _unlock_grants directly even though
   kill.clear_remote_pid() and unlock.invalidate_grant() existed.
   Now using the helpers, so a future change to the storage shape
   only requires editing the owning module.

Bonus from Codex's check note: _get_client() now asserts
burnin._client is not None with a clear message instead of relying
on an obscure NoneType AttributeError if a stage is somehow called
before init().

Verified: 44/44 tests pass; container boots clean; /health 200.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 01:35:07 -04:00
Brandon Walter
19c2c0dc0f refactor: extract _common.py + stages.py from burnin (1.0.0-31)
Continues the staged burnin.py module split started in 1.0.0-30.
Two more clean extractions; orchestration (init, _run_job,
start_job, cancel_job, check_stuck_jobs, semaphore) intentionally
stays in __init__.py for now to avoid threading the TrueNASClient
through cross-module setters.

* app/burnin/_common.py — shared helpers with no upward deps:
  STAGE_ORDER + _STAGE_BASE_WEIGHTS + POLL_INTERVAL constants;
  _now / _db connection helper; _is_cancelled, _start_stage,
  _finish_stage, _cancel_stage, _set_stage_error, _update_stage_*,
  _append_stage_log, _store_smart_*, _recalculate_progress; SSE
  _push_update. Imports nothing from sibling burnin modules.

* app/burnin/stages.py — every per-stage implementation moved
  verbatim: _stage_precheck, _stage_smart_test +
  _stage_smart_test_api / _ssh, _stage_surface_validate +
  _surface_validate_nvme / _ssh / _truenas, _stage_timed_simulate,
  _stage_final_check, plus _badblocks_available, _nvme_cli_available,
  and _dispatch_stage. Pulls the shared helpers from _common,
  remote-PID setters from kill, and the live TrueNASClient via a
  lazy `_get_client()` helper that defers `from app import burnin`
  until call time so we don't trip a circular import.

* __init__.py shrank from ~1480 LoC to ~600. Re-exports every
  public name (start_job, cancel_job, init, check_stuck_jobs,
  PoolMemberError, UNLOCK_TTL_SECONDS, etc.) so external callers
  in routes.py / mailer.py / poller.py see the same surface.

State that didn't move: _semaphore, _client, _active_tasks remain
on the package root (with a runtime _client reference from routes.py
preserved). _run_job and start_job still live in __init__.py — full
task.py extraction would require giving stages access to _client
through a setter rather than the lazy lookup, deferred to a future
slice.

Verification: 44/44 unit tests pass in container; /health 200;
container boots clean. No public API change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 01:18:04 -04:00
Brandon Walter
9cbae44495 refactor: split burnin.py into a package — extract unlock + kill (1.0.0-30)
Some checks are pending
Security scan / pip-audit (push) Waiting to run
Security scan / bandit (push) Waiting to run
Security scan / gitleaks (push) Waiting to run
First slice of the planned tech-debt cleanup. burnin.py was 1667 lines
and growing; staged extraction gives smaller diffs to review and a
clear bisect target if anything regresses.

Mechanical move only — no behaviour change. The two extracted modules:

* app/burnin/unlock.py — _UnlockGrant, _unlock_grants, PoolMemberError,
  is_unlocked / unlock_expiry / grant_pool_unlock, plus the four
  *_TOKEN constants and UNLOCK_TTL_SECONDS. Owns its module-level
  state; opens its own DB connection in grant_pool_unlock so it
  doesn't depend on the parent package's _db() helper.

* app/burnin/kill.py — _remote_pids dict and the kill_remote_process /
  set_remote_pid / clear_remote_pid / get_remote_pid helpers. Pulled
  out of __init__.py so the asyncssh-ignores-signals workaround lives
  next to the state it operates on.

app/burnin/__init__.py re-exports every public symbol the rest of the
app imports — `from app import burnin; burnin.start_job(...)`,
`burnin.PoolMemberError`, `burnin.UNLOCK_TTL_SECONDS`, etc. all keep
working unchanged. Internal aliases `_remote_pids` and `_unlock_grants`
on the package root point at the SAME dict objects in the submodules,
so existing in-package mutations (set in stages, cleared in cleanup
callbacks) work without rewrite.

Test fix: tests/test_unlock_flow.py:test_expired_grant_returns_false
monkey-patches UNLOCK_TTL_SECONDS. The package-root alias is bound at
import time and won't propagate back to the submodule's read site, so
the test now patches `app.burnin.unlock.UNLOCK_TTL_SECONDS` directly.

Verification: 44/44 unit tests pass in container; /health 200;
container boots clean. routes.py, mailer.py, poller.py untouched —
the public API is identical.

Future: extract stages, task, _common in subsequent versions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 00:44:28 -04:00