Brandon Walter fc33c0d11e docs: update CLAUDE.md for Stage 7; bump version to 1.0.0-7

Documents all Stage 7 features: SSH burn-in architecture, SMART attr
monitoring, drive reset, version badge, stats polish, new env vars,
new API routes, and real-TrueNAS cutover steps.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-24 08:13:21 -05:00

26 KiB

Raw Blame History

TrueNAS Burn-In Dashboard — Project Context

Drop this file in any new Claude session to resume work with full context. Last updated: 2026-02-24 (Stage 7)

What This Is

A self-hosted web dashboard for running and tracking hard-drive burn-in tests against a TrueNAS CORE instance. Deployed on maple.local (10.0.0.138).

App URL: http://10.0.0.138:8084 (or http://burnin.hellocomputer.xyz)
Stack path on maple.local: ~/docker/stacks/truenas-burnin/
Source (local mac): ~/Desktop/claude-sandbox/truenas-burnin/
Compose synced to maple.local via scp or manual copy

Stages completed

Stage	Description	Status
1	Mock TrueNAS CORE v2.0 API (15 drives, sda–sdo)	✅
2	Backend core (FastAPI, SQLite/WAL, poller, TrueNAS client)	✅
3	Dashboard UI (Jinja2, SSE live updates, dark theme)	✅
4	Burn-in orchestrator (queue, concurrency, start/cancel)	✅
5	History page, job detail page, CSV export	✅
6	Hardening (retries, JSON logging, IP allowlist, poller watchdog)	✅
6b	UX overhaul (stats bar, alerts, batch, notifications, location, print, analytics)	✅
6c	Settings overhaul (editable form, runtime store, SMTP fix, stage selection)	✅
6d	Cancel SMART tests, Cancel All burn-ins, drag-to-reorder stages in modals	✅
7	SSH burn-in execution, SMART attr monitoring, drive reset, version badge, stats polish	✅

File Map

truenas-burnin/
├── docker-compose.yml          # two services: mock-truenas + app
├── Dockerfile                  # app container
├── requirements.txt
├── .env.example
├── data/                       # SQLite DB lives here (gitignored, created on deploy)
│
├── mock-truenas/
│   ├── Dockerfile
│   └── app.py                  # FastAPI mock of TrueNAS CORE v2.0 REST API
│
└── app/
    ├── __init__.py
    ├── config.py               # pydantic-settings; reads .env
    ├── database.py             # schema, migrations, init_db(), get_db()
    ├── models.py               # Pydantic v2 models; StartBurninRequest has run_surface/run_short/run_long + profile property
    ├── settings_store.py       # runtime settings store — persists to /data/settings_overrides.json
    ├── ssh_client.py           # asyncssh client: smartctl parsing, badblocks streaming, test_connection
    ├── truenas.py              # httpx async client with retry (lambda factory pattern)
    ├── poller.py               # poll loop, SSE pub/sub, stale detection, stuck-job check
    ├── burnin.py               # orchestrator, semaphore, stages, check_stuck_jobs()
    ├── notifier.py             # webhook + immediate email alerts on job completion
    ├── mailer.py               # daily HTML email + per-job alert email
    ├── logging_config.py       # structured JSON logging
    ├── renderer.py             # Jinja2 + filters (format_bytes, format_eta, format_elapsed, …)
    ├── routes.py               # all FastAPI route handlers
    ├── main.py                 # app factory, IP allowlist middleware, lifespan
    │
    ├── static/
    │   ├── app.css             # full dark theme + mobile responsive
    │   └── app.js              # push notifications, batch, elapsed timers, inline edit
    │
    └── templates/
        ├── layout.html         # header nav: History, Stats, Audit, Settings, bell button
        ├── dashboard.html      # stats bar, failed banner, batch bar
        ├── history.html
        ├── job_detail.html     # + Print/Export button
        ├── audit.html          # audit event log
        ├── stats.html          # analytics: pass rate by model, daily activity, duration by size, failures by stage
        ├── settings.html       # editable 2-col form: SMTP + SSH (left) + Notifications/Behavior/Webhook/System (right)
        ├── job_print.html      # print view with client-side QR code (qrcodejs CDN)
        └── components/
            ├── drives_table.html   # checkboxes, elapsed time, location inline edit
            ├── modal_start.html    # single-drive burn-in modal
            └── modal_batch.html    # batch burn-in modal

Architecture Overview

Browser  ──HTMX SSE──▶  GET /sse/drives
                              │
                         poller.subscribe()
                              │
                         asyncio.Queue  ◀─── poller.run() notifies after each poll
                              │                    & after each burnin stage update
                         render drives_table.html
                         yield SSE "drives-update" event

Poller (poller.py): runs every POLL_INTERVAL_SECONDS (default 12s), calls TrueNAS /api/v2.0/disk and /api/v2.0/core/get_jobs, writes to SQLite, notifies SSE subscribers
Burn-in (burnin.py): asyncio.Semaphore(max_parallel_burnins) gates concurrency. Jobs are created immediately (queued state), semaphore gates actual execution. On startup, any interrupted running jobs → state=unknown; queued jobs are re-enqueued.
SSE (routes.py /sse/drives): one persistent connection per browser tab. Renders fresh drives_table.html HTML fragment on every notification.
HTMX (dashboard.html): hx-ext="sse" + sse-swap="drives-update" replaces #drives-tbody content without page reload.

Database Schema (SQLite WAL mode)

-- drives: upsert by truenas_disk_id (the TrueNAS internal disk identifier)
drives (id, truenas_disk_id UNIQUE, devname, serial, model, size_bytes,
        temperature_c, smart_health, last_polled_at)

-- smart_tests: one row per drive+test_type combination (UNIQUE constraint)
smart_tests (id, drive_id FK, test_type CHECK('short','long'),
             state, percent, started_at, eta_at, finished_at, error_text,
             UNIQUE(drive_id, test_type))

-- burnin_jobs: one row per burn-in run (multiple per drive over time)
burnin_jobs (id, drive_id FK, profile, state CHECK(queued/running/passed/
             failed/cancelled/unknown), percent, stage_name, operator,
             created_at, started_at, finished_at, error_text)

-- burnin_stages: one row per stage per job
burnin_stages (id, burnin_job_id FK, stage_name, state, percent,
               started_at, finished_at, error_text,
               log_text TEXT,        -- raw smartctl/badblocks SSH output
               bad_blocks INTEGER)   -- bad sector count from surface_validate

-- audit_events: append-only log
audit_events (id, event_type, drive_id, job_id, operator, note, created_at)

-- drives columns added by migrations:
--   location TEXT, notes TEXT (Stage 6b)
--   smart_attrs TEXT            -- JSON blob of last SMART attribute snapshot (Stage 7)

-- smart_tests columns added by migrations:
--   raw_output TEXT             -- raw smartctl -a output (Stage 7)

Burn-In Stage Definitions

STAGE_ORDER = {
    "quick": ["precheck", "short_smart", "io_validate",  "final_check"],
    "full":  ["precheck", "surface_validate", "short_smart", "long_smart", "final_check"],
}

The UI only exposes full profile (destructive). Quick profile exists for dev/testing.

TrueNAS API Contracts Used

Method	Endpoint	Notes
GET	`/api/v2.0/disk`	List all disks
POST	`/api/v2.0/smart/test`	Start SMART test `{disks:[name], type:"SHORT"\|"LONG"}`
GET	`/api/v2.0/core/get_jobs`	Filter `[["method","=","smart.test"]]`
POST	`/api/v2.0/core/job_abort`	`job_id` positional arg
GET	`/api/v2.0/smart/test/results/{disk}`	Per-disk SMART results

Auth: Authorization: Bearer {TRUENAS_API_KEY} header.

Config / Environment Variables

All read from .env via pydantic-settings. See .env.example for full list.

Variable	Default	Notes
`APP_HOST`	`0.0.0.0`
`APP_PORT`	`8080`
`DB_PATH`	`/data/app.db`	Inside container
`TRUENAS_BASE_URL`	`http://localhost:8000`	Point at mock or real TrueNAS
`TRUENAS_API_KEY`	`mock-key`	Real API key for prod
`TRUENAS_VERIFY_TLS`	`false`	Set true for prod with valid cert
`POLL_INTERVAL_SECONDS`	`12`
`STALE_THRESHOLD_SECONDS`	`45`	UI shows warning if data older than this
`MAX_PARALLEL_BURNINS`	`2`	asyncio.Semaphore limit
`SURFACE_VALIDATE_SECONDS`	`45`	Mock only — duration of surface stage
`IO_VALIDATE_SECONDS`	`25`	Mock only — duration of I/O stage
`STUCK_JOB_HOURS`	`24`	Hours before a running job is auto-marked unknown
`LOG_LEVEL`	`INFO`
`ALLOWED_IPS`	``	Empty = allow all. Comma-sep IPs/CIDRs
`SMTP_HOST`	``	Empty = email disabled
`SMTP_PORT`	`587`
`SMTP_USER`	``
`SMTP_PASSWORD`	``
`SMTP_FROM`	``
`SMTP_TO`	``	Comma-separated
`SMTP_REPORT_HOUR`	`8`	Local hour (0-23) to send daily report
`SMTP_ALERT_ON_FAIL`	`true`	Immediate email when a job fails
`SMTP_ALERT_ON_PASS`	`false`	Immediate email when a job passes
`WEBHOOK_URL`	``	POST JSON on burnin_passed/burnin_failed. Works with ntfy, Slack, Discord, n8n
`TEMP_WARN_C`	`46`	Temperature warning threshold (°C)
`TEMP_CRIT_C`	`55`	Temperature critical threshold — precheck fails above this
`BAD_BLOCK_THRESHOLD`	`0`	Max bad blocks allowed before surface_validate fails (0 = any bad = fail)
`APP_VERSION`	`1.0.0-7`	Displayed in header version badge
`SSH_HOST`	``	TrueNAS SSH hostname/IP — empty disables SSH mode (uses mock/REST)
`SSH_PORT`	`22`	TrueNAS SSH port
`SSH_USER`	`root`	TrueNAS SSH username
`SSH_PASSWORD`	``	TrueNAS SSH password (use key instead for production)
`SSH_KEY`	``	TrueNAS SSH private key PEM string — loaded in-memory, never written to disk

Deploy Workflow

First deploy (already done)

# On maple.local
cd ~/docker/stacks/truenas-burnin
docker compose up -d --build

Redeploy after code changes

# Copy changed files from mac to maple.local first, e.g.:
scp -P 2225 -r app/ brandon@10.0.0.138:~/docker/stacks/truenas-burnin/

# Then on maple.local:
ssh brandon@10.0.0.138 -p 2225
cd ~/docker/stacks/truenas-burnin
docker compose up -d --build

Reset the database (e.g. after schema changes)

# On maple.local — stop containers first
docker compose stop app
# Delete DB using alpine (container owns the file, sudo not available)
docker run --rm -v ~/docker/stacks/truenas-burnin/data:/data alpine rm -f /data/app.db
docker compose start app

Check logs

docker compose logs -f app
docker compose logs -f mock-truenas

Mock TrueNAS Server (`mock-truenas/app.py`)

15 drives: sda–sdo
Drive mix: 3× ST12000NM0008 12TB, 3× WD80EFAX 8TB, 2× ST16000NM001G 16TB, 2× ST4000VN008 4TB, 2× TOSHIBA MG06ACA10TE 10TB, 1× HGST HUS728T8TAL5200 8TB, 1× Seagate Barracuda ST6000DM003 6TB, 1× FAIL001 (sdn) — always fails at ~30%
SHORT test: 90s simulated; LONG test: 480s simulated; tick every 5s
Debug endpoints:
- POST /debug/reset — reset all jobs/state
- GET /debug/state — dump current state
- POST /debug/complete-all-jobs — instantly complete all running tests

Key Implementation Patterns

Retry pattern — lambda factory (NOT coroutine object)

# CORRECT: pass a factory so each retry creates a fresh coroutine
r = await _with_retry(lambda: self._client.get("/api/v2.0/disk"), "get_disks")

# WRONG: coroutine is exhausted after first await, retry silently fails
r = await _with_retry(self._client.get("/api/v2.0/disk"), "get_disks")

SSE template rendering

# Use templates.env.get_template().render() — not TemplateResponse (that's a Response object)
html = templates.env.get_template("components/drives_table.html").render(drives=drives)
yield {"event": "drives-update", "data": html}

Sticky thead scroll fix

/* BOTH axes required on table-wrap for position:sticky to work on thead */
.table-wrap {
  overflow: auto;           /* NOT overflow-x: auto */
  max-height: calc(100vh - 130px);
}
thead { position: sticky; top: 0; z-index: 10; }

export.csv route ordering

# MUST register export.csv BEFORE /{job_id} — FastAPI tries int() on "export.csv"
@router.get("/api/v1/burnin/export.csv")   # first
async def burnin_export_csv(...): ...

@router.get("/api/v1/burnin/{job_id}")     # second
async def burnin_get(job_id: int, ...): ...

Known Issues / Past Bugs Fixed

Bug	Root Cause	Fix
`_execute_stages` used `STAGE_ORDER[profile]` ignoring custom order	Stage order stored in DB but not read back	`_run_job` reads stages from `burnin_stages ORDER BY id`; `_execute_stages` accepts `stages: list[str]`
Poller stuck at 'running' after completion	`_sync_history()` had early-return guard when state=running	Removed guard — `_sync_history` only called when job not in active dict
DB schema tables missing after edit	Tables split into separate variable never passed to `executescript()`	Put all tables in single `SCHEMA` string
Retry not retrying	`_with_retry(coro)` — coroutine exhausted after first fail	Changed to `_with_retry(factory: Callable[[], Coroutine])`
`error_text` overwritten	`_finish_stage(success=False)` overwrote error set by stage handler	`_finish_stage` omits `error_text` column in SQL when param is None
Cancelled stage showed 'failed'	`_execute_stages` called `_finish_stage(success=False)` on cancel	Check `_is_cancelled()`, call `_cancel_stage()` instead
export.csv returns 422	Route registered after `/{job_id}`, FastAPI tries `int("export.csv")`	Move export route before parameterized route
Old drive names persist after mock rename	Poller upserts by `truenas_disk_id`, old rows stay	Delete `app.db` and restart
First row clipped behind sticky thead	`overflow-x: auto` only creates partial stacking context	Use `overflow: auto` (both axes) on `.table-wrap`
`rm data/app.db` permission denied	Container owns the file	Use `docker run --rm -v .../data:/data alpine rm -f /data/app.db`
First row clipped after Stage 6b	Stats bar added 70px but max-height not updated	`max-height: calc(100vh - 205px)`
SMTP "Connection unexpectedly closed"	`_send_email` used `settings.smtp_port` (587 default) even in SSL mode	Derive port from mode via `_MODE_PORTS` dict; SSL→465, STARTTLS→587, Plain→25
SSL mode missing EHLO	`smtplib.SMTP_SSL` was created without calling `ehlo()`	Added `server.ehlo()` after both SSL and STARTTLS connections
`profile` NameError in `_execute_stages`	`_execute_stages` called `_recalculate_progress(job_id, profile)` but `profile` not in scope	Changed to `_recalculate_progress(job_id)` — profile param was unused
`app_version` Jinja2 global rendered as function	Set `templates.env.globals["app_version"] = _get_app_version` (callable)	Set to the static string value directly: `= _settings.app_version`

Feature Reference (Stage 7)

SSH Burn-In Architecture

ssh_client.py provides an optional SSH execution layer. When SSH_HOST is set (and key or password is present), all burn-in stages run real commands over SSH against TrueNAS. When SSH_HOST is empty, stages fall back to mock/REST simulation.

Dual-mode dispatch — each stage checks ssh_client.is_configured():

if ssh_client.is_configured():
    # run smartctl / badblocks over SSH
else:
    # simulate with REST API or timed sleep (mock mode)

SSH client capabilities (ssh_client.py):

test_connection() → {"ok": bool, "error": str} — used by Test SSH button
get_smart_attributes(devname) → parse smartctl -a, return {health, raw_output, attributes, warnings, failures}
start_smart_test(devname, test_type) → smartctl -t short|long /dev/{devname}
poll_smart_progress(devname) → smartctl -a during test; returns {state, percent_remaining, output}
abort_smart_test(devname) → smartctl -X /dev/{devname}
run_badblocks(devname, on_progress, cancelled_fn) → streams badblocks -wsv -b 4096 -p 1; counts bad sectors from stdout (digit-only lines)

Key auth pattern — key is stored as PEM string in settings, never written to disk:

asyncssh.connect(host, ..., client_keys=[asyncssh.import_private_key(pem_str)], known_hosts=None)

badblocks streaming — uses asyncssh.create_process() with parallel stdout/stderr draining via asyncio.gather. Progress updates written to DB every 20 lines to avoid excessive writes.

SMART Attribute Monitoring

Monitored attributes and their thresholds:

ID	Name	Any non-zero →
5	Reallocated_Sector_Ct	FAIL
10	Spin_Retry_Count	WARN
188	Command_Timeout	WARN
197	Current_Pending_Sector	FAIL
198	Offline_Uncorrectable	FAIL
199	UDMA_CRC_Error_Count	WARN

SMART attrs stored as JSON blob in drives.smart_attrs. Updated by final_check stage (SSH mode) or short_smart/long_smart REST mode. Displayed in drive drawer with colour-coded table + raw smartctl -a output.

Drive Reset Action

POST /api/v1/drives/{drive_id}/reset — clears smart_tests rows to idle, clears drives.smart_attrs, writes audit event, notifies SSE subscribers
Button appears in action column when can_reset = drive has no active burn-in AND has any non-idle smart state or smart attrs
Burn-in history (burnin_jobs, burnin_stages) is preserved — reset only affects SMART test state

New Routes (Stage 7)

Method	Path	Description
`POST`	`/api/v1/drives/{id}/reset`	Reset SMART state and attrs for a drive
`POST`	`/api/v1/settings/test-ssh`	Test SSH connection with current SSH settings
`GET`	`/api/v1/updates/check`	Check for latest release from Forgejo git.hellocomputer.xyz

Check for Updates

Settings page has a "Check for Updates" button that fetches:

GET https://git.hellocomputer.xyz/api/v1/repos/brandon/truenas-burnin/releases/latest

Compares tag name against settings.app_version; shows "up to date" or "v{tag} available".

Version Badge

app_version set as Jinja2 global in renderer.py:

templates.env.globals["app_version"] = _settings.app_version

Displayed in header as <span class="header-version">v{app_version}</span> (right side, muted).

Configurable Thresholds

renderer.py _temp_class now reads from settings instead of hardcoded values:

if temp >= settings.temp_crit_c:  return "temp-crit"
if temp >= settings.temp_warn_c:  return "temp-warn"

precheck stage fails if temperature_c >= settings.temp_crit_c.

Surface validate fails if bad_blocks > settings.bad_block_threshold (default 0 = any bad sector = fail).

Cutting to Real TrueNAS (Next Steps)

When ready to test against a real TrueNAS CORE box:

In Settings (or .env), set:
- TrueNAS URL → https://10.0.0.X (real IP)
- API Key → real API key
- SSH Host → same IP as TrueNAS
- SSH User → root (or sudoer with smartctl/badblocks access)
- SSH Key → paste PEM key into textarea
Click Test SSH Connection to verify before starting a burn-in
TrueNAS CORE uses ada0, da0 device names (not sda). Mock drive names will differ.
Delete app.db before first real poll to clear mock drive rows
Comment out mock-truenas service in docker-compose.yml (optional — harmless to leave)
Verify TrueNAS CORE v2.0 REST API:
- GET /api/v2.0/disk returns list with name, serial, model, size, temperature
- GET /api/v2.0/core/get_jobs with filter [["method","=","smart.test"]]
- POST /api/v2.0/smart/test accepts {disks: [devname], type: "SHORT"|"LONG"}

Feature Reference (Stage 6b)

New Pages

URL	Description
`/stats`	Analytics — pass rate by model, daily activity last 14 days
`/audit`	Audit log — last 200 events with drive/operator context
`/settings`	Editable 2-col settings form (SMTP, Notifications, Behavior, Webhook)
`/history/{id}/print`	Print-friendly job report with QR code

New API Routes (6b + 6c)

Method	Path	Description
`PATCH`	`/api/v1/drives/{id}`	Update `notes` and/or `location`
`POST`	`/api/v1/settings`	Save runtime settings to `/data/settings_overrides.json`
`POST`	`/api/v1/settings/test-smtp`	Test SMTP connection without sending email

Notifications

Browser push: Bell icon in header → Notification.requestPermission(). Fires on job-alert SSE event (burnin pass/fail).
SSE alert event: job-alert event type on /sse/drives. JS listens via htmx:sseMessage.
Immediate email: send_job_alert() in mailer.py. Triggered by notifier.notify_job_complete() from burnin.py.
Webhook: notifier._send_webhook() — POST JSON to WEBHOOK_URL. Payload includes event, job_id, devname, serial, model, state, operator, error_text.

Stuck Job Detection

burnin.check_stuck_jobs() runs every 5 poll cycles (~1 min)
Jobs running longer than STUCK_JOB_HOURS (default 24h) → state=unknown
Logged at CRITICAL level; audit event written

Batch Burn-In

Checkboxes on each idle/selectable drive row
Batch bar appears in filter row when any drives selected
Uses existing POST /api/v1/burnin/start with multiple drive_ids
Requires operator name + explicit confirmation checkbox (no serial required)
JS checkedDriveIds Set persists across SSE swaps via restoreCheckboxes()

Drive Location

location and notes fields added to drives table via ALTER TABLE migration
Inline click-to-edit on location field in drive name cell
Saves via PATCH /api/v1/drives/{id} on blur/Enter; restores on Escape

Feature Reference (Stage 6c)

Settings Page

Two-column layout: SMTP card (left, wider) + Notifications / Behavior / Webhook stacked (right)
Read-only system card at bottom (TrueNAS URL, poll interval, etc.) — restart required badge
All changes save instantly via POST /api/v1/settings → settings_store.save() → /data/settings_overrides.json
Overrides loaded on startup in main.py lifespan via settings_store.init()
Connection mode dropdown auto-sets port: STARTTLS→587, SSL/TLS→465, Plain→25
Test Connection button at top of SMTP card — tests live settings without sending email
Brand logo in header is now a clickable <a href="/"> home link

SMTP Port Derivation

# mailer.py — port is derived from mode, NOT from settings.smtp_port
_MODE_PORTS = {"starttls": 587, "ssl": 465, "plain": 25}
port = _MODE_PORTS.get(mode, 587)

Never use settings.smtp_port in mailer — it's kept in config for .env backward compat only.

Burn-In Stage Selection

StartBurninRequest no longer takes profile: str. Instead takes:

run_surface: bool = True — surface validate (destructive write test)
run_short: bool = True — Short SMART (non-destructive)
run_long: bool = True — Long SMART (non-destructive)

Profile string is computed as a property. Profiles: full, surface_short, surface_long, surface, short_long, short, long. Precheck and final_check always run.

STAGE_ORDER in burnin.py has all 7 profile combinations.

_recalculate_progress() uses _STAGE_BASE_WEIGHTS dict (per-stage weights) and computes overall % dynamically from actual burnin_stages rows — no profile lookup needed.

In the UI, both single-drive and batch modals show 3 checkboxes. If surface is unchecked:

Destructive warning is hidden
Serial confirmation field is hidden (single modal)
Confirmation checkbox is hidden (batch modal)

Table Scroll Fix

.table-wrap {
  max-height: calc(100vh - 205px); /* header(44) + main-pad(20) + stats-bar(70) + filter-bar(46) + buffer */
}

If stats bar or other content height changes, update this offset.

Feature Reference (Stage 6d)

Cancel Functionality

What	How
Cancel running Short SMART	`✕ Short` button appears in action col when `short_busy`; calls `POST /api/v1/drives/{id}/smart/cancel` with `{type:"short"}`
Cancel running Long SMART	`✕ Long` button appears when `long_busy`; same route with `{type:"long"}`
Cancel individual burn-in	`✕ Burn-In` button (was "Cancel") shown when `bi_active`; calls `POST /api/v1/burnin/{id}/cancel`
Cancel All Running	Red `✕ Cancel All Burn-Ins` button appears in filter bar when any burn-in jobs are active; JS collects all `.btn-cancel[data-job-id]` and cancels each

SMART cancel route (POST /api/v1/drives/{drive_id}/smart/cancel):

Fetches all running TrueNAS jobs via client.get_smart_jobs()
Finds job where arguments[0].disks contains the drive's devname
Calls client.abort_job(tn_job_id)
Updates smart_tests table row to state='aborted'

Stage Reordering

Default order changed to: Short SMART → Long SMART → Surface Validate (non-destructive first)
Drag handles (⠿) on each stage row in both single and batch modals
HTML5 drag-and-drop, no external library
getStageOrder(listId) reads current DOM order of checked stages
stage_order: ["short_smart","long_smart","surface_validate"] sent in API body
StartBurninRequest.stage_order: list[str] | None — validated against allowed stage names
burnin.start_job() accepts stage_order param; builds: ["precheck"] + stage_order + ["final_check"]
_run_job() reads stage names back from burnin_stages ORDER BY id — so custom order is honoured
Destructive warning / serial confirmation still triggered by stage-surface checkbox ID (order-independent)

NPM / DNS Setup

Proxy host: burnin.hellocomputer.xyz → http://10.0.0.138:8080
Authelia protection: recommended (no built-in auth in app)
DNS: burnin.hellocomputer.xyz CNAME → sandon.hellocomputer.xyz (proxied: false)

26 KiB Raw Blame History Unescape Escape