Initial commit — TrueNAS Burn-In Dashboard v0.5.0
Full-stack burn-in orchestration dashboard (Stages 1–6d complete): FastAPI backend, SQLite/WAL, SSE live dashboard, mock TrueNAS server, SMTP/webhook notifications, batch burn-in, settings UI, audit log, stats page, cancel SMART/burn-in, drag-to-reorder stages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
commit
b73b5251ae
36 changed files with 8636 additions and 0 deletions
39
.env.example
Normal file
39
.env.example
Normal file
|
|
@ -0,0 +1,39 @@
|
|||
APP_HOST=0.0.0.0
|
||||
APP_PORT=8084
|
||||
DB_PATH=/data/app.db
|
||||
|
||||
# Point at mock-truenas for dev, real TrueNAS for production
|
||||
TRUENAS_BASE_URL=http://mock-truenas:8000
|
||||
TRUENAS_API_KEY=your-api-key-here
|
||||
TRUENAS_VERIFY_TLS=false
|
||||
|
||||
POLL_INTERVAL_SECONDS=12
|
||||
STALE_THRESHOLD_SECONDS=45
|
||||
MAX_PARALLEL_BURNINS=2
|
||||
|
||||
# Stuck job detection — jobs running longer than this are marked unknown
|
||||
STUCK_JOB_HOURS=24
|
||||
|
||||
# Security — comma-separated IPs or CIDRs, empty = allow all
|
||||
# ALLOWED_IPS=10.0.0.0/24,127.0.0.1
|
||||
|
||||
LOG_LEVEL=INFO
|
||||
|
||||
# SMTP — daily digest at SMTP_REPORT_HOUR, immediate alerts on fail/pass
|
||||
# Leave SMTP_HOST empty to disable all email.
|
||||
# SMTP_HOST=smtp.duocircle.com
|
||||
# SMTP_PORT=587
|
||||
# SMTP_SSL_MODE=starttls # starttls (default) | ssl | plain
|
||||
# SMTP_TIMEOUT=60 # connection timeout in seconds
|
||||
# SMTP_USER=you@domain.com
|
||||
# SMTP_PASSWORD=yourpassword
|
||||
# SMTP_FROM=burnin@domain.com
|
||||
# SMTP_TO=brandon@domain.com
|
||||
# SMTP_REPORT_HOUR=8
|
||||
# SMTP_DAILY_REPORT_ENABLED=true # set false to skip daily report without disabling alerts
|
||||
# SMTP_ALERT_ON_FAIL=true
|
||||
# SMTP_ALERT_ON_PASS=false
|
||||
|
||||
# Webhook — POST JSON on burnin_passed / burnin_failed
|
||||
# Works with Slack, Discord, ntfy.sh, Gotify, n8n, Home Assistant, etc.
|
||||
# WEBHOOK_URL=https://ntfy.sh/your-topic
|
||||
31
.gitignore
vendored
Normal file
31
.gitignore
vendored
Normal file
|
|
@ -0,0 +1,31 @@
|
|||
# Python
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*.pyo
|
||||
*.pyd
|
||||
.Python
|
||||
*.egg-info/
|
||||
dist/
|
||||
build/
|
||||
.venv/
|
||||
venv/
|
||||
env/
|
||||
|
||||
# Environment
|
||||
.env
|
||||
.env.*
|
||||
!.env.example
|
||||
|
||||
# App data (SQLite DB, settings overrides, uploads)
|
||||
data/
|
||||
|
||||
# macOS
|
||||
.DS_Store
|
||||
.AppleDouble
|
||||
.LSOverride
|
||||
|
||||
# Editors
|
||||
.vscode/
|
||||
.idea/
|
||||
*.swp
|
||||
*.swo
|
||||
449
CLAUDE.md
Normal file
449
CLAUDE.md
Normal file
|
|
@ -0,0 +1,449 @@
|
|||
# TrueNAS Burn-In Dashboard — Project Context
|
||||
|
||||
> Drop this file in any new Claude session to resume work with full context.
|
||||
> Last updated: 2026-02-22 (Stage 6d)
|
||||
|
||||
---
|
||||
|
||||
## What This Is
|
||||
|
||||
A self-hosted web dashboard for running and tracking hard-drive burn-in tests
|
||||
against a TrueNAS CORE instance. Deployed on **maple.local** (10.0.0.138).
|
||||
|
||||
- **App URL**: http://10.0.0.138:8084 (or http://burnin.hellocomputer.xyz)
|
||||
- **Stack path on maple.local**: `~/docker/stacks/truenas-burnin/`
|
||||
- **Source (local mac)**: `~/Desktop/claude-sandbox/truenas-burnin/`
|
||||
- **Compose synced to maple.local** via `scp` or manual copy
|
||||
|
||||
### Stages completed
|
||||
|
||||
| Stage | Description | Status |
|
||||
|-------|-------------|--------|
|
||||
| 1 | Mock TrueNAS CORE v2.0 API (15 drives, sda–sdo) | ✅ |
|
||||
| 2 | Backend core (FastAPI, SQLite/WAL, poller, TrueNAS client) | ✅ |
|
||||
| 3 | Dashboard UI (Jinja2, SSE live updates, dark theme) | ✅ |
|
||||
| 4 | Burn-in orchestrator (queue, concurrency, start/cancel) | ✅ |
|
||||
| 5 | History page, job detail page, CSV export | ✅ |
|
||||
| 6 | Hardening (retries, JSON logging, IP allowlist, poller watchdog) | ✅ |
|
||||
| 6b | UX overhaul (stats bar, alerts, batch, notifications, location, print, analytics) | ✅ |
|
||||
| 6c | Settings overhaul (editable form, runtime store, SMTP fix, stage selection) | ✅ |
|
||||
| 6d | Cancel SMART tests, Cancel All burn-ins, drag-to-reorder stages in modals | ✅ |
|
||||
| 7 | Cut to real TrueNAS | 🔲 future |
|
||||
|
||||
---
|
||||
|
||||
## File Map
|
||||
|
||||
```
|
||||
truenas-burnin/
|
||||
├── docker-compose.yml # two services: mock-truenas + app
|
||||
├── Dockerfile # app container
|
||||
├── requirements.txt
|
||||
├── .env.example
|
||||
├── data/ # SQLite DB lives here (gitignored, created on deploy)
|
||||
│
|
||||
├── mock-truenas/
|
||||
│ ├── Dockerfile
|
||||
│ └── app.py # FastAPI mock of TrueNAS CORE v2.0 REST API
|
||||
│
|
||||
└── app/
|
||||
├── __init__.py
|
||||
├── config.py # pydantic-settings; reads .env
|
||||
├── database.py # schema, migrations, init_db(), get_db()
|
||||
├── models.py # Pydantic v2 models; StartBurninRequest has run_surface/run_short/run_long + profile property
|
||||
├── settings_store.py # runtime settings store — persists to /data/settings_overrides.json
|
||||
├── truenas.py # httpx async client with retry (lambda factory pattern)
|
||||
├── poller.py # poll loop, SSE pub/sub, stale detection, stuck-job check
|
||||
├── burnin.py # orchestrator, semaphore, stages, check_stuck_jobs()
|
||||
├── notifier.py # webhook + immediate email alerts on job completion
|
||||
├── mailer.py # daily HTML email + per-job alert email
|
||||
├── logging_config.py # structured JSON logging
|
||||
├── renderer.py # Jinja2 + filters (format_bytes, format_eta, format_elapsed, …)
|
||||
├── routes.py # all FastAPI route handlers
|
||||
├── main.py # app factory, IP allowlist middleware, lifespan
|
||||
│
|
||||
├── static/
|
||||
│ ├── app.css # full dark theme + mobile responsive
|
||||
│ └── app.js # push notifications, batch, elapsed timers, inline edit
|
||||
│
|
||||
└── templates/
|
||||
├── layout.html # header nav: History, Stats, Audit, Settings, bell button
|
||||
├── dashboard.html # stats bar, failed banner, batch bar
|
||||
├── history.html
|
||||
├── job_detail.html # + Print/Export button
|
||||
├── audit.html # audit event log
|
||||
├── stats.html # analytics: pass rate by model, daily activity
|
||||
├── settings.html # editable 2-col form: SMTP (left) + Notifications/Behavior/Webhook (right)
|
||||
├── job_print.html # print view with client-side QR code (qrcodejs CDN)
|
||||
└── components/
|
||||
├── drives_table.html # checkboxes, elapsed time, location inline edit
|
||||
├── modal_start.html # single-drive burn-in modal
|
||||
└── modal_batch.html # batch burn-in modal
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
Browser ──HTMX SSE──▶ GET /sse/drives
|
||||
│
|
||||
poller.subscribe()
|
||||
│
|
||||
asyncio.Queue ◀─── poller.run() notifies after each poll
|
||||
│ & after each burnin stage update
|
||||
render drives_table.html
|
||||
yield SSE "drives-update" event
|
||||
```
|
||||
|
||||
- **Poller** (`poller.py`): runs every `POLL_INTERVAL_SECONDS` (default 12s), calls
|
||||
TrueNAS `/api/v2.0/disk` and `/api/v2.0/core/get_jobs`, writes to SQLite,
|
||||
notifies SSE subscribers
|
||||
- **Burn-in** (`burnin.py`): `asyncio.Semaphore(max_parallel_burnins)` gates
|
||||
concurrency. Jobs are created immediately (queued state), semaphore gates
|
||||
actual execution. On startup, any interrupted running jobs → state=unknown;
|
||||
queued jobs are re-enqueued.
|
||||
- **SSE** (`routes.py /sse/drives`): one persistent connection per browser tab.
|
||||
Renders fresh `drives_table.html` HTML fragment on every notification.
|
||||
- **HTMX** (`dashboard.html`): `hx-ext="sse"` + `sse-swap="drives-update"`
|
||||
replaces `#drives-tbody` content without page reload.
|
||||
|
||||
---
|
||||
|
||||
## Database Schema (SQLite WAL mode)
|
||||
|
||||
```sql
|
||||
-- drives: upsert by truenas_disk_id (the TrueNAS internal disk identifier)
|
||||
drives (id, truenas_disk_id UNIQUE, devname, serial, model, size_bytes,
|
||||
temperature_c, smart_health, last_polled_at)
|
||||
|
||||
-- smart_tests: one row per drive+test_type combination (UNIQUE constraint)
|
||||
smart_tests (id, drive_id FK, test_type CHECK('short','long'),
|
||||
state, percent, started_at, eta_at, finished_at, error_text,
|
||||
UNIQUE(drive_id, test_type))
|
||||
|
||||
-- burnin_jobs: one row per burn-in run (multiple per drive over time)
|
||||
burnin_jobs (id, drive_id FK, profile, state CHECK(queued/running/passed/
|
||||
failed/cancelled/unknown), percent, stage_name, operator,
|
||||
created_at, started_at, finished_at, error_text)
|
||||
|
||||
-- burnin_stages: one row per stage per job
|
||||
burnin_stages (id, burnin_job_id FK, stage_name, state, percent,
|
||||
started_at, finished_at, error_text)
|
||||
|
||||
-- audit_events: append-only log
|
||||
audit_events (id, event_type, drive_id, job_id, operator, note, created_at)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Burn-In Stage Definitions
|
||||
|
||||
```python
|
||||
STAGE_ORDER = {
|
||||
"quick": ["precheck", "short_smart", "io_validate", "final_check"],
|
||||
"full": ["precheck", "surface_validate", "short_smart", "long_smart", "final_check"],
|
||||
}
|
||||
```
|
||||
|
||||
The UI only exposes **full** profile (destructive). Quick profile exists for dev/testing.
|
||||
|
||||
---
|
||||
|
||||
## TrueNAS API Contracts Used
|
||||
|
||||
| Method | Endpoint | Notes |
|
||||
|--------|----------|-------|
|
||||
| GET | `/api/v2.0/disk` | List all disks |
|
||||
| POST | `/api/v2.0/smart/test` | Start SMART test `{disks:[name], type:"SHORT"\|"LONG"}` |
|
||||
| GET | `/api/v2.0/core/get_jobs` | Filter `[["method","=","smart.test"]]` |
|
||||
| POST | `/api/v2.0/core/job_abort` | `job_id` positional arg |
|
||||
| GET | `/api/v2.0/smart/test/results/{disk}` | Per-disk SMART results |
|
||||
|
||||
Auth: `Authorization: Bearer {TRUENAS_API_KEY}` header.
|
||||
|
||||
---
|
||||
|
||||
## Config / Environment Variables
|
||||
|
||||
All read from `.env` via `pydantic-settings`. See `.env.example` for full list.
|
||||
|
||||
| Variable | Default | Notes |
|
||||
|----------|---------|-------|
|
||||
| `APP_HOST` | `0.0.0.0` | |
|
||||
| `APP_PORT` | `8080` | |
|
||||
| `DB_PATH` | `/data/app.db` | Inside container |
|
||||
| `TRUENAS_BASE_URL` | `http://localhost:8000` | Point at mock or real TrueNAS |
|
||||
| `TRUENAS_API_KEY` | `mock-key` | Real API key for prod |
|
||||
| `TRUENAS_VERIFY_TLS` | `false` | Set true for prod with valid cert |
|
||||
| `POLL_INTERVAL_SECONDS` | `12` | |
|
||||
| `STALE_THRESHOLD_SECONDS` | `45` | UI shows warning if data older than this |
|
||||
| `MAX_PARALLEL_BURNINS` | `2` | asyncio.Semaphore limit |
|
||||
| `SURFACE_VALIDATE_SECONDS` | `45` | Mock only — duration of surface stage |
|
||||
| `IO_VALIDATE_SECONDS` | `25` | Mock only — duration of I/O stage |
|
||||
| `STUCK_JOB_HOURS` | `24` | Hours before a running job is auto-marked unknown |
|
||||
| `LOG_LEVEL` | `INFO` | |
|
||||
| `ALLOWED_IPS` | `` | Empty = allow all. Comma-sep IPs/CIDRs |
|
||||
| `SMTP_HOST` | `` | Empty = email disabled |
|
||||
| `SMTP_PORT` | `587` | |
|
||||
| `SMTP_USER` | `` | |
|
||||
| `SMTP_PASSWORD` | `` | |
|
||||
| `SMTP_FROM` | `` | |
|
||||
| `SMTP_TO` | `` | Comma-separated |
|
||||
| `SMTP_REPORT_HOUR` | `8` | Local hour (0-23) to send daily report |
|
||||
| `SMTP_ALERT_ON_FAIL` | `true` | Immediate email when a job fails |
|
||||
| `SMTP_ALERT_ON_PASS` | `false` | Immediate email when a job passes |
|
||||
| `WEBHOOK_URL` | `` | POST JSON on burnin_passed/burnin_failed. Works with ntfy, Slack, Discord, n8n |
|
||||
|
||||
---
|
||||
|
||||
## Deploy Workflow
|
||||
|
||||
### First deploy (already done)
|
||||
```bash
|
||||
# On maple.local
|
||||
cd ~/docker/stacks/truenas-burnin
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
### Redeploy after code changes
|
||||
```bash
|
||||
# Copy changed files from mac to maple.local first, e.g.:
|
||||
scp -P 2225 -r app/ brandon@10.0.0.138:~/docker/stacks/truenas-burnin/
|
||||
|
||||
# Then on maple.local:
|
||||
ssh brandon@10.0.0.138 -p 2225
|
||||
cd ~/docker/stacks/truenas-burnin
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
### Reset the database (e.g. after schema changes)
|
||||
```bash
|
||||
# On maple.local — stop containers first
|
||||
docker compose stop app
|
||||
# Delete DB using alpine (container owns the file, sudo not available)
|
||||
docker run --rm -v ~/docker/stacks/truenas-burnin/data:/data alpine rm -f /data/app.db
|
||||
docker compose start app
|
||||
```
|
||||
|
||||
### Check logs
|
||||
```bash
|
||||
docker compose logs -f app
|
||||
docker compose logs -f mock-truenas
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Mock TrueNAS Server (`mock-truenas/app.py`)
|
||||
|
||||
- 15 drives: `sda`–`sdo`
|
||||
- Drive mix: 3× ST12000NM0008 12TB, 3× WD80EFAX 8TB, 2× ST16000NM001G 16TB,
|
||||
2× ST4000VN008 4TB, 2× TOSHIBA MG06ACA10TE 10TB, 1× HGST HUS728T8TAL5200 8TB,
|
||||
1× Seagate Barracuda ST6000DM003 6TB, 1× **FAIL001** (sdn) — always fails at ~30%
|
||||
- SHORT test: 90s simulated; LONG test: 480s simulated; tick every 5s
|
||||
- Debug endpoints:
|
||||
- `POST /debug/reset` — reset all jobs/state
|
||||
- `GET /debug/state` — dump current state
|
||||
- `POST /debug/complete-all-jobs` — instantly complete all running tests
|
||||
|
||||
---
|
||||
|
||||
## Key Implementation Patterns
|
||||
|
||||
### Retry pattern — lambda factory (NOT coroutine object)
|
||||
```python
|
||||
# CORRECT: pass a factory so each retry creates a fresh coroutine
|
||||
r = await _with_retry(lambda: self._client.get("/api/v2.0/disk"), "get_disks")
|
||||
|
||||
# WRONG: coroutine is exhausted after first await, retry silently fails
|
||||
r = await _with_retry(self._client.get("/api/v2.0/disk"), "get_disks")
|
||||
```
|
||||
|
||||
### SSE template rendering
|
||||
```python
|
||||
# Use templates.env.get_template().render() — not TemplateResponse (that's a Response object)
|
||||
html = templates.env.get_template("components/drives_table.html").render(drives=drives)
|
||||
yield {"event": "drives-update", "data": html}
|
||||
```
|
||||
|
||||
### Sticky thead scroll fix
|
||||
```css
|
||||
/* BOTH axes required on table-wrap for position:sticky to work on thead */
|
||||
.table-wrap {
|
||||
overflow: auto; /* NOT overflow-x: auto */
|
||||
max-height: calc(100vh - 130px);
|
||||
}
|
||||
thead { position: sticky; top: 0; z-index: 10; }
|
||||
```
|
||||
|
||||
### export.csv route ordering
|
||||
```python
|
||||
# MUST register export.csv BEFORE /{job_id} — FastAPI tries int() on "export.csv"
|
||||
@router.get("/api/v1/burnin/export.csv") # first
|
||||
async def burnin_export_csv(...): ...
|
||||
|
||||
@router.get("/api/v1/burnin/{job_id}") # second
|
||||
async def burnin_get(job_id: int, ...): ...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Known Issues / Past Bugs Fixed
|
||||
|
||||
| Bug | Root Cause | Fix |
|
||||
|-----|-----------|-----|
|
||||
| `_execute_stages` used `STAGE_ORDER[profile]` ignoring custom order | Stage order stored in DB but not read back | `_run_job` reads stages from `burnin_stages ORDER BY id`; `_execute_stages` accepts `stages: list[str]` |
|
||||
| Poller stuck at 'running' after completion | `_sync_history()` had early-return guard when state=running | Removed guard — `_sync_history` only called when job not in active dict |
|
||||
| DB schema tables missing after edit | Tables split into separate variable never passed to `executescript()` | Put all tables in single `SCHEMA` string |
|
||||
| Retry not retrying | `_with_retry(coro)` — coroutine exhausted after first fail | Changed to `_with_retry(factory: Callable[[], Coroutine])` |
|
||||
| `error_text` overwritten | `_finish_stage(success=False)` overwrote error set by stage handler | `_finish_stage` omits `error_text` column in SQL when param is None |
|
||||
| Cancelled stage showed 'failed' | `_execute_stages` called `_finish_stage(success=False)` on cancel | Check `_is_cancelled()`, call `_cancel_stage()` instead |
|
||||
| export.csv returns 422 | Route registered after `/{job_id}`, FastAPI tries `int("export.csv")` | Move export route before parameterized route |
|
||||
| Old drive names persist after mock rename | Poller upserts by `truenas_disk_id`, old rows stay | Delete `app.db` and restart |
|
||||
| First row clipped behind sticky thead | `overflow-x: auto` only creates partial stacking context | Use `overflow: auto` (both axes) on `.table-wrap` |
|
||||
| `rm data/app.db` permission denied | Container owns the file | Use `docker run --rm -v .../data:/data alpine rm -f /data/app.db` |
|
||||
| First row clipped after Stage 6b | Stats bar added 70px but max-height not updated | `max-height: calc(100vh - 205px)` |
|
||||
| SMTP "Connection unexpectedly closed" | `_send_email` used `settings.smtp_port` (587 default) even in SSL mode | Derive port from mode via `_MODE_PORTS` dict; SSL→465, STARTTLS→587, Plain→25 |
|
||||
| SSL mode missing EHLO | `smtplib.SMTP_SSL` was created without calling `ehlo()` | Added `server.ehlo()` after both SSL and STARTTLS connections |
|
||||
|
||||
---
|
||||
|
||||
## Stage 7 — Cutting to Real TrueNAS (TODO)
|
||||
|
||||
When ready to test against a real TrueNAS CORE box:
|
||||
|
||||
1. In `.env` on maple.local, set:
|
||||
```env
|
||||
TRUENAS_BASE_URL=https://10.0.0.203 # or whatever your TrueNAS IP is
|
||||
TRUENAS_API_KEY=your-real-key-here
|
||||
TRUENAS_VERIFY_TLS=false # unless you have a valid cert
|
||||
```
|
||||
2. Comment out `mock-truenas` service in `docker-compose.yml` (or leave it running — harmless)
|
||||
3. Verify TrueNAS CORE v2.0 API contract matches what `truenas.py` expects:
|
||||
- `GET /api/v2.0/disk` returns list with `name`, `serial`, `model`, `size`, `temperature`
|
||||
- `GET /api/v2.0/core/get_jobs` with filter `[["method","=","smart.test"]]`
|
||||
- `POST /api/v2.0/smart/test` accepts `{disks: [devname], type: "SHORT"|"LONG"}`
|
||||
4. Check that disk names match expected format (TrueNAS CORE uses `ada0`, `da0`, etc. — not `sda`)
|
||||
- You may need to update mock drive names back or adjust poller logic
|
||||
5. Delete `app.db` to clear mock drive rows before first real poll
|
||||
|
||||
---
|
||||
|
||||
## Feature Reference (Stage 6b)
|
||||
|
||||
### New Pages
|
||||
| URL | Description |
|
||||
|-----|-------------|
|
||||
| `/stats` | Analytics — pass rate by model, daily activity last 14 days |
|
||||
| `/audit` | Audit log — last 200 events with drive/operator context |
|
||||
| `/settings` | Editable 2-col settings form (SMTP, Notifications, Behavior, Webhook) |
|
||||
| `/history/{id}/print` | Print-friendly job report with QR code |
|
||||
|
||||
### New API Routes (6b + 6c)
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| `PATCH` | `/api/v1/drives/{id}` | Update `notes` and/or `location` |
|
||||
| `POST` | `/api/v1/settings` | Save runtime settings to `/data/settings_overrides.json` |
|
||||
| `POST` | `/api/v1/settings/test-smtp` | Test SMTP connection without sending email |
|
||||
|
||||
### Notifications
|
||||
- **Browser push**: Bell icon in header → `Notification.requestPermission()`. Fires on `job-alert` SSE event (burnin pass/fail).
|
||||
- **SSE alert event**: `job-alert` event type on `/sse/drives`. JS listens via `htmx:sseMessage`.
|
||||
- **Immediate email**: `send_job_alert()` in mailer.py. Triggered by `notifier.notify_job_complete()` from burnin.py.
|
||||
- **Webhook**: `notifier._send_webhook()` — POST JSON to `WEBHOOK_URL`. Payload includes event, job_id, devname, serial, model, state, operator, error_text.
|
||||
|
||||
### Stuck Job Detection
|
||||
- `burnin.check_stuck_jobs()` runs every 5 poll cycles (~1 min)
|
||||
- Jobs running longer than `STUCK_JOB_HOURS` (default 24h) → state=unknown
|
||||
- Logged at CRITICAL level; audit event written
|
||||
|
||||
### Batch Burn-In
|
||||
- Checkboxes on each idle/selectable drive row
|
||||
- Batch bar appears in filter row when any drives selected
|
||||
- Uses existing `POST /api/v1/burnin/start` with multiple `drive_ids`
|
||||
- Requires operator name + explicit confirmation checkbox (no serial required)
|
||||
- JS `checkedDriveIds` Set persists across SSE swaps via `restoreCheckboxes()`
|
||||
|
||||
### Drive Location
|
||||
- `location` and `notes` fields added to drives table via ALTER TABLE migration
|
||||
- Inline click-to-edit on location field in drive name cell
|
||||
- Saves via `PATCH /api/v1/drives/{id}` on blur/Enter; restores on Escape
|
||||
|
||||
## Feature Reference (Stage 6c)
|
||||
|
||||
### Settings Page
|
||||
- Two-column layout: SMTP card (left, wider) + Notifications / Behavior / Webhook stacked (right)
|
||||
- Read-only system card at bottom (TrueNAS URL, poll interval, etc.) — restart required badge
|
||||
- All changes save instantly via `POST /api/v1/settings` → `settings_store.save()` → `/data/settings_overrides.json`
|
||||
- Overrides loaded on startup in `main.py` lifespan via `settings_store.init()`
|
||||
- Connection mode dropdown auto-sets port: STARTTLS→587, SSL/TLS→465, Plain→25
|
||||
- Test Connection button at top of SMTP card — tests live settings without sending email
|
||||
- Brand logo in header is now a clickable `<a href="/">` home link
|
||||
|
||||
### SMTP Port Derivation
|
||||
```python
|
||||
# mailer.py — port is derived from mode, NOT from settings.smtp_port
|
||||
_MODE_PORTS = {"starttls": 587, "ssl": 465, "plain": 25}
|
||||
port = _MODE_PORTS.get(mode, 587)
|
||||
```
|
||||
Never use `settings.smtp_port` in mailer — it's kept in config for `.env` backward compat only.
|
||||
|
||||
### Burn-In Stage Selection
|
||||
`StartBurninRequest` no longer takes `profile: str`. Instead takes:
|
||||
- `run_surface: bool = True` — surface validate (destructive write test)
|
||||
- `run_short: bool = True` — Short SMART (non-destructive)
|
||||
- `run_long: bool = True` — Long SMART (non-destructive)
|
||||
|
||||
Profile string is computed as a property. Profiles: `full`, `surface_short`, `surface_long`,
|
||||
`surface`, `short_long`, `short`, `long`. Precheck and final_check always run.
|
||||
|
||||
`STAGE_ORDER` in `burnin.py` has all 7 profile combinations.
|
||||
|
||||
`_recalculate_progress()` uses `_STAGE_BASE_WEIGHTS` dict (per-stage weights) and computes
|
||||
overall % dynamically from actual `burnin_stages` rows — no profile lookup needed.
|
||||
|
||||
In the UI, both single-drive and batch modals show 3 checkboxes. If surface is unchecked:
|
||||
- Destructive warning is hidden
|
||||
- Serial confirmation field is hidden (single modal)
|
||||
- Confirmation checkbox is hidden (batch modal)
|
||||
|
||||
### Table Scroll Fix
|
||||
```css
|
||||
.table-wrap {
|
||||
max-height: calc(100vh - 205px); /* header(44) + main-pad(20) + stats-bar(70) + filter-bar(46) + buffer */
|
||||
}
|
||||
```
|
||||
If stats bar or other content height changes, update this offset.
|
||||
|
||||
## Feature Reference (Stage 6d)
|
||||
|
||||
### Cancel Functionality
|
||||
| What | How |
|
||||
|------|-----|
|
||||
| Cancel running Short SMART | `✕ Short` button appears in action col when `short_busy`; calls `POST /api/v1/drives/{id}/smart/cancel` with `{type:"short"}` |
|
||||
| Cancel running Long SMART | `✕ Long` button appears when `long_busy`; same route with `{type:"long"}` |
|
||||
| Cancel individual burn-in | `✕ Burn-In` button (was "Cancel") shown when `bi_active`; calls `POST /api/v1/burnin/{id}/cancel` |
|
||||
| Cancel All Running | Red `✕ Cancel All Burn-Ins` button appears in filter bar when any burn-in jobs are active; JS collects all `.btn-cancel[data-job-id]` and cancels each |
|
||||
|
||||
**SMART cancel route** (`POST /api/v1/drives/{drive_id}/smart/cancel`):
|
||||
1. Fetches all running TrueNAS jobs via `client.get_smart_jobs()`
|
||||
2. Finds job where `arguments[0].disks` contains the drive's devname
|
||||
3. Calls `client.abort_job(tn_job_id)`
|
||||
4. Updates `smart_tests` table row to `state='aborted'`
|
||||
|
||||
### Stage Reordering
|
||||
- Default order changed to: **Short SMART → Long SMART → Surface Validate** (non-destructive first)
|
||||
- Drag handles (⠿) on each stage row in both single and batch modals
|
||||
- HTML5 drag-and-drop, no external library
|
||||
- `getStageOrder(listId)` reads current DOM order of checked stages
|
||||
- `stage_order: ["short_smart","long_smart","surface_validate"]` sent in API body
|
||||
- `StartBurninRequest.stage_order: list[str] | None` — validated against allowed stage names
|
||||
- `burnin.start_job()` accepts `stage_order` param; builds: `["precheck"] + stage_order + ["final_check"]`
|
||||
- `_run_job()` reads stage names back from `burnin_stages ORDER BY id` — so custom order is honoured
|
||||
- Destructive warning / serial confirmation still triggered by `stage-surface` checkbox ID (order-independent)
|
||||
|
||||
## NPM / DNS Setup
|
||||
|
||||
- Proxy host: `burnin.hellocomputer.xyz` → `http://10.0.0.138:8080`
|
||||
- Authelia protection: recommended (no built-in auth in app)
|
||||
- DNS: `burnin.hellocomputer.xyz` CNAME → `sandon.hellocomputer.xyz` (proxied: false)
|
||||
10
Dockerfile
Normal file
10
Dockerfile
Normal file
|
|
@ -0,0 +1,10 @@
|
|||
FROM python:3.12-slim
|
||||
|
||||
WORKDIR /opt/app
|
||||
|
||||
COPY requirements.txt .
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
COPY app/ ./app/
|
||||
|
||||
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8084"]
|
||||
296
SPEC.md
Normal file
296
SPEC.md
Normal file
|
|
@ -0,0 +1,296 @@
|
|||
# TrueNAS Burn-In — Project Specification
|
||||
|
||||
**Version:** 0.5.0
|
||||
**Status:** Active Development
|
||||
**Audience:** Public / Open Source
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
TrueNAS Burn-In is a self-hosted web dashboard that runs on a separate machine or VM and connects to a TrueNAS system via SSH to automate and monitor the drive burn-in process. It is designed for users who want to validate new hard drives before adding them to a ZFS pool — where reliability is non-negotiable.
|
||||
|
||||
The app is not a TrueNAS plugin and does not run on TrueNAS itself. It connects remotely over SSH to issue smartctl and badblocks commands, polls results, and presents everything through a dark-themed real-time dashboard. It is deployed via Docker Compose and configured through a Settings UI and `.env` file.
|
||||
|
||||
---
|
||||
|
||||
## Core Philosophy
|
||||
|
||||
- Drives going into a ZFS pool must be rock solid. The app's job is to surface any doubt about a drive before it earns a slot in the pool.
|
||||
- Burn-in is always triggered manually. There is no scheduling or automation.
|
||||
- Simplicity over features. The README and Settings UI should be sufficient for any technically capable user to be up and running without hand-holding.
|
||||
- Recommend safe defaults. Warn loudly when users push limits (too many parallel jobs, destructive operations, high temperatures).
|
||||
|
||||
---
|
||||
|
||||
## Test Sequence
|
||||
|
||||
Every drive goes through the following stages in order. A failure at any stage stops that drive immediately.
|
||||
|
||||
### Stage 1 — Short SMART Test
|
||||
```
|
||||
smartctl -t short -d sat /dev/sdX
|
||||
```
|
||||
Polls for completion via:
|
||||
```
|
||||
smartctl -a -d sat /dev/sdX | grep -i remaining
|
||||
```
|
||||
Expected duration: ~2 minutes. If the test fails or reports any critical attribute violation, the drive is marked FAILED and no further tests run.
|
||||
|
||||
### Stage 2 — Long SMART Test
|
||||
```
|
||||
smartctl -t long -d sat /dev/sdX
|
||||
```
|
||||
Expected duration: varies by drive size (typically 3–6 hours for 8–12TB drives). Same polling approach. Same failure behavior.
|
||||
|
||||
### Stage 3 — Surface Scan (Badblocks, Destructive)
|
||||
```
|
||||
badblocks -wsv -b 4096 -p 1 /dev/sdX
|
||||
```
|
||||
This is a **destructive write test**. The UI must display a prominent warning before this stage begins, and again in the Settings page where the behavior is documented. The `-w` flag overwrites all data on the drive. This is intentional — these are new drives being validated before pool use.
|
||||
|
||||
**Failure threshold:** 2 or more bad blocks found triggers immediate abort and FAILED status. The threshold should be configurable in Settings (default: 2).
|
||||
|
||||
---
|
||||
|
||||
## SMART Attributes to Monitor
|
||||
|
||||
The following attributes are checked after each SMART test and continuously during the burn-in run. Any non-zero value on pre-fail attributes is treated as a warning; crossing defined thresholds triggers failure.
|
||||
|
||||
| ID | Attribute | Threshold | Notes |
|
||||
|-----|----------------------------|--------------|--------------------------------------------|
|
||||
| 5 | Reallocated_Sector_Ct | > 0 = FAIL | Any reallocation is disqualifying for ZFS |
|
||||
| 10 | Spin_Retry_Count | > 0 = WARN | Mechanical stress indicator |
|
||||
| 188 | Command_Timeout | > 0 = WARN | Drive not responding to commands |
|
||||
| 197 | Current_Pending_Sector | > 0 = FAIL | Sectors waiting to be reallocated |
|
||||
| 198 | Offline_Uncorrectable | > 0 = FAIL | Unrecoverable read errors |
|
||||
| 199 | UDMA_CRC_Error_Count | > 0 = WARN | Likely cable/controller, flag for review |
|
||||
|
||||
---
|
||||
|
||||
## Failure Behavior
|
||||
|
||||
When a drive fails at any stage:
|
||||
|
||||
1. All remaining tests for that drive are immediately cancelled.
|
||||
2. The drive is marked `FAILED` in the UI with the specific failure reason (e.g., `FAILED (SURFACE VALIDATE)`, `FAILED (REALLOCATED SECTORS)`).
|
||||
3. An alert is fired immediately via whichever notification channels are enabled in Settings (email and/or webhook — both can fire simultaneously).
|
||||
4. The failed drive's row is visually distinct in the dashboard and cannot be accidentally re-queued without an explicit reset action.
|
||||
|
||||
A **Reset** action clears the test state for a drive so it can be re-queued. It does not cancel in-progress tests — the Cancel button does that. Reset is only available on completed drives (passed, failed, or interrupted).
|
||||
|
||||
---
|
||||
|
||||
## UI
|
||||
|
||||
### Dashboard (Main View)
|
||||
|
||||
- **Stats bar:** Total drives, Running, Failed, Passed, Idle counts.
|
||||
- **Filter chips:** All / Running / Failed / Passed / Idle — filters the table below.
|
||||
- **Drive table columns:** Drive (device name + model), Serial, Size, Temp, Health, Short SMART, Long SMART, Burn-In, Actions.
|
||||
- **Temperature display:** Color-coded. Green ≤ 45°C, Yellow 46–54°C, Red ≥ 55°C. Thresholds configurable in Settings.
|
||||
- **Running tests:** Show an animated progress bar with percentage and elapsed time instead of a static badge.
|
||||
- **Actions per drive:** Short, Long, Burn-In buttons. Cancel button replaces Start when a test is running.
|
||||
- **Row click:** Opens the Log Drawer for that drive.
|
||||
|
||||
### Log Drawer
|
||||
|
||||
Slides up from the bottom of the page when a drive row is clicked. Does not navigate away — the table remains visible and scrollable above.
|
||||
|
||||
Three tabs:
|
||||
- **badblocks** — live tail of badblocks stdout, including error lines with sector numbers highlighted in red.
|
||||
- **SMART** — output of the last smartctl run for this drive, with monitored attribute values highlighted.
|
||||
- **Events** — chronological timeline of everything that happened to this drive (test started, test passed, failure detected, alert sent, etc.).
|
||||
|
||||
Features:
|
||||
- Auto-scroll toggle (on by default).
|
||||
- Blinking cursor on the active output line of a running test.
|
||||
- Close button or click the same row again to dismiss.
|
||||
- Failed drives show error lines in red with exact bad block sector numbers.
|
||||
|
||||
### History Page
|
||||
|
||||
Per-drive history. Each drive (identified by serial number) has a log of every burn-in run ever performed, with timestamps, results, and duration. Not per-session — per individual drive.
|
||||
|
||||
### Audit Page
|
||||
|
||||
Application-level event log. Records: test started, test cancelled, settings changed, alert sent, container restarted, SSH connection lost/restored. Useful for debugging and for open source users troubleshooting their setup.
|
||||
|
||||
### Stats Page
|
||||
|
||||
Aggregate statistics across all drives and all time. Pass rate, average test duration by drive size, failure breakdown by failure type.
|
||||
|
||||
### Settings Page
|
||||
|
||||
Divided into sections:
|
||||
|
||||
**EMAIL (SMTP)**
|
||||
- Host, Mode (STARTTLS/SSL/plain), Port, Timeout, Username, Password, From, To.
|
||||
- Test Connection button.
|
||||
- Enable/disable toggle.
|
||||
|
||||
**WEBHOOK**
|
||||
- Single URL field. POST JSON payload on `burnin_passed` and `burnin_failed` events.
|
||||
- Compatible with ntfy.sh, Slack, Discord, n8n, and any generic HTTP POST receiver.
|
||||
- Leave blank to disable.
|
||||
|
||||
**NOTIFICATIONS**
|
||||
- Daily Report toggle (sends full drive status email at a configured hour).
|
||||
- Alert on Failure toggle (immediate — fires both email and webhook if both configured).
|
||||
- Alert on Pass toggle.
|
||||
|
||||
**BURN-IN BEHAVIOR**
|
||||
- Max Parallel Burn-Ins (default: 2, max: 60).
|
||||
- Warning displayed inline when set above 8: "Running many simultaneous surface scans may saturate your storage controller and produce unreliable results. Recommended: 2–4."
|
||||
- Bad block failure threshold (default: 2).
|
||||
- Stuck job threshold in hours (default: 24 — jobs running longer than this are auto-marked Unknown).
|
||||
|
||||
**TEMPERATURE**
|
||||
- Warning threshold (default: 46°C).
|
||||
- Critical threshold (default: 55°C).
|
||||
|
||||
**SSH**
|
||||
- TrueNAS host/IP.
|
||||
- Port (default: 22).
|
||||
- Username.
|
||||
- Authentication: SSH key (paste or upload) or password.
|
||||
- Test Connection button.
|
||||
|
||||
**SYSTEM** *(restart required to change — set in .env)*
|
||||
- TrueNAS API URL.
|
||||
- Verify TLS toggle.
|
||||
- Poll interval (default: 12s).
|
||||
- Stale threshold (default: 45s).
|
||||
- IP allowlist.
|
||||
- Log level (DEBUG / INFO / WARN / ERROR).
|
||||
|
||||
**VERSION & UPDATES**
|
||||
- Displays current version (starting at 0.5.0).
|
||||
- "Check for Updates" button — queries GitHub releases API and shows latest version with a link if an update is available.
|
||||
|
||||
---
|
||||
|
||||
## Data Persistence
|
||||
|
||||
**SQLite** — single file, zero config, atomic writes. No data loss on container restart.
|
||||
|
||||
On restart, any drive that was in a `running` state is automatically transitioned to `interrupted`. The user sees "INTERRUPTED" in the burn-in column and must manually reset and re-queue the drive. The partial log up to the point of interruption is preserved and viewable in the Log Drawer.
|
||||
|
||||
Drive location labels persist in SQLite tied to serial number, so a drive's label survives container restarts and reappears automatically when the drive is detected again.
|
||||
|
||||
---
|
||||
|
||||
## Notifications
|
||||
|
||||
### Email
|
||||
Standard SMTP. Fires on: burn-in failure (immediate), burn-in pass (if enabled), daily report (scheduled).
|
||||
|
||||
Failure email includes: drive name, serial number, size, failure stage, failure reason, bad block count (if applicable), SMART attribute snapshot, timestamp.
|
||||
|
||||
### Webhook
|
||||
Single HTTP POST to configured URL with JSON body:
|
||||
```json
|
||||
{
|
||||
"event": "burnin_failed",
|
||||
"drive": "sda",
|
||||
"serial": "WDZ1A002",
|
||||
"size": "12 TB",
|
||||
"failure_reason": "SURFACE VALIDATE",
|
||||
"bad_blocks": 2,
|
||||
"timestamp": "2025-01-15T03:21:04Z"
|
||||
}
|
||||
```
|
||||
Compatible with ntfy.sh, Slack incoming webhooks, Discord webhooks, n8n HTTP trigger nodes.
|
||||
|
||||
Both email and webhook fire simultaneously when both are configured and enabled. User controls each independently via Settings toggles.
|
||||
|
||||
---
|
||||
|
||||
## SSH Architecture
|
||||
|
||||
The app connects to TrueNAS over SSH from the host running the Docker container. It does not use the TrueNAS web API for drive operations — all smartctl and badblocks commands are issued directly over SSH.
|
||||
|
||||
Connection details are configured in Settings (not `.env`). Supports:
|
||||
- Password authentication.
|
||||
- SSH key authentication (key pasted or uploaded in Settings UI).
|
||||
- Custom port.
|
||||
- Test Connection button validates credentials before saving.
|
||||
|
||||
On SSH disconnection mid-test: the test process on TrueNAS may continue running (SSH disconnection does not kill the remote process if launched correctly with nohup or similar). The app marks the drive as `interrupted` in its own state, attempts to reconnect, and resumes polling if the process is still running. If the remote process is gone, the drive stays `interrupted`.
|
||||
|
||||
---
|
||||
|
||||
## API
|
||||
|
||||
A REST API is available at `/api/v1/`. It is documented via OpenAPI at `/openapi.json` and browsable at `/api` in the dashboard. Version displayed: 0.1.0 (API version tracked independently from app version).
|
||||
|
||||
Key endpoints:
|
||||
- `GET /api/v1/drives` — list all drives with current status.
|
||||
- `GET /api/v1/drives/{drive_id}` — single drive detail.
|
||||
- `PATCH /api/v1/drives/{drive_id}` — update drive metadata (e.g., location label).
|
||||
- `POST /api/v1/drives/{drive_id}/smart/start` — start a SMART test.
|
||||
- `POST /api/v1/drives/{drive_id}/smart/cancel` — cancel a SMART test.
|
||||
- `POST /api/v1/burnin/start` — start a burn-in job.
|
||||
- `POST /api/v1/burnin/{job_id}/cancel` — cancel a burn-in job.
|
||||
- `GET /sse/drives` — Server-Sent Events stream powering the real-time dashboard UI.
|
||||
- `GET /health` — health check endpoint.
|
||||
|
||||
The API makes this app a strong candidate for MCP server integration, allowing an AI assistant to query drive status, start tests, or receive alerts conversationally.
|
||||
|
||||
---
|
||||
|
||||
## Deployment
|
||||
|
||||
Docker Compose. Minimum viable setup:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/yourusername/truenas-burnin
|
||||
cd truenas-burnin
|
||||
cp .env.example .env
|
||||
# Edit .env for system-level settings (TrueNAS URL, poll interval, etc.)
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
Navigate to `http://your-vm-ip:port` and complete SSH and SMTP configuration in Settings.
|
||||
|
||||
All other configuration is done through the Settings UI — no manual file editing required beyond `.env` for system-level values.
|
||||
|
||||
---
|
||||
|
||||
## mock-truenas
|
||||
|
||||
A companion Docker service (`mock-truenas`) that simulates the TrueNAS API for UI development and testing without real hardware. It mocks drive discovery, SMART test responses, and badblocks progress. Used exclusively for development — not deployed in production.
|
||||
|
||||
### Testing on Real TrueNAS (v1.0 Milestone Plan)
|
||||
|
||||
To validate against real hardware:
|
||||
|
||||
1. Switch `TRUENAS_URL` in `.env` from `http://mock-truenas:8000` to your real TrueNAS IP/hostname.
|
||||
2. Ensure SSH is enabled on TrueNAS (System → Services → SSH).
|
||||
3. Configure SSH credentials in Settings and use Test Connection to verify.
|
||||
4. Start with a single idle drive — run Short SMART only first.
|
||||
5. Verify the log drawer shows real smartctl output.
|
||||
6. If successful, proceed to Long SMART, then a full burn-in on a drive you're comfortable wiping.
|
||||
7. Confirm an alert email is received on completion.
|
||||
8. Scale to 2–4 drives simultaneously and monitor system resource warnings.
|
||||
|
||||
**v1.0 is considered production-ready when:** the app runs reliably on a real TrueNAS system with 10 simultaneous drives, a failure alert email is received correctly, and a passing drive's history is preserved across a container restart.
|
||||
|
||||
---
|
||||
|
||||
## Version
|
||||
|
||||
- App version starts at **0.5.0**
|
||||
- Displayed on the dashboard landing page header and in Settings.
|
||||
- Update check in Settings queries GitHub releases API.
|
||||
- API version tracked separately, currently **0.1.0**.
|
||||
|
||||
---
|
||||
|
||||
## Out of Scope (v1.0)
|
||||
|
||||
- Scheduled or automated burn-in triggering.
|
||||
- Non-destructive badblocks mode (read-only surface scan).
|
||||
- Multi-TrueNAS support (single host only).
|
||||
- User authentication / login wall (single-user, self-hosted, IP allowlist is sufficient).
|
||||
- Mobile-optimized UI (desktop dashboard only).
|
||||
0
app/__init__.py
Normal file
0
app/__init__.py
Normal file
658
app/burnin.py
Normal file
658
app/burnin.py
Normal file
|
|
@ -0,0 +1,658 @@
|
|||
"""
|
||||
Burn-in orchestrator.
|
||||
|
||||
Manages a FIFO queue of burn-in jobs capped at MAX_PARALLEL_BURNINS concurrent
|
||||
executions. Each job runs stages sequentially; a failed stage aborts the job.
|
||||
|
||||
State is persisted to SQLite throughout — DB is source of truth.
|
||||
|
||||
On startup:
|
||||
- Any 'running' jobs from a previous run are marked 'unknown' (interrupted).
|
||||
- Any 'queued' jobs are re-enqueued automatically.
|
||||
|
||||
Cancellation:
|
||||
- cancel_job() sets DB state to 'cancelled'.
|
||||
- Running stage coroutines check _is_cancelled() at POLL_INTERVAL boundaries
|
||||
and abort within a few seconds of the cancel request.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
import time
|
||||
from datetime import datetime, timezone
|
||||
|
||||
import aiosqlite
|
||||
|
||||
from app.config import settings
|
||||
from app.truenas import TrueNASClient
|
||||
|
||||
log = logging.getLogger(__name__)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Stage definitions
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
STAGE_ORDER: dict[str, list[str]] = {
|
||||
# Legacy
|
||||
"quick": ["precheck", "short_smart", "io_validate", "final_check"],
|
||||
# Single-stage selectable profiles
|
||||
"surface": ["precheck", "surface_validate", "final_check"],
|
||||
"short": ["precheck", "short_smart", "final_check"],
|
||||
"long": ["precheck", "long_smart", "final_check"],
|
||||
# Two-stage combos
|
||||
"surface_short": ["precheck", "surface_validate", "short_smart", "final_check"],
|
||||
"surface_long": ["precheck", "surface_validate", "long_smart", "final_check"],
|
||||
"short_long": ["precheck", "short_smart", "long_smart", "final_check"],
|
||||
# All three
|
||||
"full": ["precheck", "surface_validate", "short_smart", "long_smart", "final_check"],
|
||||
}
|
||||
|
||||
# Per-stage base weights used to compute overall job % progress dynamically
|
||||
_STAGE_BASE_WEIGHTS: dict[str, int] = {
|
||||
"precheck": 5,
|
||||
"surface_validate": 65,
|
||||
"short_smart": 12,
|
||||
"long_smart": 13,
|
||||
"io_validate": 10,
|
||||
"final_check": 5,
|
||||
}
|
||||
|
||||
POLL_INTERVAL = 5.0 # seconds between progress checks during active stages
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Module-level state (initialized in init())
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_semaphore: asyncio.Semaphore | None = None
|
||||
_client: TrueNASClient | None = None
|
||||
|
||||
|
||||
def _now() -> str:
|
||||
return datetime.now(timezone.utc).isoformat()
|
||||
|
||||
|
||||
def _db():
|
||||
"""Open a fresh WAL-mode connection. Caller must use 'async with'."""
|
||||
return aiosqlite.connect(settings.db_path)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Init + startup reconciliation
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
async def init(client: TrueNASClient) -> None:
|
||||
global _semaphore, _client
|
||||
_semaphore = asyncio.Semaphore(settings.max_parallel_burnins)
|
||||
_client = client
|
||||
|
||||
async with _db() as db:
|
||||
db.row_factory = aiosqlite.Row
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
await db.execute("PRAGMA foreign_keys=ON")
|
||||
|
||||
# Mark interrupted running jobs as unknown
|
||||
await db.execute(
|
||||
"UPDATE burnin_jobs SET state='unknown', finished_at=? WHERE state='running'",
|
||||
(_now(),),
|
||||
)
|
||||
|
||||
# Re-enqueue previously queued jobs
|
||||
cur = await db.execute(
|
||||
"SELECT id FROM burnin_jobs WHERE state='queued' ORDER BY created_at"
|
||||
)
|
||||
queued = [r["id"] for r in await cur.fetchall()]
|
||||
await db.commit()
|
||||
|
||||
for job_id in queued:
|
||||
asyncio.create_task(_run_job(job_id))
|
||||
|
||||
log.info("Burn-in orchestrator ready (max_concurrent=%d)", settings.max_parallel_burnins)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Public API
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
async def start_job(drive_id: int, profile: str, operator: str,
|
||||
stage_order: list[str] | None = None) -> int:
|
||||
"""Create and enqueue a burn-in job. Returns the new job ID.
|
||||
|
||||
If stage_order is provided (e.g. ["short_smart","long_smart","surface_validate"]),
|
||||
the job runs those stages in that order (precheck and final_check are always prepended/appended).
|
||||
Otherwise the preset STAGE_ORDER[profile] is used.
|
||||
"""
|
||||
now = _now()
|
||||
|
||||
# Build the actual stage list
|
||||
if stage_order is not None:
|
||||
stages = ["precheck"] + list(stage_order) + ["final_check"]
|
||||
else:
|
||||
stages = STAGE_ORDER[profile]
|
||||
|
||||
async with _db() as db:
|
||||
db.row_factory = aiosqlite.Row
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
await db.execute("PRAGMA foreign_keys=ON")
|
||||
|
||||
# Reject duplicate active burn-in for same drive
|
||||
cur = await db.execute(
|
||||
"SELECT COUNT(*) FROM burnin_jobs WHERE drive_id=? AND state IN ('queued','running')",
|
||||
(drive_id,),
|
||||
)
|
||||
if (await cur.fetchone())[0] > 0:
|
||||
raise ValueError("Drive already has an active burn-in job")
|
||||
|
||||
# Create job
|
||||
cur = await db.execute(
|
||||
"""INSERT INTO burnin_jobs (drive_id, profile, state, percent, operator, created_at)
|
||||
VALUES (?,?,?,?,?,?) RETURNING id""",
|
||||
(drive_id, profile, "queued", 0, operator, now),
|
||||
)
|
||||
job_id = (await cur.fetchone())["id"]
|
||||
|
||||
# Create stage rows in the desired execution order
|
||||
for stage_name in stages:
|
||||
await db.execute(
|
||||
"INSERT INTO burnin_stages (burnin_job_id, stage_name, state) VALUES (?,?,?)",
|
||||
(job_id, stage_name, "pending"),
|
||||
)
|
||||
|
||||
await db.execute(
|
||||
"""INSERT INTO audit_events (event_type, drive_id, burnin_job_id, operator, message)
|
||||
VALUES (?,?,?,?,?)""",
|
||||
("burnin_queued", drive_id, job_id, operator, f"Queued {profile} burn-in"),
|
||||
)
|
||||
await db.commit()
|
||||
|
||||
asyncio.create_task(_run_job(job_id))
|
||||
log.info("Burn-in job %d queued (drive_id=%d profile=%s operator=%s)",
|
||||
job_id, drive_id, profile, operator)
|
||||
return job_id
|
||||
|
||||
|
||||
async def cancel_job(job_id: int, operator: str) -> bool:
|
||||
"""Cancel a queued or running job. Returns True if state was changed."""
|
||||
async with _db() as db:
|
||||
db.row_factory = aiosqlite.Row
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
|
||||
cur = await db.execute(
|
||||
"SELECT state, drive_id FROM burnin_jobs WHERE id=?", (job_id,)
|
||||
)
|
||||
row = await cur.fetchone()
|
||||
if not row or row["state"] not in ("queued", "running"):
|
||||
return False
|
||||
|
||||
await db.execute(
|
||||
"UPDATE burnin_jobs SET state='cancelled', finished_at=? WHERE id=?",
|
||||
(_now(), job_id),
|
||||
)
|
||||
await db.execute(
|
||||
"UPDATE burnin_stages SET state='cancelled' WHERE burnin_job_id=? AND state IN ('pending','running')",
|
||||
(job_id,),
|
||||
)
|
||||
await db.execute(
|
||||
"""INSERT INTO audit_events (event_type, drive_id, burnin_job_id, operator, message)
|
||||
VALUES (?,?,?,?,?)""",
|
||||
("burnin_cancelled", row["drive_id"], job_id, operator, "Cancelled by operator"),
|
||||
)
|
||||
await db.commit()
|
||||
|
||||
log.info("Burn-in job %d cancelled by %s", job_id, operator)
|
||||
return True
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Job runner
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
async def _run_job(job_id: int) -> None:
|
||||
"""Acquire semaphore slot, execute all stages, persist final state."""
|
||||
assert _semaphore is not None, "burnin.init() not called"
|
||||
|
||||
async with _semaphore:
|
||||
if await _is_cancelled(job_id):
|
||||
return
|
||||
|
||||
# Transition queued → running
|
||||
async with _db() as db:
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
row = await (await db.execute(
|
||||
"SELECT drive_id, profile FROM burnin_jobs WHERE id=?", (job_id,)
|
||||
)).fetchone()
|
||||
if not row:
|
||||
return
|
||||
drive_id, profile = row[0], row[1]
|
||||
|
||||
cur = await db.execute("SELECT devname, serial, model FROM drives WHERE id=?", (drive_id,))
|
||||
devname_row = await cur.fetchone()
|
||||
if not devname_row:
|
||||
return
|
||||
devname = devname_row[0]
|
||||
drive_serial = devname_row[1]
|
||||
drive_model = devname_row[2]
|
||||
|
||||
await db.execute(
|
||||
"UPDATE burnin_jobs SET state='running', started_at=? WHERE id=?",
|
||||
(_now(), job_id),
|
||||
)
|
||||
await db.execute(
|
||||
"""INSERT INTO audit_events (event_type, drive_id, burnin_job_id, operator, message)
|
||||
VALUES (?,?,?,(SELECT operator FROM burnin_jobs WHERE id=?),?)""",
|
||||
("burnin_started", drive_id, job_id, job_id, f"Started {profile} burn-in on {devname}"),
|
||||
)
|
||||
# Read stage order from DB (respects any custom order set at job creation)
|
||||
stage_cur = await db.execute(
|
||||
"SELECT stage_name FROM burnin_stages WHERE burnin_job_id=? ORDER BY id",
|
||||
(job_id,),
|
||||
)
|
||||
job_stages = [r[0] for r in await stage_cur.fetchall()]
|
||||
await db.commit()
|
||||
|
||||
_push_update()
|
||||
log.info("Burn-in started", extra={"job_id": job_id, "devname": devname, "profile": profile})
|
||||
|
||||
success = False
|
||||
error_text = None
|
||||
try:
|
||||
success = await _execute_stages(job_id, job_stages, devname, drive_id)
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
except Exception as exc:
|
||||
error_text = str(exc)
|
||||
log.exception("Burn-in raised exception", extra={"job_id": job_id, "devname": devname})
|
||||
|
||||
if await _is_cancelled(job_id):
|
||||
return
|
||||
|
||||
final_state = "passed" if success else "failed"
|
||||
async with _db() as db:
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
await db.execute(
|
||||
"UPDATE burnin_jobs SET state=?, percent=?, finished_at=?, error_text=? WHERE id=?",
|
||||
(final_state, 100 if success else None, _now(), error_text, job_id),
|
||||
)
|
||||
await db.execute(
|
||||
"""INSERT INTO audit_events (event_type, drive_id, burnin_job_id, operator, message)
|
||||
VALUES (?,?,?,(SELECT operator FROM burnin_jobs WHERE id=?),?)""",
|
||||
(f"burnin_{final_state}", drive_id, job_id, job_id,
|
||||
f"Burn-in {final_state} on {devname}"),
|
||||
)
|
||||
await db.commit()
|
||||
|
||||
# Build SSE alert for browser notifications
|
||||
alert = {
|
||||
"state": final_state,
|
||||
"job_id": job_id,
|
||||
"devname": devname,
|
||||
"serial": drive_serial,
|
||||
"model": drive_model,
|
||||
"error_text": error_text,
|
||||
}
|
||||
_push_update(alert=alert)
|
||||
log.info("Burn-in finished", extra={"job_id": job_id, "devname": devname, "state": final_state})
|
||||
|
||||
# Fire webhook + immediate email in background (non-blocking)
|
||||
try:
|
||||
from app import notifier
|
||||
cur2 = None
|
||||
async with _db() as db2:
|
||||
db2.row_factory = aiosqlite.Row
|
||||
cur2 = await db2.execute(
|
||||
"SELECT profile, operator FROM burnin_jobs WHERE id=?", (job_id,)
|
||||
)
|
||||
job_row = await cur2.fetchone()
|
||||
if job_row:
|
||||
asyncio.create_task(notifier.notify_job_complete(
|
||||
job_id=job_id,
|
||||
devname=devname,
|
||||
serial=drive_serial,
|
||||
model=drive_model,
|
||||
state=final_state,
|
||||
profile=job_row["profile"],
|
||||
operator=job_row["operator"],
|
||||
error_text=error_text,
|
||||
))
|
||||
except Exception as exc:
|
||||
log.error("Failed to schedule notifications: %s", exc)
|
||||
|
||||
|
||||
async def _execute_stages(job_id: int, stages: list[str], devname: str, drive_id: int) -> bool:
|
||||
for stage_name in stages:
|
||||
if await _is_cancelled(job_id):
|
||||
return False
|
||||
|
||||
await _start_stage(job_id, stage_name)
|
||||
_push_update()
|
||||
|
||||
try:
|
||||
ok = await _dispatch_stage(job_id, stage_name, devname, drive_id)
|
||||
except Exception as exc:
|
||||
log.error("Stage raised exception: %s", exc, extra={"job_id": job_id, "devname": devname, "stage": stage_name})
|
||||
ok = False
|
||||
await _finish_stage(job_id, stage_name, success=False, error_text=str(exc))
|
||||
_push_update()
|
||||
return False
|
||||
|
||||
if not ok and await _is_cancelled(job_id):
|
||||
# Stage was aborted due to cancellation — mark it cancelled, not failed
|
||||
await _cancel_stage(job_id, stage_name)
|
||||
else:
|
||||
await _finish_stage(job_id, stage_name, success=ok)
|
||||
await _recalculate_progress(job_id, profile)
|
||||
_push_update()
|
||||
|
||||
if not ok:
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
|
||||
async def _dispatch_stage(job_id: int, stage_name: str, devname: str, drive_id: int) -> bool:
|
||||
if stage_name == "precheck":
|
||||
return await _stage_precheck(job_id, drive_id)
|
||||
elif stage_name == "short_smart":
|
||||
return await _stage_smart_test(job_id, devname, "SHORT", "short_smart")
|
||||
elif stage_name == "long_smart":
|
||||
return await _stage_smart_test(job_id, devname, "LONG", "long_smart")
|
||||
elif stage_name == "surface_validate":
|
||||
return await _stage_timed_simulate(job_id, "surface_validate", settings.surface_validate_seconds)
|
||||
elif stage_name == "io_validate":
|
||||
return await _stage_timed_simulate(job_id, "io_validate", settings.io_validate_seconds)
|
||||
elif stage_name == "final_check":
|
||||
return await _stage_final_check(job_id, devname)
|
||||
return True
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Individual stage implementations
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
async def _stage_precheck(job_id: int, drive_id: int) -> bool:
|
||||
"""Check SMART health and temperature before starting destructive work."""
|
||||
async with _db() as db:
|
||||
cur = await db.execute(
|
||||
"SELECT smart_health, temperature_c FROM drives WHERE id=?", (drive_id,)
|
||||
)
|
||||
row = await cur.fetchone()
|
||||
|
||||
if not row:
|
||||
return False
|
||||
|
||||
health, temp = row[0], row[1]
|
||||
|
||||
if health == "FAILED":
|
||||
await _set_stage_error(job_id, "precheck", "Drive SMART health is FAILED — refusing to burn in")
|
||||
return False
|
||||
|
||||
if temp and temp > 60:
|
||||
await _set_stage_error(job_id, "precheck", f"Drive temperature {temp}°C exceeds 60°C limit")
|
||||
return False
|
||||
|
||||
await asyncio.sleep(1) # Simulate brief check
|
||||
return True
|
||||
|
||||
|
||||
async def _stage_smart_test(job_id: int, devname: str, test_type: str, stage_name: str) -> bool:
|
||||
"""Start a TrueNAS SMART test and poll until complete."""
|
||||
tn_job_id = await _client.start_smart_test([devname], test_type)
|
||||
|
||||
while True:
|
||||
if await _is_cancelled(job_id):
|
||||
try:
|
||||
await _client.abort_job(tn_job_id)
|
||||
except Exception:
|
||||
pass
|
||||
return False
|
||||
|
||||
jobs = await _client.get_smart_jobs()
|
||||
job = next((j for j in jobs if j["id"] == tn_job_id), None)
|
||||
|
||||
if not job:
|
||||
return False
|
||||
|
||||
state = job["state"]
|
||||
pct = job["progress"]["percent"]
|
||||
|
||||
await _update_stage_percent(job_id, stage_name, pct)
|
||||
await _recalculate_progress(job_id, None)
|
||||
_push_update()
|
||||
|
||||
if state == "SUCCESS":
|
||||
return True
|
||||
elif state in ("FAILED", "ABORTED"):
|
||||
await _set_stage_error(job_id, stage_name,
|
||||
job.get("error") or f"SMART {test_type} test failed")
|
||||
return False
|
||||
|
||||
await asyncio.sleep(POLL_INTERVAL)
|
||||
|
||||
|
||||
async def _stage_timed_simulate(job_id: int, stage_name: str, duration_seconds: int) -> bool:
|
||||
"""Simulate a timed stage (surface validation / IO validation) with progress updates."""
|
||||
start = time.monotonic()
|
||||
|
||||
while True:
|
||||
if await _is_cancelled(job_id):
|
||||
return False
|
||||
|
||||
elapsed = time.monotonic() - start
|
||||
pct = min(100, int(elapsed / duration_seconds * 100))
|
||||
|
||||
await _update_stage_percent(job_id, stage_name, pct)
|
||||
await _recalculate_progress(job_id, None)
|
||||
_push_update()
|
||||
|
||||
if pct >= 100:
|
||||
return True
|
||||
|
||||
await asyncio.sleep(POLL_INTERVAL)
|
||||
|
||||
|
||||
async def _stage_final_check(job_id: int, devname: str) -> bool:
|
||||
"""Verify drive passed all tests by checking current SMART health in DB."""
|
||||
await asyncio.sleep(1)
|
||||
async with _db() as db:
|
||||
cur = await db.execute(
|
||||
"SELECT smart_health FROM drives WHERE devname=?", (devname,)
|
||||
)
|
||||
row = await cur.fetchone()
|
||||
|
||||
if not row or row[0] == "FAILED":
|
||||
await _set_stage_error(job_id, "final_check", "Drive SMART health is FAILED after burn-in")
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# DB helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
async def _is_cancelled(job_id: int) -> bool:
|
||||
async with _db() as db:
|
||||
cur = await db.execute("SELECT state FROM burnin_jobs WHERE id=?", (job_id,))
|
||||
row = await cur.fetchone()
|
||||
return bool(row and row[0] == "cancelled")
|
||||
|
||||
|
||||
async def _start_stage(job_id: int, stage_name: str) -> None:
|
||||
async with _db() as db:
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
await db.execute(
|
||||
"UPDATE burnin_stages SET state='running', started_at=? WHERE burnin_job_id=? AND stage_name=?",
|
||||
(_now(), job_id, stage_name),
|
||||
)
|
||||
await db.execute(
|
||||
"UPDATE burnin_jobs SET stage_name=? WHERE id=?",
|
||||
(stage_name, job_id),
|
||||
)
|
||||
await db.commit()
|
||||
|
||||
|
||||
async def _finish_stage(job_id: int, stage_name: str, success: bool, error_text: str | None = None) -> None:
|
||||
now = _now()
|
||||
state = "passed" if success else "failed"
|
||||
async with _db() as db:
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
cur = await db.execute(
|
||||
"SELECT started_at FROM burnin_stages WHERE burnin_job_id=? AND stage_name=?",
|
||||
(job_id, stage_name),
|
||||
)
|
||||
row = await cur.fetchone()
|
||||
duration = None
|
||||
if row and row[0]:
|
||||
try:
|
||||
start = datetime.fromisoformat(row[0])
|
||||
if start.tzinfo is None:
|
||||
start = start.replace(tzinfo=timezone.utc)
|
||||
duration = (datetime.now(timezone.utc) - start).total_seconds()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Only overwrite error_text if one is passed; otherwise preserve what the stage already wrote
|
||||
if error_text is not None:
|
||||
await db.execute(
|
||||
"""UPDATE burnin_stages
|
||||
SET state=?, percent=?, finished_at=?, duration_seconds=?, error_text=?
|
||||
WHERE burnin_job_id=? AND stage_name=?""",
|
||||
(state, 100 if success else None, now, duration, error_text, job_id, stage_name),
|
||||
)
|
||||
else:
|
||||
await db.execute(
|
||||
"""UPDATE burnin_stages
|
||||
SET state=?, percent=?, finished_at=?, duration_seconds=?
|
||||
WHERE burnin_job_id=? AND stage_name=?""",
|
||||
(state, 100 if success else None, now, duration, job_id, stage_name),
|
||||
)
|
||||
await db.commit()
|
||||
|
||||
|
||||
async def _update_stage_percent(job_id: int, stage_name: str, pct: int) -> None:
|
||||
async with _db() as db:
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
await db.execute(
|
||||
"UPDATE burnin_stages SET percent=? WHERE burnin_job_id=? AND stage_name=?",
|
||||
(pct, job_id, stage_name),
|
||||
)
|
||||
await db.commit()
|
||||
|
||||
|
||||
async def _cancel_stage(job_id: int, stage_name: str) -> None:
|
||||
now = _now()
|
||||
async with _db() as db:
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
await db.execute(
|
||||
"UPDATE burnin_stages SET state='cancelled', finished_at=? WHERE burnin_job_id=? AND stage_name=?",
|
||||
(now, job_id, stage_name),
|
||||
)
|
||||
await db.commit()
|
||||
|
||||
|
||||
async def _set_stage_error(job_id: int, stage_name: str, error_text: str) -> None:
|
||||
async with _db() as db:
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
await db.execute(
|
||||
"UPDATE burnin_stages SET error_text=? WHERE burnin_job_id=? AND stage_name=?",
|
||||
(error_text, job_id, stage_name),
|
||||
)
|
||||
await db.commit()
|
||||
|
||||
|
||||
async def _recalculate_progress(job_id: int, profile: str | None = None) -> None:
|
||||
"""Recompute overall job % from actual stage rows. profile param is unused (kept for compat)."""
|
||||
async with _db() as db:
|
||||
db.row_factory = aiosqlite.Row
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
|
||||
cur = await db.execute(
|
||||
"SELECT stage_name, state, percent FROM burnin_stages WHERE burnin_job_id=? ORDER BY id",
|
||||
(job_id,),
|
||||
)
|
||||
stages = await cur.fetchall()
|
||||
if not stages:
|
||||
return
|
||||
|
||||
total_weight = sum(_STAGE_BASE_WEIGHTS.get(s["stage_name"], 5) for s in stages)
|
||||
if total_weight == 0:
|
||||
return
|
||||
|
||||
completed = 0.0
|
||||
current = None
|
||||
for s in stages:
|
||||
w = _STAGE_BASE_WEIGHTS.get(s["stage_name"], 5)
|
||||
st = s["state"]
|
||||
if st == "passed":
|
||||
completed += w
|
||||
elif st == "running":
|
||||
completed += w * (s["percent"] or 0) / 100
|
||||
current = s["stage_name"]
|
||||
|
||||
pct = int(completed / total_weight * 100)
|
||||
await db.execute(
|
||||
"UPDATE burnin_jobs SET percent=?, stage_name=? WHERE id=?",
|
||||
(pct, current, job_id),
|
||||
)
|
||||
await db.commit()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# SSE push
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _push_update(alert: dict | None = None) -> None:
|
||||
"""Notify SSE subscribers that data has changed, with optional browser notification payload."""
|
||||
try:
|
||||
from app import poller
|
||||
poller._notify_subscribers(alert=alert)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Stuck-job detection (called by poller every ~5 cycles)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
async def check_stuck_jobs() -> None:
|
||||
"""Mark jobs that have been 'running' beyond stuck_job_hours as 'unknown'."""
|
||||
threshold_seconds = settings.stuck_job_hours * 3600
|
||||
|
||||
async with _db() as db:
|
||||
db.row_factory = aiosqlite.Row
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
|
||||
cur = await db.execute("""
|
||||
SELECT bj.id, bj.drive_id, d.devname, bj.started_at
|
||||
FROM burnin_jobs bj
|
||||
JOIN drives d ON d.id = bj.drive_id
|
||||
WHERE bj.state = 'running'
|
||||
AND bj.started_at IS NOT NULL
|
||||
AND (julianday('now') - julianday(bj.started_at)) * 86400 > ?
|
||||
""", (threshold_seconds,))
|
||||
stuck = await cur.fetchall()
|
||||
|
||||
if not stuck:
|
||||
return
|
||||
|
||||
now = _now()
|
||||
for row in stuck:
|
||||
job_id, drive_id, devname, started_at = row[0], row[1], row[2], row[3]
|
||||
log.critical(
|
||||
"Stuck burn-in detected — marking unknown",
|
||||
extra={"job_id": job_id, "devname": devname, "started_at": started_at},
|
||||
)
|
||||
await db.execute(
|
||||
"UPDATE burnin_jobs SET state='unknown', finished_at=? WHERE id=?",
|
||||
(now, job_id),
|
||||
)
|
||||
await db.execute(
|
||||
"""INSERT INTO audit_events (event_type, drive_id, burnin_job_id, operator, message)
|
||||
VALUES (?,?,?,?,?)""",
|
||||
("burnin_stuck", drive_id, job_id, "system",
|
||||
f"Job stuck for >{settings.stuck_job_hours}h — automatically marked unknown"),
|
||||
)
|
||||
|
||||
await db.commit()
|
||||
|
||||
_push_update()
|
||||
log.warning("Marked %d stuck job(s) as unknown", len(stuck))
|
||||
55
app/config.py
Normal file
55
app/config.py
Normal file
|
|
@ -0,0 +1,55 @@
|
|||
from pydantic_settings import BaseSettings, SettingsConfigDict
|
||||
|
||||
|
||||
class Settings(BaseSettings):
|
||||
model_config = SettingsConfigDict(
|
||||
env_file=".env",
|
||||
env_file_encoding="utf-8",
|
||||
case_sensitive=False,
|
||||
)
|
||||
|
||||
app_host: str = "0.0.0.0"
|
||||
app_port: int = 8080
|
||||
db_path: str = "/data/app.db"
|
||||
|
||||
truenas_base_url: str = "http://localhost:8000"
|
||||
truenas_api_key: str = "mock-key"
|
||||
truenas_verify_tls: bool = False
|
||||
|
||||
poll_interval_seconds: int = 12
|
||||
stale_threshold_seconds: int = 45
|
||||
max_parallel_burnins: int = 2
|
||||
surface_validate_seconds: int = 45 # mock simulation duration
|
||||
io_validate_seconds: int = 25 # mock simulation duration
|
||||
|
||||
# Logging
|
||||
log_level: str = "INFO"
|
||||
|
||||
# Security — comma-separated IPs or CIDRs, e.g. "10.0.0.0/24,127.0.0.1"
|
||||
# Empty string means allow all (default).
|
||||
allowed_ips: str = ""
|
||||
|
||||
# SMTP — daily status email at 8am local time
|
||||
# Leave smtp_host empty to disable email.
|
||||
smtp_host: str = ""
|
||||
smtp_port: int = 587
|
||||
smtp_user: str = ""
|
||||
smtp_password: str = ""
|
||||
smtp_from: str = ""
|
||||
smtp_to: str = "" # comma-separated recipients
|
||||
smtp_report_hour: int = 8 # local hour to send (0-23)
|
||||
smtp_daily_report_enabled: bool = True # set False to skip daily report without disabling alerts
|
||||
smtp_alert_on_fail: bool = True # immediate email when a job fails
|
||||
smtp_alert_on_pass: bool = False # immediate email when a job passes
|
||||
smtp_ssl_mode: str = "starttls" # "starttls" | "ssl" | "plain"
|
||||
smtp_timeout: int = 60 # connection + read timeout in seconds
|
||||
|
||||
# Webhook — POST JSON payload on every job state change (pass/fail)
|
||||
# Leave empty to disable. Works with Slack, Discord, ntfy, n8n, etc.
|
||||
webhook_url: str = ""
|
||||
|
||||
# Stuck-job detection: jobs running longer than this are marked 'unknown'
|
||||
stuck_job_hours: int = 24
|
||||
|
||||
|
||||
settings = Settings()
|
||||
143
app/database.py
Normal file
143
app/database.py
Normal file
|
|
@ -0,0 +1,143 @@
|
|||
import aiosqlite
|
||||
from pathlib import Path
|
||||
|
||||
from app.config import settings
|
||||
|
||||
SCHEMA = """
|
||||
CREATE TABLE IF NOT EXISTS drives (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
truenas_disk_id TEXT UNIQUE NOT NULL,
|
||||
devname TEXT NOT NULL,
|
||||
serial TEXT,
|
||||
model TEXT,
|
||||
size_bytes INTEGER,
|
||||
temperature_c INTEGER,
|
||||
smart_health TEXT DEFAULT 'UNKNOWN',
|
||||
last_seen_at TEXT NOT NULL,
|
||||
last_polled_at TEXT NOT NULL,
|
||||
notes TEXT,
|
||||
location TEXT
|
||||
);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS smart_tests (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
drive_id INTEGER NOT NULL REFERENCES drives(id) ON DELETE CASCADE,
|
||||
test_type TEXT NOT NULL CHECK(test_type IN ('short', 'long')),
|
||||
state TEXT NOT NULL DEFAULT 'idle',
|
||||
percent INTEGER DEFAULT 0,
|
||||
truenas_job_id INTEGER,
|
||||
started_at TEXT,
|
||||
eta_at TEXT,
|
||||
finished_at TEXT,
|
||||
error_text TEXT,
|
||||
UNIQUE(drive_id, test_type)
|
||||
);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS burnin_jobs (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
drive_id INTEGER NOT NULL REFERENCES drives(id),
|
||||
profile TEXT NOT NULL,
|
||||
state TEXT NOT NULL DEFAULT 'queued',
|
||||
percent INTEGER DEFAULT 0,
|
||||
stage_name TEXT,
|
||||
operator TEXT NOT NULL,
|
||||
created_at TEXT NOT NULL,
|
||||
started_at TEXT,
|
||||
finished_at TEXT,
|
||||
error_text TEXT
|
||||
);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS burnin_stages (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
burnin_job_id INTEGER NOT NULL REFERENCES burnin_jobs(id) ON DELETE CASCADE,
|
||||
stage_name TEXT NOT NULL,
|
||||
state TEXT NOT NULL DEFAULT 'pending',
|
||||
percent INTEGER DEFAULT 0,
|
||||
started_at TEXT,
|
||||
finished_at TEXT,
|
||||
duration_seconds REAL,
|
||||
error_text TEXT
|
||||
);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS audit_events (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
event_type TEXT NOT NULL,
|
||||
drive_id INTEGER REFERENCES drives(id),
|
||||
burnin_job_id INTEGER REFERENCES burnin_jobs(id),
|
||||
operator TEXT,
|
||||
message TEXT NOT NULL,
|
||||
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now'))
|
||||
);
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_smart_drive_type ON smart_tests(drive_id, test_type);
|
||||
CREATE INDEX IF NOT EXISTS idx_burnin_jobs_drive ON burnin_jobs(drive_id, state);
|
||||
CREATE INDEX IF NOT EXISTS idx_burnin_stages_job ON burnin_stages(burnin_job_id);
|
||||
CREATE INDEX IF NOT EXISTS idx_audit_events_job ON audit_events(burnin_job_id);
|
||||
"""
|
||||
|
||||
|
||||
# Migrations for existing databases that predate schema additions.
|
||||
# Each entry is tried with try/except — SQLite raises OperationalError
|
||||
# ("duplicate column name") if the column already exists, which is safe to ignore.
|
||||
_MIGRATIONS = [
|
||||
"ALTER TABLE drives ADD COLUMN notes TEXT",
|
||||
"ALTER TABLE drives ADD COLUMN location TEXT",
|
||||
]
|
||||
|
||||
|
||||
async def _run_migrations(db: aiosqlite.Connection) -> None:
|
||||
for sql in _MIGRATIONS:
|
||||
try:
|
||||
await db.execute(sql)
|
||||
except Exception:
|
||||
pass # Column already exists — harmless
|
||||
|
||||
# Remove the old CHECK(profile IN ('quick','full')) constraint if present.
|
||||
# SQLite can't ALTER a CHECK — requires a full table rebuild.
|
||||
cur = await db.execute(
|
||||
"SELECT sql FROM sqlite_master WHERE type='table' AND name='burnin_jobs'"
|
||||
)
|
||||
row = await cur.fetchone()
|
||||
if row and "CHECK" in (row[0] or ""):
|
||||
await db.executescript("""
|
||||
PRAGMA foreign_keys=OFF;
|
||||
CREATE TABLE burnin_jobs_new (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
drive_id INTEGER NOT NULL REFERENCES drives(id),
|
||||
profile TEXT NOT NULL,
|
||||
state TEXT NOT NULL DEFAULT 'queued',
|
||||
percent INTEGER DEFAULT 0,
|
||||
stage_name TEXT,
|
||||
operator TEXT NOT NULL,
|
||||
created_at TEXT NOT NULL,
|
||||
started_at TEXT,
|
||||
finished_at TEXT,
|
||||
error_text TEXT
|
||||
);
|
||||
INSERT INTO burnin_jobs_new SELECT * FROM burnin_jobs;
|
||||
DROP TABLE burnin_jobs;
|
||||
ALTER TABLE burnin_jobs_new RENAME TO burnin_jobs;
|
||||
CREATE INDEX IF NOT EXISTS idx_burnin_jobs_drive ON burnin_jobs(drive_id, state);
|
||||
PRAGMA foreign_keys=ON;
|
||||
""")
|
||||
|
||||
|
||||
async def init_db() -> None:
|
||||
Path(settings.db_path).parent.mkdir(parents=True, exist_ok=True)
|
||||
async with aiosqlite.connect(settings.db_path) as db:
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
await db.execute("PRAGMA foreign_keys=ON")
|
||||
await db.executescript(SCHEMA)
|
||||
await _run_migrations(db)
|
||||
await db.commit()
|
||||
|
||||
|
||||
async def get_db():
|
||||
db = await aiosqlite.connect(settings.db_path)
|
||||
db.row_factory = aiosqlite.Row
|
||||
try:
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
await db.execute("PRAGMA foreign_keys=ON")
|
||||
yield db
|
||||
finally:
|
||||
await db.close()
|
||||
50
app/logging_config.py
Normal file
50
app/logging_config.py
Normal file
|
|
@ -0,0 +1,50 @@
|
|||
"""
|
||||
Structured JSON logging configuration.
|
||||
|
||||
Each log line is a single JSON object:
|
||||
{"ts":"2026-02-21T21:34:36","level":"INFO","logger":"app.burnin","msg":"...","job_id":1}
|
||||
|
||||
Extra context fields (job_id, drive_id, devname, stage) are included when
|
||||
passed via the logging `extra=` kwarg.
|
||||
"""
|
||||
|
||||
import json
|
||||
import logging
|
||||
import traceback
|
||||
|
||||
from app.config import settings
|
||||
|
||||
# Standard LogRecord attributes to exclude from the "extra" dump
|
||||
_STDLIB_ATTRS = frozenset(logging.LogRecord("", 0, "", 0, "", (), None).__dict__)
|
||||
|
||||
|
||||
class _JsonFormatter(logging.Formatter):
|
||||
def format(self, record: logging.LogRecord) -> str:
|
||||
data: dict = {
|
||||
"ts": self.formatTime(record, "%Y-%m-%dT%H:%M:%S"),
|
||||
"level": record.levelname,
|
||||
"logger": record.name,
|
||||
"msg": record.getMessage(),
|
||||
}
|
||||
# Include any non-standard fields passed via extra={}
|
||||
for key, val in record.__dict__.items():
|
||||
if key not in _STDLIB_ATTRS and not key.startswith("_"):
|
||||
data[key] = val
|
||||
if record.exc_info:
|
||||
data["exc"] = "".join(traceback.format_exception(*record.exc_info)).strip()
|
||||
return json.dumps(data)
|
||||
|
||||
|
||||
def configure() -> None:
|
||||
handler = logging.StreamHandler()
|
||||
handler.setFormatter(_JsonFormatter())
|
||||
|
||||
level = getattr(logging, settings.log_level.upper(), logging.INFO)
|
||||
root = logging.getLogger()
|
||||
root.setLevel(level)
|
||||
root.handlers = [handler]
|
||||
|
||||
# Quiet chatty third-party loggers
|
||||
logging.getLogger("httpx").setLevel(logging.WARNING)
|
||||
logging.getLogger("httpcore").setLevel(logging.WARNING)
|
||||
logging.getLogger("uvicorn.access").setLevel(logging.WARNING)
|
||||
453
app/mailer.py
Normal file
453
app/mailer.py
Normal file
|
|
@ -0,0 +1,453 @@
|
|||
"""
|
||||
Daily status email — sent at smtp_report_hour (local time) every day.
|
||||
|
||||
Disabled when SMTP_HOST is not set.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
import smtplib
|
||||
import ssl
|
||||
from datetime import datetime, timedelta, timezone
|
||||
from email.mime.multipart import MIMEMultipart
|
||||
from email.mime.text import MIMEText
|
||||
|
||||
import aiosqlite
|
||||
|
||||
from app.config import settings
|
||||
|
||||
log = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# HTML email template
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _chip(state: str) -> str:
|
||||
colours = {
|
||||
"PASSED": ("#1a4731", "#3fb950", "#3fb950"),
|
||||
"passed": ("#1a4731", "#3fb950", "#3fb950"),
|
||||
"FAILED": ("#4b1113", "#f85149", "#f85149"),
|
||||
"failed": ("#4b1113", "#f85149", "#f85149"),
|
||||
"running": ("#0d2d6b", "#58a6ff", "#58a6ff"),
|
||||
"queued": ("#4b3800", "#d29922", "#d29922"),
|
||||
"cancelled": ("#222", "#8b949e", "#8b949e"),
|
||||
"unknown": ("#222", "#8b949e", "#8b949e"),
|
||||
"idle": ("#222", "#8b949e", "#8b949e"),
|
||||
"UNKNOWN": ("#222", "#8b949e", "#8b949e"),
|
||||
}
|
||||
bg, fg, bd = colours.get(state, ("#222", "#8b949e", "#8b949e"))
|
||||
label = state.upper()
|
||||
return (
|
||||
f'<span style="background:{bg};color:{fg};border:1px solid {bd};'
|
||||
f'border-radius:4px;padding:2px 8px;font-size:11px;font-weight:600;'
|
||||
f'letter-spacing:.04em;white-space:nowrap">{label}</span>'
|
||||
)
|
||||
|
||||
|
||||
def _temp_colour(c) -> str:
|
||||
if c is None:
|
||||
return "#8b949e"
|
||||
if c < 40:
|
||||
return "#3fb950"
|
||||
if c < 50:
|
||||
return "#d29922"
|
||||
return "#f85149"
|
||||
|
||||
|
||||
def _fmt_bytes(b) -> str:
|
||||
if b is None:
|
||||
return "—"
|
||||
tb = b / 1_000_000_000_000
|
||||
if tb >= 1:
|
||||
return f"{tb:.0f} TB"
|
||||
return f"{b / 1_000_000_000:.0f} GB"
|
||||
|
||||
|
||||
def _fmt_dt(iso: str | None) -> str:
|
||||
if not iso:
|
||||
return "—"
|
||||
try:
|
||||
dt = datetime.fromisoformat(iso)
|
||||
if dt.tzinfo is None:
|
||||
dt = dt.replace(tzinfo=timezone.utc)
|
||||
return dt.astimezone().strftime("%Y-%m-%d %H:%M")
|
||||
except Exception:
|
||||
return iso or "—"
|
||||
|
||||
|
||||
def _drive_rows_html(drives: list[dict]) -> str:
|
||||
if not drives:
|
||||
return '<tr><td colspan="8" style="text-align:center;color:#8b949e;padding:24px">No drives found</td></tr>'
|
||||
|
||||
rows = []
|
||||
for d in drives:
|
||||
health = d.get("smart_health") or "UNKNOWN"
|
||||
temp = d.get("temperature_c")
|
||||
bi = d.get("burnin") or {}
|
||||
bi_state = bi.get("state", "—") if bi else "—"
|
||||
|
||||
short = d.get("smart_short") or {}
|
||||
long_ = d.get("smart_long") or {}
|
||||
short_state = short.get("state", "idle")
|
||||
long_state = long_.get("state", "idle")
|
||||
|
||||
row_bg = "#1c0a0a" if health == "FAILED" else "#0d1117"
|
||||
|
||||
rows.append(f"""
|
||||
<tr style="background:{row_bg};border-bottom:1px solid #30363d">
|
||||
<td style="padding:9px 12px;font-weight:600;color:#c9d1d9">{d.get('devname','—')}</td>
|
||||
<td style="padding:9px 12px;color:#8b949e;font-size:12px">{d.get('model','—')}</td>
|
||||
<td style="padding:9px 12px;font-family:monospace;font-size:12px;color:#8b949e">{d.get('serial','—')}</td>
|
||||
<td style="padding:9px 12px;text-align:right;color:#8b949e">{_fmt_bytes(d.get('size_bytes'))}</td>
|
||||
<td style="padding:9px 12px;text-align:right;color:{_temp_colour(temp)};font-weight:500">{f'{temp}°C' if temp is not None else '—'}</td>
|
||||
<td style="padding:9px 12px">{_chip(health)}</td>
|
||||
<td style="padding:9px 12px">{_chip(short_state)}</td>
|
||||
<td style="padding:9px 12px">{_chip(long_state)}</td>
|
||||
<td style="padding:9px 12px">{_chip(bi_state) if bi else '—'}</td>
|
||||
</tr>""")
|
||||
return "\n".join(rows)
|
||||
|
||||
|
||||
def _build_html(drives: list[dict], generated_at: str) -> str:
|
||||
total = len(drives)
|
||||
failed_drives = [d for d in drives if d.get("smart_health") == "FAILED"]
|
||||
running_burnin = [d for d in drives if (d.get("burnin") or {}).get("state") == "running"]
|
||||
passed_burnin = [d for d in drives if (d.get("burnin") or {}).get("state") == "passed"]
|
||||
|
||||
# Alert banner
|
||||
alert_html = ""
|
||||
if failed_drives:
|
||||
names = ", ".join(d["devname"] for d in failed_drives)
|
||||
alert_html = f"""
|
||||
<div style="background:#4b1113;border:1px solid #f85149;border-radius:6px;padding:14px 18px;margin-bottom:20px;color:#f85149;font-weight:500">
|
||||
⚠ SMART health FAILED on {len(failed_drives)} drive(s): {names}
|
||||
</div>"""
|
||||
|
||||
drive_rows = _drive_rows_html(drives)
|
||||
|
||||
return f"""<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width,initial-scale=1">
|
||||
<title>TrueNAS Burn-In — Daily Report</title>
|
||||
</head>
|
||||
<body style="margin:0;padding:0;background:#0d1117;font-family:-apple-system,BlinkMacSystemFont,'Segoe UI',system-ui,sans-serif;font-size:14px;color:#c9d1d9">
|
||||
<table width="100%" cellpadding="0" cellspacing="0" style="background:#0d1117;min-height:100vh">
|
||||
<tr><td align="center" style="padding:32px 16px">
|
||||
|
||||
<table width="700" cellpadding="0" cellspacing="0" style="max-width:700px;width:100%">
|
||||
|
||||
<!-- Header -->
|
||||
<tr>
|
||||
<td style="background:#161b22;border:1px solid #30363d;border-radius:10px 10px 0 0;padding:20px 24px;border-bottom:none">
|
||||
<table width="100%" cellpadding="0" cellspacing="0">
|
||||
<tr>
|
||||
<td><span style="font-size:18px;font-weight:700;color:#f0f6fc">TrueNAS Burn-In</span>
|
||||
<span style="color:#8b949e;font-size:13px;margin-left:10px">Daily Status Report</span></td>
|
||||
<td align="right" style="color:#8b949e;font-size:12px">{generated_at}</td>
|
||||
</tr>
|
||||
</table>
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<!-- Body -->
|
||||
<tr>
|
||||
<td style="background:#0d1117;border:1px solid #30363d;border-top:none;border-bottom:none;padding:24px">
|
||||
|
||||
{alert_html}
|
||||
|
||||
<!-- Summary chips -->
|
||||
<table cellpadding="0" cellspacing="0" style="margin-bottom:24px">
|
||||
<tr>
|
||||
<td style="padding-right:10px">
|
||||
<div style="background:#161b22;border:1px solid #30363d;border-radius:8px;padding:12px 18px;text-align:center;min-width:80px">
|
||||
<div style="font-size:24px;font-weight:700;color:#f0f6fc">{total}</div>
|
||||
<div style="font-size:11px;color:#8b949e;text-transform:uppercase;letter-spacing:.06em;margin-top:2px">Drives</div>
|
||||
</div>
|
||||
</td>
|
||||
<td style="padding-right:10px">
|
||||
<div style="background:#161b22;border:1px solid #30363d;border-radius:8px;padding:12px 18px;text-align:center;min-width:80px">
|
||||
<div style="font-size:24px;font-weight:700;color:#f85149">{len(failed_drives)}</div>
|
||||
<div style="font-size:11px;color:#8b949e;text-transform:uppercase;letter-spacing:.06em;margin-top:2px">Failed</div>
|
||||
</div>
|
||||
</td>
|
||||
<td style="padding-right:10px">
|
||||
<div style="background:#161b22;border:1px solid #30363d;border-radius:8px;padding:12px 18px;text-align:center;min-width:80px">
|
||||
<div style="font-size:24px;font-weight:700;color:#58a6ff">{len(running_burnin)}</div>
|
||||
<div style="font-size:11px;color:#8b949e;text-transform:uppercase;letter-spacing:.06em;margin-top:2px">Running</div>
|
||||
</div>
|
||||
</td>
|
||||
<td>
|
||||
<div style="background:#161b22;border:1px solid #30363d;border-radius:8px;padding:12px 18px;text-align:center;min-width:80px">
|
||||
<div style="font-size:24px;font-weight:700;color:#3fb950">{len(passed_burnin)}</div>
|
||||
<div style="font-size:11px;color:#8b949e;text-transform:uppercase;letter-spacing:.06em;margin-top:2px">Passed</div>
|
||||
</div>
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<!-- Drive table -->
|
||||
<table width="100%" cellpadding="0" cellspacing="0" style="border:1px solid #30363d;border-radius:8px;overflow:hidden">
|
||||
<thead>
|
||||
<tr style="background:#161b22">
|
||||
<th style="padding:9px 12px;font-size:11px;font-weight:600;text-transform:uppercase;letter-spacing:.06em;color:#8b949e;text-align:left;border-bottom:1px solid #30363d">Drive</th>
|
||||
<th style="padding:9px 12px;font-size:11px;font-weight:600;text-transform:uppercase;letter-spacing:.06em;color:#8b949e;text-align:left;border-bottom:1px solid #30363d">Model</th>
|
||||
<th style="padding:9px 12px;font-size:11px;font-weight:600;text-transform:uppercase;letter-spacing:.06em;color:#8b949e;text-align:left;border-bottom:1px solid #30363d">Serial</th>
|
||||
<th style="padding:9px 12px;font-size:11px;font-weight:600;text-transform:uppercase;letter-spacing:.06em;color:#8b949e;text-align:right;border-bottom:1px solid #30363d">Size</th>
|
||||
<th style="padding:9px 12px;font-size:11px;font-weight:600;text-transform:uppercase;letter-spacing:.06em;color:#8b949e;text-align:right;border-bottom:1px solid #30363d">Temp</th>
|
||||
<th style="padding:9px 12px;font-size:11px;font-weight:600;text-transform:uppercase;letter-spacing:.06em;color:#8b949e;text-align:left;border-bottom:1px solid #30363d">Health</th>
|
||||
<th style="padding:9px 12px;font-size:11px;font-weight:600;text-transform:uppercase;letter-spacing:.06em;color:#8b949e;text-align:left;border-bottom:1px solid #30363d">Short</th>
|
||||
<th style="padding:9px 12px;font-size:11px;font-weight:600;text-transform:uppercase;letter-spacing:.06em;color:#8b949e;text-align:left;border-bottom:1px solid #30363d">Long</th>
|
||||
<th style="padding:9px 12px;font-size:11px;font-weight:600;text-transform:uppercase;letter-spacing:.06em;color:#8b949e;text-align:left;border-bottom:1px solid #30363d">Burn-In</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{drive_rows}
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<!-- Footer -->
|
||||
<tr>
|
||||
<td style="background:#161b22;border:1px solid #30363d;border-top:none;border-radius:0 0 10px 10px;padding:14px 24px;text-align:center">
|
||||
<span style="font-size:12px;color:#8b949e">Generated by TrueNAS Burn-In Dashboard · {generated_at}</span>
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
</table>
|
||||
</td></tr>
|
||||
</table>
|
||||
</body>
|
||||
</html>"""
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Send
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Standard ports for each SSL mode — used when smtp_port is not overridden
|
||||
_MODE_PORTS: dict[str, int] = {"starttls": 587, "ssl": 465, "plain": 25}
|
||||
|
||||
|
||||
def _smtp_port() -> int:
|
||||
"""Derive port from ssl_mode; fall back to settings.smtp_port if explicitly set."""
|
||||
mode = (settings.smtp_ssl_mode or "starttls").lower()
|
||||
return _MODE_PORTS.get(mode, 587)
|
||||
|
||||
|
||||
def _send_email(subject: str, html: str) -> None:
|
||||
recipients = [r.strip() for r in settings.smtp_to.split(",") if r.strip()]
|
||||
if not recipients:
|
||||
log.warning("SMTP_TO is empty — skipping send")
|
||||
return
|
||||
|
||||
msg = MIMEMultipart("alternative")
|
||||
msg["Subject"] = subject
|
||||
msg["From"] = settings.smtp_from or settings.smtp_user
|
||||
msg["To"] = ", ".join(recipients)
|
||||
msg.attach(MIMEText(html, "html", "utf-8"))
|
||||
|
||||
ctx = ssl.create_default_context()
|
||||
mode = (settings.smtp_ssl_mode or "starttls").lower()
|
||||
timeout = int(settings.smtp_timeout or 60)
|
||||
port = _smtp_port()
|
||||
|
||||
if mode == "ssl":
|
||||
server = smtplib.SMTP_SSL(settings.smtp_host, port, context=ctx, timeout=timeout)
|
||||
server.ehlo()
|
||||
server.login(settings.smtp_user, settings.smtp_password)
|
||||
server.sendmail(msg["From"], recipients, msg.as_string())
|
||||
server.quit()
|
||||
else:
|
||||
with smtplib.SMTP(settings.smtp_host, port, timeout=timeout) as server:
|
||||
server.ehlo()
|
||||
if mode == "starttls":
|
||||
server.starttls(context=ctx)
|
||||
server.ehlo()
|
||||
server.login(settings.smtp_user, settings.smtp_password)
|
||||
server.sendmail(msg["From"], recipients, msg.as_string())
|
||||
|
||||
log.info("Email sent to %s", recipients)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Data fetch
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
async def _fetch_report_data() -> list[dict]:
|
||||
"""Pull drives + latest burnin state from DB."""
|
||||
from app.routes import _fetch_drives_for_template # local import avoids circular
|
||||
async with aiosqlite.connect(settings.db_path) as db:
|
||||
db.row_factory = aiosqlite.Row
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
return await _fetch_drives_for_template(db)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Scheduler
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _build_alert_html(
|
||||
job_id: int,
|
||||
devname: str,
|
||||
serial: str | None,
|
||||
model: str | None,
|
||||
state: str,
|
||||
error_text: str | None,
|
||||
generated_at: str,
|
||||
) -> str:
|
||||
is_fail = state == "failed"
|
||||
color = "#f85149" if is_fail else "#3fb950"
|
||||
bg = "#4b1113" if is_fail else "#1a4731"
|
||||
icon = "✕" if is_fail else "✓"
|
||||
|
||||
error_section = ""
|
||||
if error_text:
|
||||
error_section = f"""
|
||||
<div style="background:#4b1113;border:1px solid #f85149;border-radius:6px;
|
||||
padding:12px 16px;margin-top:16px;color:#f85149;font-size:13px">
|
||||
<strong>Error:</strong> {error_text}
|
||||
</div>"""
|
||||
|
||||
return f"""<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head><meta charset="UTF-8"><title>Burn-In {state.title()} Alert</title></head>
|
||||
<body style="margin:0;padding:0;background:#0d1117;font-family:-apple-system,sans-serif;
|
||||
font-size:14px;color:#c9d1d9">
|
||||
<table width="100%" cellpadding="0" cellspacing="0">
|
||||
<tr><td align="center" style="padding:32px 16px">
|
||||
<table width="480" cellpadding="0" cellspacing="0" style="max-width:480px;width:100%">
|
||||
<tr>
|
||||
<td style="background:{bg};border:2px solid {color};border-radius:10px;padding:24px">
|
||||
<div style="font-size:26px;font-weight:700;color:{color};margin-bottom:16px">
|
||||
{icon} Burn-In {state.upper()}
|
||||
</div>
|
||||
<table cellpadding="0" cellspacing="0" style="width:100%">
|
||||
<tr>
|
||||
<td style="color:#8b949e;font-size:12px;padding:5px 0">Device</td>
|
||||
<td style="font-weight:600;text-align:right;font-size:15px">{devname}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td style="color:#8b949e;font-size:12px;padding:5px 0">Model</td>
|
||||
<td style="text-align:right">{model or '—'}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td style="color:#8b949e;font-size:12px;padding:5px 0">Serial</td>
|
||||
<td style="font-family:monospace;text-align:right">{serial or '—'}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td style="color:#8b949e;font-size:12px;padding:5px 0">Job #</td>
|
||||
<td style="font-family:monospace;text-align:right">{job_id}</td>
|
||||
</tr>
|
||||
</table>
|
||||
{error_section}
|
||||
<div style="margin-top:16px;font-size:11px;color:#8b949e">{generated_at}</div>
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
</td></tr>
|
||||
</table>
|
||||
</body>
|
||||
</html>"""
|
||||
|
||||
|
||||
async def send_job_alert(
|
||||
job_id: int,
|
||||
devname: str,
|
||||
serial: str | None,
|
||||
model: str | None,
|
||||
state: str,
|
||||
error_text: str | None,
|
||||
) -> None:
|
||||
"""Send an immediate per-job alert email (pass or fail)."""
|
||||
icon = "✕" if state == "failed" else "✓"
|
||||
subject = f"{icon} Burn-In {state.upper()}: {devname} ({serial or 'no serial'})"
|
||||
now_str = datetime.now().strftime("%Y-%m-%d %H:%M")
|
||||
html = _build_alert_html(job_id, devname, serial, model, state, error_text, now_str)
|
||||
await asyncio.to_thread(_send_email, subject, html)
|
||||
|
||||
|
||||
async def test_smtp_connection() -> dict:
|
||||
"""
|
||||
Try to establish an SMTP connection using current settings.
|
||||
Returns {"ok": True/False, "error": str|None}.
|
||||
Does NOT send any email.
|
||||
"""
|
||||
if not settings.smtp_host:
|
||||
return {"ok": False, "error": "SMTP_HOST is not configured"}
|
||||
|
||||
def _test() -> dict:
|
||||
try:
|
||||
ctx = ssl.create_default_context()
|
||||
mode = (settings.smtp_ssl_mode or "starttls").lower()
|
||||
timeout = int(settings.smtp_timeout or 60)
|
||||
port = _smtp_port()
|
||||
|
||||
if mode == "ssl":
|
||||
server = smtplib.SMTP_SSL(settings.smtp_host, port,
|
||||
context=ctx, timeout=timeout)
|
||||
server.ehlo()
|
||||
else:
|
||||
server = smtplib.SMTP(settings.smtp_host, port, timeout=timeout)
|
||||
server.ehlo()
|
||||
if mode == "starttls":
|
||||
server.starttls(context=ctx)
|
||||
server.ehlo()
|
||||
|
||||
if settings.smtp_user:
|
||||
server.login(settings.smtp_user, settings.smtp_password)
|
||||
server.quit()
|
||||
return {"ok": True, "error": None}
|
||||
except Exception as exc:
|
||||
return {"ok": False, "error": str(exc)}
|
||||
|
||||
return await asyncio.to_thread(_test)
|
||||
|
||||
|
||||
async def send_report_now() -> None:
|
||||
"""Send a report immediately (used by on-demand API endpoint)."""
|
||||
drives = await _fetch_report_data()
|
||||
now_str = datetime.now().strftime("%Y-%m-%d %H:%M")
|
||||
html = _build_html(drives, now_str)
|
||||
subject = f"Burn-In Report — {datetime.now().strftime('%Y-%m-%d')} ({len(drives)} drives)"
|
||||
await asyncio.to_thread(_send_email, subject, html)
|
||||
|
||||
|
||||
async def run() -> None:
|
||||
"""Background loop: send daily report at smtp_report_hour local time."""
|
||||
if not settings.smtp_host:
|
||||
log.info("SMTP not configured — daily email disabled")
|
||||
return
|
||||
|
||||
log.info(
|
||||
"Mailer started — daily report at %02d:00 local time",
|
||||
settings.smtp_report_hour,
|
||||
)
|
||||
|
||||
while True:
|
||||
now = datetime.now()
|
||||
target = now.replace(
|
||||
hour=settings.smtp_report_hour,
|
||||
minute=0, second=0, microsecond=0,
|
||||
)
|
||||
if target <= now:
|
||||
target += timedelta(days=1)
|
||||
|
||||
wait = (target - now).total_seconds()
|
||||
log.info("Next report in %.0f seconds (%s)", wait, target.strftime("%Y-%m-%d %H:%M"))
|
||||
await asyncio.sleep(wait)
|
||||
|
||||
if settings.smtp_daily_report_enabled:
|
||||
try:
|
||||
await send_report_now()
|
||||
except Exception as exc:
|
||||
log.error("Failed to send daily report: %s", exc)
|
||||
else:
|
||||
log.info("Daily report skipped — smtp_daily_report_enabled is False")
|
||||
|
||||
# Sleep briefly past the hour to avoid drift from re-triggering immediately
|
||||
await asyncio.sleep(60)
|
||||
123
app/main.py
Normal file
123
app/main.py
Normal file
|
|
@ -0,0 +1,123 @@
|
|||
import asyncio
|
||||
import ipaddress
|
||||
import logging
|
||||
from contextlib import asynccontextmanager
|
||||
|
||||
from fastapi import FastAPI
|
||||
from fastapi.staticfiles import StaticFiles
|
||||
from starlette.middleware.base import BaseHTTPMiddleware
|
||||
from starlette.requests import Request
|
||||
from starlette.responses import PlainTextResponse
|
||||
|
||||
from app import burnin, mailer, poller, settings_store
|
||||
from app.config import settings
|
||||
from app.database import init_db
|
||||
from app.logging_config import configure as configure_logging
|
||||
from app.renderer import templates # noqa: F401 — registers filters as side-effect
|
||||
from app.routes import router
|
||||
from app.truenas import TrueNASClient
|
||||
|
||||
# Configure structured JSON logging before anything else logs
|
||||
configure_logging()
|
||||
log = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# IP allowlist middleware
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class _IPAllowlistMiddleware(BaseHTTPMiddleware):
|
||||
"""
|
||||
Block requests from IPs not in ALLOWED_IPS.
|
||||
|
||||
When ALLOWED_IPS is empty the middleware is a no-op.
|
||||
Checks X-Forwarded-For first (trusts the leftmost address), then the
|
||||
direct client IP.
|
||||
"""
|
||||
|
||||
def __init__(self, app, allowed_ips: str) -> None:
|
||||
super().__init__(app)
|
||||
self._networks: list[ipaddress.IPv4Network | ipaddress.IPv6Network] = []
|
||||
for entry in (s.strip() for s in allowed_ips.split(",") if s.strip()):
|
||||
try:
|
||||
self._networks.append(ipaddress.ip_network(entry, strict=False))
|
||||
except ValueError:
|
||||
log.warning("Invalid ALLOWED_IPS entry ignored: %r", entry)
|
||||
|
||||
def _is_allowed(self, ip_str: str) -> bool:
|
||||
try:
|
||||
addr = ipaddress.ip_address(ip_str)
|
||||
return any(addr in net for net in self._networks)
|
||||
except ValueError:
|
||||
return False
|
||||
|
||||
async def dispatch(self, request: Request, call_next):
|
||||
if not self._networks:
|
||||
return await call_next(request)
|
||||
|
||||
# Prefer X-Forwarded-For (leftmost = original client)
|
||||
forwarded = request.headers.get("X-Forwarded-For", "").split(",")[0].strip()
|
||||
client_ip = forwarded or (request.client.host if request.client else "")
|
||||
|
||||
if self._is_allowed(client_ip):
|
||||
return await call_next(request)
|
||||
|
||||
log.warning("Request blocked by IP allowlist", extra={"client_ip": client_ip})
|
||||
return PlainTextResponse("Forbidden", status_code=403)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Poller supervisor — restarts run() if it ever exits unexpectedly
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
async def _supervised_poller(client: TrueNASClient) -> None:
|
||||
while True:
|
||||
try:
|
||||
await poller.run(client)
|
||||
except asyncio.CancelledError:
|
||||
raise # Propagate shutdown signal cleanly
|
||||
except Exception as exc:
|
||||
log.critical("Poller crashed unexpectedly — restarting in 5s: %s", exc)
|
||||
await asyncio.sleep(5)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Lifespan
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_client: TrueNASClient | None = None
|
||||
|
||||
|
||||
@asynccontextmanager
|
||||
async def lifespan(app: FastAPI):
|
||||
global _client
|
||||
log.info("Starting up")
|
||||
await init_db()
|
||||
settings_store.init()
|
||||
_client = TrueNASClient()
|
||||
await burnin.init(_client)
|
||||
poll_task = asyncio.create_task(_supervised_poller(_client))
|
||||
mailer_task = asyncio.create_task(mailer.run())
|
||||
yield
|
||||
log.info("Shutting down")
|
||||
poll_task.cancel()
|
||||
mailer_task.cancel()
|
||||
try:
|
||||
await asyncio.gather(poll_task, mailer_task, return_exceptions=True)
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
await _client.close()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# App
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
app = FastAPI(title="TrueNAS Burn-In Dashboard", lifespan=lifespan)
|
||||
|
||||
if settings.allowed_ips:
|
||||
app.add_middleware(_IPAllowlistMiddleware, allowed_ips=settings.allowed_ips)
|
||||
log.info("IP allowlist active: %s", settings.allowed_ips)
|
||||
|
||||
app.mount("/static", StaticFiles(directory="app/static"), name="static")
|
||||
app.include_router(router)
|
||||
104
app/models.py
Normal file
104
app/models.py
Normal file
|
|
@ -0,0 +1,104 @@
|
|||
from pydantic import BaseModel, field_validator, model_validator
|
||||
|
||||
|
||||
class SmartTestState(BaseModel):
|
||||
state: str = "idle"
|
||||
percent: int | None = None
|
||||
eta_seconds: int | None = None
|
||||
eta_timestamp: str | None = None
|
||||
started_at: str | None = None
|
||||
finished_at: str | None = None
|
||||
error_text: str | None = None
|
||||
|
||||
|
||||
_VALID_STAGE_NAMES = frozenset({"surface_validate", "short_smart", "long_smart"})
|
||||
|
||||
|
||||
class StartBurninRequest(BaseModel):
|
||||
drive_ids: list[int]
|
||||
operator: str
|
||||
run_surface: bool = True
|
||||
run_short: bool = True
|
||||
run_long: bool = True
|
||||
stage_order: list[str] | None = None # custom execution order, e.g. ["short_smart","long_smart","surface_validate"]
|
||||
|
||||
@field_validator("operator")
|
||||
@classmethod
|
||||
def validate_operator(cls, v: str) -> str:
|
||||
v = v.strip()
|
||||
if not v:
|
||||
raise ValueError("operator must not be empty")
|
||||
return v
|
||||
|
||||
@model_validator(mode="after")
|
||||
def validate_stages(self) -> "StartBurninRequest":
|
||||
if not (self.run_surface or self.run_short or self.run_long):
|
||||
raise ValueError("At least one stage must be selected")
|
||||
if self.stage_order is not None:
|
||||
for s in self.stage_order:
|
||||
if s not in _VALID_STAGE_NAMES:
|
||||
raise ValueError(f"Invalid stage name in stage_order: {s!r}")
|
||||
return self
|
||||
|
||||
@property
|
||||
def profile(self) -> str:
|
||||
_MAP = {
|
||||
(True, True, True): "full",
|
||||
(True, True, False): "surface_short",
|
||||
(True, False, True): "surface_long",
|
||||
(True, False, False): "surface",
|
||||
(False, True, True): "short_long",
|
||||
(False, True, False): "short",
|
||||
(False, False, True): "long",
|
||||
}
|
||||
return _MAP[(self.run_surface, self.run_short, self.run_long)]
|
||||
|
||||
|
||||
class CancelBurninRequest(BaseModel):
|
||||
operator: str = "unknown"
|
||||
|
||||
|
||||
class BurninStageResponse(BaseModel):
|
||||
id: int
|
||||
stage_name: str
|
||||
state: str
|
||||
percent: int = 0
|
||||
started_at: str | None = None
|
||||
finished_at: str | None = None
|
||||
error_text: str | None = None
|
||||
|
||||
|
||||
class BurninJobResponse(BaseModel):
|
||||
id: int
|
||||
drive_id: int
|
||||
profile: str
|
||||
state: str
|
||||
percent: int = 0
|
||||
stage_name: str | None = None
|
||||
operator: str
|
||||
created_at: str
|
||||
started_at: str | None = None
|
||||
finished_at: str | None = None
|
||||
error_text: str | None = None
|
||||
stages: list[BurninStageResponse] = []
|
||||
|
||||
|
||||
class DriveResponse(BaseModel):
|
||||
id: int
|
||||
devname: str
|
||||
serial: str | None = None
|
||||
model: str | None = None
|
||||
size_bytes: int | None = None
|
||||
temperature_c: int | None = None
|
||||
smart_health: str = "UNKNOWN"
|
||||
last_polled_at: str
|
||||
is_stale: bool
|
||||
smart_short: SmartTestState
|
||||
smart_long: SmartTestState
|
||||
notes: str | None = None
|
||||
location: str | None = None
|
||||
|
||||
|
||||
class UpdateDriveRequest(BaseModel):
|
||||
notes: str | None = None
|
||||
location: str | None = None
|
||||
80
app/notifier.py
Normal file
80
app/notifier.py
Normal file
|
|
@ -0,0 +1,80 @@
|
|||
"""
|
||||
Notification dispatcher — webhooks and immediate email alerts.
|
||||
|
||||
Called from burnin.py when a job reaches a terminal state (passed/failed).
|
||||
Webhook fires unconditionally when WEBHOOK_URL is set.
|
||||
Email alerts fire based on smtp_alert_on_fail / smtp_alert_on_pass settings.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
|
||||
from app.config import settings
|
||||
|
||||
log = logging.getLogger(__name__)
|
||||
|
||||
|
||||
async def notify_job_complete(
|
||||
job_id: int,
|
||||
devname: str,
|
||||
serial: str | None,
|
||||
model: str | None,
|
||||
state: str,
|
||||
profile: str,
|
||||
operator: str,
|
||||
error_text: str | None,
|
||||
) -> None:
|
||||
"""Fire all configured notifications for a completed burn-in job."""
|
||||
tasks = []
|
||||
|
||||
if settings.webhook_url:
|
||||
tasks.append(_send_webhook({
|
||||
"event": f"burnin_{state}",
|
||||
"job_id": job_id,
|
||||
"devname": devname,
|
||||
"serial": serial,
|
||||
"model": model,
|
||||
"state": state,
|
||||
"profile": profile,
|
||||
"operator": operator,
|
||||
"error_text": error_text,
|
||||
}))
|
||||
|
||||
if settings.smtp_host:
|
||||
should_alert = (
|
||||
(state == "failed" and settings.smtp_alert_on_fail) or
|
||||
(state == "passed" and settings.smtp_alert_on_pass)
|
||||
)
|
||||
if should_alert:
|
||||
tasks.append(_send_alert_email(job_id, devname, serial, model, state, error_text))
|
||||
|
||||
if not tasks:
|
||||
return
|
||||
|
||||
results = await asyncio.gather(*tasks, return_exceptions=True)
|
||||
for r in results:
|
||||
if isinstance(r, Exception):
|
||||
log.error("Notification failed: %s", r, extra={"job_id": job_id, "devname": devname})
|
||||
|
||||
|
||||
async def _send_webhook(payload: dict) -> None:
|
||||
import httpx
|
||||
async with httpx.AsyncClient(timeout=10.0) as client:
|
||||
r = await client.post(settings.webhook_url, json=payload)
|
||||
r.raise_for_status()
|
||||
log.info(
|
||||
"Webhook sent",
|
||||
extra={"event": payload.get("event"), "job_id": payload.get("job_id"), "url": settings.webhook_url},
|
||||
)
|
||||
|
||||
|
||||
async def _send_alert_email(
|
||||
job_id: int,
|
||||
devname: str,
|
||||
serial: str | None,
|
||||
model: str | None,
|
||||
state: str,
|
||||
error_text: str | None,
|
||||
) -> None:
|
||||
from app import mailer
|
||||
await mailer.send_job_alert(job_id, devname, serial, model, state, error_text)
|
||||
290
app/poller.py
Normal file
290
app/poller.py
Normal file
|
|
@ -0,0 +1,290 @@
|
|||
"""
|
||||
Polling loop — fetches TrueNAS state every POLL_INTERVAL_SECONDS and
|
||||
normalizes it into SQLite.
|
||||
|
||||
Design notes:
|
||||
- Opens its own DB connection per cycle (WAL allows concurrent readers).
|
||||
- Skips a cycle if TrueNAS is unreachable; marks poller unhealthy.
|
||||
- Never overwrites a 'running' state with stale history.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
from datetime import datetime, timezone, timedelta
|
||||
from typing import Any
|
||||
|
||||
import aiosqlite
|
||||
|
||||
from app.config import settings
|
||||
from app.truenas import TrueNASClient
|
||||
|
||||
log = logging.getLogger(__name__)
|
||||
|
||||
# Shared state read by the /health endpoint
|
||||
_state: dict[str, Any] = {
|
||||
"last_poll_at": None,
|
||||
"last_error": None,
|
||||
"healthy": False,
|
||||
"drives_seen": 0,
|
||||
"consecutive_failures": 0,
|
||||
}
|
||||
|
||||
# SSE subscriber queues — notified after each successful poll
|
||||
_subscribers: list[asyncio.Queue] = []
|
||||
|
||||
|
||||
def get_state() -> dict:
|
||||
return _state.copy()
|
||||
|
||||
|
||||
def subscribe() -> asyncio.Queue:
|
||||
q: asyncio.Queue = asyncio.Queue(maxsize=1)
|
||||
_subscribers.append(q)
|
||||
return q
|
||||
|
||||
|
||||
def unsubscribe(q: asyncio.Queue) -> None:
|
||||
try:
|
||||
_subscribers.remove(q)
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
|
||||
def _notify_subscribers(alert: dict | None = None) -> None:
|
||||
payload = {"alert": alert}
|
||||
for q in list(_subscribers):
|
||||
try:
|
||||
q.put_nowait(payload)
|
||||
except asyncio.QueueFull:
|
||||
pass # Client is behind; skip this update
|
||||
|
||||
|
||||
def _now() -> str:
|
||||
return datetime.now(timezone.utc).isoformat()
|
||||
|
||||
|
||||
def _eta_from_progress(percent: int, started_iso: str | None) -> str | None:
|
||||
"""Linear ETA extrapolation from elapsed time and percent complete."""
|
||||
if not started_iso or percent <= 0:
|
||||
return None
|
||||
try:
|
||||
start = datetime.fromisoformat(started_iso)
|
||||
if start.tzinfo is None:
|
||||
start = start.replace(tzinfo=timezone.utc)
|
||||
elapsed = (datetime.now(timezone.utc) - start).total_seconds()
|
||||
total_est = elapsed / (percent / 100)
|
||||
remaining = max(0.0, total_est - elapsed)
|
||||
return (datetime.now(timezone.utc) + timedelta(seconds=remaining)).isoformat()
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
|
||||
def _map_history_state(status: str) -> str:
|
||||
return "passed" if "without error" in status.lower() else "failed"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# DB helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
async def _upsert_drive(db: aiosqlite.Connection, disk: dict, now: str) -> int:
|
||||
await db.execute(
|
||||
"""
|
||||
INSERT INTO drives
|
||||
(truenas_disk_id, devname, serial, model, size_bytes,
|
||||
temperature_c, smart_health, last_seen_at, last_polled_at)
|
||||
VALUES (?,?,?,?,?,?,?,?,?)
|
||||
ON CONFLICT(truenas_disk_id) DO UPDATE SET
|
||||
temperature_c = excluded.temperature_c,
|
||||
smart_health = excluded.smart_health,
|
||||
last_seen_at = excluded.last_seen_at,
|
||||
last_polled_at = excluded.last_polled_at
|
||||
""",
|
||||
(
|
||||
disk["identifier"],
|
||||
disk["devname"],
|
||||
disk.get("serial"),
|
||||
disk.get("model"),
|
||||
disk.get("size"),
|
||||
disk.get("temperature"),
|
||||
disk.get("smart_health", "UNKNOWN"),
|
||||
now,
|
||||
now,
|
||||
),
|
||||
)
|
||||
cur = await db.execute(
|
||||
"SELECT id FROM drives WHERE truenas_disk_id = ?", (disk["identifier"],)
|
||||
)
|
||||
row = await cur.fetchone()
|
||||
return row["id"]
|
||||
|
||||
|
||||
async def _upsert_test(db: aiosqlite.Connection, drive_id: int, ttype: str, data: dict) -> None:
|
||||
await db.execute(
|
||||
"""
|
||||
INSERT INTO smart_tests
|
||||
(drive_id, test_type, state, percent, truenas_job_id,
|
||||
started_at, eta_at, finished_at, error_text)
|
||||
VALUES (?,?,?,?,?,?,?,?,?)
|
||||
ON CONFLICT(drive_id, test_type) DO UPDATE SET
|
||||
state = excluded.state,
|
||||
percent = excluded.percent,
|
||||
truenas_job_id = excluded.truenas_job_id,
|
||||
started_at = COALESCE(excluded.started_at, smart_tests.started_at),
|
||||
eta_at = excluded.eta_at,
|
||||
finished_at = excluded.finished_at,
|
||||
error_text = excluded.error_text
|
||||
""",
|
||||
(
|
||||
drive_id,
|
||||
ttype,
|
||||
data["state"],
|
||||
data.get("percent", 0),
|
||||
data.get("truenas_job_id"),
|
||||
data.get("started_at"),
|
||||
data.get("eta_at"),
|
||||
data.get("finished_at"),
|
||||
data.get("error_text"),
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
async def _apply_running_job(
|
||||
db: aiosqlite.Connection, drive_id: int, ttype: str, job: dict
|
||||
) -> None:
|
||||
pct = job["progress"]["percent"]
|
||||
await _upsert_test(db, drive_id, ttype, {
|
||||
"state": "running",
|
||||
"percent": pct,
|
||||
"truenas_job_id": job["id"],
|
||||
"started_at": job.get("time_started"),
|
||||
"eta_at": _eta_from_progress(pct, job.get("time_started")),
|
||||
"finished_at": None,
|
||||
"error_text": None,
|
||||
})
|
||||
|
||||
|
||||
async def _sync_history(
|
||||
db: aiosqlite.Connection,
|
||||
client: TrueNASClient,
|
||||
drive_id: int,
|
||||
devname: str,
|
||||
ttype: str,
|
||||
) -> None:
|
||||
"""Pull most recent completed test from history.
|
||||
|
||||
This is only called when the drive+type is NOT in the active running-jobs
|
||||
dict, so it's safe to overwrite any previous 'running' state — the job
|
||||
has finished (or was never started).
|
||||
"""
|
||||
try:
|
||||
results = await client.get_smart_results(devname)
|
||||
except Exception:
|
||||
return # History fetch failure is non-fatal
|
||||
|
||||
if not results:
|
||||
return
|
||||
|
||||
for test in results[0].get("tests", []):
|
||||
t_name = test.get("type", "").lower()
|
||||
is_short = "short" in t_name
|
||||
if (ttype == "short") != is_short:
|
||||
continue # Wrong test type
|
||||
|
||||
state = _map_history_state(test.get("status", ""))
|
||||
await _upsert_test(db, drive_id, ttype, {
|
||||
"state": state,
|
||||
"percent": 100 if state == "passed" else 0,
|
||||
"truenas_job_id": None,
|
||||
"started_at": None,
|
||||
"eta_at": None,
|
||||
"finished_at": None,
|
||||
"error_text": test.get("status_verbose") if state == "failed" else None,
|
||||
})
|
||||
break # Most recent only
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Poll cycle
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
async def poll_cycle(client: TrueNASClient) -> int:
|
||||
"""Run one full poll. Returns number of drives seen."""
|
||||
now = _now()
|
||||
|
||||
disks = await client.get_disks()
|
||||
running_jobs = await client.get_smart_jobs(state="RUNNING")
|
||||
|
||||
# Index running jobs by (devname, test_type)
|
||||
active: dict[tuple[str, str], dict] = {}
|
||||
for job in running_jobs:
|
||||
try:
|
||||
args = job["arguments"][0]
|
||||
devname = args["disks"][0]
|
||||
ttype = args["type"].lower()
|
||||
active[(devname, ttype)] = job
|
||||
except (KeyError, IndexError, TypeError):
|
||||
pass
|
||||
|
||||
async with aiosqlite.connect(settings.db_path) as db:
|
||||
db.row_factory = aiosqlite.Row
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
await db.execute("PRAGMA foreign_keys=ON")
|
||||
|
||||
for disk in disks:
|
||||
devname = disk["devname"]
|
||||
drive_id = await _upsert_drive(db, disk, now)
|
||||
|
||||
for ttype in ("short", "long"):
|
||||
if (devname, ttype) in active:
|
||||
await _apply_running_job(db, drive_id, ttype, active[(devname, ttype)])
|
||||
else:
|
||||
await _sync_history(db, client, drive_id, devname, ttype)
|
||||
|
||||
await db.commit()
|
||||
|
||||
return len(disks)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Background loop
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
async def run(client: TrueNASClient) -> None:
|
||||
log.info("Poller started", extra={"poll_interval": settings.poll_interval_seconds})
|
||||
cycle = 0
|
||||
while True:
|
||||
try:
|
||||
count = await poll_cycle(client)
|
||||
cycle += 1
|
||||
_state["last_poll_at"] = _now()
|
||||
_state["last_error"] = None
|
||||
_state["healthy"] = True
|
||||
_state["drives_seen"] = count
|
||||
_state["consecutive_failures"] = 0
|
||||
log.debug("Poll OK", extra={"drives": count})
|
||||
_notify_subscribers()
|
||||
|
||||
# Check for stuck jobs every 5 cycles (~1 min at default 12s interval)
|
||||
if cycle % 5 == 0:
|
||||
try:
|
||||
from app import burnin as _burnin
|
||||
await _burnin.check_stuck_jobs()
|
||||
except Exception as exc:
|
||||
log.error("Stuck-job check failed: %s", exc)
|
||||
|
||||
except Exception as exc:
|
||||
failures = _state["consecutive_failures"] + 1
|
||||
_state["consecutive_failures"] = failures
|
||||
_state["last_error"] = str(exc)
|
||||
_state["healthy"] = False
|
||||
if failures >= 5:
|
||||
log.critical(
|
||||
"Poller has failed %d consecutive times: %s",
|
||||
failures, exc,
|
||||
extra={"consecutive_failures": failures},
|
||||
)
|
||||
else:
|
||||
log.error("Poll failed: %s", exc, extra={"consecutive_failures": failures})
|
||||
|
||||
await asyncio.sleep(settings.poll_interval_seconds)
|
||||
136
app/renderer.py
Normal file
136
app/renderer.py
Normal file
|
|
@ -0,0 +1,136 @@
|
|||
"""
|
||||
Jinja2 template engine + filter/helper registration.
|
||||
Import `templates` from here — do not create additional Jinja2 instances.
|
||||
"""
|
||||
|
||||
from datetime import datetime, timezone
|
||||
|
||||
from fastapi.templating import Jinja2Templates
|
||||
|
||||
templates = Jinja2Templates(directory="app/templates")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Template filters
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _format_bytes(value: int | None) -> str:
|
||||
if value is None:
|
||||
return "—"
|
||||
tb = value / 1_000_000_000_000
|
||||
if tb >= 1:
|
||||
return f"{tb:.0f} TB"
|
||||
gb = value / 1_000_000_000
|
||||
return f"{gb:.0f} GB"
|
||||
|
||||
|
||||
def _format_eta(seconds: int | None) -> str:
|
||||
if not seconds or seconds <= 0:
|
||||
return ""
|
||||
h = seconds // 3600
|
||||
m = (seconds % 3600) // 60
|
||||
if h > 0:
|
||||
return f"~{h}h {m}m" if m else f"~{h}h"
|
||||
return f"~{m}m" if m else "<1m"
|
||||
|
||||
|
||||
def _temp_class(celsius: int | None) -> str:
|
||||
if celsius is None:
|
||||
return ""
|
||||
if celsius < 40:
|
||||
return "temp-cool"
|
||||
if celsius < 50:
|
||||
return "temp-warm"
|
||||
return "temp-hot"
|
||||
|
||||
|
||||
def _format_dt(iso: str | None) -> str:
|
||||
if not iso:
|
||||
return "—"
|
||||
try:
|
||||
dt = datetime.fromisoformat(iso)
|
||||
if dt.tzinfo is None:
|
||||
dt = dt.replace(tzinfo=timezone.utc)
|
||||
local = dt.astimezone()
|
||||
return local.strftime("%H:%M:%S")
|
||||
except Exception:
|
||||
return iso
|
||||
|
||||
|
||||
def _format_dt_full(iso: str | None) -> str:
|
||||
"""Date + time for history tables."""
|
||||
if not iso:
|
||||
return "—"
|
||||
try:
|
||||
dt = datetime.fromisoformat(iso)
|
||||
if dt.tzinfo is None:
|
||||
dt = dt.replace(tzinfo=timezone.utc)
|
||||
local = dt.astimezone()
|
||||
return local.strftime("%Y-%m-%d %H:%M:%S")
|
||||
except Exception:
|
||||
return iso
|
||||
|
||||
|
||||
def _format_duration(seconds: float | int | None) -> str:
|
||||
if seconds is None or seconds < 0:
|
||||
return "—"
|
||||
seconds = int(seconds)
|
||||
h = seconds // 3600
|
||||
m = (seconds % 3600) // 60
|
||||
s = seconds % 60
|
||||
if h > 0:
|
||||
return f"{h}h {m}m {s}s"
|
||||
if m > 0:
|
||||
return f"{m}m {s}s"
|
||||
return f"{s}s"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Template globals
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _drive_status(drive: dict) -> str:
|
||||
short = (drive.get("smart_short") or {}).get("state", "idle")
|
||||
long_ = (drive.get("smart_long") or {}).get("state", "idle")
|
||||
health = drive.get("smart_health", "UNKNOWN")
|
||||
if "running" in (short, long_):
|
||||
return "running"
|
||||
if short == "failed" or long_ == "failed" or health == "FAILED":
|
||||
return "failed"
|
||||
if "passed" in (short, long_):
|
||||
return "passed"
|
||||
return "idle"
|
||||
|
||||
|
||||
def _format_elapsed(iso: str | None) -> str:
|
||||
"""Human-readable elapsed time since an ISO timestamp (e.g. '2h 34m')."""
|
||||
if not iso:
|
||||
return ""
|
||||
try:
|
||||
start = datetime.fromisoformat(iso)
|
||||
if start.tzinfo is None:
|
||||
start = start.replace(tzinfo=timezone.utc)
|
||||
elapsed = int((datetime.now(timezone.utc) - start).total_seconds())
|
||||
if elapsed < 0:
|
||||
return ""
|
||||
h = elapsed // 3600
|
||||
m = (elapsed % 3600) // 60
|
||||
s = elapsed % 60
|
||||
if h > 0:
|
||||
return f"{h}h {m}m"
|
||||
if m > 0:
|
||||
return f"{m}m {s}s"
|
||||
return f"{s}s"
|
||||
except Exception:
|
||||
return ""
|
||||
|
||||
|
||||
# Register
|
||||
templates.env.filters["format_bytes"] = _format_bytes
|
||||
templates.env.filters["format_eta"] = _format_eta
|
||||
templates.env.filters["temp_class"] = _temp_class
|
||||
templates.env.filters["format_dt"] = _format_dt
|
||||
templates.env.filters["format_dt_full"] = _format_dt_full
|
||||
templates.env.filters["format_duration"] = _format_duration
|
||||
templates.env.filters["format_elapsed"] = _format_elapsed
|
||||
templates.env.globals["drive_status"] = _drive_status
|
||||
862
app/routes.py
Normal file
862
app/routes.py
Normal file
|
|
@ -0,0 +1,862 @@
|
|||
import asyncio
|
||||
import csv
|
||||
import io
|
||||
import json
|
||||
from datetime import datetime, timezone
|
||||
|
||||
import aiosqlite
|
||||
from fastapi import APIRouter, Depends, HTTPException, Query, Request
|
||||
from fastapi.responses import HTMLResponse, StreamingResponse
|
||||
from sse_starlette.sse import EventSourceResponse
|
||||
|
||||
from app import burnin, mailer, poller, settings_store
|
||||
from app.config import settings
|
||||
from app.database import get_db
|
||||
from app.models import (
|
||||
BurninJobResponse, BurninStageResponse,
|
||||
CancelBurninRequest, DriveResponse,
|
||||
SmartTestState, StartBurninRequest, UpdateDriveRequest,
|
||||
)
|
||||
from app.renderer import templates
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Internal helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _eta_seconds(eta_at: str | None) -> int | None:
|
||||
if not eta_at:
|
||||
return None
|
||||
try:
|
||||
eta_ts = datetime.fromisoformat(eta_at)
|
||||
if eta_ts.tzinfo is None:
|
||||
eta_ts = eta_ts.replace(tzinfo=timezone.utc)
|
||||
remaining = (eta_ts - datetime.now(timezone.utc)).total_seconds()
|
||||
return max(0, int(remaining))
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
|
||||
def _is_stale(last_polled_at: str) -> bool:
|
||||
try:
|
||||
last = datetime.fromisoformat(last_polled_at)
|
||||
if last.tzinfo is None:
|
||||
last = last.replace(tzinfo=timezone.utc)
|
||||
return (datetime.now(timezone.utc) - last).total_seconds() > settings.stale_threshold_seconds
|
||||
except Exception:
|
||||
return True
|
||||
|
||||
|
||||
def _build_smart(row: aiosqlite.Row, prefix: str) -> SmartTestState:
|
||||
eta_at = row[f"{prefix}_eta_at"]
|
||||
return SmartTestState(
|
||||
state=row[f"{prefix}_state"] or "idle",
|
||||
percent=row[f"{prefix}_percent"],
|
||||
eta_seconds=_eta_seconds(eta_at),
|
||||
eta_timestamp=eta_at,
|
||||
started_at=row[f"{prefix}_started_at"],
|
||||
finished_at=row[f"{prefix}_finished_at"],
|
||||
error_text=row[f"{prefix}_error"],
|
||||
)
|
||||
|
||||
|
||||
def _row_to_drive(row: aiosqlite.Row) -> DriveResponse:
|
||||
return DriveResponse(
|
||||
id=row["id"],
|
||||
devname=row["devname"],
|
||||
serial=row["serial"],
|
||||
model=row["model"],
|
||||
size_bytes=row["size_bytes"],
|
||||
temperature_c=row["temperature_c"],
|
||||
smart_health=row["smart_health"] or "UNKNOWN",
|
||||
last_polled_at=row["last_polled_at"],
|
||||
is_stale=_is_stale(row["last_polled_at"]),
|
||||
smart_short=_build_smart(row, "short"),
|
||||
smart_long=_build_smart(row, "long"),
|
||||
notes=row["notes"],
|
||||
location=row["location"],
|
||||
)
|
||||
|
||||
|
||||
def _compute_status(drive: dict) -> str:
|
||||
short = (drive.get("smart_short") or {}).get("state", "idle")
|
||||
long_ = (drive.get("smart_long") or {}).get("state", "idle")
|
||||
health = drive.get("smart_health", "UNKNOWN")
|
||||
if "running" in (short, long_):
|
||||
return "running"
|
||||
if short == "failed" or long_ == "failed" or health == "FAILED":
|
||||
return "failed"
|
||||
if "passed" in (short, long_):
|
||||
return "passed"
|
||||
return "idle"
|
||||
|
||||
|
||||
_DRIVES_QUERY = """
|
||||
SELECT
|
||||
d.id, d.devname, d.serial, d.model, d.size_bytes,
|
||||
d.temperature_c, d.smart_health, d.last_polled_at,
|
||||
d.notes, d.location,
|
||||
s.state AS short_state,
|
||||
s.percent AS short_percent,
|
||||
s.started_at AS short_started_at,
|
||||
s.eta_at AS short_eta_at,
|
||||
s.finished_at AS short_finished_at,
|
||||
s.error_text AS short_error,
|
||||
l.state AS long_state,
|
||||
l.percent AS long_percent,
|
||||
l.started_at AS long_started_at,
|
||||
l.eta_at AS long_eta_at,
|
||||
l.finished_at AS long_finished_at,
|
||||
l.error_text AS long_error
|
||||
FROM drives d
|
||||
LEFT JOIN smart_tests s ON s.drive_id = d.id AND s.test_type = 'short'
|
||||
LEFT JOIN smart_tests l ON l.drive_id = d.id AND l.test_type = 'long'
|
||||
{where}
|
||||
ORDER BY d.devname
|
||||
"""
|
||||
|
||||
|
||||
async def _fetch_burnin_by_drive(db: aiosqlite.Connection) -> dict[int, dict]:
|
||||
"""Return latest burn-in job (any state) keyed by drive_id."""
|
||||
cur = await db.execute("""
|
||||
SELECT bj.*
|
||||
FROM burnin_jobs bj
|
||||
WHERE bj.id IN (SELECT MAX(id) FROM burnin_jobs GROUP BY drive_id)
|
||||
""")
|
||||
rows = await cur.fetchall()
|
||||
return {r["drive_id"]: dict(r) for r in rows}
|
||||
|
||||
|
||||
async def _fetch_drives_for_template(db: aiosqlite.Connection) -> list[dict]:
|
||||
cur = await db.execute(_DRIVES_QUERY.format(where=""))
|
||||
rows = await cur.fetchall()
|
||||
burnin_by_drive = await _fetch_burnin_by_drive(db)
|
||||
drives = []
|
||||
for row in rows:
|
||||
d = _row_to_drive(row).model_dump()
|
||||
d["status"] = _compute_status(d)
|
||||
d["burnin"] = burnin_by_drive.get(d["id"])
|
||||
drives.append(d)
|
||||
return drives
|
||||
|
||||
|
||||
def _stale_context(poller_state: dict) -> dict:
|
||||
last = poller_state.get("last_poll_at")
|
||||
if not last:
|
||||
return {"stale": False, "stale_seconds": 0}
|
||||
try:
|
||||
dt = datetime.fromisoformat(last)
|
||||
if dt.tzinfo is None:
|
||||
dt = dt.replace(tzinfo=timezone.utc)
|
||||
elapsed = int((datetime.now(timezone.utc) - dt).total_seconds())
|
||||
stale = elapsed > settings.stale_threshold_seconds
|
||||
return {"stale": stale, "stale_seconds": elapsed}
|
||||
except Exception:
|
||||
return {"stale": False, "stale_seconds": 0}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Dashboard
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@router.get("/", response_class=HTMLResponse)
|
||||
async def dashboard(request: Request, db: aiosqlite.Connection = Depends(get_db)):
|
||||
drives = await _fetch_drives_for_template(db)
|
||||
ps = poller.get_state()
|
||||
return templates.TemplateResponse("dashboard.html", {
|
||||
"request": request,
|
||||
"drives": drives,
|
||||
"poller": ps,
|
||||
**_stale_context(ps),
|
||||
})
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# SSE — live drive table updates
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@router.get("/sse/drives")
|
||||
async def sse_drives(request: Request):
|
||||
q = poller.subscribe()
|
||||
|
||||
async def generate():
|
||||
try:
|
||||
while True:
|
||||
# Wait for next poll notification or keepalive timeout
|
||||
try:
|
||||
payload = await asyncio.wait_for(q.get(), timeout=25.0)
|
||||
except asyncio.TimeoutError:
|
||||
if await request.is_disconnected():
|
||||
break
|
||||
yield {"event": "keepalive", "data": ""}
|
||||
continue
|
||||
|
||||
if await request.is_disconnected():
|
||||
break
|
||||
|
||||
# Extract alert from payload (may be None for regular polls)
|
||||
alert = None
|
||||
if isinstance(payload, dict):
|
||||
alert = payload.get("alert")
|
||||
|
||||
# Render fresh table HTML
|
||||
async with aiosqlite.connect(settings.db_path) as db:
|
||||
db.row_factory = aiosqlite.Row
|
||||
await db.execute("PRAGMA journal_mode=WAL")
|
||||
drives = await _fetch_drives_for_template(db)
|
||||
|
||||
html = templates.env.get_template(
|
||||
"components/drives_table.html"
|
||||
).render(drives=drives)
|
||||
|
||||
yield {"event": "drives-update", "data": html}
|
||||
|
||||
# Push browser notification event if this was a job completion
|
||||
if alert:
|
||||
yield {"event": "job-alert", "data": json.dumps(alert)}
|
||||
|
||||
finally:
|
||||
poller.unsubscribe(q)
|
||||
|
||||
return EventSourceResponse(generate())
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# JSON API
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@router.get("/health")
|
||||
async def health(db: aiosqlite.Connection = Depends(get_db)):
|
||||
ps = poller.get_state()
|
||||
cur = await db.execute("SELECT COUNT(*) FROM drives")
|
||||
row = await cur.fetchone()
|
||||
drives_tracked = row[0] if row else 0
|
||||
return {
|
||||
"status": "ok" if ps["healthy"] else "degraded",
|
||||
"last_poll_at": ps["last_poll_at"],
|
||||
"last_error": ps["last_error"],
|
||||
"consecutive_failures": ps.get("consecutive_failures", 0),
|
||||
"poll_interval_seconds": settings.poll_interval_seconds,
|
||||
"drives_tracked": drives_tracked,
|
||||
}
|
||||
|
||||
|
||||
@router.get("/api/v1/drives", response_model=list[DriveResponse])
|
||||
async def list_drives(db: aiosqlite.Connection = Depends(get_db)):
|
||||
cur = await db.execute(_DRIVES_QUERY.format(where=""))
|
||||
rows = await cur.fetchall()
|
||||
return [_row_to_drive(r) for r in rows]
|
||||
|
||||
|
||||
@router.get("/api/v1/drives/{drive_id}", response_model=DriveResponse)
|
||||
async def get_drive(drive_id: int, db: aiosqlite.Connection = Depends(get_db)):
|
||||
cur = await db.execute(
|
||||
_DRIVES_QUERY.format(where="WHERE d.id = ?"), (drive_id,)
|
||||
)
|
||||
row = await cur.fetchone()
|
||||
if not row:
|
||||
raise HTTPException(status_code=404, detail="Drive not found")
|
||||
return _row_to_drive(row)
|
||||
|
||||
|
||||
@router.post("/api/v1/drives/{drive_id}/smart/start")
|
||||
async def smart_start(
|
||||
drive_id: int,
|
||||
body: dict,
|
||||
db: aiosqlite.Connection = Depends(get_db),
|
||||
):
|
||||
"""Start a standalone SHORT or LONG SMART test on a single drive."""
|
||||
from app.truenas import TrueNASClient
|
||||
from app import burnin as _burnin
|
||||
|
||||
test_type = (body.get("type") or "").upper()
|
||||
if test_type not in ("SHORT", "LONG"):
|
||||
raise HTTPException(status_code=422, detail="type must be SHORT or LONG")
|
||||
|
||||
cur = await db.execute("SELECT devname FROM drives WHERE id=?", (drive_id,))
|
||||
row = await cur.fetchone()
|
||||
if not row:
|
||||
raise HTTPException(status_code=404, detail="Drive not found")
|
||||
devname = row[0]
|
||||
|
||||
# Use the shared TrueNAS client held by the burnin module
|
||||
client = _burnin._client
|
||||
if client is None:
|
||||
raise HTTPException(status_code=503, detail="TrueNAS client not ready")
|
||||
|
||||
try:
|
||||
tn_job_id = await client.start_smart_test([devname], test_type)
|
||||
except Exception as exc:
|
||||
raise HTTPException(status_code=502, detail=f"TrueNAS error: {exc}")
|
||||
|
||||
return {"job_id": tn_job_id, "devname": devname, "type": test_type}
|
||||
|
||||
|
||||
@router.post("/api/v1/drives/{drive_id}/smart/cancel")
|
||||
async def smart_cancel(
|
||||
drive_id: int,
|
||||
body: dict,
|
||||
db: aiosqlite.Connection = Depends(get_db),
|
||||
):
|
||||
"""Cancel a running standalone SMART test on a drive."""
|
||||
from app import burnin as _burnin
|
||||
|
||||
test_type = (body.get("type") or "").lower()
|
||||
if test_type not in ("short", "long"):
|
||||
raise HTTPException(status_code=422, detail="type must be 'short' or 'long'")
|
||||
|
||||
cur = await db.execute("SELECT devname FROM drives WHERE id=?", (drive_id,))
|
||||
row = await cur.fetchone()
|
||||
if not row:
|
||||
raise HTTPException(status_code=404, detail="Drive not found")
|
||||
devname = row[0]
|
||||
|
||||
client = _burnin._client
|
||||
if client is None:
|
||||
raise HTTPException(status_code=503, detail="TrueNAS client not ready")
|
||||
|
||||
# Find the running TrueNAS job for this drive/test-type
|
||||
try:
|
||||
jobs = await client.get_smart_jobs()
|
||||
tn_job_id = None
|
||||
for j in jobs:
|
||||
if j.get("state") != "RUNNING":
|
||||
continue
|
||||
args = j.get("arguments", [])
|
||||
if not args or not isinstance(args[0], dict):
|
||||
continue
|
||||
if devname in args[0].get("disks", []):
|
||||
tn_job_id = j["id"]
|
||||
break
|
||||
|
||||
if tn_job_id is None:
|
||||
raise HTTPException(status_code=404, detail="No running SMART test found for this drive")
|
||||
|
||||
await client.abort_job(tn_job_id)
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as exc:
|
||||
raise HTTPException(status_code=502, detail=f"TrueNAS error: {exc}")
|
||||
|
||||
# Update local DB state
|
||||
now = datetime.now(timezone.utc).isoformat()
|
||||
await db.execute(
|
||||
"UPDATE smart_tests SET state='aborted', finished_at=? WHERE drive_id=? AND test_type=? AND state='running'",
|
||||
(now, drive_id, test_type),
|
||||
)
|
||||
await db.commit()
|
||||
|
||||
return {"cancelled": True, "devname": devname, "type": test_type}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Burn-in API
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _row_to_burnin(row: aiosqlite.Row, stages: list[aiosqlite.Row]) -> BurninJobResponse:
|
||||
return BurninJobResponse(
|
||||
id=row["id"],
|
||||
drive_id=row["drive_id"],
|
||||
profile=row["profile"],
|
||||
state=row["state"],
|
||||
percent=row["percent"] or 0,
|
||||
stage_name=row["stage_name"],
|
||||
operator=row["operator"],
|
||||
created_at=row["created_at"],
|
||||
started_at=row["started_at"],
|
||||
finished_at=row["finished_at"],
|
||||
error_text=row["error_text"],
|
||||
stages=[
|
||||
BurninStageResponse(
|
||||
id=s["id"],
|
||||
stage_name=s["stage_name"],
|
||||
state=s["state"],
|
||||
percent=s["percent"] or 0,
|
||||
started_at=s["started_at"],
|
||||
finished_at=s["finished_at"],
|
||||
error_text=s["error_text"],
|
||||
)
|
||||
for s in stages
|
||||
],
|
||||
)
|
||||
|
||||
|
||||
@router.post("/api/v1/burnin/start")
|
||||
async def burnin_start(req: StartBurninRequest):
|
||||
results = []
|
||||
errors = []
|
||||
for drive_id in req.drive_ids:
|
||||
try:
|
||||
job_id = await burnin.start_job(
|
||||
drive_id, req.profile, req.operator, stage_order=req.stage_order
|
||||
)
|
||||
results.append({"drive_id": drive_id, "job_id": job_id})
|
||||
except ValueError as exc:
|
||||
errors.append({"drive_id": drive_id, "error": str(exc)})
|
||||
if errors and not results:
|
||||
raise HTTPException(status_code=409, detail=errors[0]["error"])
|
||||
return {"queued": results, "errors": errors}
|
||||
|
||||
|
||||
@router.post("/api/v1/burnin/{job_id}/cancel")
|
||||
async def burnin_cancel(job_id: int, req: CancelBurninRequest):
|
||||
ok = await burnin.cancel_job(job_id, req.operator)
|
||||
if not ok:
|
||||
raise HTTPException(status_code=409, detail="Job not found or not cancellable")
|
||||
return {"cancelled": True}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# History pages
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_PAGE_SIZE = 50
|
||||
|
||||
_ALL_STATES = ("queued", "running", "passed", "failed", "cancelled", "unknown")
|
||||
|
||||
_HISTORY_QUERY = """
|
||||
SELECT
|
||||
bj.id, bj.drive_id, bj.profile, bj.state, bj.operator,
|
||||
bj.created_at, bj.started_at, bj.finished_at, bj.error_text,
|
||||
d.devname, d.serial, d.model, d.size_bytes,
|
||||
CAST(
|
||||
(julianday(bj.finished_at) - julianday(bj.started_at)) * 86400
|
||||
AS INTEGER
|
||||
) AS duration_seconds
|
||||
FROM burnin_jobs bj
|
||||
JOIN drives d ON d.id = bj.drive_id
|
||||
{where}
|
||||
ORDER BY bj.id DESC
|
||||
"""
|
||||
|
||||
|
||||
def _state_where(state: str) -> tuple[str, list]:
|
||||
if state == "all":
|
||||
return "", []
|
||||
return "WHERE bj.state = ?", [state]
|
||||
|
||||
|
||||
@router.get("/history", response_class=HTMLResponse)
|
||||
async def history_list(
|
||||
request: Request,
|
||||
state: str = Query(default="all"),
|
||||
page: int = Query(default=1, ge=1),
|
||||
db: aiosqlite.Connection = Depends(get_db),
|
||||
):
|
||||
if state not in ("all",) + _ALL_STATES:
|
||||
state = "all"
|
||||
|
||||
where_clause, params = _state_where(state)
|
||||
|
||||
# Total count
|
||||
count_sql = f"SELECT COUNT(*) FROM burnin_jobs bj JOIN drives d ON d.id = bj.drive_id {where_clause}"
|
||||
cur = await db.execute(count_sql, params)
|
||||
total_count = (await cur.fetchone())[0]
|
||||
total_pages = max(1, (total_count + _PAGE_SIZE - 1) // _PAGE_SIZE)
|
||||
page = min(page, total_pages)
|
||||
offset = (page - 1) * _PAGE_SIZE
|
||||
|
||||
# Per-state counts for badges
|
||||
cur = await db.execute(
|
||||
"SELECT state, COUNT(*) FROM burnin_jobs GROUP BY state"
|
||||
)
|
||||
counts = {"all": total_count if state == "all" else 0}
|
||||
for r in await cur.fetchall():
|
||||
counts[r[0]] = r[1]
|
||||
if state != "all":
|
||||
cur2 = await db.execute("SELECT COUNT(*) FROM burnin_jobs")
|
||||
counts["all"] = (await cur2.fetchone())[0]
|
||||
|
||||
# Job rows
|
||||
sql = _HISTORY_QUERY.format(where=where_clause) + " LIMIT ? OFFSET ?"
|
||||
cur = await db.execute(sql, params + [_PAGE_SIZE, offset])
|
||||
rows = await cur.fetchall()
|
||||
jobs = [dict(r) for r in rows]
|
||||
|
||||
ps = poller.get_state()
|
||||
return templates.TemplateResponse("history.html", {
|
||||
"request": request,
|
||||
"jobs": jobs,
|
||||
"active_state": state,
|
||||
"counts": counts,
|
||||
"page": page,
|
||||
"total_pages": total_pages,
|
||||
"total_count": total_count,
|
||||
"poller": ps,
|
||||
**_stale_context(ps),
|
||||
})
|
||||
|
||||
|
||||
@router.get("/history/{job_id}", response_class=HTMLResponse)
|
||||
async def history_detail(
|
||||
request: Request,
|
||||
job_id: int,
|
||||
db: aiosqlite.Connection = Depends(get_db),
|
||||
):
|
||||
# Job + drive info
|
||||
cur = await db.execute("""
|
||||
SELECT
|
||||
bj.*, d.devname, d.serial, d.model, d.size_bytes,
|
||||
CAST(
|
||||
(julianday(bj.finished_at) - julianday(bj.started_at)) * 86400
|
||||
AS INTEGER
|
||||
) AS duration_seconds
|
||||
FROM burnin_jobs bj
|
||||
JOIN drives d ON d.id = bj.drive_id
|
||||
WHERE bj.id = ?
|
||||
""", (job_id,))
|
||||
row = await cur.fetchone()
|
||||
if not row:
|
||||
raise HTTPException(status_code=404, detail="Burn-in job not found")
|
||||
job = dict(row)
|
||||
|
||||
# Stages (with duration)
|
||||
cur = await db.execute("""
|
||||
SELECT *,
|
||||
CAST(
|
||||
(julianday(finished_at) - julianday(started_at)) * 86400
|
||||
AS INTEGER
|
||||
) AS duration_seconds
|
||||
FROM burnin_stages
|
||||
WHERE burnin_job_id = ?
|
||||
ORDER BY id
|
||||
""", (job_id,))
|
||||
job["stages"] = [dict(r) for r in await cur.fetchall()]
|
||||
|
||||
ps = poller.get_state()
|
||||
return templates.TemplateResponse("job_detail.html", {
|
||||
"request": request,
|
||||
"job": job,
|
||||
"poller": ps,
|
||||
**_stale_context(ps),
|
||||
})
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CSV export
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@router.get("/api/v1/burnin/export.csv")
|
||||
async def burnin_export_csv(db: aiosqlite.Connection = Depends(get_db)):
|
||||
cur = await db.execute("""
|
||||
SELECT
|
||||
bj.id AS job_id,
|
||||
bj.drive_id,
|
||||
d.devname,
|
||||
d.serial,
|
||||
d.model,
|
||||
bj.profile,
|
||||
bj.state,
|
||||
bj.operator,
|
||||
bj.created_at,
|
||||
bj.started_at,
|
||||
bj.finished_at,
|
||||
CAST(
|
||||
(julianday(bj.finished_at) - julianday(bj.started_at)) * 86400
|
||||
AS INTEGER
|
||||
) AS duration_seconds,
|
||||
bj.error_text
|
||||
FROM burnin_jobs bj
|
||||
JOIN drives d ON d.id = bj.drive_id
|
||||
ORDER BY bj.id DESC
|
||||
""")
|
||||
rows = await cur.fetchall()
|
||||
|
||||
buf = io.StringIO()
|
||||
writer = csv.writer(buf)
|
||||
writer.writerow([
|
||||
"job_id", "drive_id", "devname", "serial", "model",
|
||||
"profile", "state", "operator",
|
||||
"created_at", "started_at", "finished_at", "duration_seconds",
|
||||
"error_text",
|
||||
])
|
||||
for r in rows:
|
||||
writer.writerow(list(r))
|
||||
|
||||
buf.seek(0)
|
||||
return StreamingResponse(
|
||||
iter([buf.getvalue()]),
|
||||
media_type="text/csv",
|
||||
headers={"Content-Disposition": "attachment; filename=burnin_history.csv"},
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# On-demand email report
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@router.post("/api/v1/report/send")
|
||||
async def send_report_now():
|
||||
"""Trigger the daily status email immediately (for testing SMTP config)."""
|
||||
if not settings.smtp_host:
|
||||
raise HTTPException(status_code=503, detail="SMTP not configured (SMTP_HOST is empty)")
|
||||
try:
|
||||
await mailer.send_report_now()
|
||||
except Exception as exc:
|
||||
raise HTTPException(status_code=502, detail=f"Mail send failed: {exc}")
|
||||
return {"sent": True, "to": settings.smtp_to}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Drive notes / location update
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@router.patch("/api/v1/drives/{drive_id}")
|
||||
async def update_drive(
|
||||
drive_id: int,
|
||||
req: UpdateDriveRequest,
|
||||
db: aiosqlite.Connection = Depends(get_db),
|
||||
):
|
||||
cur = await db.execute("SELECT id FROM drives WHERE id=?", (drive_id,))
|
||||
if not await cur.fetchone():
|
||||
raise HTTPException(status_code=404, detail="Drive not found")
|
||||
|
||||
await db.execute(
|
||||
"UPDATE drives SET notes=?, location=? WHERE id=?",
|
||||
(req.notes, req.location, drive_id),
|
||||
)
|
||||
await db.commit()
|
||||
return {"updated": True}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Audit log page
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_AUDIT_QUERY = """
|
||||
SELECT
|
||||
ae.id, ae.event_type, ae.operator, ae.message, ae.created_at,
|
||||
d.devname, d.serial
|
||||
FROM audit_events ae
|
||||
LEFT JOIN drives d ON d.id = ae.drive_id
|
||||
ORDER BY ae.id DESC
|
||||
LIMIT 200
|
||||
"""
|
||||
|
||||
_AUDIT_EVENT_COLORS = {
|
||||
"burnin_queued": "yellow",
|
||||
"burnin_started": "blue",
|
||||
"burnin_passed": "passed",
|
||||
"burnin_failed": "failed",
|
||||
"burnin_cancelled": "cancelled",
|
||||
"burnin_stuck": "failed",
|
||||
"burnin_unknown": "unknown",
|
||||
}
|
||||
|
||||
|
||||
@router.get("/audit", response_class=HTMLResponse)
|
||||
async def audit_log(
|
||||
request: Request,
|
||||
db: aiosqlite.Connection = Depends(get_db),
|
||||
):
|
||||
cur = await db.execute(_AUDIT_QUERY)
|
||||
rows = [dict(r) for r in await cur.fetchall()]
|
||||
ps = poller.get_state()
|
||||
return templates.TemplateResponse("audit.html", {
|
||||
"request": request,
|
||||
"events": rows,
|
||||
"event_colors": _AUDIT_EVENT_COLORS,
|
||||
"poller": ps,
|
||||
**_stale_context(ps),
|
||||
})
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Stats / analytics page
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@router.get("/stats", response_class=HTMLResponse)
|
||||
async def stats_page(
|
||||
request: Request,
|
||||
db: aiosqlite.Connection = Depends(get_db),
|
||||
):
|
||||
# Overall counts
|
||||
cur = await db.execute("""
|
||||
SELECT
|
||||
COUNT(*) as total,
|
||||
SUM(CASE WHEN state='passed' THEN 1 ELSE 0 END) as passed,
|
||||
SUM(CASE WHEN state='failed' THEN 1 ELSE 0 END) as failed,
|
||||
SUM(CASE WHEN state='running' THEN 1 ELSE 0 END) as running,
|
||||
SUM(CASE WHEN state='cancelled' THEN 1 ELSE 0 END) as cancelled
|
||||
FROM burnin_jobs
|
||||
""")
|
||||
overall = dict(await cur.fetchone())
|
||||
|
||||
# Failure rate by drive model (only completed jobs)
|
||||
cur = await db.execute("""
|
||||
SELECT
|
||||
COALESCE(d.model, 'Unknown') AS model,
|
||||
COUNT(*) AS total,
|
||||
SUM(CASE WHEN bj.state='passed' THEN 1 ELSE 0 END) AS passed,
|
||||
SUM(CASE WHEN bj.state='failed' THEN 1 ELSE 0 END) AS failed,
|
||||
ROUND(100.0 * SUM(CASE WHEN bj.state='passed' THEN 1 ELSE 0 END) / COUNT(*), 1) AS pass_rate
|
||||
FROM burnin_jobs bj
|
||||
JOIN drives d ON d.id = bj.drive_id
|
||||
WHERE bj.state IN ('passed', 'failed')
|
||||
GROUP BY COALESCE(d.model, 'Unknown')
|
||||
ORDER BY total DESC
|
||||
LIMIT 20
|
||||
""")
|
||||
by_model = [dict(r) for r in await cur.fetchall()]
|
||||
|
||||
# Activity last 14 days
|
||||
cur = await db.execute("""
|
||||
SELECT
|
||||
date(created_at) AS day,
|
||||
COUNT(*) AS total,
|
||||
SUM(CASE WHEN state='passed' THEN 1 ELSE 0 END) AS passed,
|
||||
SUM(CASE WHEN state='failed' THEN 1 ELSE 0 END) AS failed
|
||||
FROM burnin_jobs
|
||||
WHERE created_at >= date('now', '-14 days')
|
||||
GROUP BY date(created_at)
|
||||
ORDER BY day DESC
|
||||
""")
|
||||
by_day = [dict(r) for r in await cur.fetchall()]
|
||||
|
||||
# Drives tracked
|
||||
cur = await db.execute("SELECT COUNT(*) FROM drives")
|
||||
drives_total = (await cur.fetchone())[0]
|
||||
|
||||
ps = poller.get_state()
|
||||
return templates.TemplateResponse("stats.html", {
|
||||
"request": request,
|
||||
"overall": overall,
|
||||
"by_model": by_model,
|
||||
"by_day": by_day,
|
||||
"drives_total": drives_total,
|
||||
"poller": ps,
|
||||
**_stale_context(ps),
|
||||
})
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Settings page
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@router.get("/settings", response_class=HTMLResponse)
|
||||
async def settings_page(
|
||||
request: Request,
|
||||
db: aiosqlite.Connection = Depends(get_db),
|
||||
):
|
||||
# Read-only display values (require container restart to change)
|
||||
readonly = {
|
||||
"truenas_base_url": settings.truenas_base_url,
|
||||
"truenas_verify_tls": settings.truenas_verify_tls,
|
||||
"poll_interval_seconds": settings.poll_interval_seconds,
|
||||
"stale_threshold_seconds": settings.stale_threshold_seconds,
|
||||
"allowed_ips": settings.allowed_ips or "(allow all)",
|
||||
"log_level": settings.log_level,
|
||||
}
|
||||
|
||||
# Editable values — real values for form fields (password excluded)
|
||||
editable = {
|
||||
"smtp_host": settings.smtp_host,
|
||||
"smtp_port": settings.smtp_port,
|
||||
"smtp_ssl_mode": settings.smtp_ssl_mode or "starttls",
|
||||
"smtp_timeout": settings.smtp_timeout,
|
||||
"smtp_user": settings.smtp_user,
|
||||
"smtp_from": settings.smtp_from,
|
||||
"smtp_to": settings.smtp_to,
|
||||
"smtp_report_hour": settings.smtp_report_hour,
|
||||
"smtp_daily_report_enabled": settings.smtp_daily_report_enabled,
|
||||
"smtp_alert_on_fail": settings.smtp_alert_on_fail,
|
||||
"smtp_alert_on_pass": settings.smtp_alert_on_pass,
|
||||
"webhook_url": settings.webhook_url,
|
||||
"stuck_job_hours": settings.stuck_job_hours,
|
||||
"max_parallel_burnins": settings.max_parallel_burnins,
|
||||
}
|
||||
|
||||
ps = poller.get_state()
|
||||
return templates.TemplateResponse("settings.html", {
|
||||
"request": request,
|
||||
"readonly": readonly,
|
||||
"editable": editable,
|
||||
"smtp_enabled": bool(settings.smtp_host),
|
||||
"poller": ps,
|
||||
**_stale_context(ps),
|
||||
})
|
||||
|
||||
|
||||
@router.post("/api/v1/settings")
|
||||
async def save_settings(body: dict):
|
||||
"""Save editable runtime settings. Password is only updated if non-empty."""
|
||||
# Don't overwrite password if client sent empty string
|
||||
if "smtp_password" in body and body["smtp_password"] == "":
|
||||
del body["smtp_password"]
|
||||
|
||||
try:
|
||||
saved = settings_store.save(body)
|
||||
except ValueError as exc:
|
||||
raise HTTPException(status_code=422, detail=str(exc))
|
||||
|
||||
return {"saved": True, "keys": saved}
|
||||
|
||||
|
||||
@router.post("/api/v1/settings/test-smtp")
|
||||
async def test_smtp():
|
||||
"""Test the current SMTP configuration without sending an email."""
|
||||
result = await mailer.test_smtp_connection()
|
||||
if not result["ok"]:
|
||||
raise HTTPException(status_code=502, detail=result["error"])
|
||||
return {"ok": True}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Print view (must be BEFORE /{job_id} int route)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@router.get("/history/{job_id}/print", response_class=HTMLResponse)
|
||||
async def history_print(
|
||||
request: Request,
|
||||
job_id: int,
|
||||
db: aiosqlite.Connection = Depends(get_db),
|
||||
):
|
||||
cur = await db.execute("""
|
||||
SELECT
|
||||
bj.*, d.devname, d.serial, d.model, d.size_bytes,
|
||||
CAST(
|
||||
(julianday(bj.finished_at) - julianday(bj.started_at)) * 86400
|
||||
AS INTEGER
|
||||
) AS duration_seconds
|
||||
FROM burnin_jobs bj
|
||||
JOIN drives d ON d.id = bj.drive_id
|
||||
WHERE bj.id = ?
|
||||
""", (job_id,))
|
||||
row = await cur.fetchone()
|
||||
if not row:
|
||||
raise HTTPException(status_code=404, detail="Job not found")
|
||||
job = dict(row)
|
||||
|
||||
cur = await db.execute("""
|
||||
SELECT *,
|
||||
CAST(
|
||||
(julianday(finished_at) - julianday(started_at)) * 86400
|
||||
AS INTEGER
|
||||
) AS duration_seconds
|
||||
FROM burnin_stages WHERE burnin_job_id=? ORDER BY id
|
||||
""", (job_id,))
|
||||
job["stages"] = [dict(r) for r in await cur.fetchall()]
|
||||
|
||||
return templates.TemplateResponse("job_print.html", {
|
||||
"request": request,
|
||||
"job": job,
|
||||
})
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Burn-in job detail API (must be after export.csv to avoid int coercion)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@router.get("/api/v1/burnin/{job_id}", response_model=BurninJobResponse)
|
||||
async def burnin_get(job_id: int, db: aiosqlite.Connection = Depends(get_db)):
|
||||
db.row_factory = aiosqlite.Row
|
||||
cur = await db.execute("SELECT * FROM burnin_jobs WHERE id=?", (job_id,))
|
||||
row = await cur.fetchone()
|
||||
if not row:
|
||||
raise HTTPException(status_code=404, detail="Burn-in job not found")
|
||||
cur = await db.execute(
|
||||
"SELECT * FROM burnin_stages WHERE burnin_job_id=? ORDER BY id", (job_id,)
|
||||
)
|
||||
stages = await cur.fetchall()
|
||||
return _row_to_burnin(row, stages)
|
||||
104
app/settings_store.py
Normal file
104
app/settings_store.py
Normal file
|
|
@ -0,0 +1,104 @@
|
|||
"""
|
||||
Runtime settings store — persists editable settings to /data/settings_overrides.json.
|
||||
|
||||
Changes take effect immediately (in-memory setattr on the global Settings object)
|
||||
and survive restarts (JSON file is loaded in main.py lifespan).
|
||||
|
||||
Settings that require a container restart (TrueNAS URL, poll interval, allowed IPs, etc.)
|
||||
are NOT included here and are display-only on the settings page.
|
||||
"""
|
||||
|
||||
import json
|
||||
import logging
|
||||
from pathlib import Path
|
||||
|
||||
from app.config import settings
|
||||
|
||||
log = logging.getLogger(__name__)
|
||||
|
||||
# Field name → coerce function. Only fields listed here are accepted by save().
|
||||
_EDITABLE: dict[str, type] = {
|
||||
"smtp_host": str,
|
||||
"smtp_ssl_mode": str,
|
||||
"smtp_timeout": int,
|
||||
"smtp_user": str,
|
||||
"smtp_password": str,
|
||||
"smtp_from": str,
|
||||
"smtp_to": str,
|
||||
"smtp_daily_report_enabled": bool,
|
||||
"smtp_report_hour": int,
|
||||
"smtp_alert_on_fail": bool,
|
||||
"smtp_alert_on_pass": bool,
|
||||
"webhook_url": str,
|
||||
"stuck_job_hours": int,
|
||||
"max_parallel_burnins": int,
|
||||
}
|
||||
|
||||
_VALID_SSL_MODES = {"starttls", "ssl", "plain"}
|
||||
|
||||
|
||||
def _overrides_path() -> Path:
|
||||
return Path(settings.db_path).parent / "settings_overrides.json"
|
||||
|
||||
|
||||
def _coerce(key: str, raw) -> object:
|
||||
coerce = _EDITABLE[key]
|
||||
if coerce is bool:
|
||||
if isinstance(raw, bool):
|
||||
return raw
|
||||
return str(raw).lower() in ("1", "true", "yes", "on")
|
||||
return coerce(raw)
|
||||
|
||||
|
||||
def _apply(data: dict) -> None:
|
||||
"""Apply a dict of updates to the live settings object."""
|
||||
for key, raw in data.items():
|
||||
if key not in _EDITABLE:
|
||||
continue
|
||||
try:
|
||||
val = _coerce(key, raw)
|
||||
if key == "smtp_ssl_mode" and val not in _VALID_SSL_MODES:
|
||||
log.warning("settings_store: invalid smtp_ssl_mode %r — ignoring", val)
|
||||
continue
|
||||
if key == "smtp_report_hour" and not (0 <= int(val) <= 23):
|
||||
log.warning("settings_store: smtp_report_hour out of range — ignoring")
|
||||
continue
|
||||
setattr(settings, key, val)
|
||||
except (ValueError, TypeError) as exc:
|
||||
log.warning("settings_store: invalid value for %s: %s", key, exc)
|
||||
|
||||
|
||||
def init() -> None:
|
||||
"""Load persisted overrides at startup. Call once from lifespan."""
|
||||
path = _overrides_path()
|
||||
if not path.exists():
|
||||
return
|
||||
try:
|
||||
data = json.loads(path.read_text())
|
||||
_apply(data)
|
||||
log.info("settings_store: loaded %d override(s) from %s", len(data), path)
|
||||
except Exception as exc:
|
||||
log.warning("settings_store: could not load overrides from %s: %s", path, exc)
|
||||
|
||||
|
||||
def save(updates: dict) -> list[str]:
|
||||
"""
|
||||
Validate, apply, and persist a dict of settings updates.
|
||||
Returns list of keys that were actually saved.
|
||||
Raises ValueError for unknown or invalid fields.
|
||||
"""
|
||||
accepted: dict = {}
|
||||
for key, raw in updates.items():
|
||||
if key not in _EDITABLE:
|
||||
raise ValueError(f"Unknown or non-editable setting: {key!r}")
|
||||
accepted[key] = raw
|
||||
|
||||
_apply(accepted)
|
||||
|
||||
# Persist ALL currently-applied editable values (not just the delta)
|
||||
snapshot = {k: getattr(settings, k) for k in _EDITABLE}
|
||||
path = _overrides_path()
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
path.write_text(json.dumps(snapshot, indent=2))
|
||||
log.info("settings_store: saved %d key(s) — snapshot written to %s", len(accepted), path)
|
||||
return list(accepted.keys())
|
||||
1939
app/static/app.css
Normal file
1939
app/static/app.css
Normal file
File diff suppressed because it is too large
Load diff
848
app/static/app.js
Normal file
848
app/static/app.js
Normal file
|
|
@ -0,0 +1,848 @@
|
|||
(function () {
|
||||
'use strict';
|
||||
|
||||
// -----------------------------------------------------------------------
|
||||
// Filter bar + stats bar
|
||||
// -----------------------------------------------------------------------
|
||||
|
||||
var activeFilter = 'all';
|
||||
|
||||
function getRows() {
|
||||
return Array.from(document.querySelectorAll('#drives-tbody tr[data-status]'));
|
||||
}
|
||||
|
||||
function updateCounts() {
|
||||
var rows = getRows();
|
||||
var counts = { all: rows.length, running: 0, failed: 0, passed: 0, idle: 0 };
|
||||
rows.forEach(function (r) {
|
||||
var s = r.dataset.status;
|
||||
if (s && Object.prototype.hasOwnProperty.call(counts, s)) counts[s]++;
|
||||
});
|
||||
|
||||
// Update filter bar badges
|
||||
document.querySelectorAll('.filter-btn[data-filter]').forEach(function (btn) {
|
||||
var badge = btn.querySelector('.badge');
|
||||
if (badge) badge.textContent = counts[btn.dataset.filter] != null ? counts[btn.dataset.filter] : 0;
|
||||
});
|
||||
|
||||
// Update stats bar
|
||||
['all', 'running', 'failed', 'passed', 'idle'].forEach(function (s) {
|
||||
var el = document.getElementById('stat-' + s);
|
||||
if (el) el.textContent = counts[s] != null ? counts[s] : 0;
|
||||
});
|
||||
|
||||
// Show/hide failed banner
|
||||
var banner = document.getElementById('failed-banner');
|
||||
if (banner) {
|
||||
var failedCount = counts.failed || 0;
|
||||
banner.hidden = failedCount === 0;
|
||||
var fc = banner.querySelector('.failed-count');
|
||||
if (fc) fc.textContent = failedCount;
|
||||
}
|
||||
|
||||
// Show/hide "Cancel All Burn-Ins" button based on whether any .btn-cancel exist
|
||||
var cancelAllBtn = document.getElementById('cancel-all-btn');
|
||||
if (cancelAllBtn) {
|
||||
var hasCancelable = document.querySelectorAll('.btn-cancel[data-job-id]').length > 0;
|
||||
cancelAllBtn.hidden = !hasCancelable;
|
||||
}
|
||||
}
|
||||
|
||||
function applyFilter(filter) {
|
||||
activeFilter = filter;
|
||||
getRows().forEach(function (row) {
|
||||
row.style.display = (filter === 'all' || row.dataset.status === filter) ? '' : 'none';
|
||||
});
|
||||
document.querySelectorAll('.filter-btn[data-filter]').forEach(function (btn) {
|
||||
btn.classList.toggle('active', btn.dataset.filter === filter);
|
||||
});
|
||||
updateCounts();
|
||||
}
|
||||
|
||||
document.addEventListener('click', function (e) {
|
||||
var btn = e.target.closest('.filter-btn[data-filter]');
|
||||
if (btn) applyFilter(btn.dataset.filter);
|
||||
});
|
||||
|
||||
document.addEventListener('htmx:afterSwap', function () {
|
||||
applyFilter(activeFilter);
|
||||
restoreCheckboxes();
|
||||
initElapsedTimers();
|
||||
initLocationEdits();
|
||||
});
|
||||
|
||||
updateCounts();
|
||||
|
||||
// -----------------------------------------------------------------------
|
||||
// Toast notifications
|
||||
// -----------------------------------------------------------------------
|
||||
|
||||
function showToast(msg, type) {
|
||||
type = type || 'info';
|
||||
var container = document.getElementById('toast-container');
|
||||
if (!container) return;
|
||||
var el = document.createElement('div');
|
||||
el.className = 'toast toast-' + type;
|
||||
el.textContent = msg;
|
||||
container.appendChild(el);
|
||||
setTimeout(function () { el.remove(); }, 5000);
|
||||
}
|
||||
|
||||
// -----------------------------------------------------------------------
|
||||
// Browser push notifications
|
||||
// -----------------------------------------------------------------------
|
||||
|
||||
function updateNotifBtn() {
|
||||
var btn = document.getElementById('notif-btn');
|
||||
if (!btn) return;
|
||||
var perm = Notification.permission;
|
||||
btn.classList.remove('notif-active', 'notif-denied');
|
||||
if (perm === 'granted') {
|
||||
btn.classList.add('notif-active');
|
||||
btn.title = 'Notifications enabled';
|
||||
} else if (perm === 'denied') {
|
||||
btn.classList.add('notif-denied');
|
||||
btn.title = 'Notifications blocked — allow in browser settings';
|
||||
} else {
|
||||
btn.title = 'Enable browser notifications';
|
||||
}
|
||||
}
|
||||
|
||||
if ('Notification' in window) {
|
||||
updateNotifBtn();
|
||||
document.addEventListener('click', function (e) {
|
||||
if (!e.target.closest('#notif-btn')) return;
|
||||
if (Notification.permission === 'denied') {
|
||||
showToast('Notifications blocked — allow in browser settings', 'error');
|
||||
return;
|
||||
}
|
||||
Notification.requestPermission().then(function (perm) {
|
||||
updateNotifBtn();
|
||||
if (perm === 'granted') {
|
||||
showToast('Browser notifications enabled', 'success');
|
||||
new Notification('TrueNAS Burn-In', {
|
||||
body: 'You will be notified when burn-in jobs complete.',
|
||||
});
|
||||
}
|
||||
});
|
||||
});
|
||||
} else {
|
||||
var nb = document.getElementById('notif-btn');
|
||||
if (nb) nb.style.display = 'none';
|
||||
}
|
||||
|
||||
// Handle job-alert SSE events for browser notifications
|
||||
document.addEventListener('htmx:sseMessage', function (e) {
|
||||
if (!e.detail || e.detail.type !== 'job-alert') return;
|
||||
try {
|
||||
handleJobAlert(JSON.parse(e.detail.data));
|
||||
} catch (_) {}
|
||||
});
|
||||
|
||||
function handleJobAlert(data) {
|
||||
var isPass = data.state === 'passed';
|
||||
var icon = isPass ? '✓' : '✕';
|
||||
var title = icon + ' ' + (data.devname || 'Drive') + ' — Burn-In ' + (data.state || '').toUpperCase();
|
||||
var bodyText = (data.model || '') + (data.serial ? ' · ' + data.serial : '');
|
||||
if (!isPass && data.error_text) bodyText += '\n' + data.error_text;
|
||||
|
||||
showToast(title + (data.error_text ? ' · ' + data.error_text : ''), isPass ? 'success' : 'error');
|
||||
|
||||
if (Notification.permission === 'granted') {
|
||||
try {
|
||||
new Notification(title, { body: bodyText || undefined });
|
||||
} catch (_) {}
|
||||
}
|
||||
}
|
||||
|
||||
// -----------------------------------------------------------------------
|
||||
// Elapsed time timers
|
||||
// -----------------------------------------------------------------------
|
||||
|
||||
var _elapsedInterval = null;
|
||||
|
||||
function formatElapsed(seconds) {
|
||||
if (seconds < 0) return '';
|
||||
var h = Math.floor(seconds / 3600);
|
||||
var m = Math.floor((seconds % 3600) / 60);
|
||||
var s = seconds % 60;
|
||||
if (h > 0) return h + 'h ' + m + 'm';
|
||||
if (m > 0) return m + 'm ' + s + 's';
|
||||
return s + 's';
|
||||
}
|
||||
|
||||
function tickElapsedTimers() {
|
||||
var now = Date.now();
|
||||
document.querySelectorAll('.elapsed-timer[data-started]').forEach(function (el) {
|
||||
var started = new Date(el.dataset.started).getTime();
|
||||
if (isNaN(started)) return;
|
||||
var elapsed = Math.floor((now - started) / 1000);
|
||||
el.textContent = formatElapsed(elapsed);
|
||||
});
|
||||
}
|
||||
|
||||
function initElapsedTimers() {
|
||||
if (_elapsedInterval) return; // Already running
|
||||
var timers = document.querySelectorAll('.elapsed-timer[data-started]');
|
||||
if (timers.length === 0) return;
|
||||
_elapsedInterval = setInterval(function () {
|
||||
var remaining = document.querySelectorAll('.elapsed-timer[data-started]');
|
||||
if (remaining.length === 0) {
|
||||
clearInterval(_elapsedInterval);
|
||||
_elapsedInterval = null;
|
||||
return;
|
||||
}
|
||||
tickElapsedTimers();
|
||||
}, 1000);
|
||||
tickElapsedTimers();
|
||||
}
|
||||
|
||||
initElapsedTimers();
|
||||
|
||||
// -----------------------------------------------------------------------
|
||||
// Inline location / notes edit
|
||||
// -----------------------------------------------------------------------
|
||||
|
||||
function initLocationEdits() {
|
||||
document.querySelectorAll('.drive-location').forEach(function (el) {
|
||||
if (el._locationInited) return;
|
||||
el._locationInited = true;
|
||||
|
||||
el.addEventListener('click', function (evt) {
|
||||
evt.stopPropagation();
|
||||
var driveId = el.dataset.driveId;
|
||||
var current = el.classList.contains('drive-location-empty') ? '' : el.textContent.trim();
|
||||
|
||||
var input = document.createElement('input');
|
||||
input.type = 'text';
|
||||
input.className = 'drive-location-input';
|
||||
input.value = current;
|
||||
input.placeholder = 'e.g. Bay 3 Shelf 2';
|
||||
input.maxLength = 64;
|
||||
|
||||
el.replaceWith(input);
|
||||
input.focus();
|
||||
input.select();
|
||||
|
||||
async function save() {
|
||||
var newVal = input.value.trim();
|
||||
try {
|
||||
var resp = await fetch('/api/v1/drives/' + driveId, {
|
||||
method: 'PATCH',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({ location: newVal || null }),
|
||||
});
|
||||
if (!resp.ok) throw new Error('save failed');
|
||||
} catch (_) {
|
||||
showToast('Failed to save location', 'error');
|
||||
}
|
||||
// The SSE update will replace the whole row; nothing more needed
|
||||
}
|
||||
|
||||
function cancel() {
|
||||
var span = document.createElement('span');
|
||||
span.className = 'drive-location' + (current ? '' : ' drive-location-empty');
|
||||
span.dataset.driveId = driveId;
|
||||
span.dataset.field = 'location';
|
||||
span.title = current ? 'Click to edit location' : 'Click to set location';
|
||||
span.textContent = current || '+ location';
|
||||
input.replaceWith(span);
|
||||
initLocationEdits(); // re-attach listener
|
||||
}
|
||||
|
||||
input.addEventListener('blur', function () { save(); });
|
||||
input.addEventListener('keydown', function (e) {
|
||||
if (e.key === 'Enter') { input.blur(); }
|
||||
if (e.key === 'Escape') { cancel(); }
|
||||
});
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
initLocationEdits();
|
||||
|
||||
// -----------------------------------------------------------------------
|
||||
// Stage drag-and-drop reordering
|
||||
// -----------------------------------------------------------------------
|
||||
|
||||
function initStageDrag(listId) {
|
||||
var list = document.getElementById(listId);
|
||||
if (!list || list._dragInited) return;
|
||||
list._dragInited = true;
|
||||
|
||||
var draggingEl = null;
|
||||
|
||||
list.addEventListener('dragstart', function (e) {
|
||||
draggingEl = e.target.closest('.stage-check');
|
||||
if (!draggingEl) return;
|
||||
draggingEl.classList.add('dragging');
|
||||
e.dataTransfer.effectAllowed = 'move';
|
||||
});
|
||||
|
||||
list.addEventListener('dragend', function () {
|
||||
if (draggingEl) draggingEl.classList.remove('dragging');
|
||||
list.querySelectorAll('.stage-check.drag-over').forEach(function (el) {
|
||||
el.classList.remove('drag-over');
|
||||
});
|
||||
draggingEl = null;
|
||||
});
|
||||
|
||||
list.addEventListener('dragover', function (e) {
|
||||
e.preventDefault();
|
||||
if (!draggingEl) return;
|
||||
var target = e.target.closest('.stage-check');
|
||||
if (!target || target === draggingEl) return;
|
||||
var rect = target.getBoundingClientRect();
|
||||
var midY = rect.top + rect.height / 2;
|
||||
if (e.clientY < midY) {
|
||||
list.insertBefore(draggingEl, target);
|
||||
} else {
|
||||
list.insertBefore(draggingEl, target.nextSibling);
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
// Map checkbox id → backend stage name
|
||||
var _STAGE_ID_MAP = {
|
||||
'stage-surface': 'surface_validate',
|
||||
'stage-short': 'short_smart',
|
||||
'stage-long': 'long_smart',
|
||||
'batch-stage-surface': 'surface_validate',
|
||||
'batch-stage-short': 'short_smart',
|
||||
'batch-stage-long': 'long_smart',
|
||||
};
|
||||
|
||||
// Read DOM order of checked stages from the given list element
|
||||
function getStageOrder(listId) {
|
||||
var items = Array.from(document.querySelectorAll('#' + listId + ' .stage-check'));
|
||||
var order = [];
|
||||
items.forEach(function (item) {
|
||||
var cb = item.querySelector('input[type=checkbox]');
|
||||
if (cb && cb.checked && _STAGE_ID_MAP[cb.id]) {
|
||||
order.push(_STAGE_ID_MAP[cb.id]);
|
||||
}
|
||||
});
|
||||
return order;
|
||||
}
|
||||
|
||||
// -----------------------------------------------------------------------
|
||||
// Standalone SMART test
|
||||
// -----------------------------------------------------------------------
|
||||
|
||||
async function startSmartTest(btn) {
|
||||
var driveId = btn.dataset.driveId;
|
||||
var testType = btn.dataset.testType;
|
||||
var operator = localStorage.getItem('burnin_operator') || 'unknown';
|
||||
|
||||
btn.disabled = true;
|
||||
try {
|
||||
var resp = await fetch('/api/v1/drives/' + driveId + '/smart/start', {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({ type: testType, operator: operator }),
|
||||
});
|
||||
var data = await resp.json();
|
||||
if (!resp.ok) {
|
||||
showToast(data.detail || 'Failed to start test', 'error');
|
||||
btn.disabled = false;
|
||||
} else {
|
||||
var label = testType === 'SHORT' ? 'Short' : 'Long';
|
||||
showToast(label + ' SMART test started on ' + data.devname, 'success');
|
||||
}
|
||||
} catch (err) {
|
||||
showToast('Network error', 'error');
|
||||
btn.disabled = false;
|
||||
}
|
||||
}
|
||||
|
||||
// -----------------------------------------------------------------------
|
||||
// Cancel standalone SMART test
|
||||
// -----------------------------------------------------------------------
|
||||
|
||||
async function cancelSmartTest(btn) {
|
||||
var driveId = btn.dataset.driveId;
|
||||
var testType = btn.dataset.testType; // 'short' or 'long'
|
||||
var label = testType === 'short' ? 'Short' : 'Long';
|
||||
|
||||
if (!confirm('Cancel the ' + label + ' SMART test? This cannot be undone.')) return;
|
||||
|
||||
btn.disabled = true;
|
||||
try {
|
||||
var resp = await fetch('/api/v1/drives/' + driveId + '/smart/cancel', {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({ type: testType }),
|
||||
});
|
||||
var data = await resp.json();
|
||||
if (!resp.ok) {
|
||||
showToast(data.detail || 'Failed to cancel test', 'error');
|
||||
btn.disabled = false;
|
||||
} else {
|
||||
var label = testType === 'short' ? 'Short' : 'Long';
|
||||
showToast(label + ' SMART test cancelled on ' + (data.devname || ''), 'info');
|
||||
}
|
||||
} catch (err) {
|
||||
showToast('Network error', 'error');
|
||||
btn.disabled = false;
|
||||
}
|
||||
}
|
||||
|
||||
// -----------------------------------------------------------------------
|
||||
// Cancel ALL running/queued burn-in jobs
|
||||
// -----------------------------------------------------------------------
|
||||
|
||||
async function cancelAllBurnins() {
|
||||
var cancelBtns = Array.from(document.querySelectorAll('.btn-cancel[data-job-id]'));
|
||||
if (cancelBtns.length === 0) {
|
||||
showToast('No active burn-in jobs to cancel', 'info');
|
||||
return;
|
||||
}
|
||||
if (!confirm('Cancel ALL ' + cancelBtns.length + ' active burn-in job(s)? This cannot be undone.')) return;
|
||||
var operator = localStorage.getItem('burnin_operator') || 'unknown';
|
||||
var count = 0;
|
||||
for (var i = 0; i < cancelBtns.length; i++) {
|
||||
var jobId = cancelBtns[i].dataset.jobId;
|
||||
try {
|
||||
var resp = await fetch('/api/v1/burnin/' + jobId + '/cancel', {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({ operator: operator }),
|
||||
});
|
||||
if (resp.ok) count++;
|
||||
} catch (_) {}
|
||||
}
|
||||
showToast(count + ' burn-in job(s) cancelled', 'info');
|
||||
}
|
||||
|
||||
// -----------------------------------------------------------------------
|
||||
// Single drive Burn-In modal
|
||||
// -----------------------------------------------------------------------
|
||||
|
||||
var modalDriveId = null;
|
||||
var modalSerial = null;
|
||||
|
||||
function _stageLabel() {
|
||||
// Read labels in DOM order from stage-order-list
|
||||
var items = Array.from(document.querySelectorAll('#stage-order-list .stage-check'));
|
||||
var labelMap = {
|
||||
'stage-surface': 'Surface',
|
||||
'stage-short': 'Short SMART',
|
||||
'stage-long': 'Long SMART',
|
||||
};
|
||||
var parts = [];
|
||||
items.forEach(function (item) {
|
||||
var cb = item.querySelector('input[type=checkbox]');
|
||||
if (cb && cb.checked && labelMap[cb.id]) parts.push(labelMap[cb.id]);
|
||||
});
|
||||
return parts.length ? parts.join(' + ') : 'No stages';
|
||||
}
|
||||
|
||||
function handleStageChange() {
|
||||
var surfaceChecked = document.getElementById('stage-surface') && document.getElementById('stage-surface').checked;
|
||||
var warning = document.getElementById('surface-warning');
|
||||
var serialField = document.getElementById('serial-field');
|
||||
if (warning) warning.style.display = surfaceChecked ? '' : 'none';
|
||||
if (serialField) serialField.style.display = surfaceChecked ? '' : 'none';
|
||||
// Update title
|
||||
var title = document.getElementById('modal-title');
|
||||
if (title) title.textContent = 'Burn-In — ' + _stageLabel();
|
||||
validateModal();
|
||||
}
|
||||
|
||||
function openModal(btn) {
|
||||
modalDriveId = btn.dataset.driveId;
|
||||
modalSerial = btn.dataset.serial || '';
|
||||
|
||||
document.getElementById('modal-devname').textContent = btn.dataset.devname || '—';
|
||||
document.getElementById('modal-model').textContent = btn.dataset.model || '—';
|
||||
document.getElementById('modal-serial-display').textContent = btn.dataset.serial || '—';
|
||||
document.getElementById('modal-size').textContent = btn.dataset.size || '—';
|
||||
|
||||
var healthEl = document.getElementById('modal-health');
|
||||
var health = btn.dataset.health || 'UNKNOWN';
|
||||
healthEl.textContent = health;
|
||||
healthEl.className = 'chip chip-' + health.toLowerCase();
|
||||
|
||||
// Reset stage checkboxes to all-on (keep user's drag order)
|
||||
['stage-surface', 'stage-short', 'stage-long'].forEach(function (id) {
|
||||
var el = document.getElementById(id);
|
||||
if (el) el.checked = true;
|
||||
});
|
||||
|
||||
document.getElementById('confirm-serial').value = '';
|
||||
document.getElementById('confirm-hint').textContent = 'Expected: ' + modalSerial;
|
||||
|
||||
var savedOp = localStorage.getItem('burnin_operator') || '';
|
||||
document.getElementById('operator-input').value = savedOp;
|
||||
|
||||
// Init drag on first open (list is in static DOM)
|
||||
initStageDrag('stage-order-list');
|
||||
|
||||
handleStageChange(); // sets warning visibility + title + validates
|
||||
|
||||
document.getElementById('start-modal').removeAttribute('hidden');
|
||||
setTimeout(function () {
|
||||
document.getElementById('operator-input').focus();
|
||||
}, 50);
|
||||
}
|
||||
|
||||
function closeModal() {
|
||||
document.getElementById('start-modal').setAttribute('hidden', '');
|
||||
modalDriveId = null;
|
||||
modalSerial = null;
|
||||
}
|
||||
|
||||
function validateModal() {
|
||||
var operator = (document.getElementById('operator-input').value || '').trim();
|
||||
var surfaceChecked = document.getElementById('stage-surface') && document.getElementById('stage-surface').checked;
|
||||
var shortChecked = document.getElementById('stage-short') && document.getElementById('stage-short').checked;
|
||||
var longChecked = document.getElementById('stage-long') && document.getElementById('stage-long').checked;
|
||||
var anyStage = surfaceChecked || shortChecked || longChecked;
|
||||
|
||||
var valid;
|
||||
if (surfaceChecked) {
|
||||
var typed = (document.getElementById('confirm-serial').value || '').trim();
|
||||
valid = operator.length > 0 && typed === modalSerial && modalSerial !== '' && anyStage;
|
||||
} else {
|
||||
valid = operator.length > 0 && anyStage;
|
||||
}
|
||||
document.getElementById('modal-start-btn').disabled = !valid;
|
||||
}
|
||||
|
||||
async function submitStart() {
|
||||
var operator = (document.getElementById('operator-input').value || '').trim();
|
||||
localStorage.setItem('burnin_operator', operator);
|
||||
|
||||
var runSurface = document.getElementById('stage-surface').checked;
|
||||
var runShort = document.getElementById('stage-short').checked;
|
||||
var runLong = document.getElementById('stage-long').checked;
|
||||
var stageOrder = getStageOrder('stage-order-list');
|
||||
|
||||
try {
|
||||
var resp = await fetch('/api/v1/burnin/start', {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({
|
||||
drive_ids: [parseInt(modalDriveId, 10)],
|
||||
operator: operator,
|
||||
run_surface: runSurface,
|
||||
run_short: runShort,
|
||||
run_long: runLong,
|
||||
stage_order: stageOrder,
|
||||
}),
|
||||
});
|
||||
|
||||
var data = await resp.json();
|
||||
if (!resp.ok) {
|
||||
showToast(data.detail || 'Failed to start burn-in', 'error');
|
||||
return;
|
||||
}
|
||||
|
||||
closeModal();
|
||||
showToast('Burn-in queued: ' + _stageLabel(), 'success');
|
||||
} catch (err) {
|
||||
showToast('Network error', 'error');
|
||||
}
|
||||
}
|
||||
|
||||
// -----------------------------------------------------------------------
|
||||
// Batch Burn-In
|
||||
// -----------------------------------------------------------------------
|
||||
|
||||
var checkedDriveIds = new Set();
|
||||
|
||||
function updateBatchBar() {
|
||||
var bar = document.getElementById('batch-bar');
|
||||
if (!bar) return;
|
||||
var count = checkedDriveIds.size;
|
||||
bar.hidden = count === 0;
|
||||
var countEl = document.getElementById('batch-count');
|
||||
if (countEl) countEl.textContent = count;
|
||||
}
|
||||
|
||||
function restoreCheckboxes() {
|
||||
// Re-check boxes that were checked before the SSE swap
|
||||
document.querySelectorAll('.drive-checkbox').forEach(function (cb) {
|
||||
cb.checked = checkedDriveIds.has(cb.dataset.driveId);
|
||||
});
|
||||
|
||||
// Update select-all state
|
||||
var selectAll = document.getElementById('select-all-cb');
|
||||
if (selectAll) {
|
||||
var allBoxes = document.querySelectorAll('.drive-checkbox');
|
||||
selectAll.checked = allBoxes.length > 0 && Array.from(allBoxes).every(function (c) { return c.checked; });
|
||||
selectAll.indeterminate = checkedDriveIds.size > 0 && !selectAll.checked;
|
||||
}
|
||||
|
||||
updateBatchBar();
|
||||
}
|
||||
|
||||
// Toggle individual checkbox
|
||||
document.addEventListener('change', function (e) {
|
||||
if (e.target.classList.contains('drive-checkbox')) {
|
||||
var id = e.target.dataset.driveId;
|
||||
if (e.target.checked) {
|
||||
checkedDriveIds.add(id);
|
||||
} else {
|
||||
checkedDriveIds.delete(id);
|
||||
}
|
||||
updateBatchBar();
|
||||
// Update select-all indeterminate state
|
||||
var selectAll = document.getElementById('select-all-cb');
|
||||
if (selectAll) {
|
||||
var allBoxes = Array.from(document.querySelectorAll('.drive-checkbox'));
|
||||
selectAll.checked = allBoxes.length > 0 && allBoxes.every(function (c) { return c.checked; });
|
||||
selectAll.indeterminate = checkedDriveIds.size > 0 && !selectAll.checked;
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
// Select-all checkbox
|
||||
if (e.target.id === 'select-all-cb') {
|
||||
var boxes = document.querySelectorAll('.drive-checkbox');
|
||||
boxes.forEach(function (cb) {
|
||||
cb.checked = e.target.checked;
|
||||
if (e.target.checked) {
|
||||
checkedDriveIds.add(cb.dataset.driveId);
|
||||
} else {
|
||||
checkedDriveIds.delete(cb.dataset.driveId);
|
||||
}
|
||||
});
|
||||
updateBatchBar();
|
||||
return;
|
||||
}
|
||||
|
||||
// Batch modal inputs validation
|
||||
if (['batch-confirm-cb', 'batch-stage-surface', 'batch-stage-short', 'batch-stage-long'].indexOf(e.target.id) !== -1) {
|
||||
validateBatchModal();
|
||||
}
|
||||
});
|
||||
|
||||
// Batch bar buttons
|
||||
document.addEventListener('click', function (e) {
|
||||
if (e.target.id === 'batch-start-btn' || e.target.closest('#batch-start-btn')) {
|
||||
openBatchModal();
|
||||
return;
|
||||
}
|
||||
if (e.target.id === 'batch-clear-btn') {
|
||||
checkedDriveIds.clear();
|
||||
document.querySelectorAll('.drive-checkbox').forEach(function (cb) { cb.checked = false; });
|
||||
var sa = document.getElementById('select-all-cb');
|
||||
if (sa) { sa.checked = false; sa.indeterminate = false; }
|
||||
updateBatchBar();
|
||||
return;
|
||||
}
|
||||
});
|
||||
|
||||
function openBatchModal() {
|
||||
var modal = document.getElementById('batch-modal');
|
||||
if (!modal) return;
|
||||
var savedOp = localStorage.getItem('burnin_operator') || '';
|
||||
document.getElementById('batch-operator-input').value = savedOp;
|
||||
document.getElementById('batch-confirm-cb').checked = false;
|
||||
// Reset stages to all-on (keep user's drag order)
|
||||
['batch-stage-surface', 'batch-stage-short', 'batch-stage-long'].forEach(function (id) {
|
||||
var el = document.getElementById(id);
|
||||
if (el) el.checked = true;
|
||||
});
|
||||
var countEls = document.querySelectorAll('#batch-modal-count, #batch-modal-count-btn');
|
||||
countEls.forEach(function (el) { el.textContent = checkedDriveIds.size; });
|
||||
// Init drag on first open
|
||||
initStageDrag('batch-stage-order-list');
|
||||
validateBatchModal();
|
||||
modal.removeAttribute('hidden');
|
||||
setTimeout(function () {
|
||||
document.getElementById('batch-operator-input').focus();
|
||||
}, 50);
|
||||
}
|
||||
|
||||
function closeBatchModal() {
|
||||
var modal = document.getElementById('batch-modal');
|
||||
if (modal) modal.setAttribute('hidden', '');
|
||||
}
|
||||
|
||||
function validateBatchModal() {
|
||||
var operator = (document.getElementById('batch-operator-input').value || '').trim();
|
||||
var surfaceEl = document.getElementById('batch-stage-surface');
|
||||
var surfaceChecked = surfaceEl && surfaceEl.checked;
|
||||
|
||||
// Show/hide destructive warning and confirm checkbox based on surface selection
|
||||
var warning = document.getElementById('batch-surface-warning');
|
||||
var confirmWrap = document.getElementById('batch-confirm-wrap');
|
||||
if (warning) warning.style.display = surfaceChecked ? '' : 'none';
|
||||
if (confirmWrap) confirmWrap.style.display = surfaceChecked ? '' : 'none';
|
||||
|
||||
var shortEl = document.getElementById('batch-stage-short');
|
||||
var longEl = document.getElementById('batch-stage-long');
|
||||
var anyStage = surfaceChecked ||
|
||||
(shortEl && shortEl.checked) ||
|
||||
(longEl && longEl.checked);
|
||||
|
||||
var valid;
|
||||
if (surfaceChecked) {
|
||||
var confirmed = document.getElementById('batch-confirm-cb').checked;
|
||||
valid = operator.length > 0 && confirmed && anyStage;
|
||||
} else {
|
||||
valid = operator.length > 0 && anyStage;
|
||||
}
|
||||
|
||||
var btn = document.getElementById('batch-modal-start-btn');
|
||||
if (btn) btn.disabled = !valid;
|
||||
}
|
||||
|
||||
document.addEventListener('input', function (e) {
|
||||
if (e.target.id === 'operator-input' || e.target.id === 'confirm-serial') validateModal();
|
||||
if (e.target.id === 'batch-operator-input') validateBatchModal();
|
||||
});
|
||||
|
||||
async function submitBatchStart() {
|
||||
var operator = (document.getElementById('batch-operator-input').value || '').trim();
|
||||
localStorage.setItem('burnin_operator', operator);
|
||||
|
||||
var ids = Array.from(checkedDriveIds).map(function (id) { return parseInt(id, 10); });
|
||||
if (ids.length === 0) return;
|
||||
|
||||
var btn = document.getElementById('batch-modal-start-btn');
|
||||
if (btn) btn.disabled = true;
|
||||
|
||||
var runSurface = document.getElementById('batch-stage-surface').checked;
|
||||
var runShort = document.getElementById('batch-stage-short').checked;
|
||||
var runLong = document.getElementById('batch-stage-long').checked;
|
||||
var stageOrder = getStageOrder('batch-stage-order-list');
|
||||
|
||||
try {
|
||||
var resp = await fetch('/api/v1/burnin/start', {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({
|
||||
drive_ids: ids,
|
||||
operator: operator,
|
||||
run_surface: runSurface,
|
||||
run_short: runShort,
|
||||
run_long: runLong,
|
||||
stage_order: stageOrder,
|
||||
}),
|
||||
});
|
||||
var data = await resp.json();
|
||||
if (!resp.ok) {
|
||||
showToast(data.detail || 'Failed to queue batch', 'error');
|
||||
if (btn) btn.disabled = false;
|
||||
return;
|
||||
}
|
||||
|
||||
closeBatchModal();
|
||||
checkedDriveIds.clear();
|
||||
updateBatchBar();
|
||||
var queued = (data.queued || []).length;
|
||||
var errors = (data.errors || []).length;
|
||||
var msg = queued + ' burn-in(s) queued';
|
||||
if (errors) msg += ', ' + errors + ' skipped (already active)';
|
||||
showToast(msg, errors && !queued ? 'error' : 'success');
|
||||
} catch (err) {
|
||||
showToast('Network error', 'error');
|
||||
if (btn) btn.disabled = false;
|
||||
}
|
||||
}
|
||||
|
||||
// -----------------------------------------------------------------------
|
||||
// Cancel burn-in (individual)
|
||||
// -----------------------------------------------------------------------
|
||||
|
||||
async function cancelBurnin(btn) {
|
||||
var jobId = btn.dataset.jobId;
|
||||
var operator = localStorage.getItem('burnin_operator') || 'unknown';
|
||||
|
||||
if (!confirm('Cancel this burn-in job? This cannot be undone.')) return;
|
||||
|
||||
btn.disabled = true;
|
||||
try {
|
||||
var resp = await fetch('/api/v1/burnin/' + jobId + '/cancel', {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({ operator: operator }),
|
||||
});
|
||||
|
||||
if (resp.ok) {
|
||||
showToast('Burn-in cancelled', 'info');
|
||||
} else {
|
||||
var data = await resp.json();
|
||||
showToast(data.detail || 'Failed to cancel', 'error');
|
||||
btn.disabled = false;
|
||||
}
|
||||
} catch (err) {
|
||||
showToast('Network error', 'error');
|
||||
btn.disabled = false;
|
||||
}
|
||||
}
|
||||
|
||||
// -----------------------------------------------------------------------
|
||||
// Delegated event handlers (work after SSE swaps)
|
||||
// -----------------------------------------------------------------------
|
||||
|
||||
document.addEventListener('click', function (e) {
|
||||
// Short / Long SMART start buttons
|
||||
var smartBtn = e.target.closest('.btn-smart-short, .btn-smart-long');
|
||||
if (smartBtn && !smartBtn.disabled) { startSmartTest(smartBtn); return; }
|
||||
|
||||
// Cancel SMART test buttons
|
||||
var cancelSmartBtn = e.target.closest('.btn-cancel-smart');
|
||||
if (cancelSmartBtn && !cancelSmartBtn.disabled) { cancelSmartTest(cancelSmartBtn); return; }
|
||||
|
||||
// Burn-in start button (single drive)
|
||||
var startBtn = e.target.closest('.btn-start');
|
||||
if (startBtn && !startBtn.disabled) { openModal(startBtn); return; }
|
||||
|
||||
// Cancel burn-in button (individual)
|
||||
var cancelBtn = e.target.closest('.btn-cancel');
|
||||
if (cancelBtn) { cancelBurnin(cancelBtn); return; }
|
||||
|
||||
// Cancel ALL running burn-ins
|
||||
if (e.target.id === 'cancel-all-btn' || e.target.closest('#cancel-all-btn')) {
|
||||
cancelAllBurnins();
|
||||
return;
|
||||
}
|
||||
|
||||
// Single-drive modal close
|
||||
if (e.target.closest('#modal-close-btn') || e.target.closest('#modal-cancel-btn')) {
|
||||
closeModal();
|
||||
return;
|
||||
}
|
||||
if (e.target.id === 'start-modal') {
|
||||
closeModal();
|
||||
return;
|
||||
}
|
||||
if (e.target.id === 'modal-start-btn') {
|
||||
submitStart();
|
||||
return;
|
||||
}
|
||||
|
||||
// Batch modal close
|
||||
if (e.target.closest('#batch-modal-close-btn') || e.target.closest('#batch-modal-cancel-btn')) {
|
||||
closeBatchModal();
|
||||
return;
|
||||
}
|
||||
if (e.target.id === 'batch-modal') {
|
||||
closeBatchModal();
|
||||
return;
|
||||
}
|
||||
if (e.target.id === 'batch-modal-start-btn') {
|
||||
submitBatchStart();
|
||||
return;
|
||||
}
|
||||
});
|
||||
|
||||
document.addEventListener('input', function (e) {
|
||||
var id = e.target.id;
|
||||
if (id === 'operator-input' || id === 'confirm-serial') validateModal();
|
||||
});
|
||||
|
||||
document.addEventListener('keydown', function (e) {
|
||||
if (e.key === 'Escape') {
|
||||
var modal = document.getElementById('start-modal');
|
||||
if (modal && !modal.hidden) { closeModal(); return; }
|
||||
var bModal = document.getElementById('batch-modal');
|
||||
if (bModal && !bModal.hidden) { closeBatchModal(); return; }
|
||||
}
|
||||
});
|
||||
|
||||
}());
|
||||
55
app/templates/audit.html
Normal file
55
app/templates/audit.html
Normal file
|
|
@ -0,0 +1,55 @@
|
|||
{% extends "layout.html" %}
|
||||
|
||||
{% block title %}TrueNAS Burn-In — Audit Log{% endblock %}
|
||||
|
||||
{% block content %}
|
||||
<div class="page-toolbar">
|
||||
<h1 class="page-title">Audit Log</h1>
|
||||
<div class="toolbar-right">
|
||||
<span class="page-subtitle">Last 200 events</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="table-wrap">
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th class="col-job">#</th>
|
||||
<th>Time</th>
|
||||
<th>Event</th>
|
||||
<th class="col-drive">Drive</th>
|
||||
<th>Operator</th>
|
||||
<th>Message</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{% if events %}
|
||||
{% for e in events %}
|
||||
{% set color = event_colors.get(e.event_type, 'unknown') %}
|
||||
<tr>
|
||||
<td class="mono text-muted">{{ e.id }}</td>
|
||||
<td class="mono text-muted" style="white-space:nowrap">{{ e.created_at | format_dt_full }}</td>
|
||||
<td>
|
||||
<span class="chip chip-{{ color }}">{{ e.event_type | replace('_', ' ') }}</span>
|
||||
</td>
|
||||
<td class="col-drive">
|
||||
{% if e.devname %}
|
||||
<span class="drive-name">{{ e.devname }}</span>
|
||||
{% if e.serial %}<span class="drive-model mono">{{ e.serial }}</span>{% endif %}
|
||||
{% else %}
|
||||
<span class="cell-empty">—</span>
|
||||
{% endif %}
|
||||
</td>
|
||||
<td class="text-muted">{{ e.operator or '—' }}</td>
|
||||
<td class="text-muted" style="max-width:360px;white-space:normal">{{ e.message }}</td>
|
||||
</tr>
|
||||
{% endfor %}
|
||||
{% else %}
|
||||
<tr>
|
||||
<td colspan="6" class="empty-state">No audit events yet.</td>
|
||||
</tr>
|
||||
{% endif %}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
{% endblock %}
|
||||
174
app/templates/components/drives_table.html
Normal file
174
app/templates/components/drives_table.html
Normal file
|
|
@ -0,0 +1,174 @@
|
|||
{%- macro smart_cell(smart) -%}
|
||||
<div class="smart-cell">
|
||||
{%- if smart.state == 'running' -%}
|
||||
<div class="progress-wrap">
|
||||
<div class="progress-bar">
|
||||
<div class="progress-fill" style="width:{{ smart.percent or 0 }}%"></div>
|
||||
</div>
|
||||
<span class="progress-pct">{{ smart.percent or 0 }}%</span>
|
||||
</div>
|
||||
{%- if smart.eta_seconds %}
|
||||
<div class="eta-text">{{ smart.eta_seconds | format_eta }}</div>
|
||||
{%- endif %}
|
||||
{%- elif smart.state == 'passed' -%}
|
||||
<span class="chip chip-passed">Passed</span>
|
||||
{%- elif smart.state == 'failed' -%}
|
||||
<span class="chip chip-failed" title="{{ smart.error_text or '' }}">Failed</span>
|
||||
{%- elif smart.state == 'aborted' -%}
|
||||
<span class="chip chip-aborted">Aborted</span>
|
||||
{%- else -%}
|
||||
<span class="cell-empty">—</span>
|
||||
{%- endif -%}
|
||||
</div>
|
||||
{%- endmacro -%}
|
||||
|
||||
{%- macro burnin_cell(bi) -%}
|
||||
<div class="burnin-cell">
|
||||
{%- if bi is none -%}
|
||||
<span class="cell-empty">—</span>
|
||||
{%- elif bi.state == 'queued' -%}
|
||||
<span class="chip chip-queued">Queued</span>
|
||||
{%- elif bi.state == 'running' -%}
|
||||
<div class="progress-wrap">
|
||||
<div class="progress-bar">
|
||||
<div class="progress-fill progress-fill-green" style="width:{{ bi.percent or 0 }}%"></div>
|
||||
</div>
|
||||
<span class="progress-pct">{{ bi.percent or 0 }}%</span>
|
||||
</div>
|
||||
<div class="burnin-meta">
|
||||
{%- if bi.stage_name %}
|
||||
<span class="stage-name">{{ bi.stage_name | replace('_', ' ') | title }}</span>
|
||||
{%- endif %}
|
||||
{%- if bi.started_at %}
|
||||
<span class="elapsed-timer" data-started="{{ bi.started_at }}">{{ bi.started_at | format_elapsed }}</span>
|
||||
{%- endif %}
|
||||
</div>
|
||||
{%- elif bi.state == 'passed' -%}
|
||||
<span class="chip chip-passed">Passed</span>
|
||||
{%- elif bi.state == 'failed' -%}
|
||||
<span class="chip chip-failed">Failed{% if bi.stage_name %} ({{ bi.stage_name | replace('_',' ') }}){% endif %}</span>
|
||||
{%- elif bi.state == 'cancelled' -%}
|
||||
<span class="chip chip-aborted">Cancelled</span>
|
||||
{%- elif bi.state == 'unknown' -%}
|
||||
<span class="chip chip-unknown">Unknown</span>
|
||||
{%- else -%}
|
||||
<span class="cell-empty">—</span>
|
||||
{%- endif -%}
|
||||
</div>
|
||||
{%- endmacro -%}
|
||||
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th class="col-check">
|
||||
<input type="checkbox" id="select-all-cb" class="drive-cb" title="Select all idle drives">
|
||||
</th>
|
||||
<th class="col-drive">Drive</th>
|
||||
<th class="col-serial">Serial</th>
|
||||
<th class="col-size">Size</th>
|
||||
<th class="col-temp">Temp</th>
|
||||
<th class="col-health">Health</th>
|
||||
<th class="col-smart">Short SMART</th>
|
||||
<th class="col-smart">Long SMART</th>
|
||||
<th class="col-burnin">Burn-In</th>
|
||||
<th class="col-actions">Actions</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody id="drives-tbody">
|
||||
{%- if drives %}
|
||||
{%- for drive in drives %}
|
||||
{%- set bi_active = drive.burnin and drive.burnin.state in ('queued', 'running') %}
|
||||
{%- set short_busy = drive.smart_short and drive.smart_short.state == 'running' %}
|
||||
{%- set long_busy = drive.smart_long and drive.smart_long.state == 'running' %}
|
||||
{%- set selectable = not bi_active and not short_busy and not long_busy %}
|
||||
<tr data-status="{{ drive.status }}" id="drive-{{ drive.id }}">
|
||||
<td class="col-check">
|
||||
{%- if selectable %}
|
||||
<input type="checkbox" class="drive-checkbox" data-drive-id="{{ drive.id }}">
|
||||
{%- endif %}
|
||||
</td>
|
||||
<td class="col-drive">
|
||||
<span class="drive-name">{{ drive.devname }}</span>
|
||||
<span class="drive-model">{{ drive.model or "Unknown" }}</span>
|
||||
{%- if drive.location %}
|
||||
<span class="drive-location"
|
||||
data-drive-id="{{ drive.id }}"
|
||||
data-field="location"
|
||||
title="Click to edit location">{{ drive.location }}</span>
|
||||
{%- else %}
|
||||
<span class="drive-location drive-location-empty"
|
||||
data-drive-id="{{ drive.id }}"
|
||||
data-field="location"
|
||||
title="Click to set location">+ location</span>
|
||||
{%- endif %}
|
||||
</td>
|
||||
<td class="col-serial mono">{{ drive.serial or "—" }}</td>
|
||||
<td class="col-size">{{ drive.size_bytes | format_bytes }}</td>
|
||||
<td class="col-temp">
|
||||
{%- if drive.temperature_c is not none %}
|
||||
<span class="temp {{ drive.temperature_c | temp_class }}">{{ drive.temperature_c }}°C</span>
|
||||
{%- else %}
|
||||
<span class="cell-empty">—</span>
|
||||
{%- endif %}
|
||||
</td>
|
||||
<td class="col-health">
|
||||
<span class="chip chip-{{ drive.smart_health | lower }}">{{ drive.smart_health }}</span>
|
||||
</td>
|
||||
<td class="col-smart">{{ smart_cell(drive.smart_short) }}</td>
|
||||
<td class="col-smart">{{ smart_cell(drive.smart_long) }}</td>
|
||||
<td class="col-burnin">{{ burnin_cell(drive.burnin) }}</td>
|
||||
<td class="col-actions">
|
||||
<div class="action-group">
|
||||
{%- if bi_active %}
|
||||
<!-- Burn-in running/queued: only show cancel -->
|
||||
<button class="btn-action btn-cancel"
|
||||
data-job-id="{{ drive.burnin.id }}">✕ Burn-In</button>
|
||||
{%- else %}
|
||||
<!-- Short SMART: show cancel if running, else start button -->
|
||||
{%- if short_busy %}
|
||||
<button class="btn-action btn-cancel-smart"
|
||||
data-drive-id="{{ drive.id }}"
|
||||
data-test-type="short"
|
||||
title="Cancel Short SMART test">✕ Short</button>
|
||||
{%- else %}
|
||||
<button class="btn-action btn-smart-short{% if long_busy %} btn-disabled{% endif %}"
|
||||
data-drive-id="{{ drive.id }}"
|
||||
data-test-type="SHORT"
|
||||
{% if long_busy %}disabled{% endif %}
|
||||
title="Start Short SMART test">Short</button>
|
||||
{%- endif %}
|
||||
<!-- Long SMART: show cancel if running, else start button -->
|
||||
{%- if long_busy %}
|
||||
<button class="btn-action btn-cancel-smart"
|
||||
data-drive-id="{{ drive.id }}"
|
||||
data-test-type="long"
|
||||
title="Cancel Long SMART test">✕ Long</button>
|
||||
{%- else %}
|
||||
<button class="btn-action btn-smart-long{% if short_busy %} btn-disabled{% endif %}"
|
||||
data-drive-id="{{ drive.id }}"
|
||||
data-test-type="LONG"
|
||||
{% if short_busy %}disabled{% endif %}
|
||||
title="Start Long SMART test (~several hours)">Long</button>
|
||||
{%- endif %}
|
||||
<!-- Burn-In -->
|
||||
<button class="btn-action btn-start{% if short_busy or long_busy %} btn-disabled{% endif %}"
|
||||
data-drive-id="{{ drive.id }}"
|
||||
data-devname="{{ drive.devname }}"
|
||||
data-serial="{{ drive.serial or '' }}"
|
||||
data-model="{{ drive.model or 'Unknown' }}"
|
||||
data-size="{{ drive.size_bytes | format_bytes }}"
|
||||
data-health="{{ drive.smart_health }}"
|
||||
{% if short_busy or long_busy %}disabled{% endif %}
|
||||
title="Start Burn-In">Burn-In</button>
|
||||
{%- endif %}
|
||||
</div>
|
||||
</td>
|
||||
</tr>
|
||||
{%- endfor %}
|
||||
{%- else %}
|
||||
<tr>
|
||||
<td colspan="10" class="empty-state">No drives found. Waiting for first poll…</td>
|
||||
</tr>
|
||||
{%- endif %}
|
||||
</tbody>
|
||||
</table>
|
||||
73
app/templates/components/modal_batch.html
Normal file
73
app/templates/components/modal_batch.html
Normal file
|
|
@ -0,0 +1,73 @@
|
|||
<div id="batch-modal" class="modal-overlay" hidden aria-modal="true" role="dialog">
|
||||
<div class="modal">
|
||||
<div class="modal-header">
|
||||
<h2 class="modal-title">Batch Burn-In</h2>
|
||||
<button class="modal-close" id="batch-modal-close-btn" aria-label="Close">✕</button>
|
||||
</div>
|
||||
|
||||
<div class="modal-body">
|
||||
|
||||
<!-- Stage selection (drag to reorder) -->
|
||||
<div class="form-group">
|
||||
<div class="form-label">Stages to run <span class="stage-drag-hint">— drag to reorder</span></div>
|
||||
<div class="stage-checks" id="batch-stage-order-list">
|
||||
<label class="stage-check" draggable="true">
|
||||
<span class="drag-handle" title="Drag to reorder">⠿</span>
|
||||
<input type="checkbox" id="batch-stage-short" checked onchange="validateBatchModal()">
|
||||
<span>
|
||||
<strong>Short SMART</strong>
|
||||
<span class="stage-note-inline">— non-destructive, ~2 min</span>
|
||||
</span>
|
||||
</label>
|
||||
<label class="stage-check" draggable="true">
|
||||
<span class="drag-handle" title="Drag to reorder">⠿</span>
|
||||
<input type="checkbox" id="batch-stage-long" checked onchange="validateBatchModal()">
|
||||
<span>
|
||||
<strong>Long SMART</strong>
|
||||
<span class="stage-note-inline">— non-destructive, ~several hours</span>
|
||||
</span>
|
||||
</label>
|
||||
<label class="stage-check" draggable="true">
|
||||
<span class="drag-handle" title="Drag to reorder">⠿</span>
|
||||
<input type="checkbox" id="batch-stage-surface" checked onchange="validateBatchModal()">
|
||||
<span>
|
||||
<strong>Surface Validate</strong>
|
||||
<span class="stage-tag stage-tag-destructive">destructive</span>
|
||||
<span class="stage-note-inline">— full-surface write test (slow)</span>
|
||||
</span>
|
||||
</label>
|
||||
</div>
|
||||
<div class="stage-always-note">Precheck & final health check always run.</div>
|
||||
</div>
|
||||
|
||||
<!-- Destructive warning — shown only when surface validate is selected -->
|
||||
<div id="batch-surface-warning" class="confirm-warning">
|
||||
⚠ Surface Validate will <strong>permanently overwrite all data</strong> on
|
||||
<strong><span id="batch-modal-count">0</span> selected drive(s)</strong>.
|
||||
</div>
|
||||
|
||||
<!-- Operator -->
|
||||
<div class="form-group">
|
||||
<label class="form-label" for="batch-operator-input">Operator</label>
|
||||
<input class="form-input" type="text" id="batch-operator-input"
|
||||
placeholder="Your name" autocomplete="name" maxlength="64">
|
||||
</div>
|
||||
|
||||
<!-- Confirmation checkbox — shown only when surface validate is selected -->
|
||||
<div class="form-group" id="batch-confirm-wrap">
|
||||
<label class="confirm-check-label">
|
||||
<input type="checkbox" id="batch-confirm-cb">
|
||||
<span>I understand this will <strong>permanently destroy all data</strong> on the selected drives</span>
|
||||
</label>
|
||||
</div>
|
||||
|
||||
</div>
|
||||
|
||||
<div class="modal-footer">
|
||||
<button class="btn-secondary" id="batch-modal-cancel-btn">Cancel</button>
|
||||
<button class="btn-danger" id="batch-modal-start-btn" disabled>
|
||||
Start <span id="batch-modal-count-btn">0</span> Burn-In(s)
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
87
app/templates/components/modal_start.html
Normal file
87
app/templates/components/modal_start.html
Normal file
|
|
@ -0,0 +1,87 @@
|
|||
<div id="start-modal" class="modal-overlay" hidden aria-modal="true" role="dialog">
|
||||
<div class="modal">
|
||||
<div class="modal-header">
|
||||
<h2 class="modal-title" id="modal-title">Burn-In</h2>
|
||||
<button class="modal-close" id="modal-close-btn" aria-label="Close">✕</button>
|
||||
</div>
|
||||
|
||||
<div class="modal-body">
|
||||
|
||||
<!-- Drive info -->
|
||||
<div class="modal-drive-info">
|
||||
<div class="modal-drive-row">
|
||||
<span class="modal-devname" id="modal-devname">—</span>
|
||||
<span class="chip chip-unknown" id="modal-health">UNKNOWN</span>
|
||||
</div>
|
||||
<div class="modal-drive-sub">
|
||||
<span id="modal-model">—</span>
|
||||
·
|
||||
<span id="modal-size">—</span>
|
||||
·
|
||||
<span class="mono" id="modal-serial-display">—</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Stage selection (drag to reorder) -->
|
||||
<div class="form-group">
|
||||
<div class="form-label">Stages to run <span class="stage-drag-hint">— drag to reorder</span></div>
|
||||
<div class="stage-checks" id="stage-order-list">
|
||||
<label class="stage-check" draggable="true">
|
||||
<span class="drag-handle" title="Drag to reorder">⠿</span>
|
||||
<input type="checkbox" id="stage-short" checked onchange="handleStageChange()">
|
||||
<span>
|
||||
<strong>Short SMART</strong>
|
||||
<span class="stage-note-inline">— non-destructive, ~2 min</span>
|
||||
</span>
|
||||
</label>
|
||||
<label class="stage-check" draggable="true">
|
||||
<span class="drag-handle" title="Drag to reorder">⠿</span>
|
||||
<input type="checkbox" id="stage-long" checked onchange="handleStageChange()">
|
||||
<span>
|
||||
<strong>Long SMART</strong>
|
||||
<span class="stage-note-inline">— non-destructive, ~several hours</span>
|
||||
</span>
|
||||
</label>
|
||||
<label class="stage-check" draggable="true">
|
||||
<span class="drag-handle" title="Drag to reorder">⠿</span>
|
||||
<input type="checkbox" id="stage-surface" checked onchange="handleStageChange()">
|
||||
<span>
|
||||
<strong>Surface Validate</strong>
|
||||
<span class="stage-tag stage-tag-destructive">destructive</span>
|
||||
<span class="stage-note-inline">— full-surface write test (slow)</span>
|
||||
</span>
|
||||
</label>
|
||||
</div>
|
||||
<div class="stage-always-note">Precheck & final health check always run.</div>
|
||||
</div>
|
||||
|
||||
<!-- Destructive warning — shown only when surface validate is selected -->
|
||||
<div id="surface-warning" class="confirm-warning">
|
||||
⚠ Surface Validate will <strong>permanently overwrite all data</strong> on this drive.
|
||||
</div>
|
||||
|
||||
<!-- Operator name -->
|
||||
<div class="form-group">
|
||||
<label class="form-label" for="operator-input">Operator</label>
|
||||
<input class="form-input" type="text" id="operator-input"
|
||||
placeholder="Your name" autocomplete="name" maxlength="64">
|
||||
</div>
|
||||
|
||||
<!-- Serial confirmation — shown only when surface validate is selected -->
|
||||
<div class="form-group" id="serial-field">
|
||||
<label class="form-label" for="confirm-serial">
|
||||
Type the serial number to confirm destructive test
|
||||
</label>
|
||||
<input class="form-input form-input-confirm" type="text" id="confirm-serial"
|
||||
placeholder="" autocomplete="off" spellcheck="false">
|
||||
<div class="confirm-hint" id="confirm-hint"></div>
|
||||
</div>
|
||||
|
||||
</div>
|
||||
|
||||
<div class="modal-footer">
|
||||
<button class="btn-secondary" id="modal-cancel-btn">Cancel</button>
|
||||
<button class="btn-danger" id="modal-start-btn" disabled>Start Burn-In</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
74
app/templates/dashboard.html
Normal file
74
app/templates/dashboard.html
Normal file
|
|
@ -0,0 +1,74 @@
|
|||
{% extends "layout.html" %}
|
||||
|
||||
{% block title %}TrueNAS Burn-In — Dashboard{% endblock %}
|
||||
|
||||
{% block content %}
|
||||
{% include "components/modal_start.html" %}
|
||||
{% include "components/modal_batch.html" %}
|
||||
|
||||
<!-- Stats bar — counts are updated live by app.js updateCounts() -->
|
||||
<div class="stats-bar">
|
||||
<div class="stat-card" data-stat-filter="all">
|
||||
<span class="stat-value" id="stat-all">{{ drives | length }}</span>
|
||||
<span class="stat-label">Drives</span>
|
||||
</div>
|
||||
<div class="stat-card stat-running" data-stat-filter="running">
|
||||
<span class="stat-value" id="stat-running">0</span>
|
||||
<span class="stat-label">Running</span>
|
||||
</div>
|
||||
<a class="stat-card stat-failed" href="/history?state=failed" data-stat-filter="failed">
|
||||
<span class="stat-value" id="stat-failed">0</span>
|
||||
<span class="stat-label">Failed</span>
|
||||
</a>
|
||||
<div class="stat-card stat-passed" data-stat-filter="passed">
|
||||
<span class="stat-value" id="stat-passed">0</span>
|
||||
<span class="stat-label">Passed</span>
|
||||
</div>
|
||||
<div class="stat-card stat-idle" data-stat-filter="idle">
|
||||
<span class="stat-value" id="stat-idle">0</span>
|
||||
<span class="stat-label">Idle</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Failed drive banner — shown/hidden by JS when failed count > 0 -->
|
||||
<div id="failed-banner" class="banner banner-error" hidden>
|
||||
✕ <strong><span class="failed-count">0</span> drive(s)</strong> have failed tests —
|
||||
<a href="/history?state=failed" style="color:inherit;text-decoration:underline">View history</a>
|
||||
</div>
|
||||
|
||||
<div class="filter-bar" id="filter-bar">
|
||||
<button class="filter-btn active" data-filter="all">
|
||||
All <span class="badge">0</span>
|
||||
</button>
|
||||
<button class="filter-btn" data-filter="running">
|
||||
Running <span class="badge">0</span>
|
||||
</button>
|
||||
<button class="filter-btn" data-filter="failed">
|
||||
Failed <span class="badge">0</span>
|
||||
</button>
|
||||
<button class="filter-btn" data-filter="passed">
|
||||
Passed <span class="badge">0</span>
|
||||
</button>
|
||||
<button class="filter-btn" data-filter="idle">
|
||||
Idle <span class="badge">0</span>
|
||||
</button>
|
||||
|
||||
<!-- Batch start bar (hidden until checkboxes are selected) -->
|
||||
<div id="batch-bar" class="batch-bar" hidden>
|
||||
<span class="batch-count-label"><span id="batch-count">0</span> selected</span>
|
||||
<button class="btn-batch-start" id="batch-start-btn">Start Burn-In</button>
|
||||
<button class="btn-batch-clear" id="batch-clear-btn">Clear</button>
|
||||
</div>
|
||||
|
||||
<!-- Cancel all running burn-ins (shown by JS when any burn-in is active) -->
|
||||
<button class="btn-cancel-all" id="cancel-all-btn" hidden title="Cancel all running and queued burn-in jobs">✕ Cancel All Burn-Ins</button>
|
||||
</div>
|
||||
|
||||
<div class="table-wrap">
|
||||
<div hx-ext="sse" sse-connect="/sse/drives">
|
||||
<div id="drives-table-wrap" sse-swap="drives-update" hx-swap="innerHTML">
|
||||
{% include "components/drives_table.html" %}
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
{% endblock %}
|
||||
93
app/templates/history.html
Normal file
93
app/templates/history.html
Normal file
|
|
@ -0,0 +1,93 @@
|
|||
{% extends "layout.html" %}
|
||||
|
||||
{% block title %}TrueNAS Burn-In — History{% endblock %}
|
||||
|
||||
{% block content %}
|
||||
<div class="page-toolbar">
|
||||
<h1 class="page-title">Burn-In History</h1>
|
||||
<div class="toolbar-right">
|
||||
<a class="btn-export" href="/api/v1/burnin/export.csv">Export CSV</a>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- State filter tabs -->
|
||||
<div class="filter-bar" style="margin-bottom: 14px;">
|
||||
{% set states = [('all','All'), ('passed','Passed'), ('failed','Failed'), ('cancelled','Cancelled'), ('running','Running'), ('unknown','Unknown')] %}
|
||||
{% for val, label in states %}
|
||||
<a class="filter-btn{% if active_state == val %} active{% endif %}"
|
||||
href="/history?state={{ val }}&page=1">
|
||||
{{ label }}
|
||||
{% if val in counts %}<span class="badge">{{ counts[val] }}</span>{% endif %}
|
||||
</a>
|
||||
{% endfor %}
|
||||
</div>
|
||||
|
||||
<div class="table-wrap">
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th class="col-job">#</th>
|
||||
<th class="col-drive">Drive</th>
|
||||
<th>Profile</th>
|
||||
<th>State</th>
|
||||
<th>Operator</th>
|
||||
<th>Started</th>
|
||||
<th>Duration</th>
|
||||
<th>Error</th>
|
||||
<th class="col-actions"></th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{% if jobs %}
|
||||
{% for j in jobs %}
|
||||
<tr>
|
||||
<td class="mono text-muted">{{ j.id }}</td>
|
||||
<td class="col-drive">
|
||||
<span class="drive-name">{{ j.devname }}</span>
|
||||
<span class="drive-model">{{ j.serial }}</span>
|
||||
</td>
|
||||
<td>
|
||||
<span class="chip chip-{{ 'red' if j.profile == 'full' else 'gray' }}">{{ j.profile }}</span>
|
||||
</td>
|
||||
<td>
|
||||
<span class="chip chip-{{ j.state }}">{{ j.state }}</span>
|
||||
</td>
|
||||
<td class="text-muted">{{ j.operator or '—' }}</td>
|
||||
<td class="mono text-muted">{{ j.started_at | format_dt_full }}</td>
|
||||
<td class="mono text-muted">{{ j.duration_seconds | format_duration }}</td>
|
||||
<td class="error-cell">
|
||||
{% if j.error_text %}
|
||||
<span class="error-snippet" title="{{ j.error_text }}">{{ j.error_text[:60] }}{% if j.error_text | length > 60 %}…{% endif %}</span>
|
||||
{% else %}—{% endif %}
|
||||
</td>
|
||||
<td>
|
||||
<a class="btn-detail" href="/history/{{ j.id }}">Detail</a>
|
||||
</td>
|
||||
</tr>
|
||||
{% endfor %}
|
||||
{% else %}
|
||||
<tr>
|
||||
<td colspan="9" class="empty-state">No burn-in jobs found.</td>
|
||||
</tr>
|
||||
{% endif %}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
|
||||
<!-- Pagination -->
|
||||
{% if total_pages > 1 %}
|
||||
<div class="pagination">
|
||||
{% if page > 1 %}
|
||||
<a class="page-btn" href="/history?state={{ active_state }}&page={{ page - 1 }}">← Prev</a>
|
||||
{% endif %}
|
||||
<span class="page-info">Page {{ page }} of {{ total_pages }} · {{ total_count }} jobs</span>
|
||||
{% if page < total_pages %}
|
||||
<a class="page-btn" href="/history?state={{ active_state }}&page={{ page + 1 }}">Next →</a>
|
||||
{% endif %}
|
||||
</div>
|
||||
{% else %}
|
||||
<div class="pagination">
|
||||
<span class="page-info">{{ total_count }} job{% if total_count != 1 %}s{% endif %}</span>
|
||||
</div>
|
||||
{% endif %}
|
||||
{% endblock %}
|
||||
122
app/templates/job_detail.html
Normal file
122
app/templates/job_detail.html
Normal file
|
|
@ -0,0 +1,122 @@
|
|||
{% extends "layout.html" %}
|
||||
|
||||
{% block title %}TrueNAS Burn-In — Job #{{ job.id }}{% endblock %}
|
||||
|
||||
{% block content %}
|
||||
<div class="page-toolbar">
|
||||
<div class="breadcrumb">
|
||||
<a href="/history">History</a>
|
||||
<span class="breadcrumb-sep">›</span>
|
||||
<span>Job #{{ job.id }}</span>
|
||||
</div>
|
||||
<div class="toolbar-right">
|
||||
<a class="btn-export" href="/history/{{ job.id }}/print" target="_blank" rel="noopener">
|
||||
Print / Export
|
||||
</a>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Summary cards -->
|
||||
<div class="detail-grid">
|
||||
|
||||
<!-- Drive info -->
|
||||
<div class="detail-card">
|
||||
<div class="detail-card-title">Drive</div>
|
||||
<div class="detail-rows">
|
||||
<div class="detail-row">
|
||||
<span class="detail-label">Device</span>
|
||||
<span class="detail-value drive-name">{{ job.devname }}</span>
|
||||
</div>
|
||||
<div class="detail-row">
|
||||
<span class="detail-label">Model</span>
|
||||
<span class="detail-value">{{ job.model or '—' }}</span>
|
||||
</div>
|
||||
<div class="detail-row">
|
||||
<span class="detail-label">Serial</span>
|
||||
<span class="detail-value mono">{{ job.serial or '—' }}</span>
|
||||
</div>
|
||||
<div class="detail-row">
|
||||
<span class="detail-label">Size</span>
|
||||
<span class="detail-value">{{ job.size_bytes | format_bytes }}</span>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Job summary -->
|
||||
<div class="detail-card">
|
||||
<div class="detail-card-title">Job</div>
|
||||
<div class="detail-rows">
|
||||
<div class="detail-row">
|
||||
<span class="detail-label">Profile</span>
|
||||
<span class="chip chip-{{ 'red' if job.profile == 'full' else 'gray' }}">{{ job.profile }}</span>
|
||||
</div>
|
||||
<div class="detail-row">
|
||||
<span class="detail-label">State</span>
|
||||
<span class="chip chip-{{ job.state }}">{{ job.state }}</span>
|
||||
</div>
|
||||
<div class="detail-row">
|
||||
<span class="detail-label">Operator</span>
|
||||
<span class="detail-value">{{ job.operator or '—' }}</span>
|
||||
</div>
|
||||
<div class="detail-row">
|
||||
<span class="detail-label">Created</span>
|
||||
<span class="detail-value mono">{{ job.created_at | format_dt_full }}</span>
|
||||
</div>
|
||||
<div class="detail-row">
|
||||
<span class="detail-label">Started</span>
|
||||
<span class="detail-value mono">{{ job.started_at | format_dt_full }}</span>
|
||||
</div>
|
||||
<div class="detail-row">
|
||||
<span class="detail-label">Finished</span>
|
||||
<span class="detail-value mono">{{ job.finished_at | format_dt_full }}</span>
|
||||
</div>
|
||||
<div class="detail-row">
|
||||
<span class="detail-label">Duration</span>
|
||||
<span class="detail-value mono">{{ job.duration_seconds | format_duration }}</span>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
</div>
|
||||
|
||||
{% if job.error_text %}
|
||||
<div class="banner banner-error" style="margin-bottom: 14px; border-radius: 6px; border: 1px solid var(--red-bd);">
|
||||
✕ {{ job.error_text }}
|
||||
</div>
|
||||
{% endif %}
|
||||
|
||||
<!-- Stages table -->
|
||||
<h2 class="section-title">Stages</h2>
|
||||
<div class="table-wrap">
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Stage</th>
|
||||
<th>State</th>
|
||||
<th>Started</th>
|
||||
<th>Duration</th>
|
||||
<th>Error</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{% for s in job.stages %}
|
||||
<tr>
|
||||
<td>
|
||||
<span class="stage-label">{{ s.stage_name.replace('_', ' ').title() }}</span>
|
||||
</td>
|
||||
<td>
|
||||
<span class="chip chip-{{ s.state }}">{{ s.state }}</span>
|
||||
</td>
|
||||
<td class="mono text-muted">{{ s.started_at | format_dt_full }}</td>
|
||||
<td class="mono text-muted">{{ s.duration_seconds | format_duration }}</td>
|
||||
<td>
|
||||
{% if s.error_text %}
|
||||
<span class="error-full">{{ s.error_text }}</span>
|
||||
{% else %}—{% endif %}
|
||||
</td>
|
||||
</tr>
|
||||
{% endfor %}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
{% endblock %}
|
||||
304
app/templates/job_print.html
Normal file
304
app/templates/job_print.html
Normal file
|
|
@ -0,0 +1,304 @@
|
|||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||||
<title>Burn-In Report — Job #{{ job.id }}</title>
|
||||
<style>
|
||||
* { box-sizing: border-box; margin: 0; padding: 0; }
|
||||
|
||||
:root {
|
||||
--green: #1a7431;
|
||||
--red: #b91c1c;
|
||||
--gray: #374151;
|
||||
--border: #d1d5db;
|
||||
--muted: #6b7280;
|
||||
}
|
||||
|
||||
body {
|
||||
background: #fff;
|
||||
color: #111827;
|
||||
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", sans-serif;
|
||||
font-size: 13px;
|
||||
padding: 32px;
|
||||
max-width: 800px;
|
||||
margin: 0 auto;
|
||||
}
|
||||
|
||||
/* ---- Header ---- */
|
||||
.print-header {
|
||||
display: flex;
|
||||
align-items: flex-start;
|
||||
justify-content: space-between;
|
||||
margin-bottom: 24px;
|
||||
padding-bottom: 16px;
|
||||
border-bottom: 2px solid var(--border);
|
||||
}
|
||||
|
||||
.print-brand {
|
||||
font-size: 13px;
|
||||
color: var(--muted);
|
||||
}
|
||||
|
||||
.print-brand strong {
|
||||
display: block;
|
||||
font-size: 16px;
|
||||
color: #111827;
|
||||
}
|
||||
|
||||
.result-badge {
|
||||
font-size: 28px;
|
||||
font-weight: 800;
|
||||
letter-spacing: -0.02em;
|
||||
padding: 6px 20px;
|
||||
border-radius: 8px;
|
||||
border: 3px solid;
|
||||
}
|
||||
|
||||
.result-badge.passed {
|
||||
color: var(--green);
|
||||
border-color: var(--green);
|
||||
background: #f0fdf4;
|
||||
}
|
||||
|
||||
.result-badge.failed {
|
||||
color: var(--red);
|
||||
border-color: var(--red);
|
||||
background: #fef2f2;
|
||||
}
|
||||
|
||||
.result-badge.cancelled,
|
||||
.result-badge.unknown {
|
||||
color: var(--gray);
|
||||
border-color: var(--gray);
|
||||
background: #f9fafb;
|
||||
}
|
||||
|
||||
/* ---- Info grid ---- */
|
||||
.info-grid {
|
||||
display: grid;
|
||||
grid-template-columns: 1fr 1fr;
|
||||
gap: 16px;
|
||||
margin-bottom: 20px;
|
||||
}
|
||||
|
||||
.info-card {
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 6px;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
.info-card-title {
|
||||
background: #f9fafb;
|
||||
border-bottom: 1px solid var(--border);
|
||||
padding: 6px 12px;
|
||||
font-size: 11px;
|
||||
font-weight: 600;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.06em;
|
||||
color: var(--muted);
|
||||
}
|
||||
|
||||
.info-row {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
padding: 6px 12px;
|
||||
border-bottom: 1px solid #f3f4f6;
|
||||
}
|
||||
|
||||
.info-row:last-child { border-bottom: none; }
|
||||
|
||||
.info-label { color: var(--muted); font-size: 12px; }
|
||||
|
||||
.info-value {
|
||||
font-weight: 500;
|
||||
text-align: right;
|
||||
font-size: 12px;
|
||||
}
|
||||
|
||||
/* ---- Error box ---- */
|
||||
.error-box {
|
||||
background: #fef2f2;
|
||||
border: 1px solid #fca5a5;
|
||||
border-radius: 6px;
|
||||
padding: 10px 14px;
|
||||
color: var(--red);
|
||||
font-size: 12px;
|
||||
margin-bottom: 16px;
|
||||
}
|
||||
|
||||
/* ---- Stages table ---- */
|
||||
h2 {
|
||||
font-size: 12px;
|
||||
font-weight: 600;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.06em;
|
||||
color: var(--muted);
|
||||
margin-bottom: 8px;
|
||||
}
|
||||
|
||||
table { width: 100%; border-collapse: collapse; }
|
||||
|
||||
th {
|
||||
background: #f9fafb;
|
||||
border: 1px solid var(--border);
|
||||
padding: 6px 10px;
|
||||
font-size: 11px;
|
||||
font-weight: 600;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.05em;
|
||||
color: var(--muted);
|
||||
text-align: left;
|
||||
}
|
||||
|
||||
td {
|
||||
border: 1px solid var(--border);
|
||||
padding: 7px 10px;
|
||||
font-size: 12px;
|
||||
vertical-align: middle;
|
||||
}
|
||||
|
||||
.s-passed { color: var(--green); font-weight: 600; }
|
||||
.s-failed { color: var(--red); font-weight: 600; }
|
||||
.s-other { color: var(--muted); }
|
||||
|
||||
/* ---- QR + footer ---- */
|
||||
.print-footer {
|
||||
display: flex;
|
||||
align-items: flex-end;
|
||||
justify-content: space-between;
|
||||
margin-top: 24px;
|
||||
padding-top: 16px;
|
||||
border-top: 1px solid var(--border);
|
||||
}
|
||||
|
||||
.print-footer-note {
|
||||
font-size: 11px;
|
||||
color: var(--muted);
|
||||
line-height: 1.6;
|
||||
}
|
||||
|
||||
#qrcode canvas, #qrcode img { display: block; }
|
||||
|
||||
/* ---- Print styles ---- */
|
||||
@media print {
|
||||
body { padding: 16px; }
|
||||
a { color: inherit; }
|
||||
|
||||
.result-badge.passed { -webkit-print-color-adjust: exact; print-color-adjust: exact; }
|
||||
.result-badge.failed { -webkit-print-color-adjust: exact; print-color-adjust: exact; }
|
||||
.info-card-title { -webkit-print-color-adjust: exact; print-color-adjust: exact; }
|
||||
.s-passed, .s-failed { -webkit-print-color-adjust: exact; print-color-adjust: exact; }
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
<div class="print-header">
|
||||
<div class="print-brand">
|
||||
<strong>TrueNAS Burn-In Dashboard</strong>
|
||||
Job #{{ job.id }} · {{ job.created_at | format_dt_full }}
|
||||
</div>
|
||||
<div class="result-badge {{ job.state }}">
|
||||
{{ job.state | upper }}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="info-grid">
|
||||
|
||||
<div class="info-card">
|
||||
<div class="info-card-title">Drive</div>
|
||||
<div class="info-row">
|
||||
<span class="info-label">Device</span>
|
||||
<span class="info-value" style="font-size:14px;font-weight:700">{{ job.devname }}</span>
|
||||
</div>
|
||||
<div class="info-row">
|
||||
<span class="info-label">Model</span>
|
||||
<span class="info-value">{{ job.model or '—' }}</span>
|
||||
</div>
|
||||
<div class="info-row">
|
||||
<span class="info-label">Serial</span>
|
||||
<span class="info-value" style="font-family:monospace">{{ job.serial or '—' }}</span>
|
||||
</div>
|
||||
<div class="info-row">
|
||||
<span class="info-label">Size</span>
|
||||
<span class="info-value">{{ job.size_bytes | format_bytes }}</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="info-card">
|
||||
<div class="info-card-title">Job</div>
|
||||
<div class="info-row">
|
||||
<span class="info-label">Profile</span>
|
||||
<span class="info-value">{{ job.profile | title }}</span>
|
||||
</div>
|
||||
<div class="info-row">
|
||||
<span class="info-label">Operator</span>
|
||||
<span class="info-value">{{ job.operator or '—' }}</span>
|
||||
</div>
|
||||
<div class="info-row">
|
||||
<span class="info-label">Started</span>
|
||||
<span class="info-value" style="font-family:monospace">{{ job.started_at | format_dt_full }}</span>
|
||||
</div>
|
||||
<div class="info-row">
|
||||
<span class="info-label">Finished</span>
|
||||
<span class="info-value" style="font-family:monospace">{{ job.finished_at | format_dt_full }}</span>
|
||||
</div>
|
||||
<div class="info-row">
|
||||
<span class="info-label">Duration</span>
|
||||
<span class="info-value" style="font-family:monospace">{{ job.duration_seconds | format_duration }}</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
</div>
|
||||
|
||||
{% if job.error_text %}
|
||||
<div class="error-box">✕ {{ job.error_text }}</div>
|
||||
{% endif %}
|
||||
|
||||
<h2>Stages</h2>
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Stage</th>
|
||||
<th>Result</th>
|
||||
<th>Duration</th>
|
||||
<th>Notes</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{% for s in job.stages %}
|
||||
<tr>
|
||||
<td style="font-weight:500">{{ s.stage_name.replace('_', ' ').title() }}</td>
|
||||
<td class="s-{{ s.state if s.state in ('passed','failed') else 'other' }}">
|
||||
{{ s.state | upper }}
|
||||
</td>
|
||||
<td style="font-family:monospace">{{ s.duration_seconds | format_duration }}</td>
|
||||
<td style="color:#6b7280">{{ s.error_text or '—' }}</td>
|
||||
</tr>
|
||||
{% endfor %}
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
<div class="print-footer">
|
||||
<div class="print-footer-note">
|
||||
Generated by TrueNAS Burn-In Dashboard<br>
|
||||
{{ job.finished_at | format_dt_full }}<br>
|
||||
Scan QR code to view full job details online
|
||||
</div>
|
||||
<div id="qrcode"></div>
|
||||
</div>
|
||||
|
||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/qrcodejs/1.0.0/qrcode.min.js"
|
||||
integrity="sha512-CNgIRecGo7nphbeZ04Sc13ka07paqdeTu0WR1IM4kNcpmBAXAIn1G+hNMtOE4lK7QTqBgQRmB/gFgQiRp8iKg=="
|
||||
crossorigin="anonymous" referrerpolicy="no-referrer"></script>
|
||||
<script>
|
||||
new QRCode(document.getElementById('qrcode'), {
|
||||
text: window.location.origin + '/history/{{ job.id }}',
|
||||
width: 96, height: 96,
|
||||
colorDark: '#111827', colorLight: '#ffffff',
|
||||
correctLevel: QRCode.CorrectLevel.M
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
64
app/templates/layout.html
Normal file
64
app/templates/layout.html
Normal file
|
|
@ -0,0 +1,64 @@
|
|||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||||
<title>{% block title %}TrueNAS Burn-In{% endblock %}</title>
|
||||
<link rel="stylesheet" href="/static/app.css">
|
||||
</head>
|
||||
<body>
|
||||
|
||||
<header>
|
||||
<a class="header-brand" href="/" aria-label="Dashboard">
|
||||
<svg width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1.75" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true">
|
||||
<rect x="2" y="2" width="20" height="8" rx="2" ry="2"></rect>
|
||||
<rect x="2" y="14" width="20" height="8" rx="2" ry="2"></rect>
|
||||
<line x1="6" y1="6" x2="6.01" y2="6"></line>
|
||||
<line x1="6" y1="18" x2="6.01" y2="18"></line>
|
||||
</svg>
|
||||
<span class="header-title">TrueNAS Burn-In</span>
|
||||
</a>
|
||||
<div class="header-meta">
|
||||
<span class="live-indicator">
|
||||
<span class="live-dot{% if poller and not poller.healthy %} degraded{% endif %}"></span>
|
||||
{% if poller and poller.healthy %}Live{% else %}Polling error{% endif %}
|
||||
</span>
|
||||
{% if poller and poller.last_poll_at %}
|
||||
<span class="poll-time">Last poll {{ poller.last_poll_at | format_dt }}</span>
|
||||
{% endif %}
|
||||
<button class="notif-btn" id="notif-btn" title="Enable browser notifications" aria-label="Toggle notifications">
|
||||
<svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true">
|
||||
<path d="M18 8A6 6 0 0 0 6 8c0 7-3 9-3 9h18s-3-2-3-9"></path>
|
||||
<path d="M13.73 21a2 2 0 0 1-3.46 0"></path>
|
||||
</svg>
|
||||
</button>
|
||||
<a class="header-link" href="/history">History</a>
|
||||
<a class="header-link" href="/stats">Stats</a>
|
||||
<a class="header-link" href="/audit">Audit</a>
|
||||
<a class="header-link" href="/settings">Settings</a>
|
||||
<a class="header-link" href="/docs" target="_blank" rel="noopener">API</a>
|
||||
</div>
|
||||
</header>
|
||||
|
||||
{% if stale %}
|
||||
<div class="banner banner-warn">
|
||||
⚠ Data may be stale — no successful poll in over {{ stale_seconds }}s
|
||||
</div>
|
||||
{% endif %}
|
||||
|
||||
{% if poller and poller.last_error %}
|
||||
<div class="banner banner-error">
|
||||
✕ Poll error: {{ poller.last_error }}
|
||||
</div>
|
||||
{% endif %}
|
||||
|
||||
<main>
|
||||
{% block content %}{% endblock %}
|
||||
</main>
|
||||
|
||||
<div id="toast-container" aria-live="polite"></div>
|
||||
<script src="https://unpkg.com/htmx.org@2.0.3/dist/htmx.min.js"></script>
|
||||
<script src="https://unpkg.com/htmx-ext-sse@2.2.2/sse.js"></script>
|
||||
<script src="/static/app.js"></script>
|
||||
</body>
|
||||
</html>
|
||||
303
app/templates/settings.html
Normal file
303
app/templates/settings.html
Normal file
|
|
@ -0,0 +1,303 @@
|
|||
{% extends "layout.html" %}
|
||||
|
||||
{% block title %}TrueNAS Burn-In — Settings{% endblock %}
|
||||
|
||||
{% block content %}
|
||||
<div class="page-toolbar">
|
||||
<h1 class="page-title">Settings</h1>
|
||||
<div class="toolbar-right">
|
||||
<a class="btn-export" href="/docs" target="_blank" rel="noopener">API Docs</a>
|
||||
</div>
|
||||
</div>
|
||||
<p class="page-subtitle">
|
||||
Changes take effect immediately. Settings marked
|
||||
<span class="badge-restart">restart required</span> must be changed in <code>.env</code>.
|
||||
</p>
|
||||
|
||||
<form id="settings-form" autocomplete="off">
|
||||
<div class="settings-two-col">
|
||||
|
||||
<!-- LEFT: Email / SMTP + Webhook stacked -->
|
||||
<div class="settings-left-col">
|
||||
|
||||
<div class="settings-card">
|
||||
<div class="settings-card-header">
|
||||
<span class="settings-card-title">Email (SMTP)</span>
|
||||
{% if smtp_enabled %}
|
||||
<span class="chip chip-passed" style="font-size:10px">Enabled</span>
|
||||
{% else %}
|
||||
<span class="chip chip-unknown" style="font-size:10px">Disabled — set Host to enable</span>
|
||||
{% endif %}
|
||||
</div>
|
||||
|
||||
<!-- Compact horizontal field grid -->
|
||||
<div class="sf-fields">
|
||||
|
||||
<!-- Test connection row — full width -->
|
||||
<div class="sf-full sf-row-test" style="margin-bottom:4px">
|
||||
<button type="button" id="test-smtp-btn" class="btn-secondary">Test Connection</button>
|
||||
<span id="smtp-test-result" class="settings-test-result" style="display:none"></span>
|
||||
</div>
|
||||
|
||||
<label for="smtp_host">Host</label>
|
||||
<input class="sf-input" id="smtp_host" name="smtp_host" type="text"
|
||||
value="{{ editable.smtp_host }}" placeholder="smtp.example.com">
|
||||
|
||||
<label for="smtp_ssl_mode">Mode</label>
|
||||
<div class="sf-inline-group">
|
||||
<select class="sf-select" id="smtp_ssl_mode" name="smtp_ssl_mode">
|
||||
<option value="starttls" {% if editable.smtp_ssl_mode == 'starttls' %}selected{% endif %}>STARTTLS (587)</option>
|
||||
<option value="ssl" {% if editable.smtp_ssl_mode == 'ssl' %}selected{% endif %}>SSL / TLS (465)</option>
|
||||
<option value="plain" {% if editable.smtp_ssl_mode == 'plain' %}selected{% endif %}>Plain (25)</option>
|
||||
</select>
|
||||
<span class="sf-label-sm">Timeout</span>
|
||||
<input class="sf-input sf-input-xs" id="smtp_timeout" name="smtp_timeout"
|
||||
type="number" min="5" max="300" value="{{ editable.smtp_timeout }}" style="width:52px">
|
||||
</div>
|
||||
|
||||
<label for="smtp_user">Username</label>
|
||||
<input class="sf-input" id="smtp_user" name="smtp_user" type="text"
|
||||
value="{{ editable.smtp_user }}" autocomplete="off">
|
||||
|
||||
<label for="smtp_password">Password</label>
|
||||
<input class="sf-input" id="smtp_password" name="smtp_password" type="password"
|
||||
placeholder="leave blank to keep existing" autocomplete="new-password">
|
||||
|
||||
<label for="smtp_from">From</label>
|
||||
<input class="sf-input" id="smtp_from" name="smtp_from" type="text"
|
||||
value="{{ editable.smtp_from }}" placeholder="burnin@example.com">
|
||||
|
||||
<label for="smtp_to">To</label>
|
||||
<input class="sf-input" id="smtp_to" name="smtp_to" type="text"
|
||||
value="{{ editable.smtp_to }}" placeholder="you@example.com">
|
||||
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Webhook -->
|
||||
<div class="settings-card">
|
||||
<div class="settings-card-header">
|
||||
<span class="settings-card-title">Webhook</span>
|
||||
</div>
|
||||
<div class="sf-fields">
|
||||
<label for="webhook_url">URL</label>
|
||||
<div>
|
||||
<input class="sf-input" id="webhook_url" name="webhook_url" type="text"
|
||||
value="{{ editable.webhook_url }}" placeholder="https://ntfy.sh/your-topic">
|
||||
<span class="sf-hint" style="margin-top:3px">POST JSON on burnin_passed / burnin_failed. ntfy.sh, Slack, Discord, n8n. Leave blank to disable.</span>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
</div><!-- /left col -->
|
||||
|
||||
<!-- RIGHT column: Notifications + Behavior -->
|
||||
<div class="settings-right-col">
|
||||
|
||||
<!-- Notifications -->
|
||||
<div class="settings-card">
|
||||
<div class="settings-card-header">
|
||||
<span class="settings-card-title">Notifications</span>
|
||||
</div>
|
||||
|
||||
<div class="sf-toggle-row">
|
||||
<div>
|
||||
<div class="sf-label">Daily Report</div>
|
||||
<div class="sf-hint">Full drive status email each day</div>
|
||||
</div>
|
||||
<label class="toggle">
|
||||
<input type="checkbox" name="smtp_daily_report_enabled" id="smtp_daily_report_enabled"
|
||||
{% if editable.smtp_daily_report_enabled %}checked{% endif %}>
|
||||
<span class="toggle-slider"></span>
|
||||
</label>
|
||||
</div>
|
||||
|
||||
<div class="sf-row sf-row-inline" id="report-hour-wrap"
|
||||
{% if not editable.smtp_daily_report_enabled %}style="opacity:.4;pointer-events:none"{% endif %}>
|
||||
<div>
|
||||
<label class="sf-label" for="smtp_report_hour">Report Hour (0–23 local)</label>
|
||||
<input class="sf-input sf-input-xs" id="smtp_report_hour" name="smtp_report_hour"
|
||||
type="number" min="0" max="23" value="{{ editable.smtp_report_hour }}">
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="sf-divider"></div>
|
||||
|
||||
<div class="sf-toggle-row">
|
||||
<div>
|
||||
<div class="sf-label">Alert on Failure</div>
|
||||
<div class="sf-hint">Immediate email when a burn-in fails</div>
|
||||
</div>
|
||||
<label class="toggle">
|
||||
<input type="checkbox" name="smtp_alert_on_fail" id="smtp_alert_on_fail"
|
||||
{% if editable.smtp_alert_on_fail %}checked{% endif %}>
|
||||
<span class="toggle-slider"></span>
|
||||
</label>
|
||||
</div>
|
||||
|
||||
<div class="sf-toggle-row">
|
||||
<div>
|
||||
<div class="sf-label">Alert on Pass</div>
|
||||
<div class="sf-hint">Immediate email when a burn-in passes</div>
|
||||
</div>
|
||||
<label class="toggle">
|
||||
<input type="checkbox" name="smtp_alert_on_pass" id="smtp_alert_on_pass"
|
||||
{% if editable.smtp_alert_on_pass %}checked{% endif %}>
|
||||
<span class="toggle-slider"></span>
|
||||
</label>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Behavior -->
|
||||
<div class="settings-card">
|
||||
<div class="settings-card-header">
|
||||
<span class="settings-card-title">Burn-In Behavior</span>
|
||||
</div>
|
||||
|
||||
<div class="sf-row">
|
||||
<label class="sf-label" for="max_parallel_burnins">Max Parallel Burn-Ins</label>
|
||||
<input class="sf-input sf-input-xs" id="max_parallel_burnins" name="max_parallel_burnins"
|
||||
type="number" min="1" max="16" value="{{ editable.max_parallel_burnins }}">
|
||||
<span class="sf-hint">How many jobs can run at the same time</span>
|
||||
</div>
|
||||
|
||||
<div class="sf-row">
|
||||
<label class="sf-label" for="stuck_job_hours">Stuck Job Threshold (hours)</label>
|
||||
<input class="sf-input sf-input-xs" id="stuck_job_hours" name="stuck_job_hours"
|
||||
type="number" min="1" max="168" value="{{ editable.stuck_job_hours }}">
|
||||
<span class="sf-hint">Jobs running longer than this → auto-marked unknown</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
</div><!-- /right col -->
|
||||
</div><!-- /two-col -->
|
||||
|
||||
<!-- Save row -->
|
||||
<div class="settings-save-bar">
|
||||
<button type="submit" class="btn-primary" id="save-btn">Save Settings</button>
|
||||
<button type="button" class="btn-secondary" id="cancel-settings-btn">Cancel</button>
|
||||
<span id="save-result" class="settings-test-result" style="display:none"></span>
|
||||
</div>
|
||||
</form>
|
||||
|
||||
<!-- System (read-only) -->
|
||||
<div class="settings-card settings-card-readonly">
|
||||
<div class="settings-card-header">
|
||||
<span class="settings-card-title">System</span>
|
||||
<span class="badge-restart">restart required to change</span>
|
||||
</div>
|
||||
<div class="sf-readonly-grid">
|
||||
<div class="sf-ro-row">
|
||||
<span class="sf-ro-label">TrueNAS URL</span>
|
||||
<span class="sf-ro-value mono">{{ readonly.truenas_base_url }}</span>
|
||||
</div>
|
||||
<div class="sf-ro-row">
|
||||
<span class="sf-ro-label">Verify TLS</span>
|
||||
<span class="sf-ro-value">{{ 'Yes' if readonly.truenas_verify_tls else 'No' }}</span>
|
||||
</div>
|
||||
<div class="sf-ro-row">
|
||||
<span class="sf-ro-label">Poll Interval</span>
|
||||
<span class="sf-ro-value mono">{{ readonly.poll_interval_seconds }}s</span>
|
||||
</div>
|
||||
<div class="sf-ro-row">
|
||||
<span class="sf-ro-label">Stale Threshold</span>
|
||||
<span class="sf-ro-value mono">{{ readonly.stale_threshold_seconds }}s</span>
|
||||
</div>
|
||||
<div class="sf-ro-row">
|
||||
<span class="sf-ro-label">IP Allowlist</span>
|
||||
<span class="sf-ro-value mono">{{ readonly.allowed_ips }}</span>
|
||||
</div>
|
||||
<div class="sf-ro-row">
|
||||
<span class="sf-ro-label">Log Level</span>
|
||||
<span class="sf-ro-value mono">{{ readonly.log_level }}</span>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<script>
|
||||
(function () {
|
||||
// Dim report-hour when daily report disabled
|
||||
var dailyCb = document.getElementById('smtp_daily_report_enabled');
|
||||
var hourWrap = document.getElementById('report-hour-wrap');
|
||||
if (dailyCb && hourWrap) {
|
||||
dailyCb.addEventListener('change', function () {
|
||||
hourWrap.style.opacity = dailyCb.checked ? '' : '0.4';
|
||||
hourWrap.style.pointerEvents = dailyCb.checked ? '' : 'none';
|
||||
});
|
||||
}
|
||||
|
||||
function showResult(el, ok, msg) {
|
||||
el.style.display = 'inline-flex';
|
||||
el.className = 'settings-test-result ' + (ok ? 'result-ok' : 'result-err');
|
||||
el.textContent = (ok ? '✓ ' : '✕ ') + msg;
|
||||
}
|
||||
|
||||
function collectForm() {
|
||||
var form = document.getElementById('settings-form');
|
||||
var data = {};
|
||||
for (var i = 0; i < form.elements.length; i++) {
|
||||
var el = form.elements[i];
|
||||
if (!el.name || el.type === 'submit' || el.type === 'button') continue;
|
||||
data[el.name] = el.type === 'checkbox' ? el.checked : el.value;
|
||||
}
|
||||
return data;
|
||||
}
|
||||
|
||||
// Save
|
||||
var form = document.getElementById('settings-form');
|
||||
var saveBtn = document.getElementById('save-btn');
|
||||
var saveResult = document.getElementById('save-result');
|
||||
form.addEventListener('submit', async function (e) {
|
||||
e.preventDefault();
|
||||
saveBtn.disabled = true;
|
||||
saveBtn.textContent = 'Saving…';
|
||||
saveResult.style.display = 'none';
|
||||
try {
|
||||
var resp = await fetch('/api/v1/settings', {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify(collectForm()),
|
||||
});
|
||||
var data = await resp.json();
|
||||
if (resp.ok) {
|
||||
showResult(saveResult, true, 'Saved');
|
||||
} else {
|
||||
showResult(saveResult, false, data.detail || 'Save failed');
|
||||
}
|
||||
} catch (e) {
|
||||
showResult(saveResult, false, 'Network error');
|
||||
} finally {
|
||||
saveBtn.disabled = false;
|
||||
saveBtn.textContent = 'Save Settings';
|
||||
}
|
||||
});
|
||||
|
||||
// Cancel — reload page to restore saved values
|
||||
var cancelBtn = document.getElementById('cancel-settings-btn');
|
||||
if (cancelBtn) {
|
||||
cancelBtn.addEventListener('click', function () {
|
||||
window.location.reload();
|
||||
});
|
||||
}
|
||||
|
||||
// Test SMTP
|
||||
var testBtn = document.getElementById('test-smtp-btn');
|
||||
var testResult = document.getElementById('smtp-test-result');
|
||||
testBtn.addEventListener('click', async function () {
|
||||
testBtn.disabled = true;
|
||||
testBtn.textContent = 'Testing…';
|
||||
testResult.style.display = 'none';
|
||||
try {
|
||||
var resp = await fetch('/api/v1/settings/test-smtp', { method: 'POST' });
|
||||
var data = await resp.json();
|
||||
showResult(testResult, resp.ok, resp.ok ? 'Connection OK' : (data.detail || 'Failed'));
|
||||
} catch (e) {
|
||||
showResult(testResult, false, 'Network error');
|
||||
} finally {
|
||||
testBtn.disabled = false;
|
||||
testBtn.textContent = 'Test Connection';
|
||||
}
|
||||
});
|
||||
}());
|
||||
</script>
|
||||
{% endblock %}
|
||||
123
app/templates/stats.html
Normal file
123
app/templates/stats.html
Normal file
|
|
@ -0,0 +1,123 @@
|
|||
{% extends "layout.html" %}
|
||||
|
||||
{% block title %}TrueNAS Burn-In — Stats{% endblock %}
|
||||
|
||||
{% block content %}
|
||||
<div class="page-toolbar">
|
||||
<h1 class="page-title">Analytics</h1>
|
||||
<div class="toolbar-right">
|
||||
<span class="page-subtitle">{{ drives_total }} drives tracked</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Overall stat cards -->
|
||||
<div class="stats-row" style="margin-bottom:24px">
|
||||
<div class="overview-card">
|
||||
<span class="ov-value">{{ overall.total or 0 }}</span>
|
||||
<span class="ov-label">Total Jobs</span>
|
||||
</div>
|
||||
<div class="overview-card ov-green">
|
||||
<span class="ov-value">{{ overall.passed or 0 }}</span>
|
||||
<span class="ov-label">Passed</span>
|
||||
</div>
|
||||
<div class="overview-card ov-red">
|
||||
<span class="ov-value">{{ overall.failed or 0 }}</span>
|
||||
<span class="ov-label">Failed</span>
|
||||
</div>
|
||||
<div class="overview-card ov-blue">
|
||||
<span class="ov-value">{{ overall.running or 0 }}</span>
|
||||
<span class="ov-label">Running</span>
|
||||
</div>
|
||||
<div class="overview-card ov-gray">
|
||||
<span class="ov-value">{{ overall.cancelled or 0 }}</span>
|
||||
<span class="ov-label">Cancelled</span>
|
||||
</div>
|
||||
{% if overall.total and overall.total > 0 %}
|
||||
<div class="overview-card ov-green">
|
||||
<span class="ov-value">{{ "%.0f" | format(100 * (overall.passed or 0) / overall.total) }}%</span>
|
||||
<span class="ov-label">Pass Rate</span>
|
||||
</div>
|
||||
{% endif %}
|
||||
</div>
|
||||
|
||||
<div class="stats-grid">
|
||||
|
||||
<!-- Failure rate by model -->
|
||||
<div class="stats-section">
|
||||
<h2 class="section-title">Results by Drive Model</h2>
|
||||
{% if by_model %}
|
||||
<div class="table-wrap" style="max-height:none">
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Model</th>
|
||||
<th style="text-align:right">Total</th>
|
||||
<th style="text-align:right">Passed</th>
|
||||
<th style="text-align:right">Failed</th>
|
||||
<th style="text-align:right">Pass Rate</th>
|
||||
<th style="min-width:120px">Rate Bar</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{% for m in by_model %}
|
||||
<tr>
|
||||
<td style="font-weight:500;color:var(--text-strong)">{{ m.model }}</td>
|
||||
<td class="mono text-muted" style="text-align:right">{{ m.total }}</td>
|
||||
<td class="mono" style="text-align:right;color:var(--green)">{{ m.passed }}</td>
|
||||
<td class="mono" style="text-align:right;color:{% if m.failed > 0 %}var(--red){% else %}var(--text-muted){% endif %}">{{ m.failed }}</td>
|
||||
<td class="mono" style="text-align:right;font-weight:600;color:{% if (m.pass_rate or 0) >= 90 %}var(--green){% elif (m.pass_rate or 0) >= 70 %}var(--yellow){% else %}var(--red){% endif %}">
|
||||
{{ m.pass_rate or 0 }}%
|
||||
</td>
|
||||
<td>
|
||||
<div class="rate-bar-wrap">
|
||||
<div class="rate-bar-fill rate-pass" style="width:{{ m.pass_rate or 0 }}%"></div>
|
||||
<div class="rate-bar-fill rate-fail" style="width:{{ 100 - (m.pass_rate or 0) }}%"></div>
|
||||
</div>
|
||||
</td>
|
||||
</tr>
|
||||
{% endfor %}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
{% else %}
|
||||
<div class="empty-state" style="border:1px solid var(--border);border-radius:8px;padding:32px">
|
||||
No completed burn-in jobs yet.
|
||||
</div>
|
||||
{% endif %}
|
||||
</div>
|
||||
|
||||
<!-- Activity last 14 days -->
|
||||
<div class="stats-section">
|
||||
<h2 class="section-title">Activity — Last 14 Days</h2>
|
||||
{% if by_day %}
|
||||
<div class="table-wrap" style="max-height:none">
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Date</th>
|
||||
<th style="text-align:right">Total</th>
|
||||
<th style="text-align:right">Passed</th>
|
||||
<th style="text-align:right">Failed</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{% for d in by_day %}
|
||||
<tr>
|
||||
<td class="mono text-muted">{{ d.day }}</td>
|
||||
<td class="mono" style="text-align:right;color:var(--text-strong)">{{ d.total }}</td>
|
||||
<td class="mono" style="text-align:right;color:var(--green)">{{ d.passed }}</td>
|
||||
<td class="mono" style="text-align:right;color:{% if d.failed > 0 %}var(--red){% else %}var(--text-muted){% endif %}">{{ d.failed }}</td>
|
||||
</tr>
|
||||
{% endfor %}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
{% else %}
|
||||
<div class="empty-state" style="border:1px solid var(--border);border-radius:8px;padding:32px">
|
||||
No activity in the last 14 days.
|
||||
</div>
|
||||
{% endif %}
|
||||
</div>
|
||||
|
||||
</div>
|
||||
{% endblock %}
|
||||
112
app/truenas.py
Normal file
112
app/truenas.py
Normal file
|
|
@ -0,0 +1,112 @@
|
|||
import asyncio
|
||||
import logging
|
||||
from collections.abc import Callable, Coroutine
|
||||
from typing import Any, TypeVar
|
||||
|
||||
import httpx
|
||||
|
||||
from app.config import settings
|
||||
|
||||
log = logging.getLogger(__name__)
|
||||
|
||||
T = TypeVar("T")
|
||||
|
||||
# Exceptions that are safe to retry (transient network issues)
|
||||
_RETRYABLE = (
|
||||
httpx.ConnectError,
|
||||
httpx.TimeoutException,
|
||||
httpx.RemoteProtocolError,
|
||||
httpx.ReadError,
|
||||
)
|
||||
|
||||
|
||||
async def _with_retry(
|
||||
factory: Callable[[], Coroutine[Any, Any, T]],
|
||||
label: str,
|
||||
max_attempts: int = 3,
|
||||
) -> T:
|
||||
"""
|
||||
Call factory() to get a fresh coroutine and await it, retrying with
|
||||
exponential backoff on transient failures.
|
||||
|
||||
A factory (not a bare coroutine) is required so each attempt gets a
|
||||
new coroutine object — an already-awaited coroutine cannot be reused.
|
||||
"""
|
||||
backoff = 1.0
|
||||
for attempt in range(1, max_attempts + 1):
|
||||
try:
|
||||
return await factory()
|
||||
except _RETRYABLE as exc:
|
||||
if attempt == max_attempts:
|
||||
raise
|
||||
log.warning(
|
||||
"TrueNAS %s transient error (attempt %d/%d): %s — retrying in %.0fs",
|
||||
label, attempt, max_attempts, exc, backoff,
|
||||
)
|
||||
await asyncio.sleep(backoff)
|
||||
backoff *= 2
|
||||
|
||||
|
||||
class TrueNASClient:
|
||||
def __init__(self) -> None:
|
||||
self._client = httpx.AsyncClient(
|
||||
base_url=settings.truenas_base_url,
|
||||
headers={"Authorization": f"Bearer {settings.truenas_api_key}"},
|
||||
verify=settings.truenas_verify_tls,
|
||||
timeout=10.0,
|
||||
)
|
||||
|
||||
async def close(self) -> None:
|
||||
await self._client.aclose()
|
||||
|
||||
async def get_disks(self) -> list[dict]:
|
||||
r = await _with_retry(
|
||||
lambda: self._client.get("/api/v2.0/disk"),
|
||||
"get_disks",
|
||||
)
|
||||
r.raise_for_status()
|
||||
return r.json()
|
||||
|
||||
async def get_smart_jobs(self, state: str | None = None) -> list[dict]:
|
||||
params: dict = {"method": "smart.test"}
|
||||
if state:
|
||||
params["state"] = state
|
||||
r = await _with_retry(
|
||||
lambda: self._client.get("/api/v2.0/core/get_jobs", params=params),
|
||||
"get_smart_jobs",
|
||||
)
|
||||
r.raise_for_status()
|
||||
return r.json()
|
||||
|
||||
async def get_smart_results(self, devname: str) -> list[dict]:
|
||||
r = await _with_retry(
|
||||
lambda: self._client.get(f"/api/v2.0/smart/test/results/{devname}"),
|
||||
f"get_smart_results({devname})",
|
||||
)
|
||||
r.raise_for_status()
|
||||
return r.json()
|
||||
|
||||
async def start_smart_test(self, disks: list[str], test_type: str) -> int:
|
||||
"""Start a SMART test. Not retried — a duplicate start would launch a second job."""
|
||||
r = await self._client.post(
|
||||
"/api/v2.0/smart/test",
|
||||
json={"disks": disks, "type": test_type},
|
||||
)
|
||||
r.raise_for_status()
|
||||
return r.json()
|
||||
|
||||
async def abort_job(self, job_id: int) -> None:
|
||||
"""Abort a TrueNAS job. Not retried — best-effort cancel."""
|
||||
r = await self._client.post(
|
||||
"/api/v2.0/core/job_abort",
|
||||
json={"id": job_id},
|
||||
)
|
||||
r.raise_for_status()
|
||||
|
||||
async def get_system_info(self) -> dict:
|
||||
r = await _with_retry(
|
||||
lambda: self._client.get("/api/v2.0/system/info"),
|
||||
"get_system_info",
|
||||
)
|
||||
r.raise_for_status()
|
||||
return r.json()
|
||||
21
docker-compose.yml
Normal file
21
docker-compose.yml
Normal file
|
|
@ -0,0 +1,21 @@
|
|||
services:
|
||||
mock-truenas:
|
||||
build: ./mock-truenas
|
||||
container_name: mock-truenas
|
||||
ports:
|
||||
- "8000:8000"
|
||||
restart: unless-stopped
|
||||
|
||||
app:
|
||||
build: .
|
||||
container_name: truenas-burnin
|
||||
ports:
|
||||
- "8084:8084"
|
||||
env_file: .env
|
||||
volumes:
|
||||
- ./data:/data
|
||||
- ./app/templates:/opt/app/app/templates
|
||||
- ./app/static:/opt/app/app/static
|
||||
depends_on:
|
||||
- mock-truenas
|
||||
restart: unless-stopped
|
||||
9
mock-truenas/Dockerfile
Normal file
9
mock-truenas/Dockerfile
Normal file
|
|
@ -0,0 +1,9 @@
|
|||
FROM python:3.12-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
RUN pip install --no-cache-dir fastapi uvicorn
|
||||
|
||||
COPY app.py .
|
||||
|
||||
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||
345
mock-truenas/app.py
Normal file
345
mock-truenas/app.py
Normal file
|
|
@ -0,0 +1,345 @@
|
|||
"""
|
||||
Mock TrueNAS CORE v2.0 API Server
|
||||
|
||||
Simulates the TrueNAS CORE REST API for development and testing.
|
||||
All state is in-memory. Restart resets everything.
|
||||
|
||||
Simulation behavior:
|
||||
- SHORT test completes in ~90 seconds real-time
|
||||
- LONG test completes in ~8 minutes real-time
|
||||
- Drive 'sdn' (serial FAIL001) always fails SMART at ~30%
|
||||
- Temperatures drift slightly on each tick
|
||||
- Debug endpoints at /debug/* for test control
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import random
|
||||
import time
|
||||
from datetime import datetime, timezone
|
||||
from typing import Optional
|
||||
|
||||
from fastapi import FastAPI, HTTPException
|
||||
from pydantic import BaseModel
|
||||
|
||||
app = FastAPI(title="Mock TrueNAS CORE API", version="13.0-U6.1")
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Simulation constants
|
||||
# ---------------------------------------------------------------------------
|
||||
SHORT_DURATION_SECONDS = 90
|
||||
LONG_DURATION_SECONDS = 480
|
||||
TICK_SECONDS = 5
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Static drive inventory — 15 drives, sda-sdo, mixed capacities
|
||||
# ---------------------------------------------------------------------------
|
||||
_BASE_DRIVES = [
|
||||
# 12TB Seagate Exos — sda, sdb, sdc
|
||||
{"identifier": "3500151795937c001", "name": "sda", "devname": "sda", "serial": "WDZ1A001", "model": "ST12000NM0008", "size": 12000138625024, "rotationrate": 7200, "_base_temp": 36},
|
||||
{"identifier": "3500151795937c002", "name": "sdb", "devname": "sdb", "serial": "WDZ1A002", "model": "ST12000NM0008", "size": 12000138625024, "rotationrate": 7200, "_base_temp": 34},
|
||||
{"identifier": "3500151795937c003", "name": "sdc", "devname": "sdc", "serial": "WDZ1A003", "model": "ST12000NM0008", "size": 12000138625024, "rotationrate": 7200, "_base_temp": 37},
|
||||
# 8TB WD Red — sdd, sde, sdf
|
||||
{"identifier": "3500151795937c004", "name": "sdd", "devname": "sdd", "serial": "WDZ1A004", "model": "WD80EFAX", "size": 8001563222016, "rotationrate": 5400, "_base_temp": 32},
|
||||
{"identifier": "3500151795937c005", "name": "sde", "devname": "sde", "serial": "WDZ1A005", "model": "WD80EFAX", "size": 8001563222016, "rotationrate": 5400, "_base_temp": 33},
|
||||
{"identifier": "3500151795937c006", "name": "sdf", "devname": "sdf", "serial": "WDZ1A006", "model": "WD80EFAX", "size": 8001563222016, "rotationrate": 5400, "_base_temp": 31},
|
||||
# 16TB Seagate Exos — sdg, sdh
|
||||
{"identifier": "3500151795937c007", "name": "sdg", "devname": "sdg", "serial": "WDZ1A007", "model": "ST16000NM001G", "size": 16000900661248, "rotationrate": 7200, "_base_temp": 38},
|
||||
{"identifier": "3500151795937c008", "name": "sdh", "devname": "sdh", "serial": "WDZ1A008", "model": "ST16000NM001G", "size": 16000900661248, "rotationrate": 7200, "_base_temp": 39},
|
||||
# 4TB Seagate IronWolf — sdi, sdj
|
||||
{"identifier": "3500151795937c009", "name": "sdi", "devname": "sdi", "serial": "WDZ1A009", "model": "ST4000VN008", "size": 4000787030016, "rotationrate": 5900, "_base_temp": 30},
|
||||
{"identifier": "3500151795937c00a", "name": "sdj", "devname": "sdj", "serial": "WDZ1A010", "model": "ST4000VN008", "size": 4000787030016, "rotationrate": 5900, "_base_temp": 29},
|
||||
# 10TB Toshiba — sdk, sdl
|
||||
{"identifier": "3500151795937c00b", "name": "sdk", "devname": "sdk", "serial": "WDZ1A011", "model": "TOSHIBA MG06ACA10TE", "size": 10000831348736, "rotationrate": 7200, "_base_temp": 41},
|
||||
{"identifier": "3500151795937c00c", "name": "sdl", "devname": "sdl", "serial": "WDZ1A012", "model": "TOSHIBA MG06ACA10TE", "size": 10000831348736, "rotationrate": 7200, "_base_temp": 40},
|
||||
# 8TB HGST — sdm
|
||||
{"identifier": "3500151795937c00d", "name": "sdm", "devname": "sdm", "serial": "WDZ1A013", "model": "HGST HUH728080ALE604", "size": 8001563222016, "rotationrate": 7200, "_base_temp": 35},
|
||||
# Always-fail drive — sdn
|
||||
{"identifier": "3500151795937c00e", "name": "sdn", "devname": "sdn", "serial": "FAIL001", "model": "TOSHIBA MG06ACA10TE", "size": 10000831348736, "rotationrate": 7200, "_base_temp": 45, "_always_fail": True},
|
||||
# 6TB Seagate Archive — sdo
|
||||
{"identifier": "3500151795937c00f", "name": "sdo", "devname": "sdo", "serial": "WDZ1A015", "model": "ST6000DM003", "size": 6001175126016, "rotationrate": 5900, "_base_temp": 33},
|
||||
]
|
||||
|
||||
# Shared fields for every drive
|
||||
_DRIVE_DEFAULTS = {
|
||||
"type": "HDD",
|
||||
"bus": "SCSI",
|
||||
"togglesmart": True,
|
||||
"pool": None,
|
||||
"enclosure": None,
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Mutable in-memory state
|
||||
# ---------------------------------------------------------------------------
|
||||
_state: dict = {
|
||||
"drives": {},
|
||||
"jobs": {},
|
||||
"smart_history": {},
|
||||
"job_counter": 1000,
|
||||
}
|
||||
|
||||
|
||||
def _init_state() -> None:
|
||||
for d in _BASE_DRIVES:
|
||||
devname = d["devname"]
|
||||
_state["drives"][devname] = {
|
||||
**_DRIVE_DEFAULTS,
|
||||
**{k: v for k, v in d.items() if not k.startswith("_")},
|
||||
"zfs_guid": f"1234{int(d['identifier'], 16):016x}",
|
||||
"temperature": d["_base_temp"],
|
||||
"smart_health": "PASSED",
|
||||
"_base_temp": d["_base_temp"],
|
||||
"_always_fail": d.get("_always_fail", False),
|
||||
}
|
||||
_state["smart_history"][devname] = []
|
||||
|
||||
|
||||
_init_state()
|
||||
|
||||
|
||||
def _public_drive(d: dict) -> dict:
|
||||
return {k: v for k, v in d.items() if not k.startswith("_")}
|
||||
|
||||
|
||||
def _public_job(j: dict) -> dict:
|
||||
return {k: v for k, v in j.items() if not k.startswith("_")}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Simulation loop
|
||||
# ---------------------------------------------------------------------------
|
||||
async def _simulation_loop() -> None:
|
||||
while True:
|
||||
await asyncio.sleep(TICK_SECONDS)
|
||||
_tick()
|
||||
|
||||
|
||||
def _tick() -> None:
|
||||
for drive in _state["drives"].values():
|
||||
drift = random.randint(-1, 2)
|
||||
drive["temperature"] = max(20, min(70, drive["_base_temp"] + drift))
|
||||
|
||||
now_iso = datetime.now(timezone.utc).isoformat()
|
||||
for job_id, job in list(_state["jobs"].items()):
|
||||
if job["state"] != "RUNNING":
|
||||
continue
|
||||
|
||||
elapsed = time.monotonic() - job["_started_mono"]
|
||||
duration = job["_duration_seconds"]
|
||||
|
||||
if job["_always_fail"] and elapsed / duration >= 0.30:
|
||||
job["state"] = "FAILED"
|
||||
job["error"] = "SMART test aborted: uncorrectable read error at LBA 0x1234567"
|
||||
job["progress"]["percent"] = 30
|
||||
job["progress"]["description"] = "Test failed"
|
||||
job["time_finished"] = now_iso
|
||||
_record_smart_result(job, failed=True)
|
||||
continue
|
||||
|
||||
pct = min(100, int(elapsed / duration * 100))
|
||||
job["progress"]["percent"] = pct
|
||||
job["progress"]["description"] = (
|
||||
f"Running SMART {job['_test_type'].lower()} test on {job['_disk']} ({pct}%)"
|
||||
)
|
||||
|
||||
if pct >= 100:
|
||||
job["state"] = "SUCCESS"
|
||||
job["result"] = True
|
||||
job["time_finished"] = now_iso
|
||||
job["progress"]["percent"] = 100
|
||||
job["progress"]["description"] = "Completed without error"
|
||||
_record_smart_result(job, failed=False)
|
||||
|
||||
|
||||
def _record_smart_result(job: dict, failed: bool) -> None:
|
||||
devname = job["_disk"]
|
||||
test_type = job["_test_type"]
|
||||
history = _state["smart_history"].get(devname, [])
|
||||
num = len(history) + 1
|
||||
|
||||
history.insert(0, {
|
||||
"num": num,
|
||||
"type": "Short offline" if test_type == "SHORT" else "Extended offline",
|
||||
"status": "Read failure" if failed else "Completed without error",
|
||||
"status_verbose": (
|
||||
"Read failure - error in segment #1" if failed
|
||||
else "Completed without error"
|
||||
),
|
||||
"remaining": 0,
|
||||
"lifetime": random.randint(10000, 50000),
|
||||
"lba_of_first_error": "0x1234567" if failed else None,
|
||||
})
|
||||
|
||||
drive = _state["drives"].get(devname)
|
||||
if drive:
|
||||
drive["smart_health"] = "FAILED" if failed else "PASSED"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Request models
|
||||
# ---------------------------------------------------------------------------
|
||||
class SmartTestRequest(BaseModel):
|
||||
disks: list[str]
|
||||
type: str # SHORT | LONG
|
||||
|
||||
|
||||
class AbortRequest(BaseModel):
|
||||
id: int
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# API Routes — mirrors TrueNAS CORE v2.0
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@app.get("/api/v2.0/disk")
|
||||
async def list_disks():
|
||||
return [_public_drive(d) for d in _state["drives"].values()]
|
||||
|
||||
|
||||
@app.get("/api/v2.0/disk/{identifier}")
|
||||
async def get_disk(identifier: str):
|
||||
for d in _state["drives"].values():
|
||||
if d["identifier"] == identifier or d["devname"] == identifier:
|
||||
return _public_drive(d)
|
||||
raise HTTPException(status_code=404, detail="Disk not found")
|
||||
|
||||
|
||||
@app.get("/api/v2.0/smart/test/results/{disk_name}")
|
||||
async def smart_test_results(disk_name: str):
|
||||
if disk_name not in _state["smart_history"]:
|
||||
raise HTTPException(status_code=404, detail="Disk not found")
|
||||
return [{"disk": disk_name, "tests": _state["smart_history"][disk_name]}]
|
||||
|
||||
|
||||
@app.post("/api/v2.0/smart/test")
|
||||
async def start_smart_test(req: SmartTestRequest):
|
||||
if req.type not in ("SHORT", "LONG"):
|
||||
raise HTTPException(status_code=422, detail="type must be SHORT or LONG")
|
||||
|
||||
job_ids = []
|
||||
for disk_name in req.disks:
|
||||
if disk_name not in _state["drives"]:
|
||||
raise HTTPException(status_code=404, detail=f"Disk {disk_name} not found")
|
||||
|
||||
_state["job_counter"] += 1
|
||||
job_id = _state["job_counter"]
|
||||
drive = _state["drives"][disk_name]
|
||||
duration = SHORT_DURATION_SECONDS if req.type == "SHORT" else LONG_DURATION_SECONDS
|
||||
|
||||
_state["jobs"][job_id] = {
|
||||
"id": job_id,
|
||||
"method": "smart.test",
|
||||
"arguments": [{"disks": [disk_name], "type": req.type}],
|
||||
"state": "RUNNING",
|
||||
"progress": {
|
||||
"percent": 0,
|
||||
"description": f"Running SMART {req.type.lower()} test on {disk_name}",
|
||||
"extra": None,
|
||||
},
|
||||
"result": None,
|
||||
"error": None,
|
||||
"exception": None,
|
||||
"time_started": datetime.now(timezone.utc).isoformat(),
|
||||
"time_finished": None,
|
||||
"_started_mono": time.monotonic(),
|
||||
"_duration_seconds": duration,
|
||||
"_disk": disk_name,
|
||||
"_test_type": req.type,
|
||||
"_always_fail": drive["_always_fail"],
|
||||
}
|
||||
job_ids.append(job_id)
|
||||
|
||||
return job_ids[0] if len(job_ids) == 1 else job_ids
|
||||
|
||||
|
||||
@app.get("/api/v2.0/core/get_jobs")
|
||||
async def get_jobs(method: Optional[str] = None, state: Optional[str] = None):
|
||||
results = []
|
||||
for job in _state["jobs"].values():
|
||||
if method and job["method"] != method:
|
||||
continue
|
||||
if state and job["state"] != state:
|
||||
continue
|
||||
results.append(_public_job(job))
|
||||
return results
|
||||
|
||||
|
||||
@app.get("/api/v2.0/core/get_jobs/{job_id}")
|
||||
async def get_job(job_id: int):
|
||||
job = _state["jobs"].get(job_id)
|
||||
if not job:
|
||||
raise HTTPException(status_code=404, detail="Job not found")
|
||||
return _public_job(job)
|
||||
|
||||
|
||||
@app.post("/api/v2.0/core/job_abort")
|
||||
async def abort_job(req: AbortRequest):
|
||||
job = _state["jobs"].get(req.id)
|
||||
if not job:
|
||||
raise HTTPException(status_code=404, detail="Job not found")
|
||||
if job["state"] != "RUNNING":
|
||||
raise HTTPException(status_code=400, detail=f"Job is not running (state={job['state']})")
|
||||
job["state"] = "ABORTED"
|
||||
job["time_finished"] = datetime.now(timezone.utc).isoformat()
|
||||
job["progress"]["description"] = "Aborted by user"
|
||||
return True
|
||||
|
||||
|
||||
@app.get("/api/v2.0/system/info")
|
||||
async def system_info():
|
||||
return {
|
||||
"version": "TrueNAS-13.0-U6.1",
|
||||
"hostname": "mock-truenas",
|
||||
"uptime_seconds": 86400,
|
||||
"system_serial": "MOCK-SN-001",
|
||||
"system_product": "MOCK SERVER",
|
||||
"cores": 4,
|
||||
"physmem": 17179869184,
|
||||
}
|
||||
|
||||
|
||||
@app.get("/health")
|
||||
async def health():
|
||||
return {"status": "ok", "mock": True, "drives": len(_state["drives"]), "jobs": len(_state["jobs"])}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Debug endpoints
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@app.post("/debug/reset")
|
||||
async def debug_reset():
|
||||
_state["drives"].clear()
|
||||
_state["jobs"].clear()
|
||||
_state["smart_history"].clear()
|
||||
_state["job_counter"] = 1000
|
||||
_init_state()
|
||||
return {"reset": True}
|
||||
|
||||
|
||||
@app.get("/debug/state")
|
||||
async def debug_state():
|
||||
return {
|
||||
"drives": {k: _public_drive(v) for k, v in _state["drives"].items()},
|
||||
"jobs": {str(k): _public_job(v) for k, v in _state["jobs"].items()},
|
||||
"smart_history": _state["smart_history"],
|
||||
"job_counter": _state["job_counter"],
|
||||
}
|
||||
|
||||
|
||||
@app.post("/debug/complete-all-jobs")
|
||||
async def debug_complete_all():
|
||||
completed = []
|
||||
for job_id, job in _state["jobs"].items():
|
||||
if job["state"] == "RUNNING":
|
||||
job["_started_mono"] -= job["_duration_seconds"]
|
||||
completed.append(job_id)
|
||||
return {"fast_forwarded": completed}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Startup
|
||||
# ---------------------------------------------------------------------------
|
||||
@app.on_event("startup")
|
||||
async def startup():
|
||||
asyncio.create_task(_simulation_loop())
|
||||
7
requirements.txt
Normal file
7
requirements.txt
Normal file
|
|
@ -0,0 +1,7 @@
|
|||
fastapi
|
||||
uvicorn
|
||||
aiosqlite
|
||||
httpx
|
||||
pydantic-settings
|
||||
jinja2
|
||||
sse-starlette
|
||||
Loading…
Add table
Reference in a new issue