truenas-burnin/app/database.py
Brandon Walter 2dff58bd52 Stage 7: SSH architecture, SMART attribute monitoring, drive reset, and polish
SSH (app/ssh_client.py — new):
- asyncssh-based client: start_smart_test, poll_smart_progress, abort_smart_test,
  get_smart_attributes, run_badblocks with streaming progress callbacks
- SMART attribute table: monitors attrs 5/10/188/197/198/199 for warn/fail thresholds
- Falls back to REST API / mock simulation when ssh_host is not configured

Burn-in stages updated (burnin.py):
- _stage_smart_test: SSH path polls smartctl -a, stores raw output + parsed attributes
- _stage_surface_validate: SSH path streams badblocks, counts bad blocks vs configurable threshold
- _stage_final_check: SSH path checks smartctl attributes; DB fallback for mock mode
- New DB helpers: _append_stage_log, _update_stage_bad_blocks, _store_smart_attrs,
  _store_smart_raw_output

Database (database.py):
- Migrations: burnin_stages.log_text, burnin_stages.bad_blocks,
  drives.smart_attrs (JSON), smart_tests.raw_output

Settings (config.py + settings_store.py):
- ssh_host, ssh_port, ssh_user, ssh_password, ssh_key — all runtime-editable
- SSH section in Settings UI with Test SSH Connection button

Webhook (notifier.py):
- Added bad_blocks and timestamp fields to payload per SPEC

Drive reset (routes.py + drives_table.html):
- POST /api/v1/drives/{id}/reset — clears SMART state, smart_attrs; audit logged
- Reset button visible on drives with completed test state (no active burn-in)

Log drawer (app.js):
- Burn-In tab: shows raw stage log_text (SSH output) with bad block highlighting
- SMART tab: shows SMART attribute table with warn/fail colouring + raw smartctl output

Polish:
- Version badge (v1.0.0-6d) in header via Jinja2 global
- Parallel burn-in warning when max_parallel_burnins > 8 in Settings
- Stats page: avg duration by drive size + failure breakdown by stage
- settings.html: SSH section with key textarea, parallel warn div

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 08:09:30 -05:00

148 lines
5.3 KiB
Python

import aiosqlite
from pathlib import Path
from app.config import settings
SCHEMA = """
CREATE TABLE IF NOT EXISTS drives (
id INTEGER PRIMARY KEY AUTOINCREMENT,
truenas_disk_id TEXT UNIQUE NOT NULL,
devname TEXT NOT NULL,
serial TEXT,
model TEXT,
size_bytes INTEGER,
temperature_c INTEGER,
smart_health TEXT DEFAULT 'UNKNOWN',
last_seen_at TEXT NOT NULL,
last_polled_at TEXT NOT NULL,
notes TEXT,
location TEXT
);
CREATE TABLE IF NOT EXISTS smart_tests (
id INTEGER PRIMARY KEY AUTOINCREMENT,
drive_id INTEGER NOT NULL REFERENCES drives(id) ON DELETE CASCADE,
test_type TEXT NOT NULL CHECK(test_type IN ('short', 'long')),
state TEXT NOT NULL DEFAULT 'idle',
percent INTEGER DEFAULT 0,
truenas_job_id INTEGER,
started_at TEXT,
eta_at TEXT,
finished_at TEXT,
error_text TEXT,
UNIQUE(drive_id, test_type)
);
CREATE TABLE IF NOT EXISTS burnin_jobs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
drive_id INTEGER NOT NULL REFERENCES drives(id),
profile TEXT NOT NULL,
state TEXT NOT NULL DEFAULT 'queued',
percent INTEGER DEFAULT 0,
stage_name TEXT,
operator TEXT NOT NULL,
created_at TEXT NOT NULL,
started_at TEXT,
finished_at TEXT,
error_text TEXT
);
CREATE TABLE IF NOT EXISTS burnin_stages (
id INTEGER PRIMARY KEY AUTOINCREMENT,
burnin_job_id INTEGER NOT NULL REFERENCES burnin_jobs(id) ON DELETE CASCADE,
stage_name TEXT NOT NULL,
state TEXT NOT NULL DEFAULT 'pending',
percent INTEGER DEFAULT 0,
started_at TEXT,
finished_at TEXT,
duration_seconds REAL,
error_text TEXT
);
CREATE TABLE IF NOT EXISTS audit_events (
id INTEGER PRIMARY KEY AUTOINCREMENT,
event_type TEXT NOT NULL,
drive_id INTEGER REFERENCES drives(id),
burnin_job_id INTEGER REFERENCES burnin_jobs(id),
operator TEXT,
message TEXT NOT NULL,
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now'))
);
CREATE INDEX IF NOT EXISTS idx_smart_drive_type ON smart_tests(drive_id, test_type);
CREATE INDEX IF NOT EXISTS idx_burnin_jobs_drive ON burnin_jobs(drive_id, state);
CREATE INDEX IF NOT EXISTS idx_burnin_stages_job ON burnin_stages(burnin_job_id);
CREATE INDEX IF NOT EXISTS idx_audit_events_job ON audit_events(burnin_job_id);
"""
# Migrations for existing databases that predate schema additions.
# Each entry is tried with try/except — SQLite raises OperationalError
# ("duplicate column name") if the column already exists, which is safe to ignore.
_MIGRATIONS = [
"ALTER TABLE drives ADD COLUMN notes TEXT",
"ALTER TABLE drives ADD COLUMN location TEXT",
# Stage 7: SSH command output + SMART attribute storage
"ALTER TABLE burnin_stages ADD COLUMN log_text TEXT",
"ALTER TABLE burnin_stages ADD COLUMN bad_blocks INTEGER DEFAULT 0",
"ALTER TABLE drives ADD COLUMN smart_attrs TEXT",
"ALTER TABLE smart_tests ADD COLUMN raw_output TEXT",
]
async def _run_migrations(db: aiosqlite.Connection) -> None:
for sql in _MIGRATIONS:
try:
await db.execute(sql)
except Exception:
pass # Column already exists — harmless
# Remove the old CHECK(profile IN ('quick','full')) constraint if present.
# SQLite can't ALTER a CHECK — requires a full table rebuild.
cur = await db.execute(
"SELECT sql FROM sqlite_master WHERE type='table' AND name='burnin_jobs'"
)
row = await cur.fetchone()
if row and "CHECK" in (row[0] or ""):
await db.executescript("""
PRAGMA foreign_keys=OFF;
CREATE TABLE burnin_jobs_new (
id INTEGER PRIMARY KEY AUTOINCREMENT,
drive_id INTEGER NOT NULL REFERENCES drives(id),
profile TEXT NOT NULL,
state TEXT NOT NULL DEFAULT 'queued',
percent INTEGER DEFAULT 0,
stage_name TEXT,
operator TEXT NOT NULL,
created_at TEXT NOT NULL,
started_at TEXT,
finished_at TEXT,
error_text TEXT
);
INSERT INTO burnin_jobs_new SELECT * FROM burnin_jobs;
DROP TABLE burnin_jobs;
ALTER TABLE burnin_jobs_new RENAME TO burnin_jobs;
CREATE INDEX IF NOT EXISTS idx_burnin_jobs_drive ON burnin_jobs(drive_id, state);
PRAGMA foreign_keys=ON;
""")
async def init_db() -> None:
Path(settings.db_path).parent.mkdir(parents=True, exist_ok=True)
async with aiosqlite.connect(settings.db_path) as db:
await db.execute("PRAGMA journal_mode=WAL")
await db.execute("PRAGMA foreign_keys=ON")
await db.executescript(SCHEMA)
await _run_migrations(db)
await db.commit()
async def get_db():
db = await aiosqlite.connect(settings.db_path)
db.row_factory = aiosqlite.Row
try:
await db.execute("PRAGMA journal_mode=WAL")
await db.execute("PRAGMA foreign_keys=ON")
yield db
finally:
await db.close()