truenas-burnin/claude-sandbox/truenas-burnin/app/config.py
echoparkbaby 3e0000528f TrueNAS Burn-In Dashboard v0.9.0 — Live mode, thermal monitoring, adaptive concurrency
Go live against real TrueNAS SCALE 25.10:
- Remove mock-truenas dependency; mount SSH key as Docker secret
- Filter expired disk records from /api/v2.0/disk (expiretime field)
- Route all SMART operations through SSH (SCALE 25.10 removed REST smart/test endpoint)
- Poll drive temperatures via POST /api/v2.0/disk/temperatures (SCALE-specific)
- Store raw smartctl output in smart_tests.raw_output for proof of test execution
- Fix percent-remaining=0 false jump to 100% on test start
- Fix terminal WebSocket: add mounted key file fallback (/run/secrets/ssh_key)
- Fix WebSocket support: uvicorn → uvicorn[standard] (installs websockets)

HBA/system sensor temps on dashboard:
- SSH to TrueNAS and run sensors -j each poll cycle
- Parse coretemp (CPU package) and pch_* (PCH/chipset — storage I/O proxy)
- Render as compact chips in stats bar, color-coded green/yellow/red
- Live updates via new SSE system-sensors event every 12s

Adaptive concurrency signal:
- Thermal pressure indicator in stats bar: hidden when OK, WARM/HOT when running
  burn-in drives hit temp_warn_c / temp_crit_c thresholds
- Thermal gate in burn-in queue: jobs wait up to 3 min before acquiring semaphore
  slot if running drives are already at warning temp; times out and proceeds

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 06:33:36 -05:00

74 lines
2.8 KiB
Python

from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
model_config = SettingsConfigDict(
env_file=".env",
env_file_encoding="utf-8",
case_sensitive=False,
)
app_host: str = "0.0.0.0"
app_port: int = 8080
db_path: str = "/data/app.db"
truenas_base_url: str = "http://localhost:8000"
truenas_api_key: str = "mock-key"
truenas_verify_tls: bool = False
poll_interval_seconds: int = 12
stale_threshold_seconds: int = 45
max_parallel_burnins: int = 2
surface_validate_seconds: int = 45 # mock simulation duration
io_validate_seconds: int = 25 # mock simulation duration
# Logging
log_level: str = "INFO"
# Security — comma-separated IPs or CIDRs, e.g. "10.0.0.0/24,127.0.0.1"
# Empty string means allow all (default).
allowed_ips: str = ""
# SMTP — daily status email at 8am local time
# Leave smtp_host empty to disable email.
smtp_host: str = ""
smtp_port: int = 587
smtp_user: str = ""
smtp_password: str = ""
smtp_from: str = ""
smtp_to: str = "" # comma-separated recipients
smtp_report_hour: int = 8 # local hour to send (0-23)
smtp_daily_report_enabled: bool = True # set False to skip daily report without disabling alerts
smtp_alert_on_fail: bool = True # immediate email when a job fails
smtp_alert_on_pass: bool = False # immediate email when a job passes
smtp_ssl_mode: str = "starttls" # "starttls" | "ssl" | "plain"
smtp_timeout: int = 60 # connection + read timeout in seconds
# Webhook — POST JSON payload on every job state change (pass/fail)
# Leave empty to disable. Works with Slack, Discord, ntfy, n8n, etc.
webhook_url: str = ""
# Stuck-job detection: jobs running longer than this are marked 'unknown'
stuck_job_hours: int = 24
# Temperature thresholds (°C) — drives table colouring + precheck gate
temp_warn_c: int = 46 # orange warning
temp_crit_c: int = 55 # red critical (precheck refuses to start above this)
# Bad-block tolerance — surface_validate fails if bad blocks exceed this
bad_block_threshold: int = 0
# SSH credentials for direct TrueNAS command execution (Stage 7)
# When ssh_host is set, burn-in stages use SSH for smartctl/badblocks instead of REST API.
# Leave ssh_host empty to use the mock/REST API (development mode).
ssh_host: str = ""
ssh_port: int = 22
ssh_user: str = "root" # TrueNAS CORE default is root
ssh_password: str = "" # Password auth (leave blank if using key)
ssh_key: str = "" # PEM private key content (paste full key including headers)
# Application version — used by the /api/v1/updates/check endpoint
app_version: str = "1.0.0-7"
settings = Settings()