Two layered changes shipped in this branch: == 1.0.0-22: app-level authentication == The dashboard previously had only an IP allowlist. Adds username + bcrypt password auth, signed-cookie sessions, and a "first user setup" flow. * New app/auth.py: User dataclass, bcrypt hash/verify, get_user_by_id/ username, create_user, touch_last_login, FastAPI `get_current_user` dependency. Session secret loaded from SESSION_SECRET env or persisted to /data/session_secret. * New app/auth_cli.py: `python -m app.auth_cli list|reset|add` for out-of-band user management. Passwords always read from a TTY prompt. * Schema: idempotent ALTER for `users` table (id, username unique, password_hash, full_name, is_admin, created_at, last_login_at). * main.py: SessionMiddleware (HMAC-signed cookie, max-age 7 days, SameSite=strict — see hardening section) + _AuthGateMiddleware that populates request.state.current_user and bounces unauth'd HTML GETs to /login while returning 401 JSON for everything else. * Routes: GET /login renders first-user-setup form when users table is empty otherwise sign-in form; POST /login; POST /api/v1/auth/setup (only works while empty); GET|POST /logout. * Bootstrap: env vars INITIAL_ADMIN_USERNAME + INITIAL_ADMIN_PASSWORD create the first admin on startup if both set AND users table empty. Ignored thereafter — change passwords via UI or CLI. * Layout: header shows current_user.full_name|username + Logout link. Modal operator field auto-fills from the logged-in user via <meta name="default-operator"> rendered in layout (replaces the localStorage-only previous behaviour). * requirements.txt: pinned bcrypt>=4.0,<5.0, itsdangerous>=2.1, python-multipart>=0.0.7. First step toward addressing the unpinned-deps gotcha. * New app/templates/login.html with first-user-setup variant. == 1.0.0-23: hardening sweep == Closes the eight-item gap audit: * DB retention + automated backup. New app/retention.py runs daily at 03:00 local. Nulls burnin_stages.log_text on stages older than retention_log_days (default 35), VACUUMs to reclaim pages, then runs `sqlite3 .backup` to /data/backups/app-YYYY-MM-DD.db keeping the retention_backup_keep most recent (default 14). Wired into the lifespan supervisor next to mailer/poller. * CSRF mitigation. SessionMiddleware bumped to SameSite=strict so the browser refuses to send the session cookie on cross-site POSTs — removes the actual CSRF vector. Trade-off: external links into the app require re-auth. * Login rate limiting. In-memory per-username AND per-source-IP failure counters in auth.py. 10 failures within 10 min trips a 15-min lockout for both keys. Returns HTTP 429 with a clear "try again in N min" message. Cleared on successful login. * Login audit events. New event types in audit_events: user_login, user_login_failed, user_login_locked_out, user_logout, user_password_changed. All include source IP. Recorded via auth.audit_auth_event(). * Password change UI. Header link "Change password" opens templates/components/modal_password.html (current/new/confirm). Posts to POST /api/v1/auth/change-password — bcrypt-verifies current, requires >=8 char new pw, writes audit event. * NVMe burn-in path. _stage_surface_validate now detects nvme* devnames and routes to _stage_surface_validate_nvme() which runs `nvme format -s 1 --force` (cryptographic erase). Seconds vs hours of badblocks, exercises the controller's secure-erase. Falls back to badblocks if nvme-cli isn't installed. Post-format SMART check. * Mounted-FS detection. ssh_client.get_mounted_drives() runs `findmnt -no SOURCE`, parses non-ZFS sources back to base devnames. Poller treats them as pool_name='(mounted)', pool_role='mounted'. Confirm token DESTROY MOUNTED FILESYSTEM, distinct purple styling, audit event mounted_drive_unlocked, daily-report banner picks it up. * Deeper /health. Real readiness check — DB write probe (PRAGMA journal_mode), poller freshness (age <= 3x stale_threshold), SSH test_connection() when configured. Returns 503 when any check fails so a proxy/orchestrator can take the container out of rotation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
185 lines
7 KiB
Python
185 lines
7 KiB
Python
import asyncio
|
|
import ipaddress
|
|
import logging
|
|
from contextlib import asynccontextmanager
|
|
|
|
from fastapi import FastAPI
|
|
from fastapi.staticfiles import StaticFiles
|
|
from starlette.middleware.base import BaseHTTPMiddleware
|
|
from starlette.middleware.sessions import SessionMiddleware
|
|
from starlette.requests import Request
|
|
from starlette.responses import JSONResponse, PlainTextResponse
|
|
|
|
from app import auth, burnin, mailer, poller, retention, settings_store
|
|
from app.config import settings
|
|
from app.database import init_db
|
|
from app.logging_config import configure as configure_logging
|
|
from app.renderer import templates # noqa: F401 — registers filters as side-effect
|
|
from app.routes import router
|
|
from app.truenas import TrueNASClient
|
|
|
|
# Configure structured JSON logging before anything else logs
|
|
configure_logging()
|
|
log = logging.getLogger(__name__)
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# IP allowlist middleware
|
|
# ---------------------------------------------------------------------------
|
|
|
|
class _IPAllowlistMiddleware(BaseHTTPMiddleware):
|
|
"""
|
|
Block requests from IPs not in ALLOWED_IPS.
|
|
|
|
When ALLOWED_IPS is empty the middleware is a no-op.
|
|
Checks X-Forwarded-For first (trusts the leftmost address), then the
|
|
direct client IP.
|
|
"""
|
|
|
|
def __init__(self, app, allowed_ips: str) -> None:
|
|
super().__init__(app)
|
|
self._networks: list[ipaddress.IPv4Network | ipaddress.IPv6Network] = []
|
|
for entry in (s.strip() for s in allowed_ips.split(",") if s.strip()):
|
|
try:
|
|
self._networks.append(ipaddress.ip_network(entry, strict=False))
|
|
except ValueError:
|
|
log.warning("Invalid ALLOWED_IPS entry ignored: %r", entry)
|
|
|
|
def _is_allowed(self, ip_str: str) -> bool:
|
|
try:
|
|
addr = ipaddress.ip_address(ip_str)
|
|
return any(addr in net for net in self._networks)
|
|
except ValueError:
|
|
return False
|
|
|
|
async def dispatch(self, request: Request, call_next):
|
|
if not self._networks:
|
|
return await call_next(request)
|
|
|
|
# Prefer X-Forwarded-For (leftmost = original client)
|
|
forwarded = request.headers.get("X-Forwarded-For", "").split(",")[0].strip()
|
|
client_ip = forwarded or (request.client.host if request.client else "")
|
|
|
|
if self._is_allowed(client_ip):
|
|
return await call_next(request)
|
|
|
|
log.warning("Request blocked by IP allowlist", extra={"client_ip": client_ip})
|
|
return PlainTextResponse("Forbidden", status_code=403)
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Poller supervisor — restarts run() if it ever exits unexpectedly
|
|
# ---------------------------------------------------------------------------
|
|
|
|
async def _supervised_poller(client: TrueNASClient) -> None:
|
|
while True:
|
|
try:
|
|
await poller.run(client)
|
|
except asyncio.CancelledError:
|
|
raise # Propagate shutdown signal cleanly
|
|
except Exception as exc:
|
|
log.critical("Poller crashed unexpectedly — restarting in 5s: %s", exc)
|
|
await asyncio.sleep(5)
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Lifespan
|
|
# ---------------------------------------------------------------------------
|
|
|
|
_client: TrueNASClient | None = None
|
|
|
|
|
|
@asynccontextmanager
|
|
async def lifespan(app: FastAPI):
|
|
global _client
|
|
log.info("Starting up")
|
|
await init_db()
|
|
settings_store.init()
|
|
await auth.bootstrap_admin_if_empty()
|
|
_client = TrueNASClient()
|
|
await burnin.init(_client)
|
|
poll_task = asyncio.create_task(_supervised_poller(_client))
|
|
mailer_task = asyncio.create_task(mailer.run())
|
|
retention_task = asyncio.create_task(retention.run())
|
|
yield
|
|
log.info("Shutting down")
|
|
poll_task.cancel()
|
|
mailer_task.cancel()
|
|
retention_task.cancel()
|
|
try:
|
|
await asyncio.gather(poll_task, mailer_task, retention_task,
|
|
return_exceptions=True)
|
|
except asyncio.CancelledError:
|
|
pass
|
|
await _client.close()
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# App
|
|
# ---------------------------------------------------------------------------
|
|
|
|
app = FastAPI(title="TrueNAS Burn-In Dashboard", lifespan=lifespan)
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Auth gate — must be added BEFORE include_router so it runs first.
|
|
# Path-prefix allowlist below covers anything we want reachable without
|
|
# a session cookie. SSE streams + WebSockets fall through to the dependency
|
|
# in their handler so they 401 cleanly.
|
|
# ---------------------------------------------------------------------------
|
|
|
|
_PUBLIC_PATHS = {"/login", "/logout", "/health", "/auth/setup"}
|
|
_PUBLIC_PREFIXES = ("/static/", "/api/v1/auth/")
|
|
|
|
|
|
class _AuthGateMiddleware(BaseHTTPMiddleware):
|
|
async def dispatch(self, request: Request, call_next):
|
|
path = request.url.path
|
|
# Always populate request.state.current_user from the session so
|
|
# templates and route handlers can both rely on it. None when
|
|
# unauthenticated.
|
|
user_id = request.session.get("user_id")
|
|
request.state.current_user = (
|
|
await auth.get_user_by_id(int(user_id)) if user_id else None
|
|
)
|
|
|
|
if path in _PUBLIC_PATHS or path.startswith(_PUBLIC_PREFIXES):
|
|
return await call_next(request)
|
|
if request.state.current_user is not None:
|
|
return await call_next(request)
|
|
# Unauthenticated. HTML GETs bounce to /login with a `next` query
|
|
# arg so the user lands back where they tried to go after logging
|
|
# in. Anything else (API calls, SSE, POSTs) gets a 401.
|
|
accept = request.headers.get("accept", "")
|
|
if request.method == "GET" and "text/html" in accept:
|
|
return auth.login_redirect(path)
|
|
return JSONResponse(
|
|
{"detail": "Authentication required"}, status_code=401
|
|
)
|
|
|
|
|
|
app.add_middleware(_AuthGateMiddleware)
|
|
# SessionMiddleware must be added LAST (it wraps innermost so request.session
|
|
# is populated before AuthGate runs).
|
|
app.add_middleware(
|
|
SessionMiddleware,
|
|
secret_key=auth.get_session_secret(),
|
|
session_cookie="burnin_session",
|
|
max_age=settings.session_max_age_seconds,
|
|
https_only=False, # we sit behind nginx-proxy-manager; trust upstream
|
|
# SameSite=strict is the primary CSRF mitigation: the browser never
|
|
# sends the session cookie on cross-site requests, so an attacker
|
|
# page can't trigger any state-changing endpoint even if it knows
|
|
# the URL. Trade-off: an external link (email, chat) into the app
|
|
# won't carry the session — user has to re-auth via /login. For an
|
|
# internal-only tool that's the right default.
|
|
same_site="strict",
|
|
)
|
|
|
|
|
|
if settings.allowed_ips:
|
|
app.add_middleware(_IPAllowlistMiddleware, allowed_ips=settings.allowed_ips)
|
|
log.info("IP allowlist active: %s", settings.allowed_ips)
|
|
|
|
app.mount("/static", StaticFiles(directory="app/static"), name="static")
|
|
app.include_router(router)
|