nas-burnin/app/routes/auth.py
Brandon Walter aa7822d6ce
Some checks are pending
Security scan / pip-audit (push) Waiting to run
Security scan / bandit (push) Waiting to run
Security scan / gitleaks (push) Waiting to run
Security scan / mypy (push) Waiting to run
feat: rate limiter + mypy + lifecycle tests + routes/ split (1.0.0-33/-34)
Closes the four remaining items from the post-Codex hardening list.

#1 Rate-limit unlock + change-password endpoints (1.0.0-33)
   * Generalised the existing login limiter into a reusable
     `_RateLimiter` class in app/auth.py. Atomic check-then-increment
     in synchronous code so a parallel asyncio burst can't slip past
     the threshold.
   * `unlock_limiter` (5 attempts in 10 min → 10 min lockout) gates
     POST /api/v1/drives/{id}/unlock per-drive AND per-source-IP.
   * `pwchange_limiter` (5 in 10 min → 15 min lockout) gates
     POST /api/v1/auth/change-password per-user AND per-IP.
   * Both clear on successful operation. The login limiter keeps its
     existing `register_login_attempt` / `clear_login_failures`
     facade names so external callers don't change.

#3 mypy in security-scan (1.0.0-33)
   * Added a 4th tool to the daily scan + forge workflow. Runs in a
     throwaway python:3.12-slim container against the deploy dir,
     exit code is informational only (NOT included in the
     `TOTAL_EXIT` failure sum). Findings land in
     ~/security-scans/scan-YYYY-MM-DD/mypy.txt for ratchet-down
     work over time.
   * Forge job uses `continue-on-error: true` so it doesn't fail the
     workflow until the type-debt baseline is annotated down.

#4 Lifecycle test coverage (1.0.0-33)
   * New tests/test_lifecycle.py with 15 cases:
     - TestCommonHelpers (7 tests): _start_stage, _finish_stage
       success/failure/error-preservation, _recalculate_progress
       weighted math, _is_cancelled, _append_stage_log.
     - TestStartCancelJob (4 tests): start_job inserts queued row +
       correct stage list, duplicate-active rejection, cancel marks
       state, cancel returns False on terminal-state jobs.
     - TestRateLimiter (4 tests): under-threshold ok, trips at
       threshold, clear removes both counter + lockout, separate
       keys don't interfere.
   * Total goes from 44 to 59 tests; closes the orchestration-path
     coverage gap Codex flagged.

#2 Partial routes.py split (1.0.0-34)
   * routes.py → routes/ package. Same staged-extraction pattern as
     the burnin.py split.
   * routes/auth.py — login/logout/setup/change-password (170 LoC).
   * routes/system.py — /health, /ws/terminal, /api/v1/updates/check
     (136 LoC).
   * routes/_helpers.py — shared utilities used by both extracted
     modules and the still-monolithic remainder: client_ip,
     operator_for, is_stale, stale_context, secret_status,
     SECRET_FIELDS (97 LoC).
   * routes/__init__.py shrank from 1568 LoC to 1261. Future slices
     can extract drives, burnin, history, settings the same way.
   * GOTCHA recorded in commit body: `from app import auth` at the
     top of __init__.py binds `auth` as an attribute on the package
     namespace, so `from . import auth as _auth_routes` finds the
     OUTER module and yields `app.auth` instead of the submodule.
     Fix is `import app.routes.auth as _auth_routes` (absolute).
     This bit me once at deploy time; container failed to start
     with `module 'app.auth' has no attribute 'router'`.

Verification: 59/59 tests pass (44 existing + 15 new); container
boots clean at 1.0.0-34; /health 200 with all checks green; security
scan still clean (mypy informational findings ignored from totals).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 09:29:53 -04:00

170 lines
6.8 KiB
Python

"""Login / logout / first-user setup / password change routes.
Public path mounting:
GET /login — render login or first-user setup form
POST /login — credential check + session bootstrap
POST /api/v1/auth/setup — first-user creation (only when zero users)
GET /logout — clear session, redirect
POST /logout — same, for explicit POST clients
POST /api/v1/auth/change-password — rotate password + audit
"""
from __future__ import annotations
import time as _time
from fastapi import APIRouter, HTTPException, Request
from fastapi.responses import HTMLResponse, RedirectResponse
from app import auth
from app.renderer import templates
from ._helpers import client_ip
router = APIRouter()
@router.get("/login", response_class=HTMLResponse)
async def login_page(request: Request, next: str = "/", error: str | None = None):
needs_setup = (await auth.user_count()) == 0
return templates.TemplateResponse(request, "login.html", {
"request": request,
"needs_setup": needs_setup,
"error": error,
"next": next if next.startswith("/") else "/",
})
@router.post("/login")
async def login_submit(request: Request):
form = await request.form()
username = (form.get("username") or "").strip()
password = form.get("password") or ""
next_url = form.get("next") or "/"
if not next_url.startswith("/"):
next_url = "/"
ip = client_ip(request)
# Atomic register-and-check: increments the counter NOW (before any
# await), so a parallel burst of guesses can't all slip past the
# threshold. Cleared on successful auth via clear_login_failures.
attempt = auth.register_login_attempt(username, ip)
if attempt != "ok":
if attempt == "now_locked_out":
await auth.audit_auth_event(
"user_login_locked_out", username,
f"Failed login from {ip} — IP/user locked out for {auth.LOGIN_LOCKOUT_SECONDS // 60} min",
)
locked_until = auth.login_locked_until(username, ip)
remaining = int((locked_until or _time.time()) - _time.time())
return templates.TemplateResponse(request, "login.html", {
"request": request,
"needs_setup": False,
"error": f"Too many failed attempts. Try again in {remaining // 60 + 1} min.",
"next": next_url,
}, status_code=429)
found = await auth.get_user_by_username(username)
if not found or not auth.verify_password(password, found[1]):
# Constant-ish-time: still call verify on a junk hash if user missing
# so the timing of "user not found" matches "wrong password."
if not found:
auth.verify_password(password, "$2b$12$" + "x" * 53)
await auth.audit_auth_event(
"user_login_failed", username, f"Failed login from {ip}",
)
return templates.TemplateResponse(request, "login.html", {
"request": request,
"needs_setup": False,
"error": "Invalid username or password.",
"next": next_url,
}, status_code=401)
user = found[0]
auth.clear_login_failures(username, ip)
# Clear any pre-login session keys before populating the new identity.
# Closes session-fixation: if an attacker had somehow seeded the
# browser with a session cookie, this discards everything in it
# before issuing the new authenticated payload.
request.session.clear()
request.session["user_id"] = user.id
request.session["username"] = user.username
await auth.touch_last_login(user.id)
await auth.audit_auth_event(
"user_login", user.username, f"Signed in from {ip}",
)
return RedirectResponse(url=next_url, status_code=303)
@router.post("/api/v1/auth/setup")
async def auth_first_user_setup(request: Request):
"""Create the first admin from the login page when the users table is
empty. Public endpoint — but only does anything when zero users exist."""
if (await auth.user_count()) > 0:
raise HTTPException(status_code=409, detail="Users already exist.")
form = await request.form()
username = (form.get("username") or "").strip()
password = form.get("password") or ""
full_name = (form.get("full_name") or "").strip() or None
try:
# bootstrap_only=True wraps the existence check + insert in an
# IMMEDIATE transaction so two concurrent setup requests can't
# both create admin accounts during the bootstrap window.
user = await auth.create_user(
username, password, full_name, is_admin=True, bootstrap_only=True
)
except ValueError as exc:
raise HTTPException(status_code=400, detail=str(exc))
# Same fixation defense as the login flow — discard any pre-existing
# session payload before issuing the authenticated identity.
request.session.clear()
request.session["user_id"] = user.id
request.session["username"] = user.username
await auth.touch_last_login(user.id)
return RedirectResponse(url="/", status_code=303)
@router.get("/logout")
@router.post("/logout")
async def logout(request: Request):
user = request.state.current_user if hasattr(request.state, "current_user") else None
if user:
await auth.audit_auth_event(
"user_logout", user.username, f"Signed out from {client_ip(request)}",
)
request.session.clear()
return RedirectResponse(url="/login", status_code=303)
@router.post("/api/v1/auth/change-password")
async def change_password(request: Request):
user = request.state.current_user if hasattr(request.state, "current_user") else None
if not user:
raise HTTPException(status_code=401, detail="Authentication required")
ip = client_ip(request)
# Rate-limit before bcrypt to keep an attacker-controlled session
# from burning CPU brute-forcing the current_password field.
keys = (("user", user.username.lower()), ("ip", ip))
attempt = auth.pwchange_limiter.register(*keys)
if attempt != "ok":
raise HTTPException(
status_code=429,
detail="Too many password-change attempts. Try again later.",
)
form = await request.form()
current = form.get("current_password") or ""
new_pw = form.get("new_password") or ""
confirm = form.get("confirm_password") or ""
if new_pw != confirm:
raise HTTPException(status_code=400, detail="New passwords do not match.")
try:
await auth.change_password(user.id, current, new_pw)
except ValueError as exc:
raise HTTPException(status_code=400, detail=str(exc))
auth.pwchange_limiter.clear(*keys)
await auth.audit_auth_event(
"user_password_changed", user.username,
f"Password changed from {ip}",
)
return {"ok": True}