# Backend Operational Contracts

## Billing endpoint behavior

- `GET /api/v1/billing/` returns a structured readiness payload.
- If Mollie is not configured, the API returns `status=configuration_required` with `code=payment_provider_not_configured` and an `action` field.
- If configured, the API returns `status=ready` and the active `restaurant_id`.

## Knowledge lifecycle endpoints

- `GET /api/v1/knowledge/` lists ingested knowledge documents scoped to the authenticated restaurant.
- `DELETE /api/v1/knowledge/{document_id}` deletes a scoped knowledge document.
- Missing documents return `404`.

## Time-slot management endpoints

- `GET /api/v1/time-slots/` lists scoped slots.
- `POST /api/v1/time-slots/` creates a scoped slot.
- `PUT /api/v1/time-slots/{slot_id}` updates a scoped slot.
- `DELETE /api/v1/time-slots/{slot_id}` deletes a scoped slot.

## Better Auth operational runbook

### Verify auth is alive

- Session check from the app origin:
  ```bash
  curl -i http://localhost:3000/api/auth/get-session
  ```
- Expect `200` with a JSON session payload when the browser session cookie is present, or `200` with `null`/empty session when unauthenticated.
- For backend JWT verification issues, check that `BETTER_AUTH_URL/api/auth/jwks` is reachable and remember the verifier keeps an in-process JWKS cache. A decode failure on a previously cached key triggers one forced refresh before the request is rejected.

### Issue a platform invitation token

- Create one token for a known email:
  ```bash
  cd backend && uv run python scripts/issue_invite.py --email user@example.com
  ```
- Override expiry/count as needed with `--expires 3d` and `--count 5`.
- Send the printed sign-up URL to the operator or customer; the token gates platform sign-up before Better Auth account creation.

### Manual email verification for support

- Use this only for support recovery when the user controls the mailbox but the verification link flow is blocked.
- Update the Better Auth-managed `user` row directly:
  ```sql
  UPDATE "user"
     SET emailVerified = TRUE,
         updatedAt = NOW()
   WHERE email = lower('user@example.com');
  ```
- Ask the user to sign out and sign back in after the change so a fresh session/JWT reflects `emailVerified=true`.

### Reset-password fallback

- Operators do not mutate password hashes manually. Instead, generate a fresh invitation/reset path with `issue_invite.py`, deliver it out-of-band, and let Better Auth handle the credential change through its normal token flow.

### Safari ITP note

- Production runs Better Auth on the app subdomain and the API on a sibling subdomain. Keep cross-subdomain cookies enabled for the shared parent domain so Safari treats the session as first-party for app-driven navigation.
- If Safari drops the session unexpectedly, verify the deployed cookie domain matches the shared registrable domain and that auth requests originate from the app subdomain rather than the API host directly.

### Audit log retention + alerting (P11/P12)

Better Auth lifecycle hooks emit structured JSON events to stdout, which the Logfire pipeline ingests under the configured token. Pin retention and alerts in the Logfire dashboard against these event names:

| Event name | Source | Action |
| --- | --- | --- |
| `auth.user.create` | databaseHooks.user.create.after | Retain ≥ 1 year (account creation audit). |
| `auth.user.update` | databaseHooks.user.update.after | Retain ≥ 90 days. |
| `auth.user.delete` | user.deleteUser.afterDelete | Retain ≥ 1 year (account deletion audit). |
| `auth.session.create` | databaseHooks.session.create.after | Retain ≥ 90 days. Alert on velocity spikes. |
| `auth.account.update` | databaseHooks.account.update.after | Retain ≥ 90 days (password / linked-account changes). |
| `auth.member.role_change` | organizationHooks.afterUpdateMemberRole | Retain ≥ 1 year (privilege escalation audit). |
| `internal_auth_email_sent` | /internal/auth/email handler | Retain ≥ 30 days. Alert on `internal_auth_email_dispatch_failed` rate > baseline. |
| `orphan_team_cleanup_complete` | restate cleanup_orphan_teams cron | Retain ≥ 30 days. Alert on `deleted` > 50 per run (unusual). |
| `auth_verification_started` | AuthVerificationWorkflow.main | Retain ≥ 30 days. |
| `auth_verified_fast` / `auth_verified_after_reminder` / `auth_verified_recheck_*` | AuthVerificationWorkflow.main | Retain ≥ 30 days. |
| `auth_verification_reminder_sent` | AuthVerificationWorkflow.main | Retain ≥ 30 days. |
| `auth_verification_abandoned` | AuthVerificationWorkflow.main | Retain ≥ 1 year. Alert on > 5 per day (unusual). |
| `invitation_workflow_started` | InvitationLifecycleWorkflow.main | Retain ≥ 30 days. |
| `invitation_accepted_*` / `invitation_reminder_sent` / `invitation_expired` | InvitationLifecycleWorkflow.main | Retain ≥ 30 days. |

Recommended Logfire SQL alerts:

```sql
-- A: Sign-in velocity spike (potential credential stuffing).
SELECT COUNT(*) AS attempts
  FROM records
 WHERE event_name = 'auth.session.create'
   AND start_timestamp > NOW() - INTERVAL '5 minutes';
-- Alert when attempts > 200 over 5 minutes.

-- B: Rate-limit fire rate (potential brute force).
SELECT COUNT(*) AS rate_limited
  FROM records
 WHERE http_status = 429
   AND attributes->>'path' LIKE '/api/auth/%'
   AND start_timestamp > NOW() - INTERVAL '5 minutes';
-- Alert when rate_limited > 50 over 5 minutes.

-- C: Email dispatch failure rate.
SELECT COUNT(*) AS failures
  FROM records
 WHERE event_name = 'internal_auth_email_dispatch_failed'
   AND start_timestamp > NOW() - INTERVAL '15 minutes';
-- Alert when failures > 0 — Scaleway TEM outage or HMAC misconfig.

-- D: Privilege escalation audit (role changes upward).
SELECT *
  FROM records
 WHERE event_name = 'auth.member.role_change'
   AND attributes->>'newRole' IN ('owner', 'admin')
   AND start_timestamp > NOW() - INTERVAL '1 hour';
-- Review daily; ANY result is auditable.
```

Configure these in Logfire → Settings → Alerts. Retention is set per-project at Logfire → Settings → Data → Retention (project tier dictates max).

## Neon RLS (per-branch one-time setup)

Dineo enforces Postgres Row-Level Security via the `pg_session_jwt`
extension and Neon's predefined `authenticated` role (NOBYPASSRLS). The
backend connects with a single pool as `neondb_owner` (BYPASSRLS) and,
on every transaction for a user-facing route, drops the effective role
to `authenticated` and injects the request's already-validated Better
Auth JWT claims into the `request.jwt.claims` GUC. RLS policies on
tenant tables key on `auth.session() ->> 'activeRestaurantId'`. Missing
or absent claim → fail-closed (zero rows).

Server-internal traffic (Restate handlers, cron sweeps, public/webhook
slug lookups) leaves the role on `neondb_owner`, which intentionally
bypasses RLS, and filters by `restaurant_id` at the application layer.

Why this matters: every Neon-provisioned role *except* `authenticated`
has BYPASSRLS, and only true superusers can strip the attribute — which
Neon does not expose. The `SET LOCAL ROLE` pattern lets us reach
`authenticated` from a regular login pool without per-request JWTs in
the connection string, which is the only way the Neon HTTP gateway
would otherwise authenticate that role.

You must perform this setup **once per Neon branch** (production, dev,
preview branches). Skipping a branch silently loses isolation.

### 1. Enable the Data API on the branch

Neon bundles the `authenticated` role + `pg_session_jwt` extension into
the **Data API** feature. We don't use the HTTP gateway — only the
roles and extension it provisions — but enabling it is the supported
way to get those primitives on a branch.

1. Neon Console → your project → confirm the branch selector is on the
   target branch (start with dev; repeat for production later).
2. Left sidebar → **Data API** → **Enable Data API**.
3. On the enable screen:
   - ☐ **Use Neon Auth** — **leave UNCHECKED**. We use Better Auth.
   - ☑ **Grant public schema access** — **check it**. Neon then applies
     `GRANT USAGE / SELECT / INSERT / UPDATE / DELETE … TO authenticated`
     on every table in `public`, plus future-table default privileges.
     Without this checkbox the `authenticated` role would hit
     `permission denied` errors before RLS even ran.
   - Database → `neondb`.
4. Click **Enable Data API**.

Neon now provisions the `authenticated` (NOBYPASSRLS), `anonymous`, and
`authenticator` roles, installs `pg_session_jwt` if missing, and grants
`neondb_owner` membership in `authenticated` + `anonymous` so the
backend can `SET LOCAL ROLE authenticated` without further configuration.

There is **no JWKS URL to register**. We use `pg_session_jwt` in its
PostgREST-compatible fallback mode: the backend validates the BA JWT
against BA's JWKS via PyJWT (already in the boot path) and then
injects the validated claims into the `request.jwt.claims` GUC. The
extension reads from that GUC when no JWK is configured at the cluster
level.

### 2. Manual grants (only if you skipped "Grant public schema access")

If the checkbox above was missed, run this once as `neondb_owner`:

```bash
psql "$NEON_DATABASE_URL_DIRECT" <<'SQL'
GRANT USAGE ON SCHEMA public TO authenticated;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO authenticated;
GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public TO authenticated;
ALTER DEFAULT PRIVILEGES FOR ROLE neondb_owner IN SCHEMA public
  GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO authenticated;
ALTER DEFAULT PRIVILEGES FOR ROLE neondb_owner IN SCHEMA public
  GRANT USAGE, SELECT ON SEQUENCES TO authenticated;
SQL
```

Idempotent. Future tables created by `neondb_owner` inherit the grant.

### 3. Capture the env vars

Set in `.env` (dev) and the production secret store:

| Variable | Source | Used by |
| --- | --- | --- |
| `NEON_DATABASE_URL` | `neondb_owner` pooled URL | Every backend session. The `after_begin` hook drops to `authenticated` per transaction for tenant routes. |
| `NEON_DATABASE_URL_DIRECT` | `neondb_owner` direct URL | Alembic migrations only. |

Both must point at the **same branch**.

### 4. Verify enforcement

```bash
# 4a. Role exists and is NOBYPASSRLS:
psql "$NEON_DATABASE_URL_DIRECT" -c "
  SELECT rolname, rolbypassrls
    FROM pg_roles
   WHERE rolname IN ('authenticated','neondb_owner')
   ORDER BY rolname"
# Expect:
#   authenticated | f
#   neondb_owner  | t

# 4b. neondb_owner is a member of authenticated (so SET LOCAL ROLE works):
psql "$NEON_DATABASE_URL_DIRECT" -c "
  SELECT r.rolname AS member_of
    FROM pg_auth_members am
    JOIN pg_roles r ON r.oid = am.roleid
    JOIN pg_roles m ON m.oid = am.member
   WHERE m.rolname = 'neondb_owner'
     AND r.rolname IN ('authenticated','anonymous')"
# Expect two rows: authenticated, anonymous.

# 4c. Fail-closed probe — same connection, SET LOCAL ROLE inside a
#     transaction, no claims set. Expect 0 from every tenant table.
psql "$NEON_DATABASE_URL_DIRECT" -c "
  BEGIN;
  SET LOCAL ROLE authenticated;
  SELECT COUNT(*) FROM reservation;
  ROLLBACK"
# Expect: 0  (no claim → auth.session() is NULL → policy filters all rows)
```

If 4a shows `authenticated | t`, Neon mis-provisioned; open a support
ticket. If 4b returns fewer than 2 rows, the Data API enable didn't
complete — redo step 1. If 4c returns rows > 0, the JWT-based RLS
policies aren't in place — re-run `alembic upgrade head` to apply
`0030_rls_via_pg_session_jwt`.

### 5. After every Alembic migration that adds a tenant table

If you checked "Grant public schema access" in step 1, the
`ALTER DEFAULT PRIVILEGES` clause Neon set up auto-grants new tables.
Confirm with:

```sql
SELECT table_name, privilege_type
  FROM information_schema.table_privileges
 WHERE grantee = 'authenticated'
   AND table_name = '<new_table>';
```

If the row is missing, run the grants block in §2 — it's idempotent.

### Failure modes

- **`permission denied for table reservation` on a tenant route** → §2
  grants weren't applied. Re-run them.
- **All tenant queries return zero rows with a valid BA JWT** → the JWT
  doesn't carry `activeRestaurantId` (it should, after BA's
  `definePayload` change in this repo; decode at jwt.io to verify), or
  the `after_begin` hook isn't firing (look for `SET LOCAL ROLE
  authenticated` in the Postgres logs for the failing transaction).
- **`role "authenticated" does not exist`** → step 1 wasn't completed
  on this branch. The Data API enable provisions the role.
- **`InsufficientPrivilegeError: new row violates row-level security
  policy`** on a write → INTENDED. RLS WITH CHECK rejected an insert
  whose `restaurant_id` doesn't match the request's claim. Surface as
  a 403 to the caller.

## Integration failure contracts

- Email delivery degrades gracefully when Scaleway credentials are missing and emits `email_disabled` telemetry.
- Payment adapter raises a structured `payment_provider_not_configured` failure when provider credentials are absent.
