feat: add harness engineering layer for agent workflows (#24791)

This PR adds an opinionated harness-engineering layer for agent-driven workflows: a small set of agent-readable docs, mechanical structure checks, structured CI failure summaries, an architecture-lint umbrella, and per-worktree dev-server isolation. The goal is to make local dev, tests, and CI mechanically inspectable by agents without changing app runtime behavior. ## What landed **Agent docs and navigation** - `.claude/docs/OBSERVABILITY.md`, `.claude/docs/DEV_ISOLATION.md`, `.claude/docs/AGENT_FAILURES.md`: task-oriented guides for logs, tracing, Prometheus, dev-server isolation, and a seeded failure catalog. - `AGENTS.md`: added an `Agent navigation` block, then trimmed the file from 375 to 229 lines by migrating duplicated detail into `WORKFLOWS.md`, `GO.md`, `TESTING.md`, and `DATABASE.md`. The user-managed custom-instructions block is preserved. - `.agents/docs`: symlink mirror of `.claude/docs` for agent runtimes that look under `.agents`. **Mechanical checks** - `scripts/check_agents_structure.sh`: validates `@...` references in tracked `AGENTS.md` files and warns when root grows past 600 lines. Wired as `make lint/agents` and into `make lint`. - `scripts/audit-agent-readiness.sh`: report-first audit of harness readiness. Currently `10 ok, 0 warn, 0 fail`. - `scripts/check_architecture.sh` / `make lint/architecture`: umbrella architecture-lint target. Consolidates the existing `check_enterprise_imports.sh` and `check_codersdk_imports.sh` so they run exactly once via the umbrella. Slot is open for new high-confidence rules. **Structured CI failure summaries** - `scripts/playwright-failure-summary.sh`: parses `site/test-results/results.json` and writes Markdown to `$GITHUB_STEP_SUMMARY` on failure. Wired into the `test-e2e` matrix job. - `scripts/go-test-failure-summary.sh`: parses `go test -json` line-delimited output the same way. Wired into `test-go-pg`, `test-go-pg-17`, and `test-go-race-pg` by injecting `gotestsum --jsonfile` in the workflow without touching `Makefile`. JSON also uploaded as a CI artifact on failure. - `site/e2e/playwright.config.ts`: enables `screenshot: only-on-failure`, `trace: retain-on-failure`, JSON reporter, and HTML reporter alongside existing reporters. - `.github/workflows/ci.yaml`: failure artifact uploads for Playwright now use `if: failure()` and predictable names (`playwright-artifacts-<variant>-<sha>`). **Per-worktree dev-server isolation** (`scripts/develop/main.go`) - Deterministic FNV-64a hash of the worktree path produces a port offset in `[0, 1000)` (50 buckets, step 20 to avoid API/proxy overlap across adjacent buckets). - Offset is applied only to defaults; both env vars (`CODER_DEV_PORT`, `CODER_DEV_WEB_PORT`, `CODER_DEV_PROXY_PORT`, `CODER_DEV_PROMETHEUS_PORT`) and CLI flags retain priority. - Hardcoded ports `9090` (embedded Prometheus UI) and `12345` (Delve) are unchanged by design. - Startup banner shows each port's source: `default`, `offset`, or `explicit`. - Unit tests in `scripts/develop/main_test.go` cover determinism, bounds, no-overlap across the four ports, and explicit-skip behavior. - State (`.coderv2/`) was already worktree-isolated via `os.Getwd()`, so no state-dir changes were needed. ## Validation `make lint/agents`, `make lint/architecture`, `make lint/emdash`, `bash scripts/audit-agent-readiness.sh` (10 ok, 0 warn, 0 fail), `shellcheck` on all 5 new scripts, `go test ./scripts/develop/...`, and `js-yaml` parse of `ci.yaml` all pass. Synthetic fixtures verify both failure-summary scripts handle empty/missing input (silent exit 0), ANSI-stripped output, and parent/subtest formatting. ## Known follow-ups (deferred) - Frontend Storybook/Vitest failure summary: lowest-leverage slice of the failure-summary work. Skipping until observed pain. - Architecture lint currently only delegates to existing import checks; new rules (`InTx` outer-store detection, swagger-annotation lint) plug in as needed. - 50 port-offset buckets means two worktree paths can occasionally collide. The DEV_ISOLATION doc tells users to set the relevant env var when this happens. > Mux opened this PR on Mike's behalf.
2026-06-02 20:48:20 +00:00 · 2026-05-11 17:27:29 +02:00
parent 915956460a
commit 85792d08bc
20 changed files with 1561 additions and 237 deletions
@@ -0,0 +1 @@
+../.claude/docs
@@ -0,0 +1,125 @@
+# Agent Failure Catalog
+
+Use this catalog for repeatable agent failures. Keep each entry short,
+actionable, and tied to existing docs or tools. Use the exact entry format
+shown below when adding new failures.
+
+```markdown
+## Symptom: <short description>
+
+- Likely cause:
+- How to reproduce:
+- How to diagnose:
+- Existing docs or tools:
+- Missing harness piece:
+- Proposed prevention:
+```
+
+## Symptom: Stale generated DB code after SQL changes
+
+- Likely cause: A query or migration changed without running `make gen`.
+- How to reproduce: Modify `coderd/database/queries/*.sql` and run tests or
+  builds without regenerating `coderd/database/queries.sql.go` and related
+  generated files.
+- How to diagnose: Check `git diff` for SQL changes without generated Go
+  changes. Run `make gen` and inspect the resulting diff.
+- Existing docs or tools: `AGENTS.md`, [Database Development Patterns](DATABASE.md),
+  and the `make gen` target.
+- Missing harness piece: No preflight doc checklist currently points agents at
+  generated DB drift before they run unrelated checks.
+- Proposed prevention: Always run `make gen` after database query or migration
+  edits, then include the generated diff in the same commit.
+
+## Symptom: Missing audit table updates
+
+- Likely cause: A database schema change affects audited data but
+  `enterprise/audit/table.go` was not updated.
+- How to reproduce: Add or change a table that audit logging expects, run
+  `make gen`, and observe audit-related generation or test failures.
+- How to diagnose: Inspect the `make gen` failure, then compare the changed
+  database tables with `enterprise/audit/table.go`.
+- Existing docs or tools: `AGENTS.md`, [Database Development Patterns](DATABASE.md),
+  and `make gen`.
+- Missing harness piece: Agents need a failure catalog entry that connects
+  generation failures to audit table maintenance.
+- Proposed prevention: After database changes, run `make gen`, update
+  `enterprise/audit/table.go` when generation reports audit drift, and rerun
+  `make gen`.
+
+## Symptom: Playwright failure without artifacts
+
+- Likely cause: The failing run did not preserve screenshots, traces, videos,
+  browser console output, or the Playwright report path.
+- How to reproduce: Run a Playwright test from `site` with
+  `pnpm playwright:test`, let it fail, and discard the generated output before
+  reporting the failure.
+- How to diagnose: Check `site/e2e/playwright.config.ts`, `site/e2e/README.md`,
+  and the terminal output for the report or `test-results` location.
+- Existing docs or tools: [Frontend Development Guidelines](../../site/AGENTS.md),
+  `site/e2e/README.md`, and `pnpm playwright:test`.
+- Missing harness piece: No central checklist tells agents which browser
+  artifacts must be attached to a failure report.
+- Proposed prevention: Capture the Playwright report path, screenshot, trace,
+  video, browser console output, and command output before retrying or cleaning
+  the workspace.
+
+## Symptom: Port collision across worktrees
+
+- Likely cause: Multiple worktrees use the same default develop ports.
+- How to reproduce: Start `./scripts/develop.sh` in one worktree, then start it
+  in another worktree without overriding ports.
+- How to diagnose: Look for `port <n> is already in use` or conflict errors in
+  the develop output. Check listeners with `lsof -iTCP:<port> -sTCP:LISTEN`.
+- Existing docs or tools: [Development Isolation Guide for Agents](DEV_ISOLATION.md)
+  and `scripts/develop/main.go`.
+- Missing harness piece: There is no automatic per-worktree port allocator.
+- Proposed prevention: Assign each worktree a unique `CODER_DEV_PORT`,
+  `CODER_DEV_WEB_PORT`, `CODER_DEV_PROXY_PORT`, and
+  `CODER_DEV_PROMETHEUS_PORT` before starting the app.
+
+## Symptom: Test using `time.Sleep`
+
+- Likely cause: A test waits for time to pass instead of synchronizing on a
+  deterministic condition or using the quartz clock.
+- How to reproduce: Add a test that depends on `time.Sleep`, then run it under
+  load or with the race detector until it flakes.
+- How to diagnose: Search the test diff for `time.Sleep`. Inspect whether the
+  code under test can use `quartz` or another explicit synchronization point.
+- Existing docs or tools: `AGENTS.md`, [Testing Patterns and Best Practices](TESTING.md),
+  and the quartz README referenced from `AGENTS.md`.
+- Missing harness piece: Agents need a failure entry that labels sleep-based
+  waiting as a flake risk before review.
+- Proposed prevention: Replace `time.Sleep` with a fake clock, trapped ticker,
+  channel, poll with timeout, or another deterministic signal.
+
+## Symptom: DB work inside `InTx` uses the outer store
+
+- Likely cause: Code inside a transaction closure calls `api.Database`, `p.db`,
+  or a helper that uses the outer store instead of the `tx` handle.
+- How to reproduce: Add DB work inside `db.InTx(...)` that calls back into the
+  outer store, then exercise it under concurrent load.
+- How to diagnose: Inspect the closure and helper call graph for database calls
+  that do not use the transaction handle. Look for pool waits, idle in
+  transaction symptoms, or deadlocks under load.
+- Existing docs or tools: `AGENTS.md`, [Database Development Patterns](DATABASE.md),
+  and code review of `InTx` closures.
+- Missing harness piece: No automated check currently proves every helper used
+  inside `InTx` stays on the transaction handle.
+- Proposed prevention: Fetch read-only inputs before opening the transaction,
+  pass `tx` into helpers that need DB access, and avoid receiver helpers that
+  hide outer-store usage.
+
+## Symptom: New API endpoint missing swagger annotations
+
+- Likely cause: A handler or route was added without matching swagger comments.
+- How to reproduce: Add a stable HTTP endpoint and skip `@Summary`, `@Router`,
+  or related annotations.
+- How to diagnose: Compare the new handler with nearby handlers and inspect
+  generated API docs for the route.
+- Existing docs or tools: `AGENTS.md`, [Documentation Style Guide](DOCS_STYLE_GUIDE.md),
+  and API generation checks.
+- Missing harness piece: Agents need a doc reminder that endpoint work includes
+  docs unless the route is intentionally experimental.
+- Proposed prevention: Add swagger annotations in the same change as stable
+  endpoints. For experimental or unstable API paths, add
+  `// @x-apidocgen {"skip": true}` after `@Router`.
@@ -34,6 +34,13 @@
 - **MUST DO**: Queries are grouped in files relating to context - e.g. `prebuilds.sql`, `users.sql`, `oauth2.sql`
 - After making changes to any `coderd/database/queries/*.sql` files you must run `make gen` to generate respective ORM changes

+### Query Naming
+
+- Use `ByX` when `X` is the lookup or filter column.
+- Use `PerX` or `GroupedByX` when `X` is the aggregation or grouping
+  dimension.
+- Avoid `ByX` names for grouped queries.
+
 ## Handling Nullable Fields

 Use `sql.NullString`, `sql.NullBool`, etc. for optional database fields:
@@ -47,6 +54,13 @@ CodeChallenge: sql.NullString{

 Set `.Valid = true` when providing values.

+## Database-to-SDK Conversions
+
+- Extract explicit db-to-SDK conversion helpers instead of inlining large
+  conversion blocks inside handlers.
+- Keep nullable-field handling, type coercion, and response shaping in the
+  converter so handlers stay focused on request flow and authorization.
+
 ## Audit Table Updates

 If adding fields to auditable types:
@@ -129,6 +143,19 @@ func TestDatabaseFunction(t *testing.T) {
 3. **Use transactions**: For related operations that must succeed together
 4. **Optimize queries**: Use EXPLAIN to understand query performance

+### Transaction Safety with `InTx`
+
+- Inside `db.InTx(...)` closures, do not use the outer store
+  (`api.Database`, `p.db`, etc.) directly or indirectly. Use the `tx`
+  handle for DB work inside the closure, or fetch read-only inputs before
+  opening the transaction.
+- Watch for helper methods on a receiver that hide outer-store access. A
+  call like `p.someHelper(ctx)` is still unsafe inside `InTx` if that
+  helper uses `p.db` internally.
+- Using the outer store while a transaction is open can hold one
+  connection and then block on another pool checkout, which can cause
+  pool starvation and `idle in transaction` incidents under load.
+
 ### Migration Writing

 1. **Make migrations reversible**: Always include down migration
@@ -0,0 +1,131 @@
+# Development Isolation Guide for Agents
+
+This guide documents the local resources that the existing harness uses. It is
+for avoiding collisions across worktrees and cleaning up after failed runs. Do
+not add new readiness or debug endpoints for these workflows.
+
+## Default local ports
+
+`scripts/develop/main.go` defines these base defaults:
+
+| Resource | Base default | Override |
+|----------|--------------|----------|
+| API server | `3000` | `--port`, `CODER_DEV_PORT` |
+| Frontend dev server | `8080` | `--web-port`, `CODER_DEV_WEB_PORT` |
+| Workspace proxy | `3010` | `--proxy-port`, `CODER_DEV_PROXY_PORT` |
+| Coder Prometheus metrics | `2114` | `--prometheus-port`, `CODER_DEV_PROMETHEUS_PORT` |
+| Embedded Prometheus UI | `9090` | Fixed in `scripts/develop/main.go` |
+| Delve debugger | `12345` | Fixed when `--debug` is used |
+
+By default, plain `./scripts/develop.sh` uses the base defaults exactly:
+`3000`, `8080`, `3010`, and `2114` for Coder Prometheus metrics. Set
+`--port-offset` or `CODER_DEV_PORT_OFFSET=true` to opt in to a deterministic
+per-worktree offset for API, frontend, workspace proxy, and Coder Prometheus
+metrics ports.
+
+When enabled, the develop script hashes the project root with FNV-64a, maps it
+into one of 50 buckets, multiplies by 20, and adds that value to each unset base
+default. The same worktree path always gets the same effective ports. A flag or
+environment variable overrides only that port. Other unset ports still receive
+the opt-in offset. The workspace proxy is only started when `--use-proxy` is
+set. The embedded Prometheus UI is only started when `--prometheus-server` or
+`CODER_DEV_PROMETHEUS_SERVER` is set, Docker is available, and the host is
+Linux. The Prometheus UI port `9090` and Delve port `12345` remain hardcoded.
+
+## Other useful develop flags and environment variables
+
+The develop script also supports these existing flags and environment
+variables:
+
+| Purpose | Flag | Environment variable |
+|---------|------|----------------------|
+| Per-worktree port offset | `--port-offset` | `CODER_DEV_PORT_OFFSET` |
+| Access URL | `--access-url` | `CODER_DEV_ACCESS_URL` |
+| Admin password | `--password` | `CODER_DEV_ADMIN_PASSWORD` |
+| Starter template | `--starter-template` | `CODER_DEV_STARTER_TEMPLATE` |
+| Roll back missing migrations | `--db-rollback` | `CODER_DEV_DB_ROLLBACK` |
+| Reset the development database | `--db-reset` | `CODER_DEV_DB_RESET` |
+| Accept changed migration tracking | `--db-continue` | `CODER_DEV_DB_CONTINUE` |
+
+Extra `coder server` flags can be passed after `--`. For example,
+`./scripts/develop.sh -- --trace` passes `--trace` to the API server.
+
+## Multi-worktree guidance
+
+Each worktree gets its own `.coderv2` directory because `scripts/develop.sh`
+sets the global config directory to `<project-root>/.coderv2`. This isolates
+built-in Postgres data, local session data, and Prometheus container storage on
+disk.
+
+The configurable develop ports use canonical defaults unless you opt in with
+`--port-offset` or `CODER_DEV_PORT_OFFSET=true`. Enable the offset when running
+multiple worktrees in parallel and you want most concurrent runs to avoid manual
+port selection. When the offset is enabled, the startup banner prints the
+effective API, web, proxy, and Coder metrics ports with their offset status.
+
+Use overrides when you need fixed ports or when two worktree paths hash to the
+same offset. For example:
+
+```sh
+CODER_DEV_PORT=3100 \
+CODER_DEV_WEB_PORT=8180 \
+CODER_DEV_PROXY_PORT=3110 \
+CODER_DEV_PROMETHEUS_PORT=2214 \
+./scripts/develop.sh --use-proxy
+```
+
+If you also need the embedded Prometheus UI in more than one worktree, use only
+one at a time. The UI port is fixed at `9090`, and the Docker container name is
+fixed to `coder-prometheus`. Delve is fixed at `127.0.0.1:12345` when `--debug`
+is used.
+
+## Known collision risks
+
+- Two worktree paths can hash to the same opt-in offset. If preflight reports a
+  busy effective port, set the relevant `CODER_DEV_*` environment variables or
+  flags for one worktree.
+- The embedded Prometheus UI always uses port `9090`.
+- The embedded Prometheus Docker container name is always `coder-prometheus`.
+- The Delve debugger always listens on `127.0.0.1:12345` when `--debug` is
+  used.
+- The develop script only checks the proxy port when `--use-proxy` is set, so
+  a stale process on the effective proxy port can go unnoticed until the proxy
+  is enabled.
+- External databases configured through `CODER_PG_CONNECTION_URL` are shared if
+  multiple worktrees point at the same database.
+
+## Readiness without new probes
+
+Do not invent a new readiness probe. The develop script already waits for the
+API server to answer `GET /healthz` for up to 60 seconds, then logs `server is
+ready to accept connections`. After setup completes, it prints a banner with
+`Coder is now running in development mode`, the effective port list, and the API
+and Web UI URLs.
+
+For agent-driven runs, treat the banner as the ready signal for browser work.
+If the banner does not appear, inspect the preceding `api`, `site`, database
+recovery, and port conflict logs.
+
+## Cleanup
+
+Use the least destructive cleanup that fixes the problem:
+
+1. Stop `./scripts/develop.sh` with `Ctrl+C` so child processes receive the
+   orchestrator shutdown signal.
+2. If a child process remains, identify it with `lsof -iTCP:<port> -sTCP:LISTEN`
+   or `ps`, then terminate only that stale process.
+3. To reset the built-in development database for the current worktree, rerun
+   with `./scripts/develop.sh --db-reset` or remove `.coderv2/postgres` after
+   stopping the app.
+4. To clear local Coder session and generated state for the current worktree,
+   remove the specific files under `.coderv2` that are relevant to the failure.
+5. To clean the embedded Prometheus container, stop the develop script first,
+   then remove the `coder-prometheus` container if it remains.
+6. To clean test databases, prefer the owning test harness cleanup. If tests
+   were interrupted, inspect the local PostgreSQL instance used by the test
+   suite before dropping any database.
+
+For database migration mismatches, prefer the develop script's recovery flags
+before deleting state. Use `--db-rollback` when a migration disappeared from the
+current branch, `--db-continue` after you manually reconcile changed migration
+tracking, and `--db-reset` only when data loss is acceptable.
@@ -1,10 +1,59 @@
-# Modern Go (1.18–1.26)
+# Modern Go (1.18-1.26)

 Reference for writing idiomatic Go. Covers what changed, what it
 replaced, and what to reach for. Respect the project's `go.mod` `go`
 line: don't emit features from a version newer than what the module
 declares. Check `go.mod` before writing code.

+## Go LSP Navigation
+
+Use Go LSP tools first for backend code navigation:
+
+- **Find definitions**: `mcp__go-language-server__definition symbolName`
+- **Find references**: `mcp__go-language-server__references symbolName`
+- **Get type info**: `mcp__go-language-server__hover filePath line column`
+- **Rename symbol**: `mcp__go-language-server__rename_symbol filePath line column newName`
+
+## Code Comments
+
+Code comments should be clear, well-formatted, and add meaningful context.
+
+- Comments are sentences and should end with periods or other appropriate
+  punctuation.
+- Explain why, not what. The code itself should be self-documenting
+  through clear naming and structure. Focus comments on non-obvious
+  decisions, edge cases, or business logic.
+- Keep comment lines to 80 characters wide, including the comment prefix
+  like `//` or `#`. When a comment spans multiple lines, wrap it
+  naturally at word boundaries.
+
+```go
+// Good: Explains the rationale with proper sentence structure.
+// We need a custom timeout here because workspace builds can take several
+// minutes on slow networks, and the default 30s timeout causes false
+// failures during initial template imports.
+ctx, cancel := context.WithTimeout(ctx, 5*time.Minute)
+
+// Bad: Describes what the code does without punctuation or wrapping.
+// Set a custom timeout
+// Workspace builds can take a long time
+// Default timeout is too short
+ctx, cancel := context.WithTimeout(ctx, 5*time.Minute)
+```
+
+## Avoid Unnecessary Changes
+
+When fixing a bug or adding a feature, don't modify code unrelated to your
+task. Unnecessary changes make PRs harder to review and can introduce
+regressions.
+
+- Don't reword existing comments or code unless the change is directly
+  motivated by your task.
+- Don't delete existing comments that explain non-obvious behavior.
+- When adding tests for new behavior, read existing tests first to
+  understand what's covered. Add new cases for uncovered behavior. Edit
+  existing tests as needed, but don't change what they verify.
+
 ## How modern Go thinks differently

 **Generics** (1.18): Design reusable code with type parameters instead
@@ -24,7 +73,7 @@ etc., they replace ad-hoc "loop and append" code with composable,
 lazy pipelines. When a sequence is consumed only once, prefer an
 iterator over materializing a slice.

-**Error trees** (1.20–1.26): Errors compose as trees, not chains.
+**Error trees** (1.20-1.26): Errors compose as trees, not chains.
 `errors.Join` aggregates multiple errors. `fmt.Errorf` accepts multiple
 `%w` verbs. `errors.Is`/`As` traverse the full tree. Custom error
 types that wrap multiple causes must implement `Unwrap() []error` (the
@@ -0,0 +1,148 @@
+# Observability Guide for Agents
+
+This guide maps the observability surfaces that already exist in local
+Coder development. Do not add new endpoints for agent debugging. Prefer the
+existing logs, tracing, Prometheus metrics, browser artifacts, and command
+output described here.
+
+## Start the app
+
+Use `./scripts/develop.sh` for local development. See
+[Development Workflows and Guidelines](WORKFLOWS.md) for the full workflow.
+The script builds the dev orchestrator, starts the API server and frontend,
+waits for the API server to answer `/healthz`, creates the first user if
+needed, and prints a banner with the local URLs.
+
+Useful defaults from `scripts/develop/main.go` are:
+
+- API server: `http://localhost:3000`.
+- Frontend dev server: `http://localhost:8080`.
+- Workspace proxy, when `--use-proxy` is set: `http://localhost:3010`.
+- Coder Prometheus metrics: `http://localhost:2114/`.
+- Embedded Prometheus UI, when `--prometheus-server` is set and Docker is
+  available on Linux: `http://localhost:9090`.
+
+## Local logs
+
+`./scripts/develop.sh` writes orchestrator and child process logs to the
+terminal. The orchestrator uses `sloghuman`, and each child process is logged
+under a named logger such as `api`, `site`, `proxy`, `ext-provisioner`, or
+`prometheus`.
+
+HTTP request logging is implemented in `coderd/httpmw/loggermw`. Request log
+fields include `user_agent`, `host`, `path`, `proto`, `remote_addr`, `start`,
+`status_code`, `latency_ms`, route params, and selected safe query params.
+Responses with status codes of 500 or higher include the response body in the
+request log. Successful `GET /api/v2` requests are skipped.
+
+When investigating failures, keep the full terminal output from
+`./scripts/develop.sh`. If you ran a command through Mux or another harness,
+record the command, exit code, and artifact path for the captured output.
+
+## Tracing
+
+HTTP tracing lives in `coderd/tracing`. The middleware covers `/api`,
+`/api/**`, workspace app routes, and external auth callback routes. When an
+active trace span exists, responses include `X-Trace-ID`, `X-Span-ID`, and a
+W3C `traceparent` header.
+
+Tracing export is controlled by existing server flags and environment
+variables, not by the develop orchestrator itself:
+
+- `--trace` or `CODER_TRACE_ENABLE` enables application tracing.
+- `--trace-logs` or `CODER_TRACE_LOGS` adds log events to traces.
+- `--trace-honeycomb-api-key` or `CODER_TRACE_HONEYCOMB_API_KEY` enables the
+  Honeycomb exporter.
+- `--trace-datadog` or `CODER_TRACE_DATADOG` enables sending Go runtime
+  traces to the local DataDog agent.
+
+To pass server flags through the develop script, put them after `--`. For
+example, use `./scripts/develop.sh -- --trace` when you already have an OTLP
+backend configured through the standard OpenTelemetry environment variables.
+
+## Prometheus metrics
+
+`./scripts/develop.sh` enables Coder Prometheus metrics by default on
+`0.0.0.0:2114`, served at `http://localhost:2114/`. The port is controlled by
+`--prometheus-port` or `CODER_DEV_PROMETHEUS_PORT`. Set it to `0` to disable
+metrics. The develop script passes these existing server flags when metrics are
+enabled: `--prometheus-enable`, `--prometheus-address`,
+`--prometheus-collect-agent-stats`, and `--prometheus-collect-db-metrics`.
+
+If `--prometheus-server` or `CODER_DEV_PROMETHEUS_SERVER` is set, the develop
+script attempts to start a Docker container named `coder-prometheus` on Linux.
+The Prometheus UI listens on `http://localhost:9090`. If a previous container
+is reused, confirm the scrape target because it may point at an older metrics
+port.
+
+Relevant metric implementations include:
+
+- `coderd/httpmw/prometheus.go` for HTTP request counters, concurrency gauges,
+  websocket gauges, and latency histograms.
+- `coderd/prometheusmetrics/` for active users, workspaces, agents, build
+  info, experiments, insights, and agent stats collectors.
+- `coderd/database/dbmetrics/` for database query and transaction metrics.
+- `docs/admin/integrations/prometheus.md` for the user-facing Prometheus
+  integration guide and metric reference.
+
+## Correlating a failed action
+
+Use this sequence when a browser or API action fails:
+
+1. Record the local clock time, browser action, URL, HTTP method, and response
+   status from the browser network panel or test output.
+2. If the response includes `X-Trace-ID` or `X-Span-ID`, copy both values. If
+   not, copy the `traceparent` header if present.
+3. Search the `./scripts/develop.sh` terminal output for the route, method,
+   status code, response body, or timestamp. Match fields such as `path`,
+   `status_code`, and `latency_ms`.
+4. Check `http://localhost:2114/` for metrics that match the route or subsystem.
+   Start with `coderd_api_requests_processed_total`,
+   `coderd_api_request_latencies_seconds`, and database metrics under the
+   `coderd_db_` prefix.
+5. Attach the browser screenshot, trace, video, or command output artifact to
+   the failure report when the harness produced one.
+
+## If an API request fails
+
+- Capture method, URL, status code, response body, and response headers.
+- Check the API log line for matching `path`, `status_code`, and `latency_ms`.
+- If the status is 500 or higher, include the logged response body.
+- Check `coderd_api_requests_processed_total` and
+  `coderd_api_request_latencies_seconds` for the matching route.
+- If database work is involved, check `coderd_db_query_counts_total`,
+  `coderd_db_query_latencies_seconds`, and transaction metrics.
+
+## If the frontend hangs
+
+- Confirm that the develop banner printed both the API and Web UI URLs.
+- Check the `site` logger output for Vite errors and dependency failures.
+- Use the browser network panel to separate frontend asset failures from API
+  failures.
+- If API calls are pending or failing, follow the API request checklist above.
+- Capture browser console output and screenshots before retrying.
+
+## If a workspace provision fails
+
+- Capture the workspace build ID, template name, workspace name, user, and
+  action that triggered the build.
+- Search logs for `provisioner`, `workspace`, `build`, and the workspace build
+  ID.
+- Check whether `ext-provisioner` is running in the develop output.
+- Review metrics for API request failures, database latency, and agent stats if
+  the failure reaches agent startup.
+- Preserve provisioner logs, template files, command output, and any browser
+  artifacts from the failed flow.
+
+## Failure report checklist
+
+Include these details in every observability failure report:
+
+- Absolute timestamp with timezone and the local command that was running.
+- Git branch, commit SHA, and whether generated files were fresh.
+- Browser action, API method, URL, route, status code, and response body.
+- `X-Trace-ID`, `X-Span-ID`, or `traceparent` when present.
+- Relevant log lines with nearby context.
+- Prometheus metrics checked and the observed values or absence of values.
+- Artifact paths for screenshots, traces, videos, logs, and command output.
+- Any cleanup performed before reproducing the failure again.
@@ -21,6 +21,13 @@
 - Test both positive and negative cases
 - Use `testutil.WaitLong` for timeouts in tests

+### Timing Issues
+
+NEVER use `time.Sleep` to mitigate timing issues. If an issue seems like
+it should use `time.Sleep`, read through https://github.com/coder/quartz
+and specifically the README to better understand how to handle timing
+issues.
+
 ### Test Package Naming

 - **Test packages**: Use `package_test` naming (e.g., `identityprovider_test`) for black-box testing
@@ -89,6 +96,11 @@ coderd/
 1. **PKCE tests failing** - Verify both authorization code storage and token exchange handle PKCE fields
 2. **Resource indicator validation failing** - Ensure database stores and retrieves resource parameters correctly

+### OAuth2 Test Scripts
+
+- Full suite: `./scripts/oauth2/test-mcp-oauth2.sh`
+- Manual testing: `./scripts/oauth2/test-manual-flow.sh`
+
 ### General Issues

 1. **Missing newlines** - Ensure files end with newline character
@@ -103,6 +103,17 @@
 4. **Add tests** in `coderd/*_test.go` files
 5. **Update OpenAPI** by running `make gen`

+### API Design Guardrails
+
+- Add swagger annotations when introducing new HTTP endpoints. Do this in
+  the same change as the handler so the docs do not get missed before
+  release.
+- For user-scoped or resource-scoped routes, prefer path parameters over
+  query parameters when that matches existing route patterns.
+- For experimental or unstable API paths, skip public doc generation with
+  `// @x-apidocgen {"skip": true}` after the `@Router` annotation. This
+  keeps them out of the published API reference until they stabilize.
+
 ## Testing Workflows

 ### Test Execution
@@ -122,6 +133,46 @@

 ## Git Workflow

+### Git Hooks
+
+**You MUST install and use the git hooks. NEVER bypass them with
+`--no-verify`. Skipping hooks wastes CI cycles and is unacceptable.**
+
+The first run will be slow as caches warm up. Consecutive runs are
+**significantly faster** (often 10x) thanks to Go build cache,
+generated file timestamps, and warm node_modules. This is NOT a
+reason to skip them. Wait for hooks to complete before proceeding,
+no matter how long they take.
+
+```sh
+git config core.hooksPath scripts/githooks
+```
+
+Two hooks run automatically:
+
+- **pre-commit**: Classifies staged files by type and runs either
+  the full `make pre-commit` or the lightweight `make pre-commit-light`
+  depending on whether Go, TypeScript, SQL, proto, or Makefile
+  changes are present. Falls back to the full target when
+  `CODER_HOOK_RUN_ALL=1` is set. A markdown-only commit takes
+  seconds; a Go change takes several minutes.
+- **pre-push**: Classifies changed files (vs remote branch or
+  merge-base) and runs `make pre-push` when Go, TypeScript, SQL,
+  proto, or Makefile changes are detected. Skips tests entirely
+  for lightweight changes. Allowlisted in
+  `scripts/githooks/pre-push`. Runs only for developers who opt
+  in. Falls back to `make pre-push` when the diff range can't
+  be determined or `CODER_HOOK_RUN_ALL=1` is set. Allow at least
+  15 minutes for a full run.
+
+`git commit` and `git push` will appear to hang while hooks run.
+This is normal. Do not interrupt, retry, or reduce the timeout.
+
+NEVER run `git config core.hooksPath` to change or disable hooks.
+
+If a hook fails, fix the issue and retry. Do not work around the
+failure by skipping the hook.
+
 ### Working on PR branches

 When working on an existing PR branch:
@@ -464,6 +464,24 @@ jobs:
          source scripts/normalize_path.sh
          normalize_path_with_symlinks "$RUNNER_TEMP/sym" "$(dirname "$(which terraform)")"

+      - name: Configure Go test JSON capture
+        if: runner.os == 'Linux'
+        shell: bash
+        run: |
+          set -euo pipefail
+          bin_dir="${RUNNER_TEMP}/go-test-json-bin"
+          mkdir -p "$bin_dir"
+
+          real_gotestsum="$(command -v gotestsum)"
+          real_gotestsum_quoted="$(printf '%q' "$real_gotestsum")"
+          printf '%s\n' \
+            '#!/usr/bin/env bash' \
+            'set -euo pipefail' \
+            "exec ${real_gotestsum_quoted} --jsonfile \"\${RUNNER_TEMP}/go-test.json\" \"\$@\"" \
+            > "${bin_dir}/gotestsum"
+          chmod +x "${bin_dir}/gotestsum"
+          echo "$bin_dir" >> "$GITHUB_PATH"
+
      - name: Setup RAM disk for Embedded Postgres (Windows)
        if: runner.os == 'Windows'
        shell: bash
@@ -540,6 +558,18 @@ jobs:
          embedded-pg-path: "R:/temp/embedded-pg"
          embedded-pg-cache: ${{ steps.embedded-pg-cache.outputs.embedded-pg-cache }}

+      - name: Publish Go test failure summary
+        if: failure() && github.actor != 'dependabot[bot]' && runner.os == 'Linux' && !github.event.pull_request.head.repo.fork
+        run: bash scripts/go-test-failure-summary.sh "${RUNNER_TEMP}/go-test.json" >> "$GITHUB_STEP_SUMMARY"
+
+      - name: Upload Go test JSON
+        if: failure() && github.actor != 'dependabot[bot]' && runner.os == 'Linux' && !github.event.pull_request.head.repo.fork
+        uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
+        with:
+          name: go-test-json-${{ github.job }}-${{ github.sha }}
+          path: ${{ runner.temp }}/go-test.json
+          retention-days: 7
+
      - name: Upload failed test db dumps
        uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
        with:
@@ -610,6 +640,24 @@ jobs:
          source scripts/normalize_path.sh
          normalize_path_with_symlinks "$RUNNER_TEMP/sym" "$(dirname "$(which terraform)")"

+      - name: Configure Go test JSON capture
+        if: runner.os == 'Linux'
+        shell: bash
+        run: |
+          set -euo pipefail
+          bin_dir="${RUNNER_TEMP}/go-test-json-bin"
+          mkdir -p "$bin_dir"
+
+          real_gotestsum="$(command -v gotestsum)"
+          real_gotestsum_quoted="$(printf '%q' "$real_gotestsum")"
+          printf '%s\n' \
+            '#!/usr/bin/env bash' \
+            'set -euo pipefail' \
+            "exec ${real_gotestsum_quoted} --jsonfile \"\${RUNNER_TEMP}/go-test.json\" \"\$@\"" \
+            > "${bin_dir}/gotestsum"
+          chmod +x "${bin_dir}/gotestsum"
+          echo "$bin_dir" >> "$GITHUB_PATH"
+
      - name: Test with PostgreSQL Database
        uses: ./.github/actions/test-go-pg
        with:
@@ -621,6 +669,18 @@ jobs:
          # On main, run tests without cache for the inverse.
          test-count: ${{ github.ref == 'refs/heads/main' && '1' || '' }}

+      - name: Publish Go test failure summary
+        if: failure() && github.actor != 'dependabot[bot]' && runner.os == 'Linux' && !github.event.pull_request.head.repo.fork
+        run: bash scripts/go-test-failure-summary.sh "${RUNNER_TEMP}/go-test.json" >> "$GITHUB_STEP_SUMMARY"
+
+      - name: Upload Go test JSON
+        if: failure() && github.actor != 'dependabot[bot]' && runner.os == 'Linux' && !github.event.pull_request.head.repo.fork
+        uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
+        with:
+          name: go-test-json-${{ github.job }}-${{ github.sha }}
+          path: ${{ runner.temp }}/go-test.json
+          retention-days: 7
+
      - name: Upload Test Cache
        uses: ./.github/actions/test-cache/upload
        with:
@@ -678,6 +738,24 @@ jobs:
      # c.f. discussion on https://github.com/coder/coder/pull/15106
      # Our Linux runners have 16 cores, but we reduce parallelism since race detection adds a lot of overhead.
      # We aim to have parallelism match CPU count (4*4=16) to avoid making flakes worse.
+      - name: Configure Go test JSON capture
+        if: runner.os == 'Linux'
+        shell: bash
+        run: |
+          set -euo pipefail
+          bin_dir="${RUNNER_TEMP}/go-test-json-bin"
+          mkdir -p "$bin_dir"
+
+          real_gotestsum="$(command -v gotestsum)"
+          real_gotestsum_quoted="$(printf '%q' "$real_gotestsum")"
+          printf '%s\n' \
+            '#!/usr/bin/env bash' \
+            'set -euo pipefail' \
+            "exec ${real_gotestsum_quoted} --jsonfile \"\${RUNNER_TEMP}/go-test.json\" \"\$@\"" \
+            > "${bin_dir}/gotestsum"
+          chmod +x "${bin_dir}/gotestsum"
+          echo "$bin_dir" >> "$GITHUB_PATH"
+
      - name: Run Tests
        uses: ./.github/actions/test-go-pg
        with:
@@ -686,6 +764,18 @@ jobs:
          test-parallelism-tests: "4"
          race-detection: "true"

+      - name: Publish Go test failure summary
+        if: failure() && github.actor != 'dependabot[bot]' && runner.os == 'Linux' && !github.event.pull_request.head.repo.fork
+        run: bash scripts/go-test-failure-summary.sh "${RUNNER_TEMP}/go-test.json" >> "$GITHUB_STEP_SUMMARY"
+
+      - name: Upload Go test JSON
+        if: failure() && github.actor != 'dependabot[bot]' && runner.os == 'Linux' && !github.event.pull_request.head.repo.fork
+        uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
+        with:
+          name: go-test-json-${{ github.job }}-${{ github.sha }}
+          path: ${{ runner.temp }}/go-test.json
+          retention-days: 7
+
      - name: Upload Test Cache
        uses: ./.github/actions/test-cache/upload
        with:
@@ -820,27 +910,36 @@ jobs:
          CODER_E2E_REQUIRE_PREMIUM_TESTS: "1"
        working-directory: site

-      - name: Upload Playwright Failed Tests
-        if: always() && github.actor != 'dependabot[bot]' && runner.os == 'Linux' && !github.event.pull_request.head.repo.fork
+      - name: Upload Playwright failure artifacts
+        if: failure() && github.actor != 'dependabot[bot]' && runner.os == 'Linux' && !github.event.pull_request.head.repo.fork
        uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
        with:
-          name: failed-test-videos${{ matrix.variant.premium && '-premium' || '' }}
-          path: ./site/test-results/**/*.webm
+          name: playwright-artifacts-${{ matrix.variant.name }}-${{ github.sha }}
+          path: |
+            ./site/test-results/**
+            ./site/playwright-report/**
          retention-days: 7

+      - name: Publish Playwright failure summary
+        if: failure() && github.actor != 'dependabot[bot]' && runner.os == 'Linux' && !github.event.pull_request.head.repo.fork
+        env:
+          MATRIX_VARIANT: ${{ matrix.variant.name }}
+          GITHUB_SHA_SHORT: ${{ github.sha }}
+        run: bash scripts/playwright-failure-summary.sh site/test-results/results.json >> "$GITHUB_STEP_SUMMARY"
+
      - name: Upload debug log
-        if: always() && github.actor != 'dependabot[bot]' && runner.os == 'Linux' && !github.event.pull_request.head.repo.fork
+        if: failure() && github.actor != 'dependabot[bot]' && runner.os == 'Linux' && !github.event.pull_request.head.repo.fork
        uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
        with:
-          name: coderd-debug-logs${{ matrix.variant.premium && '-premium' || ''  }}
+          name: coderd-debug-logs-${{ matrix.variant.name }}-${{ github.sha }}
          path: ./site/e2e/test-results/debug.log
          retention-days: 7

      - name: Upload pprof dumps
-        if: always() && github.actor != 'dependabot[bot]' && runner.os == 'Linux' && !github.event.pull_request.head.repo.fork
+        if: failure() && github.actor != 'dependabot[bot]' && runner.os == 'Linux' && !github.event.pull_request.head.repo.fork
        uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
        with:
-          name: debug-pprof-dumps${{ matrix.variant.premium && '-premium' || ''  }}
+          name: debug-pprof-dumps-${{ matrix.variant.name }}-${{ github.sha }}
          path: ./site/test-results/**/debug-pprof-*.txt
          retention-days: 7

@@ -3,6 +3,15 @@
 You are an experienced, pragmatic software engineer. You don't over-engineer a solution when a simple one is possible.
 Rule #1: If you want exception to ANY rule, YOU MUST STOP and get explicit permission first. BREAKING THE LETTER OR SPIRIT OF THE RULES IS FAILURE.

+## Agent navigation
+
+- Day-to-day: Start with [Development Workflows and Guidelines](.claude/docs/WORKFLOWS.md) for dev servers, git workflow, hooks, and routine checks.
+- Observability and isolation: Use [Observability Guide for Agents](.claude/docs/OBSERVABILITY.md) for logs, tracing, and metrics, and [Development Isolation Guide for Agents](.claude/docs/DEV_ISOLATION.md) for ports, state, readiness, and cleanup.
+- Failures: Use [Agent Failure Catalog](.claude/docs/AGENT_FAILURES.md) for repeatable failure formats and seeded diagnostics.
+- Language and area docs: Use [Modern Go](.claude/docs/GO.md), [Testing Patterns and Best Practices](.claude/docs/TESTING.md), [Database Development Patterns](.claude/docs/DATABASE.md), [OAuth2 Development Guide](.claude/docs/OAUTH2.md), [Coder Architecture](.claude/docs/ARCHITECTURE.md), [Troubleshooting Guide](.claude/docs/TROUBLESHOOTING.md), [Documentation Style Guide](.claude/docs/DOCS_STYLE_GUIDE.md), and [Pull Request Description Style Guide](.claude/docs/PR_STYLE_GUIDE.md) when that area is in scope.
+- Compatibility: `.agents/docs` symlinks to `.claude/docs` for agent runtimes that look there.
+- Frontend: Read [Frontend Development Guidelines](site/AGENTS.md) before changing anything under `site/`.
+
 ## Foundational rules

 - Doing it right is better than doing it fast. You are not in a rush. NEVER skip steps or take shortcuts.
@@ -60,82 +69,33 @@ Only pause to ask for confirmation when:

 ## Critical Patterns

-### Database Changes (ALWAYS FOLLOW)
+Detailed workflow and topic guidance lives in the imported docs. Keep root
+instructions focused on guardrails that agents should see immediately.

-1. Modify `coderd/database/queries/*.sql` files
-2. Run `make gen`
-3. If audit errors: update `enterprise/audit/table.go`
-4. Run `make gen` again
-
-### LSP Navigation (USE FIRST)
-
-#### Go LSP (for backend code)
-
- **Find definitions**: `mcp__go-language-server__definition symbolName`
- **Find references**: `mcp__go-language-server__references symbolName`
- **Get type info**: `mcp__go-language-server__hover filePath line column`
- **Rename symbol**: `mcp__go-language-server__rename_symbol filePath line column newName`
-
-#### TypeScript LSP (for frontend code in site/)
-
- **Find definitions**: `mcp__typescript-language-server__definition symbolName`
- **Find references**: `mcp__typescript-language-server__references symbolName`
- **Get type info**: `mcp__typescript-language-server__hover filePath line column`
- **Rename symbol**: `mcp__typescript-language-server__rename_symbol filePath line column newName`
-
-### OAuth2 Error Handling
-
-```go
-// OAuth2-compliant error responses
-writeOAuth2Error(ctx, rw, http.StatusBadRequest, "invalid_grant", "description")
-```
-
-### Authorization Context
-
-```go
-// Public endpoints needing system access
-app, err := api.Database.GetOAuth2ProviderAppByClientID(dbauthz.AsSystemRestricted(ctx), clientID)
-
-// Authenticated endpoints with user context
-app, err := api.Database.GetOAuth2ProviderAppByClientID(ctx, clientID)
-```
-
-### API Design
-
- Add swagger annotations when introducing new HTTP endpoints. Do this in
-  the same change as the handler so the docs do not get missed before
-  release.
- For user-scoped or resource-scoped routes, prefer path parameters over
-  query parameters when that matches existing route patterns.
- For experimental or unstable API paths, skip public doc generation with
-  `// @x-apidocgen {"skip": true}` after the `@Router` annotation. This
-  keeps them out of the published API reference until they stabilize.
-
-### Database Query Naming
-
- Use `ByX` when `X` is the lookup or filter column.
- Use `PerX` or `GroupedByX` when `X` is the aggregation or grouping
-  dimension.
- Avoid `ByX` names for grouped queries.
-
-### Database-to-SDK Conversions
-
- Extract explicit db-to-SDK conversion helpers instead of inlining large
-  conversion blocks inside handlers.
- Keep nullable-field handling, type coercion, and response shaping in the
-  converter so handlers stay focused on request flow and authorization.
-
-### Transactions and `InTx`
-
- Inside `db.InTx(...)` closures, do not use the outer store (`api.Database`,
-  `p.db`, etc.) directly or indirectly. Use the `tx` handle for DB work inside
-  the closure, or fetch read-only inputs before opening the transaction.
- Watch for helper methods on a receiver that hide outer-store access. A call
-  like `p.someHelper(ctx)` is still unsafe inside `InTx` if that helper uses
-  `p.db` internally.
- Using the outer store while a transaction is open can hold one connection and
-  then block on another pool checkout, which can cause pool starvation and
-  `idle in transaction` incidents under load.
+- **Database changes**: Follow
+  [Database Development Patterns](.claude/docs/DATABASE.md). Modify
+  `coderd/database/queries/*.sql`, run `make gen`, update
+  `enterprise/audit/table.go` for audit errors, then run `make gen` again.
+- **LSP navigation**: Use LSP tools first. See
+  [Modern Go](.claude/docs/GO.md) for Go LSP and
+  [Frontend Development Guidelines](site/AGENTS.md) for TypeScript LSP.
+- **OAuth2 and authorization**: Follow
+  [OAuth2 Development Guide](.claude/docs/OAUTH2.md). OAuth2 endpoints must
+  use RFC-compliant errors such as `writeOAuth2Error(...)`, and public
+  endpoints that need system access should use `dbauthz.AsSystemRestricted`.
+- **API design**: Follow the API guardrails in
+  [Development Workflows and Guidelines](.claude/docs/WORKFLOWS.md),
+  including swagger annotations for new public HTTP endpoints.
+- **Transactions and conversions**: Keep `InTx` work on the transaction
+  handle, and prefer explicit db-to-SDK converters. See
+  [Database Development Patterns](.claude/docs/DATABASE.md).
+- **Testing**: Follow
+  [Testing Patterns and Best Practices](.claude/docs/TESTING.md). Use unique
+  identifiers in concurrent tests and do not use `time.Sleep` to mitigate
+  timing issues.
+- **Frontend**: Read [Frontend Development Guidelines](site/AGENTS.md)
+  before changing anything under `site/`. Reuse shared UI primitives when
+  possible and prefer Storybook stories for component and page testing.

 ## Quick Reference

@@ -143,61 +103,26 @@ app, err := api.Database.GetOAuth2ProviderAppByClientID(ctx, clientID)

 ### Git Hooks (MANDATORY - DO NOT SKIP)

-**You MUST install and use the git hooks. NEVER bypass them with
-`--no-verify`. Skipping hooks wastes CI cycles and is unacceptable.**
+You MUST install and use the git hooks. NEVER bypass them with
+`--no-verify`. Skipping hooks wastes CI cycles and is unacceptable.

-The first run will be slow as caches warm up. Consecutive runs are
-**significantly faster** (often 10x) thanks to Go build cache,
-generated file timestamps, and warm node_modules. This is NOT a
-reason to skip them. Wait for hooks to complete before proceeding,
-no matter how long they take.
+The first run can be slow while caches warm up. Wait for hooks to complete,
+even when `git commit` or `git push` appears to hang.

-```sh
-git config core.hooksPath scripts/githooks
-```
-
-Two hooks run automatically:
-
- **pre-commit**: Classifies staged files by type and runs either
-  the full `make pre-commit` or the lightweight `make pre-commit-light`
-  depending on whether Go, TypeScript, SQL, proto, or Makefile
-  changes are present. Falls back to the full target when
-  `CODER_HOOK_RUN_ALL=1` is set. A markdown-only commit takes
-  seconds; a Go change takes several minutes.
- **pre-push**: Classifies changed files (vs remote branch or
-  merge-base) and runs `make pre-push` when Go, TypeScript, SQL,
-  proto, or Makefile changes are detected. Skips tests entirely
-  for lightweight changes. Allowlisted in
-  `scripts/githooks/pre-push`. Runs only for developers who opt
-  in. Falls back to `make pre-push` when the diff range can't
-  be determined or `CODER_HOOK_RUN_ALL=1` is set. Allow at least
-  15 minutes for a full run.
-
-`git commit` and `git push` will appear to hang while hooks run.
-This is normal. Do not interrupt, retry, or reduce the timeout.
-
-NEVER run `git config core.hooksPath` to change or disable hooks.
-
-If a hook fails, fix the issue and retry. Do not work around the
-failure by skipping the hook.
+See [Development Workflows and Guidelines](.claude/docs/WORKFLOWS.md) for
+hook setup, pre-commit behavior, pre-push behavior, and failure handling.

 ### Git Workflow

-When working on existing PRs, check out the branch first:
-
-```sh
-git fetch origin
-git checkout branch-name
-git pull origin branch-name
-```
-
-Don't use `git push --force` unless explicitly requested.
+When working on existing PRs, check out the branch first. See
+[Development Workflows and Guidelines](.claude/docs/WORKFLOWS.md) for the
+full workflow. Don't use `git push --force` unless explicitly requested.

 ### New Feature Checklist

- [ ] Run `git pull` to ensure latest code
- [ ] Check if feature touches database - you'll need migrations
- [ ] Check if feature touches audit logs - update `enterprise/audit/table.go`
+See [Development Workflows and Guidelines](.claude/docs/WORKFLOWS.md) for
+the new feature checklist, including `git pull`, database migration checks,
+and audit table checks.

 ## Architecture

@@ -206,23 +131,6 @@ Don't use `git push --force` unless explicitly requested.
 - **Agents**: Workspace services (SSH, port forwarding)
 - **Database**: PostgreSQL with `dbauthz` authorization

-## Testing
-
-### Race Condition Prevention
-
- Use unique identifiers: `fmt.Sprintf("test-client-%s-%d", t.Name(), time.Now().UnixNano())`
- Never use hardcoded names in concurrent tests
-
-### OAuth2 Testing
-
- Full suite: `./scripts/oauth2/test-mcp-oauth2.sh`
- Manual testing: `./scripts/oauth2/test-manual-flow.sh`
-
-### Timing Issues
-
-NEVER use `time.Sleep` to mitigate timing issues. If an issue
-seems like it should use `time.Sleep`, read through https://github.com/coder/quartz and specifically the [README](https://github.com/coder/quartz/blob/main/README.md) to better understand how to handle timing issues.
-
 ## Code Style

 ### Detailed guidelines in imported WORKFLOWS.md
@@ -250,38 +158,11 @@ seems like it should use `time.Sleep`, read through https://github.com/coder/qua
  `renderHook()` that do not require DOM assertions, and query/cache
  operations with no rendered output.

-### Writing Comments
+### Writing Comments and Avoiding Unnecessary Changes

-Code comments should be clear, well-formatted, and add meaningful context.
-
-**Proper sentence structure**: Comments are sentences and should end with
-periods or other appropriate punctuation. This improves readability and
-maintains professional code standards.
-
-**Explain why, not what**: Good comments explain the reasoning behind code
-rather than describing what the code does. The code itself should be
-self-documenting through clear naming and structure. Focus your comments on
-non-obvious decisions, edge cases, or business logic that isn't immediately
-apparent from reading the implementation.
-
-**Line length and wrapping**: Keep comment lines to 80 characters wide
-(including the comment prefix like `//` or `#`). When a comment spans multiple
-lines, wrap it naturally at word boundaries rather than writing one sentence
-per line. This creates more readable, paragraph-like blocks of documentation.
-
-```go
-// Good: Explains the rationale with proper sentence structure.
-// We need a custom timeout here because workspace builds can take several
-// minutes on slow networks, and the default 30s timeout causes false
-// failures during initial template imports.
-ctx, cancel := context.WithTimeout(ctx, 5*time.Minute)
-
-// Bad: Describes what the code does without punctuation or wrapping
-// Set a custom timeout
-// Workspace builds can take a long time
-// Default timeout is too short
-ctx, cancel := context.WithTimeout(ctx, 5*time.Minute)
-```
+See [Modern Go](.claude/docs/GO.md) for comment formatting and the rule to
+avoid unrelated edits. Preserve existing comments that explain non-obvious
+behavior unless the task directly requires changing them.

 ### No Emdash or Endash

@@ -299,21 +180,6 @@ caught by `make lint/emdash`.
 // This is slow, so we should cache it.
 ```

-### Avoid Unnecessary Changes
-
-When fixing a bug or adding a feature, don't modify code unrelated to your
-task. Unnecessary changes make PRs harder to review and can introduce
-regressions.
-
-**Don't reword existing comments or code** unless the change is directly
-motivated by your task. Rewording comments to be shorter or "cleaner" wastes
-reviewer time and clutters the diff.
-
-**Don't delete existing comments** that explain non-obvious behavior. These
-comments preserve important context about why code works a certain way.
-
-**When adding tests for new behavior**, read existing tests first to understand what's covered. Add new cases for uncovered behavior. Edit existing tests as needed, but don't change what they verify.
-
 ## Detailed Development Guides

@.claude/docs/ARCHITECTURE.md
@@ -330,18 +196,18 @@ manually before starting work:

 **Always read:**

- `.claude/docs/WORKFLOWS.md` — dev server, git workflow, hooks
+- `.claude/docs/WORKFLOWS.md` - dev server, git workflow, hooks

 **Read when relevant to your task:**

- `.claude/docs/GO.md` — Go patterns and modern Go usage (any Go changes)
- `.claude/docs/TESTING.md` — testing patterns, race conditions (any test changes)
- `.claude/docs/DATABASE.md` — migrations, SQLC, audit table (any DB changes)
- `.claude/docs/ARCHITECTURE.md` — system overview (orientation or architecture work)
- `.claude/docs/PR_STYLE_GUIDE.md` — PR description format (when writing PRs)
- `.claude/docs/OAUTH2.md` — OAuth2 and RFC compliance (when touching auth)
- `.claude/docs/TROUBLESHOOTING.md` — common failures and fixes (when stuck)
- `.claude/docs/DOCS_STYLE_GUIDE.md` — docs conventions (when writing `docs/`)
+- `.claude/docs/GO.md` - Go patterns and modern Go usage (any Go changes)
+- `.claude/docs/TESTING.md` - testing patterns, race conditions (any test changes)
+- `.claude/docs/DATABASE.md` - migrations, SQLC, audit table (any DB changes)
+- `.claude/docs/ARCHITECTURE.md` - system overview (orientation or architecture work)
+- `.claude/docs/PR_STYLE_GUIDE.md` - PR description format (when writing PRs)
+- `.claude/docs/OAUTH2.md` - OAuth2 and RFC compliance (when touching auth)
+- `.claude/docs/TROUBLESHOOTING.md` - common failures and fixes (when stuck)
+- `.claude/docs/DOCS_STYLE_GUIDE.md` - docs conventions (when writing `docs/`)

 **For frontend work**, also read `site/AGENTS.md` before making any changes
 in `site/`.
@@ -728,7 +728,7 @@ endif
 # GitHub Actions linters are run in a separate CI job (lint-actions) that only
 # triggers when workflow files change, so we skip them here when CI=true.
 LINT_ACTIONS_TARGETS := $(if $(CI),,lint/actions/actionlint)
-lint: lint/shellcheck lint/go lint/ts lint/examples lint/helm lint/site-icons lint/markdown lint/check-scopes lint/migrations lint/bootstrap lint/emdash $(LINT_ACTIONS_TARGETS)
+lint: lint/shellcheck lint/go lint/ts lint/examples lint/helm lint/site-icons lint/markdown lint/check-scopes lint/migrations lint/bootstrap lint/architecture lint/emdash lint/agents $(LINT_ACTIONS_TARGETS)
 .PHONY: lint

 # Subset of lint that does not require Go or Node toolchains.
@@ -745,8 +745,6 @@ lint/ts: site/node_modules/.installed
 .PHONY: lint/ts

 lint/go:
-	./scripts/check_enterprise_imports.sh
-	./scripts/check_codersdk_imports.sh
 	linter_ver=$$(grep -oE 'GOLANGCI_LINT_VERSION=\S+' dogfood/coder/ubuntu-26.04/Dockerfile | cut -d '=' -f 2)
 	go run github.com/golangci/golangci-lint/cmd/golangci-lint@v$$linter_ver run
 	go tool github.com/coder/paralleltestctx/cmd/paralleltestctx -custom-funcs="testutil.Context,chatdTestContext" ./...
@@ -771,6 +769,13 @@ lint/emdash:
 	bash scripts/check_emdash.sh
 .PHONY: lint/emdash

+lint/architecture:
+	./scripts/check_architecture.sh
+.PHONY: lint/architecture
+
+lint/agents:
+	./scripts/check_agents_structure.sh
+.PHONY: lint/agents

 lint/helm:
 	cd helm/
@@ -0,0 +1,130 @@
+#!/usr/bin/env bash
+set -euo pipefail
+# shellcheck source=scripts/lib.sh
+# shellcheck disable=SC1091
+source "$(dirname "${BASH_SOURCE[0]}")/lib.sh"
+cdroot
+
+usage() {
+	cat <<'USAGE'
+Usage: scripts/audit-agent-readiness.sh [--help]
+
+Print a report-first audit of agent harness readiness. Warnings identify
+aspirational checks and do not fail the script. Missing required harness docs
+fail the script. Run manually with:
+
+  bash scripts/audit-agent-readiness.sh
+USAGE
+}
+
+if [[ "${1:-}" == "--help" ]]; then
+	usage
+	exit 0
+fi
+
+ok_count=0
+warn_count=0
+fail_count=0
+
+ok() {
+	printf '[ok] %s\n' "$1"
+	((ok_count++)) || true
+}
+
+warn() {
+	printf '[warn] %s\n' "$1"
+	((warn_count++)) || true
+}
+
+fail() {
+	printf '[fail] %s\n' "$1"
+	((fail_count++)) || true
+}
+
+contains() {
+	local file="$1"
+	local pattern="$2"
+	grep -qiE "$pattern" "$file"
+}
+
+echo "Agent harness readiness audit"
+echo
+echo "Required harness docs"
+
+for doc in \
+	".claude/docs/OBSERVABILITY.md" \
+	".claude/docs/DEV_ISOLATION.md" \
+	".claude/docs/AGENT_FAILURES.md"; do
+	if [[ -f "$doc" ]]; then
+		ok "$doc exists."
+	else
+		fail "$doc is missing."
+	fi
+done
+
+if [[ -L ".agents/docs" ]]; then
+	agents_docs_target="$(readlink ".agents/docs")"
+	if [[ "$agents_docs_target" == "../.claude/docs" ]]; then
+		ok ".agents/docs points to .claude/docs."
+	else
+		fail ".agents/docs points to $agents_docs_target, expected ../.claude/docs."
+	fi
+else
+	fail ".agents/docs compatibility symlink is missing."
+fi
+
+echo
+echo "Navigation and report-first checks"
+
+if contains AGENTS.md '^##[[:space:]].*(Agent navigation|Where to look)' ||
+	{ grep -qF ".claude/docs/OBSERVABILITY.md" AGENTS.md &&
+		grep -qF ".claude/docs/DEV_ISOLATION.md" AGENTS.md &&
+		grep -qF ".claude/docs/AGENT_FAILURES.md" AGENTS.md; }; then
+	ok "Root AGENTS.md appears to include agent navigation."
+else
+	warn "Root AGENTS.md may be missing agent navigation."
+fi
+
+if contains site/e2e/playwright.config.ts 'screenshot' &&
+	contains site/e2e/playwright.config.ts 'video' &&
+	contains site/e2e/playwright.config.ts 'trace' &&
+	contains site/e2e/playwright.config.ts 'failure'; then
+	ok "Playwright failure artifact settings appear configured."
+else
+	warn "Playwright failure artifact settings were not all detected."
+fi
+
+if grep -qi "playwright" .github/workflows/ci.yaml &&
+	grep -q "upload-artifact" .github/workflows/ci.yaml &&
+	grep -qF "failure()" .github/workflows/ci.yaml; then
+	ok "E2E CI failure artifact upload appears configured."
+else
+	warn "E2E CI failure artifact upload was not detected."
+fi
+
+if contains .claude/docs/OBSERVABILITY.md 'Prometheus' &&
+	contains .claude/docs/OBSERVABILITY.md 'log'; then
+	ok "Observability doc mentions logs and Prometheus."
+else
+	warn "Observability doc may be missing logs or Prometheus coverage."
+fi
+
+if contains .claude/docs/DEV_ISOLATION.md 'port' &&
+	contains .claude/docs/DEV_ISOLATION.md 'CODER_DEV|override'; then
+	ok "Development isolation doc mentions ports and overrides."
+else
+	warn "Development isolation doc may be missing ports or override coverage."
+fi
+
+if grep -q 'lint/architecture' Makefile; then
+	ok "Architecture lint target exists."
+else
+	warn "Architecture lint target is not present yet."
+fi
+
+echo
+printf 'Summary: %d ok, %d warn, %d fail.\n' "$ok_count" "$warn_count" "$fail_count"
+
+if ((fail_count > 0)); then
+	exit 1
+fi
@@ -0,0 +1,96 @@
+#!/usr/bin/env bash
+set -euo pipefail
+# shellcheck source=scripts/lib.sh
+# shellcheck disable=SC1091
+source "$(dirname "${BASH_SOURCE[0]}")/lib.sh"
+cdroot
+
+echo "--- check agent docs structure"
+
+required_docs=(
+	".claude/docs/OBSERVABILITY.md"
+	".claude/docs/DEV_ISOLATION.md"
+	".claude/docs/AGENT_FAILURES.md"
+)
+
+fail=0
+
+for doc in "${required_docs[@]}"; do
+	if [[ ! -f "$doc" ]]; then
+		echo "error: required harness doc is missing: $doc"
+		fail=1
+	fi
+done
+
+if [[ ! -L ".agents/docs" ]]; then
+	echo "error: agent docs compatibility symlink is missing: .agents/docs -> ../.claude/docs"
+	fail=1
+elif [[ "$(readlink ".agents/docs")" != "../.claude/docs" ]]; then
+	echo "error: agent docs compatibility symlink points to $(readlink ".agents/docs"), expected ../.claude/docs"
+	fail=1
+fi
+
+is_reference_path() {
+	local ref="$1"
+	case "$ref" in
+	*/* | package.json | AGENTS.local.md)
+		return 0
+		;;
+	*)
+		return 1
+		;;
+	esac
+}
+
+# TODO: Add circular AGENTS.md include detection if nested agent docs begin
+# referencing each other. Current checks validate file existence only.
+mapfile -t agent_files < <(git ls-files '*AGENTS.md' | sort)
+
+for agent_file in "${agent_files[@]}"; do
+	agent_dir="$(dirname "$agent_file")"
+	while IFS=$'\t' read -r line_number ref; do
+		if [[ -z "${line_number:-}" || -z "${ref:-}" ]]; then
+			continue
+		fi
+		if ! is_reference_path "$ref"; then
+			continue
+		fi
+
+		candidate="$agent_dir/$ref"
+		candidate="${candidate#./}"
+		if [[ -e "$candidate" ]]; then
+			continue
+		fi
+
+		if [[ "$(basename "$ref")" == "AGENTS.local.md" ]]; then
+			echo "warning: $agent_file:$line_number: optional local agent file is not present: $ref"
+			continue
+		fi
+
+		echo "error: $agent_file:$line_number: referenced file does not exist: $ref"
+		fail=1
+	done < <(
+		awk '
+			/^[[:space:]]*(-[[:space:]]+)?@/ {
+				ref = $0
+				sub(/^[[:space:]]*(-[[:space:]]+)?@/, "", ref)
+				sub(/[[:space:]`)>].*$/, "", ref)
+				sub(/[,:;)]+$/, "", ref)
+				print FNR "\t" ref
+			}
+		' "$agent_file"
+	)
+done
+
+if [[ -f AGENTS.md ]]; then
+	root_agent_lines=$(wc -l <AGENTS.md)
+	if ((root_agent_lines > 600)); then
+		echo "warning: AGENTS.md is $root_agent_lines lines, consider keeping the root guide concise."
+	fi
+fi
+
+if [[ "$fail" -ne 0 ]]; then
+	exit 1
+fi
+
+echo "OK: agent docs structure looks valid."
@@ -0,0 +1,15 @@
+#!/usr/bin/env bash
+# Umbrella architecture-boundary check.
+#
+# Delegates to existing import-boundary scripts. New architecture rules can be
+# added here as needed.
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+echo "--- check architecture (import boundaries)"
+
+"$SCRIPT_DIR/check_enterprise_imports.sh"
+"$SCRIPT_DIR/check_codersdk_imports.sh"
+
+echo "OK: architecture checks passed."
@@ -10,6 +10,7 @@ import (
 	"context"
 	"encoding/json"
 	"fmt"
+	"hash/fnv"
 	"net"
 	"net/http"
 	"net/url"
@@ -52,7 +53,13 @@ const (
 	prometheusContainerName = "coder-prometheus"
 	// defaultPrometheusPort avoids 2112 (agent prometheus) and
 	// 2113 (agent debug) already bound inside Coder workspaces.
-	defaultPrometheusPort  = "2114"
+	defaultPrometheusPort = "2114"
+	// portOffsetBuckets keeps the offset below 1000 while leaving
+	// enough hash buckets for common multi-worktree use.
+	portOffsetBuckets = 50
+	// portOffsetStep avoids overlap between the default API and proxy
+	// ports when two worktrees land in adjacent buckets.
+	portOffsetStep         = 20
 	prometheusImage        = "prom/prometheus:v3.11.2"
 	defaultAccessURL       = "http://127.0.0.1:%d"
 	defaultPassword        = "SomeSecurePassword!"
@@ -96,6 +103,13 @@ func main() {
 				Description: "Prometheus metrics port. Set to 0 to disable.",
 				Value:       serpent.Int64Of(&cfg.coderMetricsPort),
 			},
+			{
+				Flag:        "port-offset",
+				Env:         "CODER_DEV_PORT_OFFSET",
+				Default:     "false",
+				Description: "Apply a deterministic per-worktree offset to default API, web, proxy, and Coder metrics ports. Useful when running multiple worktrees in parallel.",
+				Value:       serpent.BoolOf(&cfg.portOffsetEnabled),
+			},
 			{
 				Flag:        "prometheus-server",
 				Env:         "CODER_DEV_PROMETHEUS_SERVER",
@@ -171,12 +185,13 @@ func main() {
 		},
 		Handler: func(inv *serpent.Invocation) error {
 			cfg.serverExtraArgs = inv.Args
+			cfg.portExplicit = portExplicitFromInvocation(inv)

 			logger := slog.Make(sloghuman.Sink(inv.Stderr))
-			if err := cfg.validate(); err != nil {
+			if err := cfg.resolveEnv(); err != nil {
 				return err
 			}
-			if err := cfg.resolveEnv(); err != nil {
+			if err := cfg.validate(); err != nil {
 				return err
 			}
 			return develop(inv.Context(), logger, &cfg)
@@ -191,30 +206,123 @@ func main() {
 }

 type devConfig struct {
-	apiPort          int64
-	webPort          int64
-	proxyPort        int64
-	coderMetricsPort int64
-	prometheusServer bool
-	agpl             bool
-	accessURL        string
-	password         string
-	useProxy         bool
-	debug            bool
-	skipSetup        bool
-	multiOrg         bool
-	starterTemplate  string
-	dbRollback       bool
-	dbReset          bool
-	dbContinue       bool
-	projectRoot      string
-	binaryPath       string
-	configDir        string
-	childEnv         []string
+	apiPort           int64
+	webPort           int64
+	proxyPort         int64
+	coderMetricsPort  int64
+	portOffsetEnabled bool
+	prometheusServer  bool
+	agpl              bool
+	accessURL         string
+	password          string
+	useProxy          bool
+	debug             bool
+	skipSetup         bool
+	multiOrg          bool
+	starterTemplate   string
+	dbRollback        bool
+	dbReset           bool
+	dbContinue        bool
+	projectRoot       string
+	binaryPath        string
+	configDir         string
+	childEnv          []string
+	portExplicit      portExplicit
+	portOffset        int
+	apiPortSource     portSource
+	webPortSource     portSource
+	proxyPortSource   portSource
+	metricsPortSource portSource
 	// Extra args after flags forwarded to "coder server".
 	serverExtraArgs []string
 }

+type portExplicit struct {
+	api     bool
+	web     bool
+	proxy   bool
+	metrics bool
+}
+
+type portSource string
+
+const (
+	portSourceDefault  portSource = "default"
+	portSourceExplicit portSource = "explicit"
+	portSourceOffset   portSource = "offset"
+)
+
+func portExplicitFromInvocation(inv *serpent.Invocation) portExplicit {
+	return portExplicit{
+		api:     isPortExplicit(inv, "port", "CODER_DEV_PORT"),
+		web:     isPortExplicit(inv, "web-port", "CODER_DEV_WEB_PORT"),
+		proxy:   isPortExplicit(inv, "proxy-port", "CODER_DEV_PROXY_PORT"),
+		metrics: isPortExplicit(inv, "prometheus-port", "CODER_DEV_PROMETHEUS_PORT"),
+	}
+}
+
+func isPortExplicit(inv *serpent.Invocation, flagName, envName string) bool {
+	if flag := inv.ParsedFlags().Lookup(flagName); flag != nil && flag.Changed {
+		return true
+	}
+	if val, ok := inv.Environ.Lookup(envName); ok && val != "" {
+		return true
+	}
+	for _, opt := range inv.Command.Options {
+		if opt.Flag == flagName {
+			return opt.ValueSource == serpent.ValueSourceFlag ||
+				opt.ValueSource == serpent.ValueSourceEnv
+		}
+	}
+	return false
+}
+
+// portOffset returns a deterministic offset in [0, 1000) derived from the
+// worktree path. Successive callers with the same projectRoot get the same
+// offset; different projectRoots get different offsets with high probability.
+func portOffset(projectRoot string) int {
+	h := fnv.New64a()
+	_, _ = h.Write([]byte(projectRoot))
+	bucket := h.Sum64() % uint64(portOffsetBuckets)
+	return int(bucket) * portOffsetStep //nolint:gosec // Bucket is less than portOffsetBuckets.
+}
+
+func (c *devConfig) applyPortOffset() {
+	c.portOffset = 0
+	if !c.portOffsetEnabled {
+		return
+	}
+	c.portOffset = portOffset(c.projectRoot)
+	if c.portExplicit.api {
+		c.apiPortSource = portSourceExplicit
+	} else {
+		c.apiPortSource = c.applyDefaultPortOffset(&c.apiPort)
+	}
+	if c.portExplicit.web {
+		c.webPortSource = portSourceExplicit
+	} else {
+		c.webPortSource = c.applyDefaultPortOffset(&c.webPort)
+	}
+	if c.portExplicit.proxy {
+		c.proxyPortSource = portSourceExplicit
+	} else {
+		c.proxyPortSource = c.applyDefaultPortOffset(&c.proxyPort)
+	}
+	if c.portExplicit.metrics {
+		c.metricsPortSource = portSourceExplicit
+	} else {
+		c.metricsPortSource = c.applyDefaultPortOffset(&c.coderMetricsPort)
+	}
+}
+
+func (c *devConfig) applyDefaultPortOffset(port *int64) portSource {
+	if c.portOffset == 0 {
+		return portSourceDefault
+	}
+	*port += int64(c.portOffset)
+	return portSourceOffset
+}
+
 func (c *devConfig) validate() error {
 	if c.agpl && c.useProxy {
 		return xerrors.New("cannot use both --agpl and --use-proxy")
@@ -293,10 +401,6 @@ func (c *devConfig) validate() error {
 // resolveEnv sets defaults, unsets leaked credentials, resolves
 // filesystem paths, and computes the child process environment.
 func (c *devConfig) resolveEnv() error {
-	if strings.Contains(c.accessURL, "%d") {
-		c.accessURL = fmt.Sprintf(c.accessURL, c.apiPort)
-	}
-
 	// Prevent inherited credentials from leaking into child
 	// processes or being picked up by config reads.
 	_ = os.Unsetenv("CODER_SESSION_TOKEN")
@@ -311,6 +415,11 @@ func (c *devConfig) resolveEnv() error {
 		fmt.Sprintf("coder_%s_%s", runtime.GOOS, runtime.GOARCH))
 	c.configDir = filepath.Join(c.projectRoot, ".coderv2")

+	c.applyPortOffset()
+	if strings.Contains(c.accessURL, "%d") {
+		c.accessURL = fmt.Sprintf(c.accessURL, c.apiPort)
+	}
+
 	// Compute once, reused by cmd().
 	c.childEnv = filterEnv(os.Environ(), "CODER_SESSION_TOKEN", "CODER_URL")

@@ -1120,6 +1229,28 @@ func prometheusBannerEntry(cfg *devConfig, prometheusServerStarted bool) (label
 	}
 }

+func portBannerLine(label string, port int64, source portSource, offset int) string {
+	portValue := strconv.FormatInt(port, 10)
+	if port == 0 {
+		portValue = "disabled"
+	}
+	if source == "" {
+		return fmt.Sprintf("%s: %s", label, portValue)
+	}
+	return fmt.Sprintf("%s: %s (%s)", label, portValue, portSourceLabel(source, offset))
+}
+
+func portSourceLabel(source portSource, offset int) string {
+	switch source {
+	case portSourceExplicit:
+		return fmt.Sprintf("explicit, offset +%d skipped", offset)
+	case portSourceOffset:
+		return fmt.Sprintf("offset +%d", offset)
+	default:
+		return fmt.Sprintf("default, offset +%d", offset)
+	}
+}
+
 func printBanner(ctx context.Context, logger slog.Logger, cfg *devConfig, prometheusServerStarted bool) {
 	ifaces := []string{"localhost"}
 	if addrs, err := net.InterfaceAddrs(); err == nil {
@@ -1153,6 +1284,12 @@ func printBanner(ctx context.Context, logger slog.Logger, cfg *devConfig, promet
 		"",
 		indent("Coder is now running in development mode."),
 		"",
+		"Effective ports:",
+		indent(portBannerLine("API", cfg.apiPort, cfg.apiPortSource, cfg.portOffset)),
+		indent(portBannerLine("Web UI", cfg.webPort, cfg.webPortSource, cfg.portOffset)),
+		indent(portBannerLine("Proxy", cfg.proxyPort, cfg.proxyPortSource, cfg.portOffset)),
+		indent(portBannerLine("Coder metrics", cfg.coderMetricsPort, cfg.metricsPortSource, cfg.portOffset)),
+		"",
 		"API:",
 	)

@@ -146,6 +146,127 @@ func TestShellBool(t *testing.T) {
 	assert.Equal(t, "0", shellBool(false))
 }

+func TestPortOffset(t *testing.T) {
+	t.Parallel()
+
+	root := "/tmp/coder/worktree-a"
+	offset := portOffset(root)
+	assert.Equal(t, offset, portOffset(root))
+	assert.GreaterOrEqual(t, offset, 0)
+	assert.Less(t, offset, 1000)
+	assert.Equal(t, 0, offset%10)
+
+	var foundDifferent bool
+	for _, otherRoot := range []string{
+		"/tmp/coder/worktree-b",
+		"/tmp/coder/worktree-c",
+		"/tmp/coder/worktree-d",
+	} {
+		if portOffset(otherRoot) != offset {
+			foundDifferent = true
+			break
+		}
+	}
+	assert.True(t, foundDifferent, "expected typical worktree paths to use different offsets")
+}
+
+func TestApplyPortOffsetSkipsExplicitPorts(t *testing.T) {
+	t.Parallel()
+
+	projectRoot := "/tmp/coder/worktree-offset"
+	for i := range 100 {
+		candidate := fmt.Sprintf("/tmp/coder/worktree-offset-%d", i)
+		if portOffset(candidate) != 0 {
+			projectRoot = candidate
+			break
+		}
+	}
+	offset := portOffset(projectRoot)
+	require.NotZero(t, offset)
+
+	cfg := &devConfig{
+		apiPort:           3000,
+		webPort:           8080,
+		proxyPort:         3010,
+		coderMetricsPort:  2114,
+		portOffsetEnabled: true,
+		projectRoot:       projectRoot,
+		portExplicit: portExplicit{
+			web:     true,
+			metrics: true,
+		},
+	}
+	cfg.applyPortOffset()
+
+	assert.Equal(t, int64(3000+offset), cfg.apiPort)
+	assert.Equal(t, int64(8080), cfg.webPort)
+	assert.Equal(t, int64(3010+offset), cfg.proxyPort)
+	assert.Equal(t, int64(2114), cfg.coderMetricsPort)
+	assert.Equal(t, portSourceOffset, cfg.apiPortSource)
+	assert.Equal(t, portSourceExplicit, cfg.webPortSource)
+	assert.Equal(t, portSourceOffset, cfg.proxyPortSource)
+	assert.Equal(t, portSourceExplicit, cfg.metricsPortSource)
+}
+
+func TestApplyPortOffsetDisabledUsesDefaultPorts(t *testing.T) {
+	t.Parallel()
+
+	projectRoot := "/tmp/coder/worktree-offset"
+	for i := range 100 {
+		candidate := fmt.Sprintf("/tmp/coder/worktree-offset-disabled-%d", i)
+		if portOffset(candidate) != 0 {
+			projectRoot = candidate
+			break
+		}
+	}
+	require.NotZero(t, portOffset(projectRoot))
+
+	cfg := &devConfig{
+		apiPort:          3000,
+		webPort:          8080,
+		proxyPort:        3010,
+		coderMetricsPort: 2114,
+		projectRoot:      projectRoot,
+	}
+	cfg.applyPortOffset()
+
+	assert.Equal(t, int64(3000), cfg.apiPort)
+	assert.Equal(t, int64(8080), cfg.webPort)
+	assert.Equal(t, int64(3010), cfg.proxyPort)
+	assert.Equal(t, int64(2114), cfg.coderMetricsPort)
+	assert.Zero(t, cfg.portOffset)
+	assert.Empty(t, cfg.apiPortSource)
+	assert.Empty(t, cfg.webPortSource)
+	assert.Empty(t, cfg.proxyPortSource)
+	assert.Empty(t, cfg.metricsPortSource)
+	assert.Equal(t, "API: 3000", portBannerLine("API", cfg.apiPort, cfg.apiPortSource, cfg.portOffset))
+}
+
+func TestPortOffsetDefaultPortsDoNotOverlap(t *testing.T) {
+	t.Parallel()
+
+	ports := []struct {
+		name string
+		base int
+	}{
+		{name: "API", base: 3000},
+		{name: "Web UI", base: 8080},
+		{name: "Proxy", base: 3010},
+		{name: "Coder metrics", base: 2114},
+	}
+	seen := make(map[int]string)
+	for bucket := range portOffsetBuckets {
+		offset := bucket * portOffsetStep
+		for _, port := range ports {
+			effective := port.base + offset
+			if other, ok := seen[effective]; ok {
+				t.Fatalf("%s collides with %s on port %d", port.name, other, effective)
+			}
+			seen[effective] = fmt.Sprintf("%s with offset %d", port.name, offset)
+		}
+	}
+}
+
 func TestDevelopInCoder(t *testing.T) {
 	t.Run("DEVELOP_IN_CODER", func(t *testing.T) {
 		t.Setenv("DEVELOP_IN_CODER", "1")
@@ -431,15 +552,17 @@ func TestDevConfigResolveEnv(t *testing.T) {
 	t.Setenv("CODER_SESSION_TOKEN", "leaked")
 	t.Setenv("CODER_URL", "https://leaked.example.com")

+	wd, _ := os.Getwd()
 	cfg := &devConfig{apiPort: 3000, accessURL: defaultAccessURL}
 	require.NoError(t, cfg.resolveEnv())

-	wd, _ := os.Getwd()
 	assert.Equal(t, wd, cfg.projectRoot)
 	assert.Equal(t, filepath.Join(wd, "build",
 		fmt.Sprintf("coder_%s_%s", runtime.GOOS, runtime.GOARCH)), cfg.binaryPath)
 	assert.Equal(t, filepath.Join(wd, ".coderv2"), cfg.configDir)
 	assert.Equal(t, "http://127.0.0.1:3000", cfg.accessURL)
+	assert.Equal(t, int64(3000), cfg.apiPort)
+	assert.Zero(t, cfg.portOffset)

 	// Should have unset leaked env vars.
 	assert.Empty(t, os.Getenv("CODER_SESSION_TOKEN"))
@@ -454,11 +577,92 @@ func TestDevConfigResolveEnv(t *testing.T) {
 	}
 }

+func TestDevConfigResolveEnvUsesDefaultPortsWithoutPortOffset(t *testing.T) {
+	t.Setenv("CODER_SESSION_TOKEN", "")
+	t.Setenv("CODER_URL", "")
+
+	baseRoot := t.TempDir()
+	projectRoot := filepath.Join(baseRoot, "worktree")
+	for i := range 100 {
+		candidate := filepath.Join(baseRoot, fmt.Sprintf("worktree-default-%d", i))
+		if portOffset(candidate) != 0 {
+			projectRoot = candidate
+			break
+		}
+	}
+	require.NotZero(t, portOffset(projectRoot))
+	require.NoError(t, os.MkdirAll(projectRoot, 0o755))
+	t.Chdir(projectRoot)
+
+	cfg := &devConfig{
+		apiPort:          3000,
+		webPort:          8080,
+		proxyPort:        3010,
+		coderMetricsPort: 2114,
+		accessURL:        defaultAccessURL,
+	}
+	require.NoError(t, cfg.resolveEnv())
+
+	assert.Equal(t, projectRoot, cfg.projectRoot)
+	assert.Equal(t, int64(3000), cfg.apiPort)
+	assert.Equal(t, int64(8080), cfg.webPort)
+	assert.Equal(t, int64(3010), cfg.proxyPort)
+	assert.Equal(t, int64(2114), cfg.coderMetricsPort)
+	assert.Zero(t, cfg.portOffset)
+	assert.Empty(t, cfg.apiPortSource)
+	assert.Empty(t, cfg.webPortSource)
+	assert.Empty(t, cfg.proxyPortSource)
+	assert.Empty(t, cfg.metricsPortSource)
+	assert.Equal(t, "http://127.0.0.1:3000", cfg.accessURL)
+}
+
+func TestDevConfigResolveEnvAppliesPortOffsetWhenEnabled(t *testing.T) {
+	t.Setenv("CODER_SESSION_TOKEN", "")
+	t.Setenv("CODER_URL", "")
+
+	baseRoot := t.TempDir()
+	projectRoot := filepath.Join(baseRoot, "worktree")
+	for i := range 100 {
+		candidate := filepath.Join(baseRoot, fmt.Sprintf("worktree-%d", i))
+		if portOffset(candidate) != 0 {
+			projectRoot = candidate
+			break
+		}
+	}
+	require.NotZero(t, portOffset(projectRoot))
+	require.NoError(t, os.MkdirAll(projectRoot, 0o755))
+	t.Chdir(projectRoot)
+
+	cfg := &devConfig{
+		apiPort:           3000,
+		webPort:           8080,
+		proxyPort:         3010,
+		coderMetricsPort:  2114,
+		portOffsetEnabled: true,
+		accessURL:         defaultAccessURL,
+	}
+	require.NoError(t, cfg.resolveEnv())
+
+	offset := portOffset(projectRoot)
+	assert.Equal(t, projectRoot, cfg.projectRoot)
+	assert.Equal(t, int64(3000+offset), cfg.apiPort)
+	assert.Equal(t, int64(8080+offset), cfg.webPort)
+	assert.Equal(t, int64(3010+offset), cfg.proxyPort)
+	assert.Equal(t, int64(2114+offset), cfg.coderMetricsPort)
+	assert.Equal(t, offset, cfg.portOffset)
+	assert.Equal(t, portSourceOffset, cfg.apiPortSource)
+	assert.Equal(t, fmt.Sprintf("http://127.0.0.1:%d", 3000+offset), cfg.accessURL)
+}
+
 func TestDevConfigResolveEnvExplicitAccessURL(t *testing.T) {
 	t.Setenv("CODER_SESSION_TOKEN", "")
 	t.Setenv("CODER_URL", "")

-	cfg := &devConfig{apiPort: 5000, accessURL: "http://myhost:5000"}
+	cfg := &devConfig{
+		apiPort:      5000,
+		accessURL:    "http://myhost:5000",
+		portExplicit: portExplicit{api: true},
+	}
 	require.NoError(t, cfg.resolveEnv())
 	assert.Equal(t, "http://myhost:5000", cfg.accessURL)
 }
@@ -0,0 +1,100 @@
+#!/usr/bin/env bash
+
+# Summarize failed Go tests from go test JSON output.
+
+set -euo pipefail
+# shellcheck source=scripts/lib.sh
+# shellcheck disable=SC1091
+source "$(dirname "${BASH_SOURCE[0]}")/lib.sh"
+cdroot
+
+if [[ $# -ne 1 ]]; then
+	error "Usage: go-test-failure-summary.sh <go-test.json>"
+fi
+
+results_file=$1
+if [[ ! -s "$results_file" ]]; then
+	exit 0
+fi
+
+if ! command -v jq >/dev/null; then
+	error "jq is required to summarize Go test failures."
+fi
+
+jq -sr '
+	def clean_block:
+		tostring
+		| gsub("\u001b\\[[0-9;?]*[ -/]*[@-~]"; "")
+		| gsub("```"; "``");
+	def clean_inline:
+		tostring | gsub("`"; "") | gsub("[\r\n]"; " ");
+	def truncate($max):
+		if length > $max then .[0:$max] + "..." else . end;
+	def terminal_action:
+		.Action == "pass" or .Action == "fail" or .Action == "skip";
+	def test_key:
+		(.Package // "") + "\u0000" + (.Test // "");
+	def output_for($events; $package; $test):
+		[
+			$events[]
+			| select(.Action == "output")
+			| select((.Package // "") == $package)
+			| select((.Test // "") == $test)
+			| .Output // ""
+		]
+		| join("")
+		| clean_block
+		| if . == "" then "No output recorded." else . end
+		| truncate(600);
+
+	map(select(type == "object")) as $events
+	| [
+		$events
+		| to_entries[]
+		| .value + {idx: .key}
+		| select((.Test // "") != "")
+		| select(terminal_action)
+	] as $terminal_tests
+	| [
+		$terminal_tests
+		| group_by(test_key)
+		| .[]
+		| max_by(.idx)
+		| select(.Action == "fail")
+		| {
+			package: ((.Package // "unknown") | clean_inline),
+			test: ((.Test // "unknown") | clean_inline),
+			elapsed: (.Elapsed // 0),
+			output: output_for($events; (.Package // ""); (.Test // ""))
+		}
+	] as $failures
+	| if ($failures | length) == 0 then
+		empty
+	else
+		($failures | length) as $failed
+		| ($failures | map(.package) | unique | length) as $packages
+		| ([
+			$events[]
+			| select((.Test // "") == "")
+			| select(.Action == "pass" or .Action == "fail")
+			| .Elapsed // 0
+		] | add // 0) as $duration
+		| ([
+			$events[]
+			| select((.Test // "") == "")
+			| select(.Action == "fail")
+			| .Package // empty
+		] | unique | length) as $package_failures
+		| [
+			"## Go test failures (\($failed) in \($packages))",
+			"- Duration: \($duration)s",
+			"- Package failures: \($package_failures)",
+			"",
+			($failures[]
+				| "### \(.package) :: \(.test)\n"
+				+ "- Elapsed: \(.elapsed)s\n\n"
+				+ "```\n\(.output)\n```\n")
+		]
+		| join("\n")
+	end
+' "$results_file"
@@ -0,0 +1,104 @@
+#!/usr/bin/env bash
+
+# Summarize failed Playwright tests from the JSON reporter output.
+
+set -euo pipefail
+# shellcheck source=scripts/lib.sh
+# shellcheck disable=SC1091
+source "$(dirname "${BASH_SOURCE[0]}")/lib.sh"
+cdroot
+
+if [[ $# -ne 1 ]]; then
+	error "Usage: playwright-failure-summary.sh <results.json>"
+fi
+
+results_file=$1
+if [[ ! -f "$results_file" ]]; then
+	exit 0
+fi
+
+if ! command -v jq >/dev/null; then
+	error "jq is required to summarize Playwright failures."
+fi
+
+artifact="playwright-artifacts-${MATRIX_VARIANT:-unknown}-${GITHUB_SHA_SHORT:-unknown}"
+
+jq -r --arg artifact "$artifact" --arg root "$PROJECT_ROOT" '
+	def clean_block:
+		tostring
+		| gsub("\u001b\\[[0-9;]*[A-Za-z]"; "")
+		| gsub("```"; "``");
+	def clean_inline:
+		tostring | gsub("`"; "");
+	def truncate($max):
+		if length > $max then .[0:$max] + "..." else . end;
+	def failure_status:
+		. == "failed" or . == "timedOut" or . == "interrupted";
+	def relpath($root):
+		if startswith($root + "/") then .[($root | length) + 1:]
+		elif startswith("site/") then .
+		elif startswith("e2e/") then "site/" + .
+		else "site/e2e/" + .
+		end;
+	def all_specs($titles):
+		([$titles[], (.title // empty)] | map(select(. != ""))) as $next_titles
+		| (
+			.specs[]?
+			| . + {
+				titlePath: ($next_titles + ([.title // ""] | map(select(. != ""))))
+			}
+		),
+		(.suites[]? | all_specs($next_titles));
+	def failure_entries:
+		[
+			.suites[]?
+			| all_specs([]) as $spec
+			| $spec.tests[]? as $test
+			| select(($test.status // "") != "flaky")
+			| select(
+				(($test.status // "") == "unexpected")
+				or any($test.results[]?; .status | failure_status)
+			)
+			| ([ $test.results[]? | select(.status | failure_status) ][0]
+				// ($test.results[0] // {})) as $result
+			| ((($result.error.message // "") | clean_block) as $message
+				| (($result.error.stack // "") | clean_block) as $stack
+				| {
+					file: (($spec.file // "") | relpath($root)),
+					line: ($spec.line // 0),
+					title: (($spec.titlePath // [$spec.title // ""]) | join(" > ") | clean_inline),
+					project: (($test.projectName // "unknown") | clean_inline),
+					message: (if $message != "" then $message else $stack end | if . != "" then . else "No error message recorded." end | truncate(600)),
+					attachments: ([ $result.attachments[]? | .name // empty | clean_inline ] | unique)
+				})
+		];
+	failure_entries as $entries
+	| if ($entries | length) == 0 then
+		empty
+	else
+		(.stats // {}) as $stats
+		| ($stats.unexpected // 0) as $stats_failed
+		| ([($stats_failed | tonumber), ($entries | length)] | max) as $failed
+		| (($stats.expected // 0) + ($stats.unexpected // 0) + ($stats.flaky // 0) + ($stats.skipped // 0)) as $computed_total
+		| ($stats.total // $computed_total) as $total
+		| [
+			"## Playwright failures (\($failed) of \($total))",
+			"- Duration: \($stats.duration // 0)ms",
+			"- Skipped: \($stats.skipped // 0), Flaky: \($stats.flaky // 0)",
+			"- Artifact: `\($artifact)` (download from the run summary)",
+			"",
+			($entries[]
+				| "### \(.file):\(.line)\n"
+				+ "- Test: `\(.title)`\n"
+				+ "- Project: `\(.project)`\n"
+				+ "- Attachments:\n"
+				+ (if (.attachments | length) == 0 then
+					"  - None recorded in artifact `\($artifact)`"
+				else
+					(.attachments | map("  - `\(.)` in artifact `\($artifact)`") | join("\n"))
+				end)
+				+ "\n\n```\n\(.message)\n```\n")
+		]
+		| join("\n")
+	end
+' "$results_file" | sed -E $'s/\x1b\[[0-9;]*m//g'
@@ -26,6 +26,19 @@ When investigating or editing TypeScript/React code, always use the TypeScript l
 - `pnpm playwright:test` - Run playwright e2e tests. When running e2e tests, remind the user that a license is required to run all the tests
 - `pnpm format` - Format frontend code. Always run before creating a PR

+## Failure artifacts
+
+Playwright writes per-test failure artifacts to `site/test-results/` when
+running `pnpm playwright:test` from `site/`. Failed tests keep screenshots,
+videos, and traces through the Playwright config. The HTML report is written
+to `site/playwright-report/`, and the coderd debug log is written to
+`site/e2e/test-results/debug.log`.
+
+In CI, the `test-e2e` job uploads failure artifacts to the workflow run's
+Artifacts section. Look for artifact names prefixed with
+`playwright-artifacts-`, followed by the matrix job name and commit SHA.
+Debug logs and pprof dumps use the same job name and commit SHA convention.
+
 ## Components

 - MUI components are deprecated - migrate away from these when encountered
@@ -35,6 +35,7 @@ const localURL = (port: number, path: string): string => {
 export default defineConfig({
 	retries,
 	globalSetup: require.resolve("./setup/preflight"),
+	outputDir: "../test-results",
 	projects: [
 		{
 			name: "testsSetup",
@@ -47,10 +48,20 @@ export default defineConfig({
 			timeout: 30_000,
 		},
 	],
-	reporter: [["list"], ["./reporter.ts"]],
+	reporter: [
+		["list"],
+		["html", { open: "never" }],
+		[
+			"json",
+			{ outputFile: path.join(__dirname, "../test-results/results.json") },
+		],
+		["./reporter.ts"],
+	],
 	use: {
 		actionTimeout: 5000,
 		baseURL: `http://localhost:${coderPort}`,
+		screenshot: "only-on-failure",
+		trace: "retain-on-failure",
 		video: "retain-on-failure",
 		...(wsEndpoint
 			? {