mirror of
https://github.com/coder/coder.git
synced 2026-06-02 20:48:20 +00:00
bb8c40e764
This follows up on https://github.com/coder/coder/actions/runs/25684936801/job/75406131184?pr=25139 by replacing the large raw Go test JSON artifact with inline structured summaries and a compact failures-only artifact. ## What changed - Added `scripts/gotestsummary`, a streaming Go tool that reads gotestsum JSON and renders failed tests as Markdown. - Updated the three Go test jobs to publish per-test `<details>` sections in the job summary. - Removed upload of the raw `go-test.json` artifact. - Added upload of `go-test-failures-*.ndjson` with compact failure records for deeper inspection. - Deleted the old bash and `jq` summary script. ## Why - The previous raw artifact was about 35 MB compressed and 445 MB raw in the linked run. - Passing-test output made the artifact noisy and slow to inspect. - The old summary truncated output to 600 characters. - The new path keeps streaming, bounded output and writes structured diagnostics for only final failed tests. ## Validation - `gofmt -w scripts/gotestsummary` - `gofmt -l scripts/gotestsummary` - `go test ./scripts/gotestsummary/...` - `go vet ./scripts/gotestsummary/...` - `grep -rn 'go-test-failure-summary.sh' . || true` - `grep -rn 'go-test-failure-summary.sh\|go-test.json\|go-test-json-' .claude .agents docs AGENTS.md || true` - `make lint/agents` - `make lint/emdash` - `make lint/markdown` - `make lint/shellcheck` - `git diff --check origin/main..HEAD` > This PR was prepared by Mux working on Mike's behalf.
7.1 KiB
7.1 KiB
Agent Failure Catalog
Use this catalog for repeatable agent failures. Keep each entry short, actionable, and tied to existing docs or tools. Use the exact entry format shown below when adding new failures.
## Symptom: <short description>
- Likely cause:
- How to reproduce:
- How to diagnose:
- Existing docs or tools:
- Missing harness piece:
- Proposed prevention:
Symptom: Stale generated DB code after SQL changes
- Likely cause: A query or migration changed without running
make gen. - How to reproduce: Modify
coderd/database/queries/*.sqland run tests or builds without regeneratingcoderd/database/queries.sql.goand related generated files. - How to diagnose: Check
git difffor SQL changes without generated Go changes. Runmake genand inspect the resulting diff. - Existing docs or tools:
AGENTS.md, Database Development Patterns, and themake gentarget. - Missing harness piece: No preflight doc checklist currently points agents at generated DB drift before they run unrelated checks.
- Proposed prevention: Always run
make genafter database query or migration edits, then include the generated diff in the same commit.
Symptom: Missing audit table updates
- Likely cause: A database schema change affects audited data but
enterprise/audit/table.gowas not updated. - How to reproduce: Add or change a table that audit logging expects, run
make gen, and observe audit-related generation or test failures. - How to diagnose: Inspect the
make genfailure, then compare the changed database tables withenterprise/audit/table.go. - Existing docs or tools:
AGENTS.md, Database Development Patterns, andmake gen. - Missing harness piece: Agents need a failure catalog entry that connects generation failures to audit table maintenance.
- Proposed prevention: After database changes, run
make gen, updateenterprise/audit/table.gowhen generation reports audit drift, and rerunmake gen.
Symptom: Playwright failure without artifacts
- Likely cause: The failing run did not preserve screenshots, traces, videos, browser console output, or the Playwright report path.
- How to reproduce: Run a Playwright test from
sitewithpnpm playwright:test, let it fail, and discard the generated output before reporting the failure. - How to diagnose: Check
site/e2e/playwright.config.ts,site/e2e/README.md, and the terminal output for the report ortest-resultslocation. - Existing docs or tools: Frontend Development Guidelines,
site/e2e/README.md, andpnpm playwright:test. - Missing harness piece: No central checklist tells agents which browser artifacts must be attached to a failure report.
- Proposed prevention: Capture the Playwright report path, screenshot, trace, video, browser console output, and command output before retrying or cleaning the workspace.
Symptom: Go test failure without preserved diagnostics
- Likely cause: The failing CI job summary or compact failures artifact was discarded before reporting or retrying the failure.
- How to reproduce: Let a Go test job fail in CI, then report the failure using only the final job status instead of the job summary and artifacts.
- How to diagnose: Open the failed Go test job summary for the inline failure
table and per-test details. Download
go-test-failures-*.ndjsonfor deeper inspection of the compact failures-only records. - Existing docs or tools:
.github/workflows/ci.yamlGo test jobs andscripts/gotestsummary. - Missing harness piece: Agents need a central reminder to preserve the small Go test diagnostics artifact instead of the old raw test log.
- Proposed prevention: Attach or summarize the inline job summary and preserve
go-test-failures-*.ndjsonwhen reporting CI Go test failures.
Symptom: Port collision across worktrees
- Likely cause: Multiple worktrees use the same default develop ports.
- How to reproduce: Start
./scripts/develop.shin one worktree, then start it in another worktree without overriding ports. - How to diagnose: Look for
port <n> is already in useor conflict errors in the develop output. Check listeners withlsof -iTCP:<port> -sTCP:LISTEN. - Existing docs or tools: Development Isolation Guide for Agents
and
scripts/develop/main.go. - Missing harness piece: There is no automatic per-worktree port allocator.
- Proposed prevention: Assign each worktree a unique
CODER_DEV_PORT,CODER_DEV_WEB_PORT,CODER_DEV_PROXY_PORT, andCODER_DEV_PROMETHEUS_PORTbefore starting the app.
Symptom: Test using time.Sleep
- Likely cause: A test waits for time to pass instead of synchronizing on a deterministic condition or using the quartz clock.
- How to reproduce: Add a test that depends on
time.Sleep, then run it under load or with the race detector until it flakes. - How to diagnose: Search the test diff for
time.Sleep. Inspect whether the code under test can usequartzor another explicit synchronization point. - Existing docs or tools:
AGENTS.md, Testing Patterns and Best Practices, and the quartz README referenced fromAGENTS.md. - Missing harness piece: Agents need a failure entry that labels sleep-based waiting as a flake risk before review.
- Proposed prevention: Replace
time.Sleepwith a fake clock, trapped ticker, channel, poll with timeout, or another deterministic signal.
Symptom: DB work inside InTx uses the outer store
- Likely cause: Code inside a transaction closure calls
api.Database,p.db, or a helper that uses the outer store instead of thetxhandle. - How to reproduce: Add DB work inside
db.InTx(...)that calls back into the outer store, then exercise it under concurrent load. - How to diagnose: Inspect the closure and helper call graph for database calls that do not use the transaction handle. Look for pool waits, idle in transaction symptoms, or deadlocks under load.
- Existing docs or tools:
AGENTS.md, Database Development Patterns, and code review ofInTxclosures. - Missing harness piece: No automated check currently proves every helper used
inside
InTxstays on the transaction handle. - Proposed prevention: Fetch read-only inputs before opening the transaction,
pass
txinto helpers that need DB access, and avoid receiver helpers that hide outer-store usage.
Symptom: New API endpoint missing swagger annotations
- Likely cause: A handler or route was added without matching swagger comments.
- How to reproduce: Add a stable HTTP endpoint and skip
@Summary,@Router, or related annotations. - How to diagnose: Compare the new handler with nearby handlers and inspect generated API docs for the route.
- Existing docs or tools:
AGENTS.md, Documentation Style Guide, and API generation checks. - Missing harness piece: Agents need a doc reminder that endpoint work includes docs unless the route is intentionally experimental.
- Proposed prevention: Add swagger annotations in the same change as stable
endpoints. For experimental or unstable API paths, add
// @x-apidocgen {"skip": true}after@Router.