mirror of
https://github.com/coder/coder.git
synced 2026-06-02 20:48:20 +00:00
85792d08bc
This PR adds an opinionated harness-engineering layer for agent-driven workflows: a small set of agent-readable docs, mechanical structure checks, structured CI failure summaries, an architecture-lint umbrella, and per-worktree dev-server isolation. The goal is to make local dev, tests, and CI mechanically inspectable by agents without changing app runtime behavior. ## What landed **Agent docs and navigation** - `.claude/docs/OBSERVABILITY.md`, `.claude/docs/DEV_ISOLATION.md`, `.claude/docs/AGENT_FAILURES.md`: task-oriented guides for logs, tracing, Prometheus, dev-server isolation, and a seeded failure catalog. - `AGENTS.md`: added an `Agent navigation` block, then trimmed the file from 375 to 229 lines by migrating duplicated detail into `WORKFLOWS.md`, `GO.md`, `TESTING.md`, and `DATABASE.md`. The user-managed custom-instructions block is preserved. - `.agents/docs`: symlink mirror of `.claude/docs` for agent runtimes that look under `.agents`. **Mechanical checks** - `scripts/check_agents_structure.sh`: validates `@...` references in tracked `AGENTS.md` files and warns when root grows past 600 lines. Wired as `make lint/agents` and into `make lint`. - `scripts/audit-agent-readiness.sh`: report-first audit of harness readiness. Currently `10 ok, 0 warn, 0 fail`. - `scripts/check_architecture.sh` / `make lint/architecture`: umbrella architecture-lint target. Consolidates the existing `check_enterprise_imports.sh` and `check_codersdk_imports.sh` so they run exactly once via the umbrella. Slot is open for new high-confidence rules. **Structured CI failure summaries** - `scripts/playwright-failure-summary.sh`: parses `site/test-results/results.json` and writes Markdown to `$GITHUB_STEP_SUMMARY` on failure. Wired into the `test-e2e` matrix job. - `scripts/go-test-failure-summary.sh`: parses `go test -json` line-delimited output the same way. Wired into `test-go-pg`, `test-go-pg-17`, and `test-go-race-pg` by injecting `gotestsum --jsonfile` in the workflow without touching `Makefile`. JSON also uploaded as a CI artifact on failure. - `site/e2e/playwright.config.ts`: enables `screenshot: only-on-failure`, `trace: retain-on-failure`, JSON reporter, and HTML reporter alongside existing reporters. - `.github/workflows/ci.yaml`: failure artifact uploads for Playwright now use `if: failure()` and predictable names (`playwright-artifacts-<variant>-<sha>`). **Per-worktree dev-server isolation** (`scripts/develop/main.go`) - Deterministic FNV-64a hash of the worktree path produces a port offset in `[0, 1000)` (50 buckets, step 20 to avoid API/proxy overlap across adjacent buckets). - Offset is applied only to defaults; both env vars (`CODER_DEV_PORT`, `CODER_DEV_WEB_PORT`, `CODER_DEV_PROXY_PORT`, `CODER_DEV_PROMETHEUS_PORT`) and CLI flags retain priority. - Hardcoded ports `9090` (embedded Prometheus UI) and `12345` (Delve) are unchanged by design. - Startup banner shows each port's source: `default`, `offset`, or `explicit`. - Unit tests in `scripts/develop/main_test.go` cover determinism, bounds, no-overlap across the four ports, and explicit-skip behavior. - State (`.coderv2/`) was already worktree-isolated via `os.Getwd()`, so no state-dir changes were needed. ## Validation `make lint/agents`, `make lint/architecture`, `make lint/emdash`, `bash scripts/audit-agent-readiness.sh` (10 ok, 0 warn, 0 fail), `shellcheck` on all 5 new scripts, `go test ./scripts/develop/...`, and `js-yaml` parse of `ci.yaml` all pass. Synthetic fixtures verify both failure-summary scripts handle empty/missing input (silent exit 0), ANSI-stripped output, and parent/subtest formatting. ## Known follow-ups (deferred) - Frontend Storybook/Vitest failure summary: lowest-leverage slice of the failure-summary work. Skipping until observed pain. - Architecture lint currently only delegates to existing import checks; new rules (`InTx` outer-store detection, swagger-annotation lint) plug in as needed. - 50 port-offset buckets means two worktree paths can occasionally collide. The DEV_ISOLATION doc tells users to set the relevant env var when this happens. > Mux opened this PR on Mike's behalf.
7.3 KiB
7.3 KiB
Database Development Patterns
Database Work Overview
Database Generation Process
- Modify SQL files in
coderd/database/queries/ - Run
make gen - If errors about audit table, update
enterprise/audit/table.go - Run
make genagain - Run
make lintto catch any remaining issues
Migration Guidelines
Creating Migration Files
Location: coderd/database/migrations/
Format: {number}_{description}.{up|down}.sql
- Number must be unique and sequential
- Always include both up and down migrations
Helper Scripts
| Script | Purpose |
|---|---|
./coderd/database/migrations/create_migration.sh "migration name" |
Creates new migration files |
./coderd/database/migrations/fix_migration_numbers.sh |
Renumbers migrations to avoid conflicts |
./coderd/database/migrations/create_fixture.sh "fixture name" |
Creates test fixtures for migrations |
Database Query Organization
- MUST DO: Any changes to database - adding queries, modifying queries should be done in the
coderd/database/queries/*.sqlfiles - MUST DO: Queries are grouped in files relating to context - e.g.
prebuilds.sql,users.sql,oauth2.sql - After making changes to any
coderd/database/queries/*.sqlfiles you must runmake gento generate respective ORM changes
Query Naming
- Use
ByXwhenXis the lookup or filter column. - Use
PerXorGroupedByXwhenXis the aggregation or grouping dimension. - Avoid
ByXnames for grouped queries.
Handling Nullable Fields
Use sql.NullString, sql.NullBool, etc. for optional database fields:
CodeChallenge: sql.NullString{
String: params.codeChallenge,
Valid: params.codeChallenge != "",
}
Set .Valid = true when providing values.
Database-to-SDK Conversions
- Extract explicit db-to-SDK conversion helpers instead of inlining large conversion blocks inside handlers.
- Keep nullable-field handling, type coercion, and response shaping in the converter so handlers stay focused on request flow and authorization.
Audit Table Updates
If adding fields to auditable types:
- Update
enterprise/audit/table.go - Add each new field with appropriate action:
ActionTrack: Field should be tracked in audit logsActionIgnore: Field should be ignored in audit logsActionSecret: Field contains sensitive data
- Run
make gento verify no audit errors
Database Architecture
Core Components
- PostgreSQL 13+ recommended for production
- Migrations managed with
migrate - Database authorization through
dbauthzpackage
Authorization Patterns
// Public endpoints needing system access (OAuth2 registration)
app, err := api.Database.GetOAuth2ProviderAppByClientID(dbauthz.AsSystemRestricted(ctx), clientID)
// Authenticated endpoints with user context
app, err := api.Database.GetOAuth2ProviderAppByClientID(ctx, clientID)
// System operations in middleware
roles, err := db.GetAuthorizationUserRoles(dbauthz.AsSystemRestricted(ctx), userID)
Common Database Issues
Migration Issues
- Migration conflicts: Use
fix_migration_numbers.shto renumber - Missing down migration: Always create both up and down files
- Schema inconsistencies: Verify against existing schema
Field Handling Issues
- Nullable field errors: Use
sql.Null*types consistently - Missing audit entries: Update
enterprise/audit/table.go
Query Issues
- Query organization: Group related queries in appropriate files
- Generated code errors: Run
make genafter query changes - Performance issues: Add appropriate indexes in migrations
Database Testing
Test Database Setup
func TestDatabaseFunction(t *testing.T) {
db := dbtestutil.NewDB(t)
// Test with real database
result, err := db.GetSomething(ctx, param)
require.NoError(t, err)
require.Equal(t, expected, result)
}
Best Practices
Schema Design
- Use appropriate data types: VARCHAR for strings, TIMESTAMP for times
- Add constraints: NOT NULL, UNIQUE, FOREIGN KEY as appropriate
- Create indexes: For frequently queried columns
- Consider performance: Normalize appropriately but avoid over-normalization
Query Writing
- Use parameterized queries: Prevent SQL injection
- Handle errors appropriately: Check for specific error types
- Use transactions: For related operations that must succeed together
- Optimize queries: Use EXPLAIN to understand query performance
Transaction Safety with InTx
- Inside
db.InTx(...)closures, do not use the outer store (api.Database,p.db, etc.) directly or indirectly. Use thetxhandle for DB work inside the closure, or fetch read-only inputs before opening the transaction. - Watch for helper methods on a receiver that hide outer-store access. A
call like
p.someHelper(ctx)is still unsafe insideInTxif that helper usesp.dbinternally. - Using the outer store while a transaction is open can hold one
connection and then block on another pool checkout, which can cause
pool starvation and
idle in transactionincidents under load.
Migration Writing
- Make migrations reversible: Always include down migration
- Test migrations: On copy of production data if possible
- Keep migrations small: One logical change per migration
- Document complex changes: Add comments explaining rationale
Advanced Patterns
Complex Queries
-- Example: Complex join with aggregation
SELECT
u.id,
u.username,
COUNT(w.id) as workspace_count
FROM users u
LEFT JOIN workspaces w ON u.id = w.owner_id
WHERE u.created_at > $1
GROUP BY u.id, u.username
ORDER BY workspace_count DESC;
Conditional Queries
-- Example: Dynamic filtering
SELECT * FROM oauth2_provider_apps
WHERE
($1::text IS NULL OR name ILIKE '%' || $1 || '%')
AND ($2::uuid IS NULL OR organization_id = $2)
ORDER BY created_at DESC;
Audit Patterns
// Example: Auditable database operation
func (q *sqlQuerier) UpdateUser(ctx context.Context, arg UpdateUserParams) (User, error) {
// Implementation here
// Audit the change
if auditor := audit.FromContext(ctx); auditor != nil {
auditor.Record(audit.UserUpdate{
UserID: arg.ID,
Old: oldUser,
New: newUser,
})
}
return newUser, nil
}
Debugging Database Issues
Common Debug Commands
# Run tests (starts Postgres automatically if needed)
make test
# Run specific database tests
go test ./coderd/database/... -run TestSpecificFunction
# Check query generation
make gen
# Verify audit table
make lint
Debug Techniques
- Enable query logging: Set appropriate log levels
- Use database tools: pgAdmin, psql for direct inspection
- Check constraints: UNIQUE, FOREIGN KEY violations
- Analyze performance: Use EXPLAIN ANALYZE for slow queries
Troubleshooting Checklist
- Migration files exist (both up and down)
make genrun after query changes- Audit table updated for new fields
- Nullable fields use
sql.Null*types - Authorization context appropriate for endpoint type