Files
coder/.claude/docs/DATABASE.md
T
Michael Suchacz 85792d08bc feat: add harness engineering layer for agent workflows (#24791)
This PR adds an opinionated harness-engineering layer for agent-driven
workflows: a small set of agent-readable docs, mechanical structure
checks, structured CI failure summaries, an architecture-lint umbrella,
and per-worktree dev-server isolation. The goal is to make local dev,
tests, and CI mechanically inspectable by agents without changing app
runtime behavior.

## What landed

**Agent docs and navigation**
- `.claude/docs/OBSERVABILITY.md`, `.claude/docs/DEV_ISOLATION.md`,
`.claude/docs/AGENT_FAILURES.md`: task-oriented guides for logs,
tracing, Prometheus, dev-server isolation, and a seeded failure catalog.
- `AGENTS.md`: added an `Agent navigation` block, then trimmed the file
from 375 to 229 lines by migrating duplicated detail into
`WORKFLOWS.md`, `GO.md`, `TESTING.md`, and `DATABASE.md`. The
user-managed custom-instructions block is preserved.
- `.agents/docs`: symlink mirror of `.claude/docs` for agent runtimes
that look under `.agents`.

**Mechanical checks**
- `scripts/check_agents_structure.sh`: validates `@...` references in
tracked `AGENTS.md` files and warns when root grows past 600 lines.
Wired as `make lint/agents` and into `make lint`.
- `scripts/audit-agent-readiness.sh`: report-first audit of harness
readiness. Currently `10 ok, 0 warn, 0 fail`.
- `scripts/check_architecture.sh` / `make lint/architecture`: umbrella
architecture-lint target. Consolidates the existing
`check_enterprise_imports.sh` and `check_codersdk_imports.sh` so they
run exactly once via the umbrella. Slot is open for new high-confidence
rules.

**Structured CI failure summaries**
- `scripts/playwright-failure-summary.sh`: parses
`site/test-results/results.json` and writes Markdown to
`$GITHUB_STEP_SUMMARY` on failure. Wired into the `test-e2e` matrix job.
- `scripts/go-test-failure-summary.sh`: parses `go test -json`
line-delimited output the same way. Wired into `test-go-pg`,
`test-go-pg-17`, and `test-go-race-pg` by injecting `gotestsum
--jsonfile` in the workflow without touching `Makefile`. JSON also
uploaded as a CI artifact on failure.
- `site/e2e/playwright.config.ts`: enables `screenshot:
only-on-failure`, `trace: retain-on-failure`, JSON reporter, and HTML
reporter alongside existing reporters.
- `.github/workflows/ci.yaml`: failure artifact uploads for Playwright
now use `if: failure()` and predictable names
(`playwright-artifacts-<variant>-<sha>`).

**Per-worktree dev-server isolation** (`scripts/develop/main.go`)
- Deterministic FNV-64a hash of the worktree path produces a port offset
in `[0, 1000)` (50 buckets, step 20 to avoid API/proxy overlap across
adjacent buckets).
- Offset is applied only to defaults; both env vars (`CODER_DEV_PORT`,
`CODER_DEV_WEB_PORT`, `CODER_DEV_PROXY_PORT`,
`CODER_DEV_PROMETHEUS_PORT`) and CLI flags retain priority.
- Hardcoded ports `9090` (embedded Prometheus UI) and `12345` (Delve)
are unchanged by design.
- Startup banner shows each port's source: `default`, `offset`, or
`explicit`.
- Unit tests in `scripts/develop/main_test.go` cover determinism,
bounds, no-overlap across the four ports, and explicit-skip behavior.
- State (`.coderv2/`) was already worktree-isolated via `os.Getwd()`, so
no state-dir changes were needed.

## Validation

`make lint/agents`, `make lint/architecture`, `make lint/emdash`, `bash
scripts/audit-agent-readiness.sh` (10 ok, 0 warn, 0 fail), `shellcheck`
on all 5 new scripts, `go test ./scripts/develop/...`, and `js-yaml`
parse of `ci.yaml` all pass. Synthetic fixtures verify both
failure-summary scripts handle empty/missing input (silent exit 0),
ANSI-stripped output, and parent/subtest formatting.

## Known follow-ups (deferred)

- Frontend Storybook/Vitest failure summary: lowest-leverage slice of
the failure-summary work. Skipping until observed pain.
- Architecture lint currently only delegates to existing import checks;
new rules (`InTx` outer-store detection, swagger-annotation lint) plug
in as needed.
- 50 port-offset buckets means two worktree paths can occasionally
collide. The DEV_ISOLATION doc tells users to set the relevant env var
when this happens.

> Mux opened this PR on Mike's behalf.
2026-05-11 17:27:29 +02:00

7.3 KiB

Database Development Patterns

Database Work Overview

Database Generation Process

  1. Modify SQL files in coderd/database/queries/
  2. Run make gen
  3. If errors about audit table, update enterprise/audit/table.go
  4. Run make gen again
  5. Run make lint to catch any remaining issues

Migration Guidelines

Creating Migration Files

Location: coderd/database/migrations/ Format: {number}_{description}.{up|down}.sql

  • Number must be unique and sequential
  • Always include both up and down migrations

Helper Scripts

Script Purpose
./coderd/database/migrations/create_migration.sh "migration name" Creates new migration files
./coderd/database/migrations/fix_migration_numbers.sh Renumbers migrations to avoid conflicts
./coderd/database/migrations/create_fixture.sh "fixture name" Creates test fixtures for migrations

Database Query Organization

  • MUST DO: Any changes to database - adding queries, modifying queries should be done in the coderd/database/queries/*.sql files
  • MUST DO: Queries are grouped in files relating to context - e.g. prebuilds.sql, users.sql, oauth2.sql
  • After making changes to any coderd/database/queries/*.sql files you must run make gen to generate respective ORM changes

Query Naming

  • Use ByX when X is the lookup or filter column.
  • Use PerX or GroupedByX when X is the aggregation or grouping dimension.
  • Avoid ByX names for grouped queries.

Handling Nullable Fields

Use sql.NullString, sql.NullBool, etc. for optional database fields:

CodeChallenge: sql.NullString{
    String: params.codeChallenge,
    Valid:  params.codeChallenge != "",
}

Set .Valid = true when providing values.

Database-to-SDK Conversions

  • Extract explicit db-to-SDK conversion helpers instead of inlining large conversion blocks inside handlers.
  • Keep nullable-field handling, type coercion, and response shaping in the converter so handlers stay focused on request flow and authorization.

Audit Table Updates

If adding fields to auditable types:

  1. Update enterprise/audit/table.go
  2. Add each new field with appropriate action:
    • ActionTrack: Field should be tracked in audit logs
    • ActionIgnore: Field should be ignored in audit logs
    • ActionSecret: Field contains sensitive data
  3. Run make gen to verify no audit errors

Database Architecture

Core Components

  • PostgreSQL 13+ recommended for production
  • Migrations managed with migrate
  • Database authorization through dbauthz package

Authorization Patterns

// Public endpoints needing system access (OAuth2 registration)
app, err := api.Database.GetOAuth2ProviderAppByClientID(dbauthz.AsSystemRestricted(ctx), clientID)

// Authenticated endpoints with user context
app, err := api.Database.GetOAuth2ProviderAppByClientID(ctx, clientID)

// System operations in middleware
roles, err := db.GetAuthorizationUserRoles(dbauthz.AsSystemRestricted(ctx), userID)

Common Database Issues

Migration Issues

  1. Migration conflicts: Use fix_migration_numbers.sh to renumber
  2. Missing down migration: Always create both up and down files
  3. Schema inconsistencies: Verify against existing schema

Field Handling Issues

  1. Nullable field errors: Use sql.Null* types consistently
  2. Missing audit entries: Update enterprise/audit/table.go

Query Issues

  1. Query organization: Group related queries in appropriate files
  2. Generated code errors: Run make gen after query changes
  3. Performance issues: Add appropriate indexes in migrations

Database Testing

Test Database Setup

func TestDatabaseFunction(t *testing.T) {
    db := dbtestutil.NewDB(t)

    // Test with real database
    result, err := db.GetSomething(ctx, param)
    require.NoError(t, err)
    require.Equal(t, expected, result)
}

Best Practices

Schema Design

  1. Use appropriate data types: VARCHAR for strings, TIMESTAMP for times
  2. Add constraints: NOT NULL, UNIQUE, FOREIGN KEY as appropriate
  3. Create indexes: For frequently queried columns
  4. Consider performance: Normalize appropriately but avoid over-normalization

Query Writing

  1. Use parameterized queries: Prevent SQL injection
  2. Handle errors appropriately: Check for specific error types
  3. Use transactions: For related operations that must succeed together
  4. Optimize queries: Use EXPLAIN to understand query performance

Transaction Safety with InTx

  • Inside db.InTx(...) closures, do not use the outer store (api.Database, p.db, etc.) directly or indirectly. Use the tx handle for DB work inside the closure, or fetch read-only inputs before opening the transaction.
  • Watch for helper methods on a receiver that hide outer-store access. A call like p.someHelper(ctx) is still unsafe inside InTx if that helper uses p.db internally.
  • Using the outer store while a transaction is open can hold one connection and then block on another pool checkout, which can cause pool starvation and idle in transaction incidents under load.

Migration Writing

  1. Make migrations reversible: Always include down migration
  2. Test migrations: On copy of production data if possible
  3. Keep migrations small: One logical change per migration
  4. Document complex changes: Add comments explaining rationale

Advanced Patterns

Complex Queries

-- Example: Complex join with aggregation
SELECT
    u.id,
    u.username,
    COUNT(w.id) as workspace_count
FROM users u
LEFT JOIN workspaces w ON u.id = w.owner_id
WHERE u.created_at > $1
GROUP BY u.id, u.username
ORDER BY workspace_count DESC;

Conditional Queries

-- Example: Dynamic filtering
SELECT * FROM oauth2_provider_apps
WHERE
    ($1::text IS NULL OR name ILIKE '%' || $1 || '%')
    AND ($2::uuid IS NULL OR organization_id = $2)
ORDER BY created_at DESC;

Audit Patterns

// Example: Auditable database operation
func (q *sqlQuerier) UpdateUser(ctx context.Context, arg UpdateUserParams) (User, error) {
    // Implementation here

    // Audit the change
    if auditor := audit.FromContext(ctx); auditor != nil {
        auditor.Record(audit.UserUpdate{
            UserID: arg.ID,
            Old:    oldUser,
            New:    newUser,
        })
    }

    return newUser, nil
}

Debugging Database Issues

Common Debug Commands

# Run tests (starts Postgres automatically if needed)
make test

# Run specific database tests
go test ./coderd/database/... -run TestSpecificFunction

# Check query generation
make gen

# Verify audit table
make lint

Debug Techniques

  1. Enable query logging: Set appropriate log levels
  2. Use database tools: pgAdmin, psql for direct inspection
  3. Check constraints: UNIQUE, FOREIGN KEY violations
  4. Analyze performance: Use EXPLAIN ANALYZE for slow queries

Troubleshooting Checklist

  • Migration files exist (both up and down)
  • make gen run after query changes
  • Audit table updated for new fields
  • Nullable fields use sql.Null* types
  • Authorization context appropriate for endpoint type