## Summary
This fixes the stale helper-binary class of generator bugs in the
Makefile by adding the repo packages and embedded files that are
compiled into each affected `_gen/bin/*` helper as real prerequisites of
the helper binary target.
The concrete issue that prompted this was an audit docs regeneration
after a rebase. `docs/admin/security/audit-logs.md` depends on
`enterprise/audit/table.go`, so the docs target reran, but
`_gen/bin/auditdocgen` was only an order-only prerequisite and its own
rule only depended on `scripts/auditdocgen/*.go`. Because the stale
local `auditdocgen` binary had been compiled before `UserSecret` was
added to `enterprise/audit/table.go`, it regenerated the audit docs
without the `UserSecret` row even though the source table still
contained it.
This is the same failure mode I recently fixed for `_gen/bin/clidocgen`
in #24302 and `_gen/bin/modeloptionsgen` in #24543. Those fixes made the
binaries depend on the package sources and embedded template files whose
compile-time data they read at runtime, rather than relying on output
targets to mention those files. This PR applies that pattern to the
other high-value helper binaries with the same risk.
## Changes
- Rebuild `_gen/bin/auditdocgen` when `enterprise/audit/*.go` changes,
so audit docs are generated from the current `AuditableResources` and
`AuditActionMap` data.
- Rebuild `_gen/bin/apitypings` when `codersdk/*.go` changes, and make
`typesGenerated.ts` rerun when the health packages it emits change.
- Rebuild `_gen/bin/check-scopes` and `_gen/bin/apikeyscopesgen` when
RBAC or policy sources change.
- Rebuild `_gen/bin/dbdump` when migration Go or SQL files change, since
the migrations package embeds SQL into the binary.
- Rebuild `_gen/bin/typegen` when its Go sources, embedded templates,
RBAC/policy inputs, string helper, or country data change. Generated
RBAC files are deliberately excluded from the typegen binary input set
to avoid cycles with typegen outputs.
## Why this covers the class
Most generated output targets keep helper binaries as order-only
prerequisites. That is fine for avoiding unnecessary output churn, but
it means the helper binary target must be the cache boundary and must
list everything baked into the compiled binary. The affected helpers
import repo packages that expose maps, constants, struct tags, embedded
templates, or embedded SQL. Without those files on the binary rule, Make
can rerun an output target with an old executable and write semantically
stale generated content.
The fix keeps the existing order-only output structure and instead makes
each binary rule track its compile-time inputs directly. That matches
the previous clidocgen and modeloptionsgen fixes while avoiding a broad
`$(GO_SRC_FILES)` dependency for helpers that only need a small set of
packages.
> Written by Mux, reviewed by a human
macOS ARM reports arm64 via uname -m, but typos GitHub release assets
use aarch64 in their filenames. The mismatch produces a 404, so the
build/typos-$(VERSION) target fails silently and Apple Silicon users
fall back to whatever typos binary their environment provides, such as
the one from nix. That binary may be a different version than the one
pinned in CI, creating a skew where local lint/typos rejects strings
that CI accepts.
<!--
If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.
-->
Rolldown's tokio workers stall when competing with Go compilation
and the production site build for CPU, causing Vite transform
requests to hang. Vitest browser mode has no import-phase timeout,
so a stalled browser import() blocks the run indefinitely.
`_gen/bin/modeloptionsgen` reflects over `codersdk` struct tags, but
reflection doesn't read source — Go struct tags are compile-time string
literals folded into the type descriptor and emitted into the binary's
`.rodata` section. `reflect.TypeOf(...).Field(i).Tag.Get("enum")` reads
from that baked-in table; the generator cannot consult
`codersdk/chats.go` on disk even if it wanted to.
That means the binary has to be rebuilt whenever those tags change. The
existing Makefile rule only depends on `scripts/modeloptionsgen/*.go`,
and the JSON target lists the binary as an order-only prereq, so `make
gen` happily runs the stale binary after edits to `codersdk/chats.go`
and writes outdated enum values.
Fix: add `$(wildcard codersdk/*.go)` to the binary's prereqs, matching
`clidocgen`. The whole-package wildcard is deliberate — a narrower
prereq would break if someone splits `chats.go`. Cost is negligible:
Go's per-package build cache means unrelated edits recompile one package
and re-link, and the JSON target's own prereqs are unchanged so it
doesn't regenerate.
Add a lint check that prevents introduction of Unicode emdash (U+2014)
and endash (U+2013) characters. These are almost exclusively introduced
by AI agents and conflict with the project writing style.
The lint script (scripts/check_emdash.sh) checks only added lines in
the current diff by default, so existing violations do not block CI.
Pass --all to scan the entire repo for auditing.
Agent instructions in AGENTS.md, site/AGENTS.md, and the docs style
guide now explicitly ban emdash, endash, and " -- " as punctuation,
with guidance to use commas, semicolons, or periods instead.
Replace the old `InTx` ruleguard rule in `scripts/rules.go` with a
custom in-tree `go/analysis` analyzer under `scripts/intxcheck/`. The
new analyzer catches the same direct and pass-through misuse classes as
before, plus two new classes the pattern-matcher couldn't reach:
- **Indirect same-package helper misuse** — flags `p.someHelper(ctx)`
inside `InTx` when the helper body uses the outer store (the PR #24369
bug class).
- **Nested dangerous closures** — descends into `go func() { ... }()`,
`defer func() { ... }()`, and immediately-invoked function literals.
The analyzer uses semantic `types.Object` identity instead of raw
expression string comparison, which avoids false positives from
closure-local shadowing and catches simple aliases like `outer := s.db`
and `alias := s`.
This PR also fixes three real outer-store-inside-transaction bugs the
new analyzer surfaced:
- `coderd/wsbuilder/wsbuilder.go`: `FindMatchingPresetID` and
`getWorkspaceTask` now use the inner transaction store instead of
`b.store`.
- `enterprise/dbcrypt/dbcrypt.go`: `ensureEncrypted` now calls
`s.InsertDBCryptKey` (the tx-wrapped store) instead of
`db.InsertDBCryptKey`. The `dbCrypt.InTx` method wraps the raw tx in a
new `*dbCrypt`, so `s.InsertDBCryptKey` still dispatches through the
encryption layer.
Two call sites need `// intxcheck:ignore` suppressions. Both are one-off
patterns that only look like misuse because the analyzer doesn't track
assignments — proving them safe would require full dataflow analysis,
which is well beyond what a targeted lint like this should attempt:
- `coderd/database/dbfake/dbfake.go` — `b.db` is reassigned to `tx` on
the preceding line, so `b.doInTX()` actually uses the transaction. The
analyzer sees the original `b.db` identity and flags it.
- `coderd/database/db_test.go` — test intentionally passes the outer
store to `require.Equal` to assert that nested `InTx` returns the same
handle.
Suppressions use `// intxcheck:ignore` instead of `//nolint:intxcheck`
because `intxcheck` runs as a standalone `go/analysis` tool outside
golangci-lint. golangci-lint's `nolintlint` checker flags `//nolint`
directives for linters it doesn't control, so we use a custom comment
prefix to avoid that conflict.
The `_gen/bin/clidocgen` binary only declared `scripts/clidocgen/*.go`
as prerequisites. Since it reflects over the full CLI tree (227
transitive internal packages via `enterprise/cli` → `cli/` → `codersdk/`
→ …), any change to CLI flags, SDK structs, or command definitions could
alter its output — but Make would keep serving the stale binary until it
was manually deleted (or `-B` was passed).
This caused a recurring developer-facing bug: after merging main (or
rebasing onto new CLI/SDK changes), the pre-commit hook would use the
stale binary, commit wrong docs, `make gen` would see no diff (same
stale binary), and CI would fail because it builds fresh.
Add `$(GO_SRC_FILES)` and the embedded `command.tpl` to the prerequisite
list so Make invalidates the binary whenever its inputs change. Move
`FIND_EXCLUSIONS` and `GO_SRC_FILES` above the helper-binary block so
the variable is defined before first use.
The vitest process hung after all 2132 story tests passed because
leftover refetchInterval polls kept the Node.js event loop alive.
Components that set per-query refetchInterval override the
QueryClient default, causing HTTP requests through vite's proxy
to localhost:3000 (no backend) that never resolve cleanly.
Three fixes:
- preview.tsx: disable all automatic refetching defaults and cancel
in-flight queries on story unmount via useEffect cleanup
- storybook.tsx: save/restore the original window.WebSocket in the
withWebSocket decorator, clear pending timers in close()
- vite.config.mts: add explicit testTimeout, hookTimeout, bail, and
retry settings to the storybook vitest project
Also fix 5 story files that imported from @testing-library/react
instead of storybook/test.
## Problem
`go run` caches the final linked executable in `~/.cache/go-build`.
Every
helper invocation via `go run ./scripts/<tool>` stores a copy, and
because
the cache key includes build metadata, the same tool accumulates
multiple
cached executables over time. With 12+ helper binaries invoked during
`make gen` and `make pre-commit`, this is a meaningful contributor to
GOCACHE growth.
## Fix
Replace `go run` with `go build -o _gen/bin/<tool>` for 12 repo-local
helper packages (16 Makefile callsites). Each helper is an explicit Make
file target with `$(wildcard *.go)` prerequisites, so `make -j`
serializes
builds correctly instead of racing on shared output paths.
Helpers converted: `apitypings`, `auditdocgen`, `check-scopes`,
`clidocgen`, `dbdump`, `examplegen`, `gensite`, `apikeyscopesgen`,
`metricsdocgen`, `metricsdocgen-scanner`, `modeloptionsgen`, `typegen`.
Left on `go run` (intentionally): `migrate-ci` and `migrate-test`
(CI/test-only, not on common developer paths).
`_gen/` is already in `.gitignore`. The `clean` target removes
`_gen/bin`.
## GOCACHE growth (isolated cache, single `make gen`)
| | Old (`go run`) | New (`go build -o`) |
|--|----------------|---------------------|
| Total cache size | 2.9 GB | 2.6 GB |
| Cached executables | 11 | 4 |
| Executable bytes | 401 MB | 25 MB |
The 4 remaining executables come from tools outside this change
(`dbgen` and `goimports` from `generate.sh`, plus two `main` binaries
from deferred helpers). Helper binaries now live in `_gen/bin/`
(581 MB, gitignored, cleaned by `make clean`).
## Build time benchmarks
**Source changed** (content hash invalidated, forces recompile):
| Helper | `go run` | `go build -o` + run | Overhead |
|--------|---------|---------------------|----------|
| typegen | 1.50s | 2.03s | +0.52s |
| examplegen | 1.37s | 1.67s | +0.30s |
| apikeyscopesgen | 1.21s | 1.71s | +0.50s |
| modeloptionsgen | 1.23s | 1.64s | +0.41s |
**Repeat invocation** (no source change, the common `make gen` / `make
pre-commit` path):
| Helper | `go run` (cache lookup) | Cached binary | Speedup |
|--------|------------------------|---------------|---------|
| typegen | 0.346s | 0.037s | 9.4x |
| examplegen | 0.368s | 0.037s | 9.9x |
| modeloptionsgen | 0.342s | 0.021s | 16.3x |
| apikeyscopesgen | 0.298s | 0.030s | 9.9x |
When source changes, `go build -o` is 0.3-0.5s slower per helper (it
writes a local binary instead of caching in GOCACHE). On repeat runs
(the common path), the pre-built binary is 10-16x faster because
`go run` still does a staleness check while the binary just executes.
> This PR was authored by Mux on behalf of Mike.
Replace hardcoded paths for instruction files, skills, and MCP config
with
values read from `CODER_AGENT_EXP_*` environment variables. Template
authors
configure paths via the existing `coder_agent` `env` block. The agent
resolves `~`, relative, and absolute paths locally, then serves the
resolved config over `GET /api/v0/context-config`. `chatd` fetches this
once per workspace attach and falls back to today's defaults for older
agents.
All path env vars are comma-separated, allowing multiple directories:
| Env Var | Default | Controls |
|---|---|---|
| `CODER_AGENT_EXP_INSTRUCTIONS_DIRS` | `~/.coder` | Dirs containing the
instruction file |
| `CODER_AGENT_EXP_INSTRUCTIONS_FILE` | `AGENTS.md` | Instruction file
name |
| `CODER_AGENT_EXP_SKILLS_DIRS` | `.agents/skills` | Skills directories
|
| `CODER_AGENT_EXP_SKILL_META_FILE` | `SKILL.md` | Skill metadata file
name |
| `CODER_AGENT_EXP_MCP_CONFIG_FILES` | `.mcp.json` | MCP config files |
### Example
```hcl
resource "coder_agent" "main" {
os = "linux"
arch = "amd64"
env = {
CODER_AGENT_EXP_INSTRUCTIONS_DIRS = "/opt/company/agent-config,~/.coder"
CODER_AGENT_EXP_INSTRUCTIONS_FILE = "CLAUDE.md"
CODER_AGENT_EXP_SKILLS_DIRS = "/opt/company/ai-skills,.agents/skills"
CODER_AGENT_EXP_MCP_CONFIG_FILES = "/opt/company/mcp.json,.mcp.json"
}
}
```
<details>
<summary>Implementation Details</summary>
### Architecture
Follows the same pattern as MCP tool discovery:
agent resolves locally → exposes via HTTP → chatd consumes.
**Agent-side** (`agent/agentcontextconfig/`):
- `ResolvePath` / `ResolvePaths` handle `~`, relative, and absolute path
forms; returns `""` for relative paths when baseDir is empty
- `Config` reads env vars, falls back to defaults, resolves all paths
- `GET /api/v0/context-config` serves the resolved config as JSON
**chatd-side** (`coderd/x/chatd/`):
- Calls `conn.ContextConfig()` once on first workspace attach
- Falls back to hardcoded defaults on 404 (older agents)
- Iterates instruction dirs, skills dirs using resolved absolute paths
- `LSRelativityRoot` everywhere — no more home/root juggling
### Key design decisions
- **`EXP_` prefix**: env vars use `CODER_AGENT_EXP_*` to indicate
experimental status
- **Plural names**: comma-separated vars use plural names (`DIRS`,
`FILES`); single-value vars use singular (`FILE`)
- **Defaults in `workspacesdk`**: default constants live in
`codersdk/workspacesdk/` so both agent and server reference them without
cross-layer imports
- **`skillMetaFile` persistence**: stored on context-file parts via
`ContextFileSkillMetaFile` and restored on subsequent chat turns so
custom values survive across turns
- **Working dir dedup**: `slices.Contains` guard prevents reading the
same instruction file from both `InstructionsDirs` and the working
directory
- **MCP server dedup**: first-occurrence-wins dedup prevents leaking
duplicate connections from overlapping config files
- **ResolvePath safety**: returns `""` for relative paths when `baseDir`
is empty, so `ResolvePaths` filters them out
### Files changed
| File | Change |
|---|---|
| `agent/agentcontextconfig/` | New package — path resolution + HTTP
endpoint |
| `codersdk/workspacesdk/agentconn.go` | `ContextConfigResponse` type,
default constants, client method |
| `agent/agent.go` + `agent/api.go` | Wire up endpoint, pass config to
MCP |
| `agent/x/agentmcp/manager.go` | Accept `[]string` MCP config paths,
dedup by name |
| `coderd/x/chatd/chatd.go` | Fetch config, thread through, named
returns |
| `coderd/x/chatd/instruction.go` | Accept configurable dir + file name,
`skillMetaFileFromParts` |
| `coderd/x/chatd/chattool/skill.go` | Accept configurable dirs + meta
file |
| `codersdk/chats.go` | `ContextFileSkillMetaFile` field for persistence
|
### Test coverage
- `TestConfig` (4 cases): defaults, custom env vars, whitespace
trimming, comma-separated dirs
- `TestResolvePath` / `TestResolvePaths`: including empty baseDir edge
case
- `TestPersistInstructionFilesFallbackOnOlderAgent`: backward-compat
path when `ContextConfig` returns 404
- `TestChatMessagePartVariantTags`: updated exclusion list for new
internal field
### Backward compatibility
Older agents return 404 for the new endpoint. `chatd` catches this and
falls back to today's defaults via `readHomeInstructionFile` (using
`LSRelativityHome`). Existing workspaces work with no changes.
</details>
The terraform testdata fixtures silently drift when the coder provider
releases a new version. The .terraform.lock.hcl files are gitignored,
.tf files use loose constraints (>= 2.0.0), and generate.sh always
runs terraform init -upgrade. The Makefile only re-runs generate.sh
when the terraform CLI version changes, not the provider version.
Track a canonical lockfile and provider-version.txt in git. Change
generate.sh to respect the lockfile by default (terraform init without
-upgrade). Add --upgrade flag for intentional provider bumps, --check
for cheap staleness detection in the Makefile, and a new
update-terraform-testdata make target.
All map iterations in ConvertState now use sorted helpers instead of
ranging over Go maps directly. Previously only coder_env and
coder_script were sorted (via sortedResourcesByType). This extends
the pattern to coder_agent, coder_devcontainer, coder_agent_instance,
coder_app, coder_metadata, coder_external_auth, and the main
resource output list.
Also fixes generate.sh writing version.txt to the wrong directory
(resources/ instead of testdata/), which caused the Makefile version
check to silently desync and trigger unnecessary regeneration.
Adds TestConvertStateDeterministic that calls ConvertState 10 times
per fixture and asserts byte-identical JSON output without any
post-hoc sorting.
The test-storybook target uses @vitest/browser-playwright with
Chromium but never installs the browser binaries. pnpm install
only fetches the npm package; the actual browser must be
downloaded separately via playwright install. This mirrors what
test-e2e already does.
Pre-commit classifies staged files and runs make pre-commit-light
when no Go, TypeScript, or Makefile changes are present. This
skips gen, lint/go, lint/ts, fmt/go, fmt/ts, and the binary
build. A markdown-only commit takes seconds instead of minutes.
Pre-push uses the same heuristic: if only light files changed
(docs, shell, terraform, etc.), tests are skipped entirely.
Falls back to the full make targets when Go/TS/Makefile changes
are detected, CODER_HOOK_RUN_ALL=1 is set, or the diff range
can't be determined.
Also adds test-storybook to make pre-push (vitest with the
storybook project in Playwright browser mode).
- Add `--starter-template` option and properly create starter template
with name and icon
- Add Coder Desktop URLs to listening banner
- Makefile tweak to avoid rebuilding `scripts/develop` every time Go
code changes
Replace the ~370-line bash develop.sh with a Go program using
serpent for CLI flags, errgroup for process lifecycle, and
codersdk for setup. develop.sh becomes a thin make + exec wrapper.
- Process groups for clean shutdown of child trees
- Docker template auto-creation via SDK ExampleID
- Idempotent setup (users, orgs, templates)
- Configurable --port, --web-port, --proxy-port
- Preflight runs lib.sh dependency checks
- TCP dial for port-busy checks
- Make target (build/.bin/develop) for build caching
The pre-push hook was removed in #22956. This restores it with a
reduced scope (tests + site build) and an allowlist so it only runs
for developers who opt in.
Two opt-in mechanisms:
- git config coder.pre-push true (local, not committed)
- CODER_WORKSPACE_OWNER_NAME allowlist in the hook script
git config takes priority and also supports explicit opt-out for
allowlisted users (git config coder.pre-push false).
Refs #22956
---------
Co-authored-by: Cian Johnston <cian@coder.com>
pre-commit was noisy: every sub-target dumped full stdout/stderr to the
terminal, burying failures in pages of compiler output and lint details.
Teach timed-shell.sh a quiet mode via MAKE_LOGDIR: when set, recipe
output is redirected to per-target log files and a one-line status is
printed instead. When unset, behavior is unchanged (with a refreshed
output format).
Makefile changes:
- pre-commit creates a tmpdir, passes MAKE_LOGDIR to sub-makes
- Drop --output-sync=target (log files eliminate interleaving)
- Add --no-print-directory to suppress Entering/Leaving noise
- Split check-unstaged and check-untracked into separate defines
- Restyle both with colored indicators and clearer instructions
- Clean up tmpdir on success, preserve on failure for debugging
## Summary
- remove the `pre-push` git hook script from the repository
- remove the `make pre-push` target and related Makefile documentation
- update contributor and agent docs so they only describe the remaining
`pre-commit` hook
## Validation
- `make pre-commit`
- `git diff --check`
---
_Generated with [`mux`](https://github.com/coder/mux) • Model:
`openai:gpt-5.4` • Thinking: `high`_
## Problem
The `build` job on `main` takes ~7m28s for the Build step alone (~13m
total). Analysis of 10 recent CI runs on `main` shows the zstd
compression of the slim binary archive is the second largest bottleneck:
| Phase | Avg Duration | % of Build Step |
|-------|-------------|----------------|
| Fat Go builds (7 binaries w/ embed) | ~205s | 45.8% |
| **zstd compression (`-22 --ultra`)** | **~123s** | **27.4%** |
| Parallel block (vite + slim Go builds) | ~65s | 14.5% |
| Packaging + signing | ~55s | 12.3% |
The `zstd -22 --ultra` setting compresses a ~350 MB tar to ~71 MB, but
it is **single-threaded** and takes ~102s on 8-core CI runners. Adding
`-T8` does not help at level 22 — it remains CPU-bound on a single
thread.
## Solution
Use `zstd -6 -T0` (multithreaded, auto-detect cores) for non-release CI
builds. Release builds (`CODER_RELEASE=true`) continue using `-22
--ultra`.
### Benchmarks (349 MB slim binary tar, 8 cores)
| Setting | Wall Time | Output Size | Use Case |
|---------|----------|------------|----------|
| `-22 --ultra` | **102.4s** | 71 MB | Release builds |
| `-6 -T0` | **0.8s** | 94 MB | CI builds (new) |
| `-6` | 2.4s | 94 MB | Local dev (unchanged) |
The 23 MB size increase is negligible for the main branch preview images
(`ghcr.io/coder/coder-preview:main`). The archive is embedded in fat
binaries and extracted once by the agent at startup — decompression time
is identical regardless of compression ratio.
### Expected impact
~120s savings on the Build step, bringing it from ~7m28s to ~5m30s.
## Verification
All three code paths confirmed:
- `CODER_RELEASE=true CI=true` → `-22 --ultra` ✅
- `CI=true` (no `CODER_RELEASE`) → `-6 -T0` ✅
- Local (no `CI`) → `-6` ✅
- `CODER_RELEASE=false CI=true` (dry run) → `-6 -T0` ✅
The `lint/go` recipe used `$(shell)` inside a recipe to extract the
golangci-lint version. When `MAKE_TIMED=1` (set by pre-commit/pre-push),
make expands `.SHELLFLAGS = $@ -ceu` for `$(shell)` calls, passing the
target name as the first argument to `timed-shell.sh`. Since the target
name doesn't start with `-`, the timing code path runs and its banner
output contaminates the captured value, causing intermittent failures:
```
bash: line 3: lint/go: No such file or directory
```
Replace with bash command substitution (`$$()`), which is the correct
approach under `.ONESHELL` and avoids the `SHELL`/`.SHELLFLAGS`
interaction entirely. Also replaces deprecated `egrep` with `grep -oE`.
pre-commit and pre-push only reported total elapsed time at the end,
making it hard to identify which jobs are slow.
Add a `MAKE_TIMED=1` mode that replaces `SHELL` with a wrapper
(`scripts/lib/timed-shell.sh`) to print wall-clock time for each
recipe. pre-commit and pre-push enable this on their sub-makes.
Ad-hoc use: `make MAKE_TIMED=1 test`
- Fix dead docker pull retry loop (Make ate bash expansions)
- Make test-postgres-docker idempotent so Phase 2 stops restarting it
mid-test
- Run migrate-ci at recipe time, not parse time
- Install Playwright browsers before e2e tests
- Set test timeout to 20m, 5m shy of CI's 25m job limit
- Cap parallelism at nproc/4 via PARALLEL_JOBS
- Add phase banners and elapsed time
The `test-postgres` Makefile rule was redundant — CI never used it (it
runs `test-postgres-docker` + `make test` via the `test-go-pg` action),
and `make test` auto-starts a Postgres Docker container when needed via
`dbtestutil`.
- Remove the `test-postgres` rule from Makefile
- Update `pre-push` to run `test-postgres-docker` in the first phase
(alongside gen/fmt) and `make test` in the second phase
- Fix stale comments in CI workflows referencing `make test-postgres`
- Remove redundant "Test Postgres" entries from docs since `make test`
handles Postgres automatically
Follow-up to #22705 (pre-commit/pre-push hooks).
Unifies `test` and `test-race` into the same structure and lets CI call
`make test-race` instead of reproducing the gotestsum command.
**Parallelism**: Extracted from `GOTEST_FLAGS` into
`TEST_PARALLEL_PACKAGES`
/ `TEST_PARALLEL_TESTS` (default 8x8). `test-race` overrides to 4x4 via
target-specific Make variables. `TEST_NUM_PARALLEL_PACKAGES` and
`TEST_NUM_PARALLEL_TESTS` env vars continue to work for both targets.
**GOTEST_FLAGS**: Changed from simply-expanded (`:=`) to
recursively-expanded
(`=`) so target-specific overrides take effect at recipe time.
**CI**: `.github/actions/test-go-pg/action.yaml` now calls `make
test-race`
/ `make test` instead of hand-rolling the gotestsum command, eliminating
drift between local and CI configurations.
Refs #22705
This change adds git hooks and Makefile targets that mirror CI required
checks locally, catching issues before they reach CI.
This is for use by AI agents (documented in AGENTS.md).
- **pre-commit** (every commit): gen, fmt, lint, typos, slim binary
build. Fast checks without Docker or Playwright.
- **pre-push** (before push): full CI suite including site build, tests,
sqlc-vet, offlinedocs.
To use:
```sh
git config core.hooksPath scripts/githooks
```
Works in worktrees (where `.git` is a file). Bypass with `--no-verify`.
## Problem
Bootstrap scripts under `provisionersdk/scripts/` are inlined into
templates via `sh -c '${init_script}'`. Any single quote (apostrophe) in
these `.sh` files silently breaks the shell quoting, causing the agent
to never start — with near-invisible error output.
## Changes
- **`scripts/check_bootstrap_quotes.sh`** — new lint script that scans
all `.sh` files under `provisionersdk/scripts/` for single quotes and
fails with a clear error if any are found. Only checks shell scripts
(not `.ps1`, which legitimately uses single quotes).
- **`Makefile`** — added `lint/bootstrap` target wired into the `lint`
dependency list.
Fixes#22062
Follow-up to #22612. Running `git status --short` in a loop during `make
-B -j gen` still showed intermediate states for several files. This PR
fixes the remaining ones.
The main issues:
- `generate.sh` ran `gofmt` and `goimports` in-place after moving files
into the source tree. Now it formats in a workdir first and only `mv`s
the final result.
- `protoc` targets wrote directly to the source tree. Wrapped with
`scripts/atomic_protoc.sh` which redirects output to a tmpdir.
- Several generators used hardcoded `/tmp/` paths. On systems where
`/tmp` is tmpfs, `mv` degrades to copy+delete. Switched to a
project-local `_gen/` directory (gitignored, same filesystem).
- `apidoc/.gen` and `cli/index.md` used `cp` for final output. Replaced
with `mv`.
- `manifest.json` was written twice (unformatted, then formatted). Now
`.gen` writes to a staging file and the manifest target does one
formatted atomic write.
- `biome_format.sh` silently skipped files in gitignored dirs. Added
`--vcs-enabled=false`.
Two helpers reduce the Makefile boilerplate: `scripts/atomic_protoc.sh`
(wraps protoc) and an `atomic_write` Make define
(stdout-to-temp-to-target pattern). `.PRECIOUS` now also covers `.pb.go`
and mock files.
Verification: `make -B -j gen` x3 with `git status` polling, no changes.
Refs #22612
`make gen` could not run with `-j` because inter-target dependency edges
were missing. Multiple recipes compile `coderd/rbac` (which includes
generated files like `object_gen.go`), and without explicit ordering,
parallel runs produced syntax errors from mid-write reads.
Three main changes:
**Dependency graph fixes** declare the compile-time chain through
`coderd/rbac` so that `object_gen.go` is written before anything that
imports it is compiled. The DB generation targets use a GNU Make 4.3+
grouped target (`&:`) so Make knows `generate.sh` co-produces
`querier.go`, `unique_constraint.go`, `dbmetrics`, and `dbauthz` in a
single invocation. `SKIP_DUMP_SQL=1` avoids re-entrant `make` inside
`generate.sh` when the Makefile already guarantees `dump.sql` is fresh.
**`scripts/atomicwrite` package** replaces `os.WriteFile` in all gen
scripts with a temp-file-in-same-dir + rename pattern, preventing
interrupted runs from leaving partial files.
**`.PRECIOUS` and shell atomic writes** protect git-tracked generated
files from Make's default delete-on-error behavior. Since these files
are committed, deletion is worse than staleness -- `git restore` is the
recovery path.
CI now runs `make -j --output-sync -B gen` (~32s, down from ~85s
serial).
| Scenario | Before | After |
|-----------------------------------|--------------------|----------|
| `make gen` (serial) | 95s | 95s |
| `make -j gen` (parallel) | race error | **22s** |
| CI `make -j --output-sync -B gen` | forced serial ~85s | **~32s** |
relates to #21335
Adds UpdateAppStatus on the agentsocket, wired up to forward to Coderd over the dRPC connection the agent maintains.
Disclosure: I used AI to generate significant portions of this PR, but hand-reviewed and tweaked the code. I consider it approximately indistinguishable from what I would have done by hand.
## Summary
The macOS `.dylib` is only used by Coder Desktop macOS v0.7.2 or older.
v0.7.2 was released in August 2025. v0.8.0 of Coder Desktop macOS, also
released in August 2025, uses a signed Coder slim binary from the
deployment instead.
It's unlikely customers will be using Coder Desktop macOS v0.7.2 and the
next release of Coder simultaneously, so I think we can safely remove
this process, given it slows down CI & release processes.
## Changes
- **Makefile**: Remove `DYLIB_ARCHES`, `CODER_DYLIBS` variables and
`build/coder-dylib` target
- **scripts/build_go.sh**: Remove `--dylib` flag and all dylib-specific
logic (c-shared buildmode, CGO, plist embedding, vpn/dylib entrypoint)
- **scripts/sign_darwin.sh**: Remove dylib-specific comment
- **CI (ci.yaml)**: Remove `build-dylib` job, artifact download/insert
steps, and `build-dylib` dependency from `build` job
- **Release (release.yaml)**: Remove `build-dylib` job, artifact
download/insert steps, and `build-dylib` dependency from `release` job
- **vpn/dylib/**: Delete entire directory (`lib.go` + `info.plist.tmpl`)
- **vpn/router.go, vpn/dns.go**: Clean up comments referencing dylib
The slim and fat binary builds are completely unaffected — the dylib was
an independent build target with its own CI job.
_Generated by mux but reviewed by a human_
Extend the wire protocol for the boundary <-> agent unix socket with
a message envelope.
The envelope creates a boundary <-> agent data path that is separate
from the agent <-> coderd path. This lets boundary send operational
metadata (drop counts, configuration like jail type, capabilities)
that the agent can act on locally (e.g. Prometheus metrics) or use
to enrich outbound requests, without polluting the coderd-facing proto
with fields coderd never consumes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Description
This PR wires up the metrics scanner in the Makefile to automatically regenerate metrics documentation when source files change.
## Changes
* Add Makefile target `scripts/metricsdocgen/generated_metrics` to run the AST scanner to generate the metrics file
* Update `docs/admin/integrations/prometheus.md` Makefile target to depend on `scripts/metricsdocgen/generated_metrics`
* Add `scripts/metricsdocgen/README.md` documenting the metrics generation process
Closes: https://github.com/coder/coder/issues/13223
Adds a Go wrapper (`scripts/apidocgen/swaginit/main.go`) that calls
swag's Go API with `Strict: true`. The `--strict` flag isn't available
in swag's CLI in any version, so the wrapper is the only way to enable
it.
Also upgrades swag from v1.16.2 to v1.16.6 (better generics support,
precise numeric formats, `x-enum-descriptions`, CVE-2024-45338 fix).
## Summary
The `lint/actions/zizmor` target flakes in CI due to network
connectivity issues when running on depot runners
(https://github.com/coder/internal/issues/1233). The zizmor tool needs
to reach GitHub's API but intermittently fails with "Connection refused"
errors.
## Changes
- Creates a new `lint-actions` CI job that only runs when `.github/**`
files are touched (using existing `ci` filter)
- Removes zizmor from the main `lint` job
- Uses a Makefile conditional to include actionlint in `make lint`
locally but skip it in CI (where `lint-actions` handles it)
This reduces unnecessary flake exposure for PRs that don't modify GitHub
Actions files.
## Testing
- `actionlint` passes on the modified ci.yaml
- Verified Makefile conditional works: actionlint included locally,
skipped when `CI=true`
Fixes https://github.com/coder/internal/issues/1233
## Problem
Migration 000401 introduced a hardcoded `public.` schema qualifier which
broke deployments using non-public schemas (see #21493). We need to
prevent this from happening again.
## Solution
Adds a new `lint/migrations` Make target that validates database
migrations do not hardcode the `public` schema qualifier. Migrations
should rely on `search_path` instead to support deployments using
non-public schemas.
## Changes
- Added `scripts/check_migrations_schema.sh` - a linter script that
checks for `public.` references in migration files (excluding test
fixtures)
- Added `lint/migrations` target to the Makefile
- Added `lint/migrations` to the main `lint` target so it runs in CI
## Testing
- Verified the linter **fails** on current `main` (which has the
hardcoded `public.` in migration 000401)
- Verified the linter **passes** after applying the fix from #21493
```bash
# On main (fails)
$ make lint/migrations
ERROR: Migrations must not hardcode the 'public' schema. Use unqualified table names instead.
# After fix (passes)
$ make lint/migrations
Migration schema references OK
```
## Depends on
- #21493 must be merged first (or this PR will fail CI until it is)
---------
Signed-off-by: Danny Kopping <danny@coder.com>
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: Danny Kopping <danny@coder.com>
Usage example:
```bash
$ make test TEST_CPUPROFILE=cpu.prof TEST_MEMPROFILE=mem.prof TEST_PACKAGES=./coderd
```
Note that `TEST_PACKAGES` has to be specified, otherwise you get `cannot
use -{cpu,memory}profile flag with multiple packages`.
Signed-off-by: Danny Kopping <danny@coder.com>
Go 1.24 adds [tool
dependencies](https://go.dev/doc/modules/managing-dependencies#tools).
This allows us to track versions of tools in our `go.mod` instead of
sprinkling various `go run` commands throughout our codebase.
NOTE: there are still various hard-coded `go install` commands in our
dogfood Dockerfile. As that list is likely severely outdated, will leave
that for a separate PR.
Modifies `make fmt/go` to also use https://github.com/daixiang0/gci to format our imports to a standard format.
It introduces a new shell script to do the formatting so that our formatting tools are in one place.
relates to: https://github.com/coder/internal/issues/1094
This is number 2 of 5 pull requests in an effort to add agent script
ordering. It adds a drpc API that is exposed via a local socket. This
API serves access to a lightweight DAG based dependency manager that was
inspired by systemd.
In follow-up PRs:
* This unit manager will be plumbed into the workspace agent struct.
* CLI commands will use this agentsocket api to express dependencies
between coder scripts
I used an LLM to produce some of these changes, but I have conducted
thorough self review and consider this contribution to be ready for an
external reviewer.
Adds columns to track package and test name to test_databases table, and populates them as databases are created using the Broker.
In order to seamlessly work with existing `coder_database` databases with the old schema, the SQL that creates the table and columns is additive and idempotent, so we run it every time we initialize the Broker (once per test binary execution).
We include a transaction level advisorly lock to prevent deadlocks before attempting to alter the schema. I was seeing deadlocks without this.
Relates to https://github.com/coder/internal/issues/1093
This is the first of N pull requests to allow coder script ordering.
It introduces what is for now dead code, but paves the way for various
interfaces that allow coder scripts and other processes to depend on one
another via CLI commands and terraform configurations.
The next step is to add reactivity to the graph, such that changes in
the status of one vertex will propagate and allow other vertices to
change their own statuses.
Concurrency and stress testing yield the following:
CPU Profile:
<img width="1512" height="862" alt="Screenshot 2025-10-17 at 10 38 52"
src="https://github.com/user-attachments/assets/f46cf1a2-a0b2-4c02-81a0-069798108ee5"
/>
Mem Profile:
<img width="1512" height="862" alt="Screenshot 2025-10-17 at 10 38 01"
src="https://github.com/user-attachments/assets/45be1235-fff6-45ba-a50d-db9880377bd0"
/>
Predictably, lock contention and memory allocation are the largest
components of this system under stress. Nothing seems untoward.
fixes https://github.com/coder/internal/issues/1026
Thru a (perhaps too-) clever hack of `init()` functions, I've managed to remove the need to separately compile the cleaner binary. This should fix the flakes we are seeing were the binary compilation takes 10s of seconds on macOS. The cleaner is encorporated directly into the test binary and we self-exec as the subprocess.
We're seeing some timeouts from starting the db cleaner, e.g. https://github.com/coder/internal/issues/1026
My suspicion is that in CI the go build cache might not be warm, and so it can take a while to compile and run the dbcleaner subprocess.
This fix builds the cleaner once prior to starting a test run via `make test` to ensure we have a warm cache.
# Canonicalize API Key Scopes
This PR introduces canonical API key scopes with a `coder:` namespace prefix to avoid collisions with low-level resource:action names. It:
1. Renames special API key scopes in the database:
- `all` → `coder:all`
- `application_connect` → `coder:application_connect`
2. Adds support for a new `scopes` field in the API key creation request, allowing multiple scopes to be specified while maintaining backward compatibility with the singular `scope` field.
3. Updates the API documentation to reflect these changes, including the new endpoint for listing public API key scopes.
4. Ensures backward compatibility by mapping between legacy and canonical scope names in relevant code paths.