Adds production-observability metrics to coderd/x/chatd/ for
model-level correlation and a chatStreams memory-leak investigation.
- Label per-request chatd metrics (steps_total, message_count,
prompt_size_bytes, tool_result_size_bytes, ttft_seconds,
compaction_total) with `model` and enrich the per-turn logger
with provider/model.
- Add `coderd_chatd_stream_retries_total{provider, model, kind}`
counter incremented in chatloop before OnRetry.
- Register a prometheus.Collector exposing `streams_active`,
`stream_buffer_size_max`, `stream_buffer_events`,
`stream_subscribers` from p.chatStreams.
- Add `coderd_chatd_stream_buffer_dropped_total` counter,
incremented per publishToStream drop independently of the
existing log-rate-limited bufferDropCount.
- Snapshot logger/model before the title-generation goroutine to
avoid a data race with the logger/model rebind below it.
> 🤖
- Adds `--prometheus-port` flag to develop.sh (default: 2114). Passes `--prometheus-enable=true` and `--prometheus-address=0.0.0.0:2114` to dev coderd.
- Show Prometheus URL in dev banner + group by service
> 🤖
Replace the old `InTx` ruleguard rule in `scripts/rules.go` with a
custom in-tree `go/analysis` analyzer under `scripts/intxcheck/`. The
new analyzer catches the same direct and pass-through misuse classes as
before, plus two new classes the pattern-matcher couldn't reach:
- **Indirect same-package helper misuse** — flags `p.someHelper(ctx)`
inside `InTx` when the helper body uses the outer store (the PR #24369
bug class).
- **Nested dangerous closures** — descends into `go func() { ... }()`,
`defer func() { ... }()`, and immediately-invoked function literals.
The analyzer uses semantic `types.Object` identity instead of raw
expression string comparison, which avoids false positives from
closure-local shadowing and catches simple aliases like `outer := s.db`
and `alias := s`.
This PR also fixes three real outer-store-inside-transaction bugs the
new analyzer surfaced:
- `coderd/wsbuilder/wsbuilder.go`: `FindMatchingPresetID` and
`getWorkspaceTask` now use the inner transaction store instead of
`b.store`.
- `enterprise/dbcrypt/dbcrypt.go`: `ensureEncrypted` now calls
`s.InsertDBCryptKey` (the tx-wrapped store) instead of
`db.InsertDBCryptKey`. The `dbCrypt.InTx` method wraps the raw tx in a
new `*dbCrypt`, so `s.InsertDBCryptKey` still dispatches through the
encryption layer.
Two call sites need `// intxcheck:ignore` suppressions. Both are one-off
patterns that only look like misuse because the analyzer doesn't track
assignments — proving them safe would require full dataflow analysis,
which is well beyond what a targeted lint like this should attempt:
- `coderd/database/dbfake/dbfake.go` — `b.db` is reassigned to `tx` on
the preceding line, so `b.doInTX()` actually uses the transaction. The
analyzer sees the original `b.db` identity and flags it.
- `coderd/database/db_test.go` — test intentionally passes the outer
store to `require.Equal` to assert that nested `InTx` returns the same
handle.
Suppressions use `// intxcheck:ignore` instead of `//nolint:intxcheck`
because `intxcheck` runs as a standalone `go/analysis` tool outside
golangci-lint. golangci-lint's `nolintlint` checker flags `//nolint`
directives for linters it doesn't control, so we use a custom comment
prefix to avoid that conflict.
_Disclaimer: produced by Claude Opus 4.6_
Adds a `coder_build_info` metric which allows operators to see which
versions of Coder are currently running.
---------
Signed-off-by: Danny Kopping <danny@coder.com>
## Summary
Add `coderd_agents_first_connection_seconds` histogram metric that
records the
duration from workspace agent creation to first connection. This fills
an
observability gap — provisioner job timings and startup script metrics
exist,
but the agent connection phase (which can take several minutes) was not
exposed
to Prometheus.
Closes https://github.com/coder/coder/issues/21282
## Changes
- **`coderd/prometheusmetrics/prometheusmetrics.go`** — Define and
register a
`HistogramVec` in the existing `Agents()` polling loop. Observe
`first_connected_at - created_at` exactly once per agent via a
deduplication
map, pruned each tick to prevent unbounded memory growth.
- **`coderd/prometheusmetrics/prometheusmetrics_test.go`** — Update
`TestAgents`
to set `first_connected_at` on the test agent and assert the histogram
is
collected with correct labels, sample count, and sample sum.
- **`docs/admin/integrations/prometheus.md`**,
**`scripts/metricsdocgen/generated_metrics`** —
Auto-generated documentation updates from `make gen`.
## Metric details
| Property | Value |
|---|---|
| Name | `coderd_agents_first_connection_seconds` |
| Type | histogram |
| Labels | `template_name`, `agent_name`, `username`, `workspace_name` |
| Buckets | 1s, 10s, 30s, 1m, 2m, 5m, 10m, 30m, 1h |
## Example PromQL
```promql
# P95 agent connection time by template
histogram_quantile(0.95,
sum(rate(coderd_agents_first_connection_seconds_bucket[1h])) by (le, template_name)
)
```
<details>
<summary>Implementation notes</summary>
### Design decisions
- **Histogram over gauge**: Enables `histogram_quantile()` for
percentile queries.
- **Observe in `Agents()` polling loop**: All required data is already
fetched by
`GetWorkspaceAgentsForMetrics()` — no new DB queries.
- **Dedup via `map[uuid.UUID]struct{}`**: Prevents re-observing the same
agent
across polling ticks. Pruned each cycle to bound memory.
- **Buckets**: Aligned with
`coderd_provisionerd_workspace_build_timings_seconds`
range (1s–1h).
### Overhead at scale (100k active workspaces)
The deduplication map (`observedFirstConnection`) and per-tick pruning
map
(`currentAgentIDs`) are both `map[[16]byte]struct{}`. At 100k agents:
- **Memory**: ~2.25 MB persistent + ~2.25 MB transient per tick = **~4.5
MB peak**.
- **CPU**: ~25 ms of map operations per tick (one tick per minute) =
**<0.05% of one core**.
Both are negligible relative to the existing cost of the `Agents()` loop
(the DB
query, per-agent `GetWorkspaceAppsByAgentID` calls, and coordinator node
lookups
dominate).
</details>
> 🤖 Generated by Coder Agents
## Problem
In linked worktrees, Git hooks inherit multiple repo-local environment
variables: `GIT_DIR`, `GIT_COMMON_DIR`, `GIT_INDEX_FILE`, and others.
The
pre-commit and pre-push hooks only unset `GIT_DIR`, leaving the rest in
place.
When `make pre-commit` runs `go build`, Go tries to stamp VCS info by
shelling
out to `git`. With the leftover partial Git environment, `git` exits 128
and
the build fails:
```
error obtaining VCS status: exit status 128
Use -buildvcs=false to disable VCS stamping.
```
This only happens inside hooks in a linked worktree — running `make
pre-commit`
directly from the terminal works fine because the repo-local vars are
not set.
## Fix
Replace the bare `unset GIT_DIR` in both hooks with a loop that clears
every
variable reported by `git rev-parse --local-env-vars`:
```sh
while IFS= read -r var; do
unset "$var"
done < <(git rev-parse --local-env-vars)
```
This covers all 15 repo-local variables Git may inject (`GIT_DIR`,
`GIT_COMMON_DIR`, `GIT_INDEX_FILE`, `GIT_OBJECT_DIRECTORY`, etc.) and is
forward-compatible — if Git adds new local vars in the future, the loop
picks
them up automatically.
Adds a GitHub Actions workflow that cherry-picks merged PRs to the
latest release branch when the `cherry-pick` label is applied.
## How it works
1. Add the `cherry-pick` label to any PR targeting `main` (before or
after merge).
2. On merge (or on label if already merged), the workflow detects the
latest `release/*` branch.
3. It cherry-picks the merge commit (`-x -m1`) and opens a PR.
This complements the `backport` label (see #24025) which targets the
latest **3** release branches. `cherry-pick` targets only the **latest**
one — useful for getting fixes into the current release.
Created PRs follow existing repo conventions:
- **Branch:** `backport/<pr>-to-<version>`
- **Title:** `<original PR title> (#<pr>)` — e.g. `fix(site): correct
button alignment (#12345)`
- **Body:** links back to the original PR and merge commit
If the cherry-pick encounters conflicts, the workflow aborts the
cherry-pick, creates an empty commit with resolution instructions, and
opens the PR with a `[CONFLICT]` prefix so the author can resolve
manually.
Also:
- Removes `scripts/backport-pr.sh` (replaced by this workflow)
- Removes `.github/cherry-pick-bot.yml` (old bot config)
- Adds a section to the contributing docs explaining the `cherry-pick`
label
> [!NOTE]
> Generated with [Coder Agents](https://coder.com/agents)
RC tags are now created directly on `main`. The `release/X.Y` branch is
only cut when the actual release is ready. This eliminates the need to
cherry-pick hundreds of commits from main onto the release branch
between the first RC and the release.
## Workflow
```
main: ──●──●──●──●──●──●──●──●──●──
↑ ↑ ↑
rc.0 rc.1 cut release/2.34, tag v2.34.0
\
release/2.34: ──●── v2.34.1 (patch)
```
1. **RC:** On `main`, run `./scripts/release.sh`. The tool detects main
(or a detached HEAD reachable from main), prompts for the commit SHA to
tag, suggests the next RC version, and tags it.
2. **Release:** When the RC is blessed, create `release/X.Y` from `main`
(or the specific RC commit). Switch to that branch and run
`./scripts/release.sh`, which suggests `vX.Y.0`.
3. **Patch:** Cherry-pick fixes onto `release/X.Y` and run
`./scripts/release.sh` from that branch.
## Changes
### `scripts/releaser/release.go`
- Two modes based on branch:
- **`main` (or detached HEAD from main)** — RC tagging. Prompts for the
commit SHA to tag (defaults to HEAD). Always checks out the target
commit so the flow operates in detached HEAD. Suggests the next RC based
on existing RC tags.
- **`release/X.Y`** — Release/patch mode. Suggests `vX.Y.0` if the
latest tag is an RC, or the next patch otherwise.
- Detached HEAD support: if `git branch --show-current` is empty, checks
whether HEAD is an ancestor of `origin/main` and enters RC mode
automatically.
- Commit selection prompt in RC mode: shows current commit, lets the
user confirm or provide a different SHA.
- Warns if you try to tag a non-RC on main, or an RC on a release
branch.
- Skips open-PR check and branch sync check in RC mode (not useful on
main).
### `scripts/releaser/main.go`
- Updated help text.
### `.github/workflows/release.yaml`
- RC tags (`*-rc.*`): skip the release-branch validation (they live on
main).
- Non-RC tags: still require the corresponding `release/X.Y` branch.
### `docs/about/contributing/CONTRIBUTING.md`
- Rewrote the Releases section with the new workflow, release types
table, and ASCII diagram.
- Replaced the old "Creating a release" / "Creating a release (via
workflow dispatch)" subsections.
<details><summary>Decision log</summary>
### Why this approach?
Previously, cutting a release branch early for an RC meant
cherry-picking all of main's progress onto that branch before the actual
release — often hundreds of commits. This approach avoids that entirely:
RCs are just tagged snapshots of main, and the release branch only
exists once you need it for stabilization and backports.
### Files NOT changed
- **`scripts/release/publish.sh`** — `--rc` flag controls GitHub
prerelease marking (tag-level, not branch-level). `target_commitish`
already defaults to `main` when the tag isn't on a release branch.
- **`scripts/release/tag_version.sh`** — No RC-specific branch logic.
- **`scripts/releaser/version.go`** — Version parsing/comparison
unchanged.
- **`docs/install/releases/index.md`** — Public-facing docs describe RC
as a release channel with no branch-level detail.
</details>
> Generated by Coder Agents
Two fixes for the release script:
**1. Branch regex cleanup** — Simplified to only match `release/X.Y`.
Removed
support for `release/X.Y.Z` and `release/X.Y-rc.N` branch formats. RCs
are
now tagged from main (not from release branches), and the three-segment
`release/X.Y.Z` format will not be used going forward.
**2. Changelog range for first release on a new minor** — When no tags
match
the branch's major.minor, the commit range fell back to `HEAD` (entire
git
history, ~13k lines of changelog). Now computes `git merge-base` with
the
previous minor's release branch (e.g. `origin/release/2.32`) as the
changelog
starting point. This works even when that branch has no tags pushed yet.
Falls
back to the latest reachable tag from a previous minor if the branch
doesn't
exist.
Polishes the AI model configuration form (add/edit model) with tighter
layout and better input affordances.
**Frontend changes:**
- Replace "Unset" with "Default" in select dropdowns to communicate
system fallback
- Show pricing fields inline instead of behind a collapsible toggle
- Use flat section dividers (`border-t`) instead of bordered fieldsets
- Move field descriptions into info-icon tooltips to fix input
misalignment
- Add InputGroup adornments: `$` prefix + `/1M` suffix on pricing,
`tokens` suffix on token fields, `%` suffix on compression threshold,
range placeholders on temperature/penalty fields
- Shorter pricing labels (Input, Output, Cache Read, Cache Write)
- Compact JSON textareas (1-row height, resizable)
- Smart grid layouts by field type (3-col provider, 4-col pricing, 3-col
advanced)
- Boolean fields render as a segmented control (Default · On · Off)
instead of a dropdown
**Backend changes:**
- Add `enum` tags to OpenAI `service_tier`
(`auto,default,flex,scale,priority`) and `reasoning_summary`
(`auto,concise,detailed`) so they render as select dropdowns instead of
free-text inputs
> 🤖 Generated by Coder Agents
Add `-x` to backport script `git cherry-pick` command to include a
commit message reference to the original commit. This makes it easier to
trace where a cherry picked commit actually came from.
_Disclaimer: created using Claude Opus 4.6._
```
# Examples:
# ./scripts/backport-pr.sh 2.30 23969
# ./scripts/backport-pr.sh --dry-run 2.30 23969
```
Here's one I created: https://github.com/coder/coder/pull/23972
Signed-off-by: Danny Kopping <danny@coder.com>
This adds full RC release support to the release scripts and GitHub
Actions workflow. Previously, the tooling only supported stable and
mainline releases with strict vMAJOR.MINOR.PATCH semver tags.
Changes:
- scripts/releaser/version.go: Add Pre field to version struct for
prerelease suffixes (e.g. "rc.0"), update regex, parsing, String(),
comparison methods, and add IsRC()/rcNumber() helpers.
- scripts/releaser/release.go: Detect RC branches (release/X.Y-rc.N),
suggest RC version numbers, auto-set "rc" channel (skipping
stable/mainline prompt), add RC advisory to release notes, skip docs
update for RC releases.
- .github/workflows/release.yaml: Add "rc" channel option, fix branch
derivation for RC tags (v2.32.0-rc.0 -> release/2.32-rc.0 instead of
broken release/2.32.0-rc), skip homebrew/winget/package publishing for
RC releases.
- scripts/release/publish.sh: Add --rc flag, pass --prerelease to gh
release create for RC releases.
- scripts/releaser/version_test.go: Add comprehensive unit tests for
version parsing, string formatting, IsRC, rcNumber, GreaterThan, and
Equal with RC versions.
<!--
If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.
-->
Adds the experimental `docker-chat-sandbox` example template under
`examples/templates/x/`. It provisions a regular dev agent plus a
chat-designated agent that runs inside bubblewrap with a read-only root,
writable `/home/coder`, and outbound TCP restricted to the Coder
control-plane endpoint via `iptables`.
The chat agent still appears in dashboard and API responses, but the
template reserves it for chatd-managed sessions rather than normal user
interaction. `lint/examples` now walks nested template directories, so
experimental templates can live under `examples/templates/x/` without
treating `x/` itself as a template.
Closes#15129
Adds a generated **Beta features** table on the feature stages doc,
using the same mainline/stable sparse-checkout approach as the
early-access experiments list.
- Walk `docs/manifest.json` for routes with `state: ["beta"]` and render
a table (title, description, mainline vs stable).
- Inject output between `<!-- BEGIN: available-beta-features -->` /
`END` in `docs/install/releases/feature-stages.md`.
- Rename `scripts/release/docs_update_experiments.sh` →
`docs_update_feature_stages.sh`, refresh the script header, and use
`build/docs/feature-stages` for clone output.
<img width="1624" height="1061" alt="image"
src="https://github.com/user-attachments/assets/5fa811dd-9b80-446b-ae65-ec6e6cfedd6a"
/>
- Suppress informational `log.Printf` messages from the metrics scanner
when stdout is not a TTY (i.e. piped via `atomic_write` in `make gen` or
CI)
- Genuine warnings (`warnf`) still print unconditionally so real
problems remain visible
- `log.Fatalf` for fatal errors is unchanged
> 🤖 Created by Coder Agents and reviewed by a human
GPG emits an "untrusted key" warning when signing with a key that hasn't
been assigned a trust level, which can cause verification steps to fail
or produce noisy output.
Example:
```sh
gpg: Signature made Tue Mar 24 20:56:59 2026 UTC
gpg: using RSA key 21C96B1CB950718874F64DBD6A5A671B5E40A3B9
gpg: Good signature from "Coder Release Signing Key <security@coder.com>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 21C9 6B1C B950 7188 74F6 4DBD 6A5A 671B 5E40 A3B9
```
After importing the release key, derive its fingerprint from the keyring
and mark it as ultimately trusted via `--import-ownertrust`.
The fingerprint is extracted dynamically rather than hard-coded, so this
works for any key supplied via `CODER_GPG_RELEASE_KEY_BASE64`.
## Summary
The `build` CI job on `main` is failing with:
```
ERROR: Computed invalid windows version format: 2.32.0-rc.0.1
```
This started when the `v2.32.0-rc.0` tag was created, making `git
describe` produce versions like `2.32.0-rc.0-devel+4f571f8ff`.
## Root cause
`scripts/build_go.sh` converts the version to a Windows-compatible
`X.Y.Z.{0,1}` format by stripping pre-release segments. It uses
`${var%-*}` (shortest suffix match), which only removes the last
`-segment`. For RC versions this leaves `-rc.0` intact:
```
2.32.0-rc.0-devel → strip %-* → 2.32.0-rc.0 → + .1 → 2.32.0-rc.0.1 ✗
```
## Fix
Switch to `${var%%-*}` (longest suffix match) so all pre-release
segments are stripped from the first hyphen onward:
```
2.32.0-rc.0-devel → strip %%-* → 2.32.0 → + .1 → 2.32.0.1 ✓
```
Verified all version patterns produce valid output:
| Input | Output |
|---|---|
| `2.32.0` | `2.32.0.0` |
| `2.32.0-devel` | `2.32.0.1` |
| `2.32.0-rc.0-devel` | `2.32.0.1` |
| `2.32.0-rc.0` | `2.32.0.1` |
Fixes
https://github.com/coder/coder/actions/runs/23511163474/job/68434008241
When developers switch branches, the database may have migrations
from the other branch that don't exist in the current binary.
This causes coder server to fail at startup, leaving developers
stuck.
The develop script now detects this before starting the server:
1. Connects to postgres (starts temp embedded instance for
built-in postgres, or uses CODER_PG_CONNECTION_URL).
2. Compares DB version against the source's latest migration.
3. If DB is ahead, searches git history for the missing down
SQL files and applies them in a transaction.
4. If git recovery fails (ambiguous versions across branches,
missing files), falls back to resetting the public schema.
Also adds --reset-db and --skip-db-recovery flags.
The slices package provides type-safe generic replacements for the
old typed sort convenience functions. The codebase already uses
slices.Sort in 43 call sites; this finishes the migration for the
remaining 29.
- sort.Strings(x) -> slices.Sort(x)
- sort.Float64s(x) -> slices.Sort(x)
- sort.StringsAreSorted(x) -> slices.IsSorted(x)
Audited exported helpers in `coderd/util/*`, `testutil`, `cryptorand`,
and friends, then replaced duplicated implementations with canonical
versions.
- **fix: `maps.SortedKeys` generic signature** — value type was
hardcoded to `any`, making it impossible to actually call. Added second
type parameter `V any`. Added table-driven tests with `cmp.Diff`.
- **refactor: replace ad-hoc ptr helpers with `ptr.Ref`** — removed
`int64Ptr`, `stringPtr`, `boolPtr`, `i64ptr`, `strPtr`, `PtrInt32`
across 6 files.
- **refactor: replace local `sortedKeys`/`sortKeys` with
`maps.SortedKeys`** — now that the signature is fixed, scripts can use
it.
- **refactor: replace hand-rolled `capitalize` with
`strings.Capitalize`** — the typegen version was also not UTF-8 safe.
> 🤖 This PR was created with the help of Coder Agents, and was reviewed
by my human. 🧑💻
Pre-commit classifies staged files and runs make pre-commit-light
when no Go, TypeScript, or Makefile changes are present. This
skips gen, lint/go, lint/ts, fmt/go, fmt/ts, and the binary
build. A markdown-only commit takes seconds instead of minutes.
Pre-push uses the same heuristic: if only light files changed
(docs, shell, terraform, etc.), tests are skipped entirely.
Falls back to the full make targets when Go/TS/Makefile changes
are detected, CODER_HOOK_RUN_ALL=1 is set, or the diff range
can't be determined.
Also adds test-storybook to make pre-push (vitest with the
storybook project in Playwright browser mode).
Always deploy from main for now. We want to keep testing commits to main
as soon as they're merged, so we're going to disable the release
freezing behavior. We will test cut releases on a separate deployment,
upgraded manually.
The flat ChatMessagePart interface had 20+ optional fields, preventing
TypeScript from narrowing types on switch(part.type). Each consumer
needed runtime validation, type assertions, or defensive ?. chains.
Add `variants` struct tags to ChatMessagePart fields declaring which
union variants include each field. A codegen mutation in apitypings
reads these tags via reflect and generates per-variant sub-interfaces
(ChatTextPart, ChatReasoningPart, etc.) plus a union type alias.
A test validates every field has a variants tag or is explicitly
excluded, and every part type is covered.
Remove dead frontend code: normalizeBlockType, alias case branches
("thinking", "toolcall", "toolresult"), legacy field fallbacks
(line_number, typedBlock.name/id/input/output), and result_delta
handling. Add test coverage for args_delta streaming, provider_executed
skip logic, and source part parsing.
Previously main.go used syscall.SysProcAttr{Setpgid: true} and
syscall.Kill, both undefined on Windows. This broke GOOS=windows
cross-compilation.
Add a //go:build !windows constraint to the package since it is
a dev-only tool that requires Unix utilities (bash, make, etc.)
and is not intended to run on Windows.
Refs #23054Fixescoder/internal#1407
- Add `--starter-template` option and properly create starter template
with name and icon
- Add Coder Desktop URLs to listening banner
- Makefile tweak to avoid rebuilding `scripts/develop` every time Go
code changes
Replace the ~370-line bash develop.sh with a Go program using
serpent for CLI flags, errgroup for process lifecycle, and
codersdk for setup. develop.sh becomes a thin make + exec wrapper.
- Process groups for clean shutdown of child trees
- Docker template auto-creation via SDK ExampleID
- Idempotent setup (users, orgs, templates)
- Configurable --port, --web-port, --proxy-port
- Preflight runs lib.sh dependency checks
- TCP dial for port-busy checks
- Make target (build/.bin/develop) for build caching
Restore PR title validation that was removed in 828f33a when
cdr-bot was expected to handle it. That bot has since been disabled.
The new title job in contrib.yaml validates:
- Conventional commit format (type(scope): description)
- Type from the same set used by release notes generation
- Scope validity derived from the changed files in the PR diff
- All changed files fall under the declared scope
Uses actions/github-script (no third-party marketplace actions).
Also fixes feat(api) examples across docs (no api folder exists)
and consolidates commit rules into CONTRIBUTING.md as the single
source of truth.
Add cost tracking for LLM chat interactions with microdollar precision.
## Changes
- Add `chatcost` package for per-message cost calculation using
`shopspring/decimal` for intermediate arithmetic
- **Ceil rounding policy**: fractional micros round UP to next whole
micro (applied once after summing all components)
- Database migration: `total_cost_micros` BIGINT column with historical
backfill and `created_at` index
- API endpoints: per-user cost summary and admin rollup under
`/api/experimental/chats/cost/`
- SDK types: `ChatCostSummary`, `ChatCostModelBreakdown`,
`ChatCostUserRollup`
- Fix `modeloptionsgen` to handle `decimal.Decimal` as opaque numeric
type
- Update frontend pricing test fixtures for string decimal types
## Design decisions
- `NULL` = unpriced (no matching model config), `0` = free
- Reasoning tokens included in output tokens (no double-counting)
- Integer microdollars (BIGINT) for storage and API responses
- Price config uses `decimal.Decimal` for exact parsing; totals use
`int64`
Frontend: #23037
When SSH disconnects, the output reader subshells fail writing to the
dead terminal (EIO) and exit due to set -e. This breaks the pipe to
the server, which receives SIGPIPE and dies before graceful shutdown
can stop embedded PostgreSQL.
Disable errexit in readers so they survive terminal death, add HUP
traps so the script handles SSH disconnect, and shield children from
direct SIGHUP via SIG_IGN before exec (Go resets this to caught once
signal.Notify registers). Also make fatal() exit immediately instead
of relying on async signal delivery, and remove the broken process
group kill (ppid was never a PGID).
The pre-push hook was removed in #22956. This restores it with a
reduced scope (tests + site build) and an allowlist so it only runs
for developers who opt in.
Two opt-in mechanisms:
- git config coder.pre-push true (local, not committed)
- CODER_WORKSPACE_OWNER_NAME allowlist in the hook script
git config takes priority and also supports explicit opt-out for
allowlisted users (git config coder.pre-push false).
Refs #22956
---------
Co-authored-by: Cian Johnston <cian@coder.com>
pre-commit was noisy: every sub-target dumped full stdout/stderr to the
terminal, burying failures in pages of compiler output and lint details.
Teach timed-shell.sh a quiet mode via MAKE_LOGDIR: when set, recipe
output is redirected to per-target log files and a one-line status is
printed instead. When unset, behavior is unchanged (with a refreshed
output format).
Makefile changes:
- pre-commit creates a tmpdir, passes MAKE_LOGDIR to sub-makes
- Drop --output-sync=target (log files eliminate interleaving)
- Add --no-print-directory to suppress Entering/Leaving noise
- Split check-unstaged and check-untracked into separate defines
- Restyle both with colored indicators and clearer instructions
- Clean up tmpdir on success, preserve on failure for debugging
## Summary
- add a `--port` flag to `scripts/develop.sh` so the API can run on a
non-default port
- make the script use the selected API port for access URL defaults,
readiness checks, login, proxy wiring, and printed URLs
- reject oversized `--port` values early and treat an existing dev
frontend on port 8080 as a conflict when the requested API is not
already running
- pin the frontend dev server to port 8080 so inherited `PORT`
environment variables do not move it to a different port
## Testing
- `bash -n scripts/develop.sh`
- `shellcheck -x scripts/develop.sh`
- `bash scripts/develop.sh --port abc`
- `bash scripts/develop.sh --port 8080`
- `bash scripts/develop.sh --port 999999999999999999999`
- started `./scripts/develop.sh --port 3001` and verified:
- `http://127.0.0.1:3001/healthz`
- `http://127.0.0.1:3001/api/v2/buildinfo`
- `http://127.0.0.1:8080/healthz`
- `http://127.0.0.1:8080/api/v2/buildinfo`
- simulated an existing dev frontend on `127.0.0.1:8080` and verified
`./scripts/develop.sh --port 3001` exits with a conflict error
## Summary
- remove the `pre-push` git hook script from the repository
- remove the `make pre-push` target and related Makefile documentation
- update contributor and agent docs so they only describe the remaining
`pre-commit` hook
## Validation
- `make pre-commit`
- `git diff --check`
---
_Generated with [`mux`](https://github.com/coder/mux) • Model:
`openai:gpt-5.4` • Thinking: `high`_
pre-commit and pre-push only reported total elapsed time at the end,
making it hard to identify which jobs are slow.
Add a `MAKE_TIMED=1` mode that replaces `SHELL` with a wrapper
(`scripts/lib/timed-shell.sh`) to print wall-clock time for each
recipe. pre-commit and pre-push enable this on their sub-makes.
Ad-hoc use: `make MAKE_TIMED=1 test`
Agents hit short shell timeouts on `git commit` (~13s) before
`make pre-commit` finishes (~20s warm), then disable hooks via
`git config core.hooksPath /dev/null`. This bypasses all local checks
and, because it writes to shared `.git/config`, silently disables hooks
for every other worktree too.
Add explicit timing guidance to AGENTS.md, and write worktree-scoped
`core.hooksPath` in post-checkout, pre-commit, and pre-push hooks to
make the bypass ineffective.
This change adds git hooks and Makefile targets that mirror CI required
checks locally, catching issues before they reach CI.
This is for use by AI agents (documented in AGENTS.md).
- **pre-commit** (every commit): gen, fmt, lint, typos, slim binary
build. Fast checks without Docker or Playwright.
- **pre-push** (before push): full CI suite including site build, tests,
sqlc-vet, offlinedocs.
To use:
```sh
git config core.hooksPath scripts/githooks
```
Works in worktrees (where `.git` is a file). Bypass with `--no-verify`.
## Problem
Bootstrap scripts under `provisionersdk/scripts/` are inlined into
templates via `sh -c '${init_script}'`. Any single quote (apostrophe) in
these `.sh` files silently breaks the shell quoting, causing the agent
to never start — with near-invisible error output.
## Changes
- **`scripts/check_bootstrap_quotes.sh`** — new lint script that scans
all `.sh` files under `provisionersdk/scripts/` for single quotes and
fails with a clear error if any are found. Only checks shell scripts
(not `.ps1`, which legitimately uses single quotes).
- **`Makefile`** — added `lint/bootstrap` target wired into the `lint`
dependency list.
Fixes#22062
This PR does three things:
- Exports derp expvars to the pprof endpoint
- Exports the expvar metrics as prometheus metrics in both coderd and
wsproxy
- Updates our tailscale to a fix I also had to make to avoid a data race
condition
I generated this with mux but I also manually tested that the metrics
were getting properly emitted
Follow-up to #22612. Running `git status --short` in a loop during `make
-B -j gen` still showed intermediate states for several files. This PR
fixes the remaining ones.
The main issues:
- `generate.sh` ran `gofmt` and `goimports` in-place after moving files
into the source tree. Now it formats in a workdir first and only `mv`s
the final result.
- `protoc` targets wrote directly to the source tree. Wrapped with
`scripts/atomic_protoc.sh` which redirects output to a tmpdir.
- Several generators used hardcoded `/tmp/` paths. On systems where
`/tmp` is tmpfs, `mv` degrades to copy+delete. Switched to a
project-local `_gen/` directory (gitignored, same filesystem).
- `apidoc/.gen` and `cli/index.md` used `cp` for final output. Replaced
with `mv`.
- `manifest.json` was written twice (unformatted, then formatted). Now
`.gen` writes to a staging file and the manifest target does one
formatted atomic write.
- `biome_format.sh` silently skipped files in gitignored dirs. Added
`--vcs-enabled=false`.
Two helpers reduce the Makefile boilerplate: `scripts/atomic_protoc.sh`
(wraps protoc) and an `atomic_write` Make define
(stdout-to-temp-to-target pattern). `.PRECIOUS` now also covers `.pb.go`
and mock files.
Verification: `make -B -j gen` x3 with `git status` polling, no changes.
Refs #22612
`make gen` could not run with `-j` because inter-target dependency edges
were missing. Multiple recipes compile `coderd/rbac` (which includes
generated files like `object_gen.go`), and without explicit ordering,
parallel runs produced syntax errors from mid-write reads.
Three main changes:
**Dependency graph fixes** declare the compile-time chain through
`coderd/rbac` so that `object_gen.go` is written before anything that
imports it is compiled. The DB generation targets use a GNU Make 4.3+
grouped target (`&:`) so Make knows `generate.sh` co-produces
`querier.go`, `unique_constraint.go`, `dbmetrics`, and `dbauthz` in a
single invocation. `SKIP_DUMP_SQL=1` avoids re-entrant `make` inside
`generate.sh` when the Makefile already guarantees `dump.sql` is fresh.
**`scripts/atomicwrite` package** replaces `os.WriteFile` in all gen
scripts with a temp-file-in-same-dir + rename pattern, preventing
interrupted runs from leaving partial files.
**`.PRECIOUS` and shell atomic writes** protect git-tracked generated
files from Make's default delete-on-error behavior. Since these files
are committed, deletion is worse than staleness -- `git restore` is the
recovery path.
CI now runs `make -j --output-sync -B gen` (~32s, down from ~85s
serial).
| Scenario | Before | After |
|-----------------------------------|--------------------|----------|
| `make gen` (serial) | 95s | 95s |
| `make -j gen` (parallel) | race error | **22s** |
| CI `make -j --output-sync -B gen` | forced serial ~85s | **~32s** |