Commit Graph

602 Commits

Author SHA1 Message Date
Cian Johnston d99949df43 feat(scripts/develop): add --prometheus-server flag to run Prometheus UI (#24408)
Builds on #24389. Adds `--prometheus-server` flag that starts a
`prom/prometheus:v3.11.2` container alongside the dev environment.

> 🤖
2026-04-20 14:08:13 +00:00
Cian Johnston 4b585465b8 feat: label chatd metrics by model, add stream-state diagnostics (#24475)
Adds production-observability metrics to coderd/x/chatd/ for
model-level correlation and a chatStreams memory-leak investigation.

- Label per-request chatd metrics (steps_total, message_count,
  prompt_size_bytes, tool_result_size_bytes, ttft_seconds,
  compaction_total) with `model` and enrich the per-turn logger
  with provider/model.
- Add `coderd_chatd_stream_retries_total{provider, model, kind}`
  counter incremented in chatloop before OnRetry.
- Register a prometheus.Collector exposing `streams_active`,
  `stream_buffer_size_max`, `stream_buffer_events`,
  `stream_subscribers` from p.chatStreams.
- Add `coderd_chatd_stream_buffer_dropped_total` counter,
  incremented per publishToStream drop independently of the
  existing log-rate-limited bufferDropCount.
- Snapshot logger/model before the title-generation goroutine to
  avoid a data race with the logger/model rebind below it.

> 🤖
2026-04-17 16:16:30 +01:00
Cian Johnston a23a38c1f3 feat(scripts/develop): enable prometheus metrics by default (#24389)
- Adds `--prometheus-port` flag to develop.sh (default: 2114). Passes `--prometheus-enable=true` and `--prometheus-address=0.0.0.0:2114` to dev coderd.
- Show Prometheus URL in dev banner + group by service

> 🤖
2026-04-17 13:06:10 +01:00
Ethan 55e525fc28 ci: add InTx linter replacing ruleguard rule (#24422)
Replace the old `InTx` ruleguard rule in `scripts/rules.go` with a
custom in-tree `go/analysis` analyzer under `scripts/intxcheck/`. The
new analyzer catches the same direct and pass-through misuse classes as
before, plus two new classes the pattern-matcher couldn't reach:

- **Indirect same-package helper misuse** — flags `p.someHelper(ctx)`
inside `InTx` when the helper body uses the outer store (the PR #24369
bug class).
- **Nested dangerous closures** — descends into `go func() { ... }()`,
`defer func() { ... }()`, and immediately-invoked function literals.

The analyzer uses semantic `types.Object` identity instead of raw
expression string comparison, which avoids false positives from
closure-local shadowing and catches simple aliases like `outer := s.db`
and `alias := s`.

This PR also fixes three real outer-store-inside-transaction bugs the
new analyzer surfaced:

- `coderd/wsbuilder/wsbuilder.go`: `FindMatchingPresetID` and
`getWorkspaceTask` now use the inner transaction store instead of
`b.store`.
- `enterprise/dbcrypt/dbcrypt.go`: `ensureEncrypted` now calls
`s.InsertDBCryptKey` (the tx-wrapped store) instead of
`db.InsertDBCryptKey`. The `dbCrypt.InTx` method wraps the raw tx in a
new `*dbCrypt`, so `s.InsertDBCryptKey` still dispatches through the
encryption layer.

Two call sites need `// intxcheck:ignore` suppressions. Both are one-off
patterns that only look like misuse because the analyzer doesn't track
assignments — proving them safe would require full dataflow analysis,
which is well beyond what a targeted lint like this should attempt:

- `coderd/database/dbfake/dbfake.go` — `b.db` is reassigned to `tx` on
the preceding line, so `b.doInTX()` actually uses the transaction. The
analyzer sees the original `b.db` identity and flags it.
- `coderd/database/db_test.go` — test intentionally passes the outer
store to `require.Equal` to assert that nested `InTx` returns the same
handle.

Suppressions use `// intxcheck:ignore` instead of `//nolint:intxcheck`
because `intxcheck` runs as a standalone `go/analysis` tool outside
golangci-lint. golangci-lint's `nolintlint` checker flags `//nolint`
directives for linters it doesn't control, so we use a custom comment
prefix to avoid that conflict.
2026-04-17 00:07:30 +10:00
Kayla はな d23a6959fc chore: upgrade to ubuntu 26.04 (#24267) 2026-04-15 15:02:47 -06:00
Cian Johnston d7439a9de0 feat: add Prometheus metrics for chatd subsystem (#24371)
Adds 7 Prometheus metrics to the chatd subsystem and introduces typed
`ActivityBumpReason` for deadline bump attribution.

| Metric | Type | Labels |
|--------|------|--------|
| `coderd_chatd_chats` | Gauge | `state` (streaming, waiting) |
| `coderd_chatd_message_count` | Histogram | `provider` |
| `coderd_chatd_prompt_size_bytes` | Histogram | `provider` |
| `coderd_chatd_tool_result_size_bytes` | Histogram | `provider`,
`tool_name` |
| `coderd_chatd_ttft_seconds` | Histogram | `provider` |
| `coderd_chatd_compaction_total` | Counter | `provider`, `result` |
| `coderd_chatd_steps_total` | Counter | `provider` |

> 🤖
2026-04-15 19:53:10 +01:00
Danny Kopping 48b90f8cc8 feat: add coder_build_info metric (#24365)
_Disclaimer: produced by Claude Opus 4.6_

Adds a `coder_build_info` metric which allows operators to see which
versions of Coder are currently running.

---------

Signed-off-by: Danny Kopping <danny@coder.com>
2026-04-15 12:48:38 +00:00
J. Scott Miller 20b953a99d feat: add Prometheus metric for agent first connection duration (#24179)
## Summary

Add `coderd_agents_first_connection_seconds` histogram metric that
records the
duration from workspace agent creation to first connection. This fills
an
observability gap — provisioner job timings and startup script metrics
exist,
but the agent connection phase (which can take several minutes) was not
exposed
to Prometheus.

Closes https://github.com/coder/coder/issues/21282

## Changes

- **`coderd/prometheusmetrics/prometheusmetrics.go`** — Define and
register a
  `HistogramVec` in the existing `Agents()` polling loop. Observe
`first_connected_at - created_at` exactly once per agent via a
deduplication
  map, pruned each tick to prevent unbounded memory growth.
- **`coderd/prometheusmetrics/prometheusmetrics_test.go`** — Update
`TestAgents`
to set `first_connected_at` on the test agent and assert the histogram
is
  collected with correct labels, sample count, and sample sum.
- **`docs/admin/integrations/prometheus.md`**,
**`scripts/metricsdocgen/generated_metrics`** —
  Auto-generated documentation updates from `make gen`.

## Metric details

| Property | Value |
|---|---|
| Name | `coderd_agents_first_connection_seconds` |
| Type | histogram |
| Labels | `template_name`, `agent_name`, `username`, `workspace_name` |
| Buckets | 1s, 10s, 30s, 1m, 2m, 5m, 10m, 30m, 1h |

## Example PromQL

```promql
# P95 agent connection time by template
histogram_quantile(0.95,
  sum(rate(coderd_agents_first_connection_seconds_bucket[1h])) by (le, template_name)
)
```

<details>
<summary>Implementation notes</summary>

### Design decisions

- **Histogram over gauge**: Enables `histogram_quantile()` for
percentile queries.
- **Observe in `Agents()` polling loop**: All required data is already
fetched by
  `GetWorkspaceAgentsForMetrics()` — no new DB queries.
- **Dedup via `map[uuid.UUID]struct{}`**: Prevents re-observing the same
agent
  across polling ticks. Pruned each cycle to bound memory.
- **Buckets**: Aligned with
`coderd_provisionerd_workspace_build_timings_seconds`
  range (1s–1h).

### Overhead at scale (100k active workspaces)

The deduplication map (`observedFirstConnection`) and per-tick pruning
map
(`currentAgentIDs`) are both `map[[16]byte]struct{}`. At 100k agents:

- **Memory**: ~2.25 MB persistent + ~2.25 MB transient per tick = **~4.5
MB peak**.
- **CPU**: ~25 ms of map operations per tick (one tick per minute) =
**<0.05% of one core**.

Both are negligible relative to the existing cost of the `Agents()` loop
(the DB
query, per-agent `GetWorkspaceAppsByAgentID` calls, and coordinator node
lookups
dominate).

</details>

> 🤖 Generated by Coder Agents
2026-04-14 12:00:46 -05:00
Cian Johnston 497f637f58 chore: revert force deploying main (#23290) (#24072)
⚠️ DO NOT MERGE UNTIL @f0ssel SAYS SO ⚠️ 

This reverts commit 8f78c5145f
(https://github.com/coder/coder/pull/23290).
2026-04-08 11:19:14 -04:00
Ethan be686a8d0d fix(scripts/githooks): clear all repo-local Git env vars in hooks (#24138)
## Problem

In linked worktrees, Git hooks inherit multiple repo-local environment
variables: `GIT_DIR`, `GIT_COMMON_DIR`, `GIT_INDEX_FILE`, and others.
The
pre-commit and pre-push hooks only unset `GIT_DIR`, leaving the rest in
place.

When `make pre-commit` runs `go build`, Go tries to stamp VCS info by
shelling
out to `git`. With the leftover partial Git environment, `git` exits 128
and
the build fails:

```
error obtaining VCS status: exit status 128
    Use -buildvcs=false to disable VCS stamping.
```

This only happens inside hooks in a linked worktree — running `make
pre-commit`
directly from the terminal works fine because the repo-local vars are
not set.

## Fix

Replace the bare `unset GIT_DIR` in both hooks with a loop that clears
every
variable reported by `git rev-parse --local-env-vars`:

```sh
while IFS= read -r var; do
    unset "$var"
done < <(git rev-parse --local-env-vars)
```

This covers all 15 repo-local variables Git may inject (`GIT_DIR`,
`GIT_COMMON_DIR`, `GIT_INDEX_FILE`, `GIT_OBJECT_DIRECTORY`, etc.) and is
forward-compatible — if Git adds new local vars in the future, the loop
picks
them up automatically.
2026-04-09 01:06:12 +10:00
Garrett Delfosse ab77154975 ci: add cherry-pick to latest release workflow (#24051)
Adds a GitHub Actions workflow that cherry-picks merged PRs to the
latest release branch when the `cherry-pick` label is applied.

## How it works

1. Add the `cherry-pick` label to any PR targeting `main` (before or
after merge).
2. On merge (or on label if already merged), the workflow detects the
latest `release/*` branch.
3. It cherry-picks the merge commit (`-x -m1`) and opens a PR.

This complements the `backport` label (see #24025) which targets the
latest **3** release branches. `cherry-pick` targets only the **latest**
one — useful for getting fixes into the current release.

Created PRs follow existing repo conventions:
- **Branch:** `backport/<pr>-to-<version>`
- **Title:** `<original PR title> (#<pr>)` — e.g. `fix(site): correct
button alignment (#12345)`
- **Body:** links back to the original PR and merge commit

If the cherry-pick encounters conflicts, the workflow aborts the
cherry-pick, creates an empty commit with resolution instructions, and
opens the PR with a `[CONFLICT]` prefix so the author can resolve
manually.

Also:
- Removes `scripts/backport-pr.sh` (replaced by this workflow)
- Removes `.github/cherry-pick-bot.yml` (old bot config)
- Adds a section to the contributing docs explaining the `cherry-pick`
label

> [!NOTE]
> Generated with [Coder Agents](https://coder.com/agents)
2026-04-08 10:22:33 -04:00
Garrett Delfosse 48bc215f20 chore: tag RCs on main, cut release branch only for releases (#24001)
RC tags are now created directly on `main`. The `release/X.Y` branch is
only cut when the actual release is ready. This eliminates the need to
cherry-pick hundreds of commits from main onto the release branch
between the first RC and the release.

## Workflow

```
main:  ──●──●──●──●──●──●──●──●──●──
              ↑           ↑     ↑
           rc.0        rc.1    cut release/2.34, tag v2.34.0
                                     \
                               release/2.34:  ──●── v2.34.1 (patch)
```

1. **RC:** On `main`, run `./scripts/release.sh`. The tool detects main
(or a detached HEAD reachable from main), prompts for the commit SHA to
tag, suggests the next RC version, and tags it.
2. **Release:** When the RC is blessed, create `release/X.Y` from `main`
(or the specific RC commit). Switch to that branch and run
`./scripts/release.sh`, which suggests `vX.Y.0`.
3. **Patch:** Cherry-pick fixes onto `release/X.Y` and run
`./scripts/release.sh` from that branch.

## Changes

### `scripts/releaser/release.go`
- Two modes based on branch:
- **`main` (or detached HEAD from main)** — RC tagging. Prompts for the
commit SHA to tag (defaults to HEAD). Always checks out the target
commit so the flow operates in detached HEAD. Suggests the next RC based
on existing RC tags.
- **`release/X.Y`** — Release/patch mode. Suggests `vX.Y.0` if the
latest tag is an RC, or the next patch otherwise.
- Detached HEAD support: if `git branch --show-current` is empty, checks
whether HEAD is an ancestor of `origin/main` and enters RC mode
automatically.
- Commit selection prompt in RC mode: shows current commit, lets the
user confirm or provide a different SHA.
- Warns if you try to tag a non-RC on main, or an RC on a release
branch.
- Skips open-PR check and branch sync check in RC mode (not useful on
main).

### `scripts/releaser/main.go`
- Updated help text.

### `.github/workflows/release.yaml`
- RC tags (`*-rc.*`): skip the release-branch validation (they live on
main).
- Non-RC tags: still require the corresponding `release/X.Y` branch.

### `docs/about/contributing/CONTRIBUTING.md`
- Rewrote the Releases section with the new workflow, release types
table, and ASCII diagram.
- Replaced the old "Creating a release" / "Creating a release (via
workflow dispatch)" subsections.

<details><summary>Decision log</summary>

### Why this approach?

Previously, cutting a release branch early for an RC meant
cherry-picking all of main's progress onto that branch before the actual
release — often hundreds of commits. This approach avoids that entirely:
RCs are just tagged snapshots of main, and the release branch only
exists once you need it for stabilization and backports.

### Files NOT changed

- **`scripts/release/publish.sh`** — `--rc` flag controls GitHub
prerelease marking (tag-level, not branch-level). `target_commitish`
already defaults to `main` when the tag isn't on a release branch.
- **`scripts/release/tag_version.sh`** — No RC-specific branch logic.
- **`scripts/releaser/version.go`** — Version parsing/comparison
unchanged.
- **`docs/install/releases/index.md`** — Public-facing docs describe RC
as a release channel with no branch-level detail.

</details>

> Generated by Coder Agents
2026-04-07 15:21:22 -04:00
Garrett Delfosse 5453a6c6d6 fix(scripts/releaser): simplify branch regex and fix changelog range (#23947)
Two fixes for the release script:

**1. Branch regex cleanup** — Simplified to only match `release/X.Y`.
Removed
support for `release/X.Y.Z` and `release/X.Y-rc.N` branch formats. RCs
are
now tagged from main (not from release branches), and the three-segment
`release/X.Y.Z` format will not be used going forward.

**2. Changelog range for first release on a new minor** — When no tags
match
the branch's major.minor, the commit range fell back to `HEAD` (entire
git
history, ~13k lines of changelog). Now computes `git merge-base` with
the
previous minor's release branch (e.g. `origin/release/2.32`) as the
changelog
starting point. This works even when that branch has no tags pushed yet.
Falls
back to the latest reachable tag from a previous minor if the branch
doesn't
exist.
2026-04-07 17:07:21 +00:00
Kyle Carberry 500fc5e2a4 feat: polish model config form UI (#24047)
Polishes the AI model configuration form (add/edit model) with tighter
layout and better input affordances.

**Frontend changes:**
- Replace "Unset" with "Default" in select dropdowns to communicate
system fallback
- Show pricing fields inline instead of behind a collapsible toggle
- Use flat section dividers (`border-t`) instead of bordered fieldsets
- Move field descriptions into info-icon tooltips to fix input
misalignment
- Add InputGroup adornments: `$` prefix + `/1M` suffix on pricing,
`tokens` suffix on token fields, `%` suffix on compression threshold,
range placeholders on temperature/penalty fields
- Shorter pricing labels (Input, Output, Cache Read, Cache Write)
- Compact JSON textareas (1-row height, resizable)
- Smart grid layouts by field type (3-col provider, 4-col pricing, 3-col
advanced)
- Boolean fields render as a segmented control (Default · On · Off)
instead of a dropdown

**Backend changes:**
- Add `enum` tags to OpenAI `service_tier`
(`auto,default,flex,scale,priority`) and `reasoning_summary`
(`auto,concise,detailed`) so they render as select dropdowns instead of
free-text inputs

> 🤖 Generated by Coder Agents
2026-04-06 10:49:42 -04:00
Zach 796e8e4e18 feat: use -x option in backport script to improve tracability (#23984)
Add `-x` to backport script `git cherry-pick` command to include a
commit message reference to the original commit. This makes it easier to
trace where a cherry picked commit actually came from.
2026-04-02 10:23:31 -06:00
Danny Kopping fce05d0428 feat: add backport PR script (#23973)
_Disclaimer: created using Claude Opus 4.6._

```
# Examples:
#   ./scripts/backport-pr.sh 2.30 23969
#   ./scripts/backport-pr.sh --dry-run 2.30 23969
```

Here's one I created: https://github.com/coder/coder/pull/23972

Signed-off-by: Danny Kopping <danny@coder.com>
2026-04-02 15:19:06 +02:00
Garrett Delfosse be2e641162 feat: add release candidate (RC) support to release tooling (#23600)
This adds full RC release support to the release scripts and GitHub
Actions workflow. Previously, the tooling only supported stable and
mainline releases with strict vMAJOR.MINOR.PATCH semver tags.

Changes:
- scripts/releaser/version.go: Add Pre field to version struct for
prerelease suffixes (e.g. "rc.0"), update regex, parsing, String(),
comparison methods, and add IsRC()/rcNumber() helpers.
- scripts/releaser/release.go: Detect RC branches (release/X.Y-rc.N),
suggest RC version numbers, auto-set "rc" channel (skipping
stable/mainline prompt), add RC advisory to release notes, skip docs
update for RC releases.
- .github/workflows/release.yaml: Add "rc" channel option, fix branch
derivation for RC tags (v2.32.0-rc.0 -> release/2.32-rc.0 instead of
broken release/2.32.0-rc), skip homebrew/winget/package publishing for
RC releases.
- scripts/release/publish.sh: Add --rc flag, pass --prerelease to gh
release create for RC releases.
- scripts/releaser/version_test.go: Add comprehensive unit tests for
version parsing, string formatting, IsRC, rcNumber, GreaterThan, and
Equal with RC versions.

<!--

If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.

-->
2026-04-01 16:00:49 -04:00
Michael Suchacz cf500b95b9 chore: move docker-chat-sandbox under templates/x (#23777)
Adds the experimental `docker-chat-sandbox` example template under
`examples/templates/x/`. It provisions a regular dev agent plus a
chat-designated agent that runs inside bubblewrap with a read-only root,
writable `/home/coder`, and outbound TCP restricted to the Coder
control-plane endpoint via `iptables`.

The chat agent still appears in dashboard and API responses, but the
template reserves it for chatd-managed sessions rather than normal user
interaction. `lint/examples` now walks nested template directories, so
experimental templates can live under `examples/templates/x/` without
treating `x/` itself as a template.
2026-03-30 15:17:55 +02:00
Jake Howell 8f73e46c2f feat: automatically generate beta features (#23549)
Closes #15129

Adds a generated **Beta features** table on the feature stages doc,
using the same mainline/stable sparse-checkout approach as the
early-access experiments list.

- Walk `docs/manifest.json` for routes with `state: ["beta"]` and render
a table (title, description, mainline vs stable).
- Inject output between `<!-- BEGIN: available-beta-features -->` /
`END` in `docs/install/releases/feature-stages.md`.
- Rename `scripts/release/docs_update_experiments.sh` →
`docs_update_feature_stages.sh`, refresh the script header, and use
`build/docs/feature-stages` for clone output.

<img width="1624" height="1061" alt="image"
src="https://github.com/user-attachments/assets/5fa811dd-9b80-446b-ae65-ec6e6cfedd6a"
/>
2026-03-30 21:14:52 +11:00
Cian Johnston f164463c6a fix(scripts/metricsdocgen): shush the prometheus scanner in CI (#23642)
- Suppress informational `log.Printf` messages from the metrics scanner
when stdout is not a TTY (i.e. piped via `atomic_write` in `make gen` or
CI)
- Genuine warnings (`warnf`) still print unconditionally so real
problems remain visible
- `log.Fatalf` for fatal errors is unchanged

> 🤖 Created by Coder Agents and reviewed by a human
2026-03-26 12:58:02 +00:00
Jakub Domeracki 6bc6e2baa6 fix: explicitly trust our own GPG key (#23556)
GPG emits an "untrusted key" warning when signing with a key that hasn't
been assigned a trust level, which can cause verification steps to fail
or produce noisy output.

Example:
```sh
gpg: Signature made Tue Mar 24 20:56:59 2026 UTC
gpg:                using RSA key 21C96B1CB950718874F64DBD6A5A671B5E40A3B9
gpg: Good signature from "Coder Release Signing Key <security@coder.com>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 21C9 6B1C B950 7188 74F6  4DBD 6A5A 671B 5E40 A3B9
```

After importing the release key, derive its fingerprint from the keyring
and mark it as ultimately trusted via `--import-ownertrust`.
The fingerprint is extracted dynamically rather than hard-coded, so this
works for any key supplied via `CODER_GPG_RELEASE_KEY_BASE64`.
2026-03-25 10:24:31 +01:00
Garrett Delfosse 7f75670f8d fix(scripts): fix Windows version format for RC builds (#23542)
## Summary

The `build` CI job on `main` is failing with:

```
ERROR: Computed invalid windows version format: 2.32.0-rc.0.1
```

This started when the `v2.32.0-rc.0` tag was created, making `git
describe` produce versions like `2.32.0-rc.0-devel+4f571f8ff`.

## Root cause

`scripts/build_go.sh` converts the version to a Windows-compatible
`X.Y.Z.{0,1}` format by stripping pre-release segments. It uses
`${var%-*}` (shortest suffix match), which only removes the last
`-segment`. For RC versions this leaves `-rc.0` intact:

```
2.32.0-rc.0-devel  →  strip %-*  →  2.32.0-rc.0  →  + .1  →  2.32.0-rc.0.1  ✗
```

## Fix

Switch to `${var%%-*}` (longest suffix match) so all pre-release
segments are stripped from the first hyphen onward:

```
2.32.0-rc.0-devel  →  strip %%-*  →  2.32.0  →  + .1  →  2.32.0.1  ✓
```

Verified all version patterns produce valid output:

| Input | Output |
|---|---|
| `2.32.0` | `2.32.0.0` |
| `2.32.0-devel` | `2.32.0.1` |
| `2.32.0-rc.0-devel` | `2.32.0.1` |
| `2.32.0-rc.0` | `2.32.0.1` |

Fixes
https://github.com/coder/coder/actions/runs/23511163474/job/68434008241
2026-03-24 18:59:44 -04:00
Mathias Fredriksson 78b18e72bf feat: add automatic database migration recovery to scripts/develop (#23466)
When developers switch branches, the database may have migrations
from the other branch that don't exist in the current binary.
This causes coder server to fail at startup, leaving developers
stuck.

The develop script now detects this before starting the server:

1. Connects to postgres (starts temp embedded instance for
   built-in postgres, or uses CODER_PG_CONNECTION_URL).
2. Compares DB version against the source's latest migration.
3. If DB is ahead, searches git history for the missing down
   SQL files and applies them in a transaction.
4. If git recovery fails (ambiguous versions across branches,
   missing files), falls back to resetting the public schema.

Also adds --reset-db and --skip-db-recovery flags.
2026-03-24 22:04:56 +02:00
Mathias Fredriksson 147df5c971 refactor: replace sort.Strings with slices.Sort (#23457)
The slices package provides type-safe generic replacements for the
old typed sort convenience functions. The codebase already uses
slices.Sort in 43 call sites; this finishes the migration for the
remaining 29.

- sort.Strings(x)          -> slices.Sort(x)
- sort.Float64s(x)         -> slices.Sort(x)
- sort.StringsAreSorted(x) -> slices.IsSorted(x)
2026-03-23 23:19:23 +02:00
Cian Johnston f1d333f0e6 refactor: deduplicate utility helpers across the codebase (#23338)
Audited exported helpers in `coderd/util/*`, `testutil`, `cryptorand`,
and friends, then replaced duplicated implementations with canonical
versions.

- **fix: `maps.SortedKeys` generic signature** — value type was
hardcoded to `any`, making it impossible to actually call. Added second
type parameter `V any`. Added table-driven tests with `cmp.Diff`.
- **refactor: replace ad-hoc ptr helpers with `ptr.Ref`** — removed
`int64Ptr`, `stringPtr`, `boolPtr`, `i64ptr`, `strPtr`, `PtrInt32`
across 6 files.
- **refactor: replace local `sortedKeys`/`sortKeys` with
`maps.SortedKeys`** — now that the signature is fixed, scripts can use
it.
- **refactor: replace hand-rolled `capitalize` with
`strings.Capitalize`** — the typegen version was also not UTF-8 safe.

> 🤖 This PR was created with the help of Coder Agents, and was reviewed
by my human. 🧑‍💻
2026-03-20 15:12:41 +00:00
Mathias Fredriksson 23542cb6af feat: smart file-based target selection for scripts/githooks (#23358)
Pre-commit classifies staged files and runs make pre-commit-light
when no Go, TypeScript, or Makefile changes are present. This
skips gen, lint/go, lint/ts, fmt/go, fmt/ts, and the binary
build. A markdown-only commit takes seconds instead of minutes.

Pre-push uses the same heuristic: if only light files changed
(docs, shell, terraform, etc.), tests are skipped entirely.
Falls back to the full make targets when Go/TS/Makefile changes
are detected, CODER_HOOK_RUN_ALL=1 is set, or the diff range
can't be determined.

Also adds test-storybook to make pre-push (vitest with the
storybook project in Playwright browser mode).
2026-03-20 17:05:44 +02:00
Dean Sheather 8f78c5145f chore: force deploying to dogfood from main (#23290)
Always deploy from main for now. We want to keep testing commits to main
as soon as they're merged, so we're going to disable the release
freezing behavior. We will test cut releases on a separate deployment,
upgraded manually.
2026-03-19 13:52:30 +00:00
Garrett Delfosse 1f5f6c9ccb chore: add new interactive release package (#22624) 2026-03-18 12:19:54 -04:00
Mathias Fredriksson 66f809388e refactor: make ChatMessagePart a discriminated union in TypeScript (#23168)
The flat ChatMessagePart interface had 20+ optional fields, preventing
TypeScript from narrowing types on switch(part.type). Each consumer
needed runtime validation, type assertions, or defensive ?. chains.

Add `variants` struct tags to ChatMessagePart fields declaring which
union variants include each field. A codegen mutation in apitypings
reads these tags via reflect and generates per-variant sub-interfaces
(ChatTextPart, ChatReasoningPart, etc.) plus a union type alias.
A test validates every field has a variants tag or is explicitly
excluded, and every part type is covered.

Remove dead frontend code: normalizeBlockType, alias case branches
("thinking", "toolcall", "toolresult"), legacy field fallbacks
(line_number, typedBlock.name/id/input/output), and result_delta
handling. Add test coverage for args_delta streaming, provider_executed
skip logic, and source part parsing.
2026-03-18 09:27:51 +00:00
Mathias Fredriksson c2243addce fix(scripts/develop): allow empty access-url for devtunnel (#23166) 2026-03-17 18:06:55 +00:00
Mathias Fredriksson 144b32a4b6 fix(scripts/develop): skip build on Windows via build tag (#23118)
Previously main.go used syscall.SysProcAttr{Setpgid: true} and
syscall.Kill, both undefined on Windows. This broke GOOS=windows
cross-compilation.

Add a //go:build !windows constraint to the package since it is
a dev-only tool that requires Unix utilities (bash, make, etc.)
and is not intended to run on Windows.

Refs #23054
Fixes coder/internal#1407
2026-03-18 02:02:06 +11:00
Mathias Fredriksson a797a494ef feat: add starter template option and Coder Desktop URLs to scripts/develop (#23149)
- Add `--starter-template` option and properly create starter template
  with name and icon
- Add Coder Desktop URLs to listening banner
- Makefile tweak to avoid rebuilding `scripts/develop` every time Go
  code changes
2026-03-17 15:34:03 +02:00
Mathias Fredriksson 3a3537a642 refactor: rewrite develop.sh orchestrator in Go (#23054)
Replace the ~370-line bash develop.sh with a Go program using
serpent for CLI flags, errgroup for process lifecycle, and
codersdk for setup. develop.sh becomes a thin make + exec wrapper.

- Process groups for clean shutdown of child trees
- Docker template auto-creation via SDK ExampleID
- Idempotent setup (users, orgs, templates)
- Configurable --port, --web-port, --proxy-port
- Preflight runs lib.sh dependency checks
- TCP dial for port-busy checks
- Make target (build/.bin/develop) for build caching
2026-03-16 16:13:57 +02:00
Mathias Fredriksson aa6f301305 ci: add conventional commit PR title linting (#23096)
Restore PR title validation that was removed in 828f33a when
cdr-bot was expected to handle it. That bot has since been disabled.

The new title job in contrib.yaml validates:
- Conventional commit format (type(scope): description)
- Type from the same set used by release notes generation
- Scope validity derived from the changed files in the PR diff
- All changed files fall under the declared scope

Uses actions/github-script (no third-party marketplace actions).

Also fixes feat(api) examples across docs (no api folder exists)
and consolidates commit rules into CONTRIBUTING.md as the single
source of truth.
2026-03-16 12:24:59 +02:00
Mathias Fredriksson 3bd840fe27 build(scripts/lib): allow MAKE_TIMED=1 to show last 20 lines of failed jobs (#22985)
Refs #22978
2026-03-13 17:41:33 +00:00
Michael Suchacz c3b6284955 feat: add chat cost analytics backend (#23036)
Add cost tracking for LLM chat interactions with microdollar precision.

## Changes
- Add `chatcost` package for per-message cost calculation using
`shopspring/decimal` for intermediate arithmetic
- **Ceil rounding policy**: fractional micros round UP to next whole
micro (applied once after summing all components)
- Database migration: `total_cost_micros` BIGINT column with historical
backfill and `created_at` index
- API endpoints: per-user cost summary and admin rollup under
`/api/experimental/chats/cost/`
- SDK types: `ChatCostSummary`, `ChatCostModelBreakdown`,
`ChatCostUserRollup`
- Fix `modeloptionsgen` to handle `decimal.Decimal` as opaque numeric
type
- Update frontend pricing test fixtures for string decimal types

## Design decisions
- `NULL` = unpriced (no matching model config), `0` = free
- Reasoning tokens included in output tokens (no double-counting)
- Integer microdollars (BIGINT) for storage and API responses
- Price config uses `decimal.Decimal` for exact parsing; totals use
`int64`

Frontend: #23037
2026-03-13 18:30:49 +01:00
Mathias Fredriksson cc6716c730 fix(scripts): allow graceful shutdown when --debug/dlv is used in develop.sh (#23023) 2026-03-13 15:40:53 +02:00
Mathias Fredriksson 2bb483b425 fix(scripts/develop.sh): handle SIGHUP and prevent SIGPIPE during shutdown (#22994)
When SSH disconnects, the output reader subshells fail writing to the
dead terminal (EIO) and exit due to set -e. This breaks the pipe to
the server, which receives SIGPIPE and dies before graceful shutdown
can stop embedded PostgreSQL.

Disable errexit in readers so they survive terminal death, add HUP
traps so the script handles SSH disconnect, and shield children from
direct SIGHUP via SIG_IGN before exec (Go resets this to caught once
signal.Notify registers). Also make fatal() exit immediately instead
of relying on async signal delivery, and remove the broken process
group kill (ppid was never a PGID).
2026-03-12 16:09:30 +02:00
Mathias Fredriksson 660a3dad21 feat(scripts/githooks): restore pre-push hook with allowlist (#22980)
The pre-push hook was removed in #22956. This restores it with a
reduced scope (tests + site build) and an allowlist so it only runs
for developers who opt in.

Two opt-in mechanisms:

- git config coder.pre-push true (local, not committed)
- CODER_WORKSPACE_OWNER_NAME allowlist in the hook script

git config takes priority and also supports explicit opt-out for
allowlisted users (git config coder.pre-push false).

Refs #22956

---------

Co-authored-by: Cian Johnston <cian@coder.com>
2026-03-12 12:13:55 +02:00
Mathias Fredriksson e7e2de99ba build(Makefile): capture pre-commit output to log files (#22978)
pre-commit was noisy: every sub-target dumped full stdout/stderr to the
terminal, burying failures in pages of compiler output and lint details.

Teach timed-shell.sh a quiet mode via MAKE_LOGDIR: when set, recipe
output is redirected to per-target log files and a one-line status is
printed instead. When unset, behavior is unchanged (with a refreshed
output format).

Makefile changes:

- pre-commit creates a tmpdir, passes MAKE_LOGDIR to sub-makes
- Drop --output-sync=target (log files eliminate interleaving)
- Add --no-print-directory to suppress Entering/Leaving noise
- Split check-unstaged and check-untracked into separate defines
- Restyle both with colored indicators and clearer instructions
- Clean up tmpdir on success, preserve on failure for debugging
2026-03-12 11:16:31 +02:00
Michael Suchacz 51198744ff feat(scripts): add develop.sh port flag (#22961)
## Summary
- add a `--port` flag to `scripts/develop.sh` so the API can run on a
non-default port
- make the script use the selected API port for access URL defaults,
readiness checks, login, proxy wiring, and printed URLs
- reject oversized `--port` values early and treat an existing dev
frontend on port 8080 as a conflict when the requested API is not
already running
- pin the frontend dev server to port 8080 so inherited `PORT`
environment variables do not move it to a different port

## Testing
- `bash -n scripts/develop.sh`
- `shellcheck -x scripts/develop.sh`
- `bash scripts/develop.sh --port abc`
- `bash scripts/develop.sh --port 8080`
- `bash scripts/develop.sh --port 999999999999999999999`
- started `./scripts/develop.sh --port 3001` and verified:
  - `http://127.0.0.1:3001/healthz`
  - `http://127.0.0.1:3001/api/v2/buildinfo`
  - `http://127.0.0.1:8080/healthz`
  - `http://127.0.0.1:8080/api/v2/buildinfo`
- simulated an existing dev frontend on `127.0.0.1:8080` and verified
`./scripts/develop.sh --port 3001` exits with a conflict error
2026-03-11 20:11:03 +01:00
Thomas Kosiewski e96cd5cbb2 chore(githooks): remove pre-push hook (#22956)
## Summary
- remove the `pre-push` git hook script from the repository
- remove the `make pre-push` target and related Makefile documentation
- update contributor and agent docs so they only describe the remaining
`pre-commit` hook

## Validation
- `make pre-commit`
- `git diff --check`

---
_Generated with [`mux`](https://github.com/coder/mux) • Model:
`openai:gpt-5.4` • Thinking: `high`_
2026-03-11 17:44:19 +01:00
Mathias Fredriksson 56960585af build(Makefile): add per-target timing via SHELL wrapper (#22862)
pre-commit and pre-push only reported total elapsed time at the end,
making it hard to identify which jobs are slow.

Add a `MAKE_TIMED=1` mode that replaces `SHELL` with a wrapper
(`scripts/lib/timed-shell.sh`) to print wall-clock time for each
recipe. pre-commit and pre-push enable this on their sub-makes.

Ad-hoc use: `make MAKE_TIMED=1 test`
2026-03-09 23:07:33 +02:00
Mathias Fredriksson 9e7125f852 fix(scripts): handle ignored enc.Encode error in telemetry server (#22855)
Check the `json.Encoder.Encode` error and print to stderr.

Part of the effort to enable `errcheck.check-blank` in golangci-lint.
2026-03-09 19:03:06 +02:00
Mathias Fredriksson dd34e3d3c2 fix(scripts/githooks): prevent agents from bypassing git hooks (#22825)
Agents hit short shell timeouts on `git commit` (~13s) before
`make pre-commit` finishes (~20s warm), then disable hooks via
`git config core.hooksPath /dev/null`. This bypasses all local checks
and, because it writes to shared `.git/config`, silently disables hooks
for every other worktree too.

Add explicit timing guidance to AGENTS.md, and write worktree-scoped
`core.hooksPath` in post-checkout, pre-commit, and pre-push hooks to
make the bypass ineffective.
2026-03-09 12:51:44 +02:00
Mathias Fredriksson 752e6ecc16 build: add pre-commit/push hooks mirroring CI checks (#22705)
This change adds git hooks and Makefile targets that mirror CI required
checks locally, catching issues before they reach CI.

This is for use by AI agents (documented in AGENTS.md).

- **pre-commit** (every commit): gen, fmt, lint, typos, slim binary
  build. Fast checks without Docker or Playwright.
- **pre-push** (before push): full CI suite including site build, tests,
  sqlc-vet, offlinedocs.
  
To use:

```sh
git config core.hooksPath scripts/githooks
```

Works in worktrees (where `.git` is a file). Bypass with `--no-verify`.
2026-03-06 16:56:11 +02:00
Kacper Sawicki ba05188934 ci: add lint check to prevent single quotes in bootstrap scripts (#22664)
## Problem

Bootstrap scripts under `provisionersdk/scripts/` are inlined into
templates via `sh -c '${init_script}'`. Any single quote (apostrophe) in
these `.sh` files silently breaks the shell quoting, causing the agent
to never start — with near-invisible error output.

## Changes

- **`scripts/check_bootstrap_quotes.sh`** — new lint script that scans
all `.sh` files under `provisionersdk/scripts/` for single quotes and
fails with a clear error if any are found. Only checks shell scripts
(not `.ps1`, which legitimately uses single quotes).
- **`Makefile`** — added `lint/bootstrap` target wired into the `lint`
dependency list.

Fixes #22062
2026-03-06 13:09:56 +01:00
Jon Ayers 6c44de951d feat: add Prometheus collector for DERP server expvar metrics (#22583)
This PR does three things:
- Exports derp expvars to the pprof endpoint
- Exports the expvar metrics as prometheus metrics in both coderd and
wsproxy
- Updates our tailscale to a fix I also had to make to avoid a data race
condition

I generated this with mux but I also manually tested that the metrics
were getting properly emitted
2026-03-06 01:57:58 -06:00
Mathias Fredriksson 719c24829a build(Makefile): use atomic writes for remaining gen targets (#22670)
Follow-up to #22612. Running `git status --short` in a loop during `make
-B -j gen` still showed intermediate states for several files. This PR
fixes the remaining ones.

The main issues:

- `generate.sh` ran `gofmt` and `goimports` in-place after moving files
  into the source tree. Now it formats in a workdir first and only `mv`s 
  the final result.
- `protoc` targets wrote directly to the source tree. Wrapped with
  `scripts/atomic_protoc.sh` which redirects output to a tmpdir.
- Several generators used hardcoded `/tmp/` paths. On systems where
  `/tmp` is tmpfs, `mv` degrades to copy+delete. Switched to a
  project-local `_gen/` directory (gitignored, same filesystem).
- `apidoc/.gen` and `cli/index.md` used `cp` for final output. Replaced
  with `mv`.
- `manifest.json` was written twice (unformatted, then formatted). Now
  `.gen` writes to a staging file and the manifest target does one
  formatted atomic write.
- `biome_format.sh` silently skipped files in gitignored dirs. Added
  `--vcs-enabled=false`.

Two helpers reduce the Makefile boilerplate: `scripts/atomic_protoc.sh`
(wraps protoc) and an `atomic_write` Make define
(stdout-to-temp-to-target pattern). `.PRECIOUS` now also covers `.pb.go`
and mock files.

Verification: `make -B -j gen` x3 with `git status` polling, no changes.

Refs #22612
2026-03-05 22:32:18 +02:00
Mathias Fredriksson a6a8fd94d7 build(Makefile): enable parallel make -j gen with correct dependency graph (#22612)
`make gen` could not run with `-j` because inter-target dependency edges
were missing. Multiple recipes compile `coderd/rbac` (which includes
generated files like `object_gen.go`), and without explicit ordering,
parallel runs produced syntax errors from mid-write reads.

Three main changes:

**Dependency graph fixes** declare the compile-time chain through
`coderd/rbac` so that `object_gen.go` is written before anything that
imports it is compiled. The DB generation targets use a GNU Make 4.3+
grouped target (`&:`) so Make knows `generate.sh` co-produces
`querier.go`, `unique_constraint.go`, `dbmetrics`, and `dbauthz` in a
single invocation. `SKIP_DUMP_SQL=1` avoids re-entrant `make` inside
`generate.sh` when the Makefile already guarantees `dump.sql` is fresh.

**`scripts/atomicwrite` package** replaces `os.WriteFile` in all gen
scripts with a temp-file-in-same-dir + rename pattern, preventing
interrupted runs from leaving partial files.

**`.PRECIOUS` and shell atomic writes** protect git-tracked generated
files from Make's default delete-on-error behavior. Since these files
are committed, deletion is worse than staleness -- `git restore` is the
recovery path.

CI now runs `make -j --output-sync -B gen` (~32s, down from ~85s
serial).

| Scenario                          | Before             | After    |
|-----------------------------------|--------------------|----------|
| `make gen` (serial)               | 95s                | 95s      |
| `make -j gen` (parallel)          | race error         | **22s**  |
| CI `make -j --output-sync -B gen` | forced serial ~85s | **~32s** |
2026-03-05 11:58:10 +00:00