Commit Graph

13680 Commits

Author SHA1 Message Date
Garrett Delfosse 3462c31f43 fix: update directory for terraform-managed subagents (#24220)
When a devcontainer subagent is terraform-managed, the provisioner sets
its directory to the host-side `workspace_folder` path at build time. At
runtime, the agent injection code determines the correct
container-internal
path from `devcontainer read-configuration` and sends it via
`CreateSubAgent`.

However, the `CreateSubAgent` handler only updated `display_apps` for
pre-existing agents, ignoring the `Directory` field. This caused
SSH/terminal
sessions to land in `~` instead of the workspace folder (e.g.
`/workspaces/foo`).

Add `UpdateWorkspaceAgentDirectoryByID` query and call it in the
terraform-managed subagent update path to also persist the directory.

Fixes PLAT-118

<details><summary>Root cause analysis</summary>

Two code paths set the subagent `Directory` field:

1. **Provisioner (build time):** `insertDevcontainerSubagent` in
`provisionerdserver.go`
   stores `dc.GetWorkspaceFolder()` — the **host-side** path from the
   `coder_devcontainer` Terraform resource (e.g. `/home/coder/project`).

2. **Agent injection (runtime):**
`maybeInjectSubAgentIntoContainerLocked` in
`api.go` reads the devcontainer config and gets the correct
**container-internal**
path (e.g. `/workspaces/project`), then calls `client.Create(ctx,
subAgentConfig)`.

For terraform-managed subagents (those with `req.Id != nil`),
`CreateSubAgent`
in `coderd/agentapi/subagent.go` recognized the pre-existing agent and
entered
the update path — but only called `UpdateWorkspaceAgentDisplayAppsByID`,
discarding the `Directory` field from the request. The agent kept the
stale
host-side path, which doesn't exist inside the container, causing
`expandPathToAbs` to fall back to `~`.

</details>

> [!NOTE]
> Generated by Coder Agents
2026-04-10 10:11:22 -04:00
Ethan a0ea71b74c perf(site/src): optimistically edit chat messages (#23976)
Previously, editing a past user message in Agents chat waited for the
PATCH round-trip and cache reconciliation before the conversation
visibly settled. The edited bubble and truncated tail could briefly fall
back to older fetched state, and a failed edit did not restore the full
local editing context cleanly.

Keep history editing optimistic end-to-end: update the edited user
bubble and truncate the tail immediately, preserve that visible
conversation until the authoritative replacement message and cache catch
up, and restore the draft/editor/attachment state on failure. The route
already scopes each `agentId` to a keyed `AgentChatPage` instance with
its own store/cache-writing closures, so navigating between chats does
not need an extra post-await active-chat guard to keep one chat's edit
response out of another chat.
2026-04-10 23:40:49 +10:00
Cian Johnston 0a14bb529e refactor(site): convert OrganizationAutocomplete to fully controlled component (#24211)
Fixes https://github.com/coder/internal/issues/1440

- Convert `OrganizationAutocomplete` to a purely presentational, fully
controlled component
- Accept `value`, `onChange`, `options` from parent; remove internal
state, data fetching, and permission filtering
- Update `CreateTemplateForm` and `CreateUserForm` to own org fetching,
permission checks, auto-select, and invalid-value clearing inline
- Memoize `orgOptions` in callers for stable `useEffect` deps
- Rewrite Storybook stories for the new controlled API


> 🤖 Written by a Coder Agent. Reviewed by a human.
2026-04-10 13:56:43 +01:00
Danielle Maywood 2c32d84f12 fix: remove double bottom border on build logs table (#24000) 2026-04-10 13:50:36 +01:00
Jaayden Halko 76d89f59af fix(site): add bottom spacing for sources-only assistant messages (#24202)
Closes CODAGT-123

Assistant messages containing only source parts (no markdown or
reasoning)
were missing the bottom spacer that normally fills the gap left by the
hidden
action bar, causing them to sit flush against the next user bubble.

The existing fallback spacer guarded on `Boolean(parsed.reasoning)`, so
it
only fired for thinking-only replies. Replace that guard with the
broader
`hasRenderableContent` flag (which covers blocks, tools, and sources)
and
extract a named `needsAssistantBottomSpacer` boolean so future content
types
inherit consistent spacing without re-reading compound conditions.

Adds a `SourcesOnlyAssistantSpacing` Storybook story mirroring the
existing
`ThinkingOnlyAssistantSpacing` pattern for regression coverage.
2026-04-10 13:09:23 +01:00
Jaayden Halko 1a3a92bd1b fix: fix 4px layout shift on streaming commit in chat (#24203)
Closes CODAGT-124

When a streaming assistant response finishes and moves from the live
stream
tail into the conversation timeline, the message jumps 4px upward. This
happens because the outer layout wrapper and live-stream section both
used
`gap-3` (12px), while the committed-message list used `gap-2` (8px).

Unify all three containers to `gap-2` so the gap between messages stays
at 8px regardless of whether they're streaming or committed, eliminating
the layout shift.

A Storybook story with play-function assertions locks the invariant: it
renders both committed messages and an active stream, then verifies both
the outer and inner containers report `rowGap === "8px"`.
2026-04-10 13:09:03 +01:00
Jake Howell 4018320614 fix: resolve <WorkspaceTimings /> size (#24235) 2026-04-10 21:31:43 +10:00
Susana Ferreira d9700baa8d docs(docs/ai-coder): document AI Gateway Proxy private IP restrictions (#24209)
Documents the private/reserved IP range restrictions added to AI Gateway
Proxy:

- **Restricting proxy access**: Updated to reflect that private/reserved
IP ranges are now blocked by default, with atomic IP validation to
prevent DNS rebinding. Documents the Coder access URL exemption and the
`CODER_AIBRIDGE_PROXY_ALLOWED_PRIVATE_CIDRS` option.
- **Upstream proxy**: Added a note on the DNS rebinding limitation when
an upstream proxy is configured, and that upstream proxies should
enforce their own restrictions.

> [!NOTE]
> Initially generated by Coder Agents, modified and reviewed by
@ssncferreira

Follow-up: #23109
2026-04-10 12:09:14 +01:00
Jake Howell 82456ff62e feat: resolve useTime() thunk() error (#24234)
Fixes a regression introduced in #24060 that could crash the frontend.

`thunk` is created by `useEffectEvent()`, and React 19.2 enforces that
effect-event functions are not invoked during render. The previous code
called `thunk()` inside a `setState` updater function, and React
executes updater
functions during render, so this became an illegal render-phase call.

The fix computes `next` in the interval callback (`const next =
thunk()`) and then stores it via `setComputedValue(() => next)`. This
keeps the `useEffectEvent` call outside render and also preserves
correct behavior when `func` returns a function value, because React
stores `next` instead of treating it as a functional updater.
2026-04-10 10:45:18 +00:00
Faur Ioan-Aurel 83fd4cf5c2 fix: OAuth2 cancel button in the authorization page not working (#24058)
Go's html/template has a built-in security filter (urlFilter) that only
allows http, https, and mailto URL schemes. Any other scheme gets
replaced with #ZgotmplZ.

The OAuth2 app's callback URL uses custom URI scheme which the filter
considers unsafe. For example the Coder JetBrains plugin exposes a
callback URI with the scheme jetbrains:// - which was effectively
changed by the template engine into #ZgotmplZ. Of course this is not an
actual callback. When users clicked the cancel button nothing happened.

The fix was simple - we now wrap the apps registered callback URI into
htmltemplate.URL. Usually this needs some validation otherwise the
linter will complain about it. The callback URI used by the Cancel logic
is actually validated by our backend when the client app
programmatically registered via the dynamic OAuth2 registration
endpoints, so we refactored the validation around that code and re-used
some of it in the Cancel handling to make sure we don't allow URIs like
`javascript` and `data`, even though in theory these URIs were already
validated.

In addition, while testing this PR with
https://github.com/coder/coder-jetbrains-toolbox/pull/209 I discovered
that we are also not compliant with
https://www.rfc-editor.org/rfc/rfc6749#section-4.1.2.1 which requires
the server to attach the local state if it was provided by the client in
the original request. Also it is optional but generally a good practice
to include `error_description` in the error responses. In fact we follow
this pattern for the other types of error responses. So this is not a
one off.

- resolves #20323
<img width="1485" height="771" alt="Cancel_page_with_invalid_uri"
src="https://github.com/user-attachments/assets/5539d234-9ce3-4dda-b421-d023fc9aa99e"
/>
<img width="486" height="746" alt="Coder Toolbox handling the Cancel
button"
src="https://github.com/user-attachments/assets/acab71a6-d29c-4fa9-80ba-3c0095bbdc8f"
/>

<!--

If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.

-->
2026-04-10 12:49:22 +03:00
Danielle Maywood 38d4da82b9 refactor: send raw typed payloads over chat WebSockets (#24148) 2026-04-10 10:47:30 +01:00
Jaayden Halko 19e0e0e8e6 perf(site): split InlineMarkdown out of Markdown to avoid loading PrismJS in initial bundle (#24192)
\`InlineMarkdown\` and \`MemoizedInlineMarkdown\` lived in
\`Markdown.tsx\`
alongside a static \`import { Prism as SyntaxHighlighter } from
"react-syntax-highlighter"\` — the full PrismJS build with ~300 language
grammars. Because \`DashboardLayout\` eagerly imports
\`AnnouncementBannerView → InlineMarkdown\`, every authenticated page
loaded and evaluated the entire Prism/refractor bundle on startup even
though syntax highlighting is only used in secondary views.

This PR moves \`InlineMarkdown\` and \`MemoizedInlineMarkdown\` into
their
own \`InlineMarkdown.tsx\` file that depends only on \`react-markdown\`
and
updates all six consumers to import from the new module.
\`Markdown.tsx\`
keeps the PrismJS import for the full \`Markdown\` component, which is
only reached through lazy-loaded routes.

> 🤖 Generated by Coder Agents
2026-04-10 07:34:31 +01:00
Ehab Younes 1d0653cdab fix(cli): retry dial timeouts in SSH connection setup (#24199)
Reorder error checks in isRetryableError so IsConnectionError is evaluated before context.DeadlineExceeded. Dial timeouts (*net.OpError wrapping DeadlineExceeded) were incorrectly treated as non-retryable, causing Coder Connect to fail immediately on broken tunnels with valid DNS despite existing retry logic.

Fixes #24201
2026-04-10 00:55:16 +03:00
Zach 95cff8c5fb feat: add REST API handlers and client methods for user secrets (#24107)
Add the five REST endpoints for managing user secrets, SDK client
methods, and handler tests.

Endpoints:
- `POST /api/v2/users/{user}/secrets`
- `GET /api/v2/users/{user}/secrets`
- `GET /api/v2/users/{user}/secrets/{name}`
- `PATCH /api/v2/users/{user}/secrets/{name}`
- `DELETE /api/v2/users/{user}/secrets/{name}`

Routes are registered under the existing `/{user}` group with
`ExtractUserParam`. The delete query was changed from `:exec` to
`:execrows` so the handler can distinguish "not found" from success
(DELETE with `:exec` silently returns nil for zero affected rows).
2026-04-09 12:12:55 -06:00
Ethan ad2415ede7 fix: bump coder/tailscale to pick up RTM_MISS fix (#24187)
## What

Bumps `coder/tailscale` to
[`e956a95`](https://github.com/coder/tailscale/commit/e956a950740bd737c55451f56e77038f7430a919)
([PR #113](https://github.com/coder/tailscale/pull/113)) to pick up the
`RTM_MISS` fix for the Darwin network monitor.

Already released on `release/2.31` as v2.31.8. (#24185) to unblock a
customer. This PR is to update `main`.

## Why

On Darwin, `RTM_MISS` route-socket messages (fired on every failed route
lookup) were not filtered by `netmon`, causing each one to be treated as
a `LinkChange`. When netcheck sends STUN probes to an IPv6 address with
no route, this creates a self-sustaining feedback loop: `RTM_MISS` →
`LinkChange` → `ReSTUN` → netcheck → v6 STUN probe → `RTM_MISS` → …

The loop drives DERP home-region flapping at ~70× baseline, which at
fleet scale saturates PostgreSQL's `NOTIFY` lock and causes coordinator
health-check timeouts.

The upstream fix adds a single `if msg.Type == unix.RTM_MISS { return
true }` check to `skipRouteMessage`. This is safe because `RTM_MISS` is
a lookup-path signal, not a table-mutation signal — route withdrawals
always emit `RTM_DELETE` before any subsequent lookup can miss.

Of note is that this issue has only been reported recently, since users
updated to macOS 26.4.

Relates to ENG-2394
2026-04-09 13:22:56 -04:00
Cian Johnston 1e40cea199 feat: warn in CLI when server runs dev or RC builds (#24158)
Adds warning on stderr when the server version contains `-devel` or
`-rc.N`

> 🤖 Written by a Coder Agent. Will be reviewed by a human.
2026-04-09 12:48:35 -04:00
Kayla はな 9d6557d173 refactor(site): migrate some components from emotion to tailwind (#24182) 2026-04-09 10:33:01 -06:00
Kayla はな 224db483d7 refactor(site): remove mui from a few components (#24125) 2026-04-09 10:02:26 -06:00
Yevhenii Shcherbina 8237822441 feat: byok observability api (#24207)
## Summary
Exposes `credential_kind` and `credential_hint` on AI Bridge session
threads, making credential metadata visible in the session detail API.
   
Each thread in the `/api/v2/aibridge/sessions/{session_id}` response now
includes:
- `credential_kind`: `centralized` or `byok`
- `credential_hint`: masked credential (e.g. `sk-a...pgAA`)
Values are taken from the thread's root interception.
## Changes

- `codersdk/aibridge.go`: Added `CredentialKind` and `CredentialHint`
fields to `AIBridgeThread`
- `coderd/database/db2sdk/db2sdk.go`: Populated from root interception
in `buildAIBridgeThread`
  - `SessionTimeline.stories.tsx`: Added fields to mock thread data
2026-04-09 11:41:17 -04:00
Ethan 65bf7c3b18 fix(coderd/x/chatd/chatloop): stabilize startup-timeout tests with quartz (#24193)
The startup-timeout integration tests in `chatloop` used a 5ms real-time
budget and relied on wall-clock scheduling to fire the startup guard
timer before the first stream part arrived. On loaded CI runners the
timer sometimes lost the race, producing `attempts == 2` instead of
`attempts == 1` and flaking `TestRun_FirstPartDisarmsStartupTimeout`.

Replace the real `time.Timer` in `startupGuard` with a `quartz.Timer` so
tests can control time deterministically. Production behavior is
unchanged: `RunOptions.Clock` defaults to `quartz.NewReal()` when nil,
and the startup timeout still covers both opening the provider stream
and waiting for the first stream part.

- Add `RunOptions.Clock quartz.Clock` with nil-safe default.
- Tag the startup guard timer as `"startupGuard"` for quartz trap
targeting.
- Rewrite the four startup-timeout integration tests to use
`quartz.NewMock(t)` with trap/advance/release sequences instead of
wall-clock sleeps.
- Add `awaitRunResult` helper so tests fail with a clear message instead
of hanging when `Run` does not complete.

Closes https://github.com/coder/internal/issues/1460
2026-04-10 00:40:09 +10:00
Garrett Delfosse 76cbc580f0 ci: add cherry-pick PR check for release branches (#24121)
Adds a GitHub Actions workflow that runs on PRs targeting `release/*`
branches to flag non-bug-fix cherry-picks.

## What it does

- Triggers on `pull_request_target` (opened, reopened, edited) for
`release/*` branches
- Checks if the PR title starts with `fix:` or `fix(scope):`
(conventional commit format)
- If not a bug fix, comments on the PR informing the author and emits a
warning (via `core.warning`), but does **not** fail the check
- Deduplicates comments on title edits by updating an existing comment
(identified by a hidden HTML marker) instead of creating a new one

> [!NOTE]
> Generated by Coder Agents
2026-04-09 10:37:56 -04:00
Kyle Carberry 391b22aef7 feat: add CLI commands for managing chat context from workspaces (#24105)
Adds `coder exp chat context add` and `coder exp chat context clear`
commands that run inside a workspace to manage chat context files via
the agent token.

`add` reads instruction and skill files from a directory (defaulting to
cwd) and inserts them as context-file messages into an active chat.
Multiple calls are additive — `instructionFromContextFiles` already
accumulates all context-file parts across messages.

`clear` soft-deletes all context-file messages, causing
`contextFileAgentID()` to return `!found` on the next turn, which
triggers `needsInstructionPersist=true` and re-fetches defaults from the
agent.

Both commands auto-detect the target chat via `CODER_CHAT_ID` (already
set by `agentproc` on chat-spawned processes), or fall back to
single-active-chat resolution for the agent. The `--chat` flag overrides
both.

Also adds sub-agent context inheritance: `createChildSubagentChat` now
copies parent context-file messages to child chats at spawn time, so
delegated sub-agents share the same instruction context without
independently re-fetching from the workspace agent.

<details><summary>Implementation details</summary>

**New files:**
- `cli/exp_chat.go` — CLI command tree under `coder exp chat context`

**Modified files:**
- `agent/agentcontextconfig/api.go` — `ConfigFromDir()` reads context
from an arbitrary directory without env vars
- `codersdk/agentsdk/agentsdk.go` — `AddChatContext`/`ClearChatContext`
SDK methods
- `coderd/workspaceagents.go` — POST/DELETE handlers on
`/workspaceagents/me/chat-context`
- `coderd/coderd.go` — Route registration
- `coderd/database/queries/chats.sql` — `GetActiveChatsByAgentID`,
`SoftDeleteContextFileMessages`
- `coderd/database/dbauthz/dbauthz.go` — RBAC implementations for new
queries
- `coderd/x/chatd/subagent.go` — `copyParentContextFiles` for sub-agent
inheritance
- `cli/root.go` — Register `chatCommand()` in `AGPLExperimental()`

**Auth pattern:** Uses `AgentAuth` (same as `coder external-auth`) —
agent token via `CODER_AGENT_TOKEN` + `CODER_AGENT_URL` env vars.

</details>

> 🤖 Generated by Coder Agents

---------

Co-authored-by: Michael Suchacz <203725896+ibetitsmike@users.noreply.github.com>
2026-04-09 16:33:00 +02:00
Michael Suchacz f8e8f979a2 chore(Makefile): use go build -o for helper binaries to reduce GOCACHE growth (#24197)
## Problem

`go run` caches the final linked executable in `~/.cache/go-build`.
Every
helper invocation via `go run ./scripts/<tool>` stores a copy, and
because
the cache key includes build metadata, the same tool accumulates
multiple
cached executables over time. With 12+ helper binaries invoked during
`make gen` and `make pre-commit`, this is a meaningful contributor to
GOCACHE growth.

## Fix

Replace `go run` with `go build -o _gen/bin/<tool>` for 12 repo-local
helper packages (16 Makefile callsites). Each helper is an explicit Make
file target with `$(wildcard *.go)` prerequisites, so `make -j`
serializes
builds correctly instead of racing on shared output paths.

Helpers converted: `apitypings`, `auditdocgen`, `check-scopes`,
`clidocgen`, `dbdump`, `examplegen`, `gensite`, `apikeyscopesgen`,
`metricsdocgen`, `metricsdocgen-scanner`, `modeloptionsgen`, `typegen`.

Left on `go run` (intentionally): `migrate-ci` and `migrate-test`
(CI/test-only, not on common developer paths).

`_gen/` is already in `.gitignore`. The `clean` target removes
`_gen/bin`.

## GOCACHE growth (isolated cache, single `make gen`)

|  | Old (`go run`) | New (`go build -o`) |
|--|----------------|---------------------|
| Total cache size | 2.9 GB | 2.6 GB |
| Cached executables | 11 | 4 |
| Executable bytes | 401 MB | 25 MB |

The 4 remaining executables come from tools outside this change
(`dbgen` and `goimports` from `generate.sh`, plus two `main` binaries
from deferred helpers). Helper binaries now live in `_gen/bin/`
(581 MB, gitignored, cleaned by `make clean`).

## Build time benchmarks

**Source changed** (content hash invalidated, forces recompile):

| Helper | `go run` | `go build -o` + run | Overhead |
|--------|---------|---------------------|----------|
| typegen | 1.50s | 2.03s | +0.52s |
| examplegen | 1.37s | 1.67s | +0.30s |
| apikeyscopesgen | 1.21s | 1.71s | +0.50s |
| modeloptionsgen | 1.23s | 1.64s | +0.41s |

**Repeat invocation** (no source change, the common `make gen` / `make
pre-commit` path):

| Helper | `go run` (cache lookup) | Cached binary | Speedup |
|--------|------------------------|---------------|---------|
| typegen | 0.346s | 0.037s | 9.4x |
| examplegen | 0.368s | 0.037s | 9.9x |
| modeloptionsgen | 0.342s | 0.021s | 16.3x |
| apikeyscopesgen | 0.298s | 0.030s | 9.9x |

When source changes, `go build -o` is 0.3-0.5s slower per helper (it
writes a local binary instead of caching in GOCACHE). On repeat runs
(the common path), the pre-built binary is 10-16x faster because
`go run` still does a staleness check while the binary just executes.

> This PR was authored by Mux on behalf of Mike.
2026-04-09 16:04:06 +02:00
Jeremy Ruppel fb0ed1162b fix(site): replace expandable agentic loop section with cool design (#24171)
the current page has an "Agentic loop completed" block that doesn't
really contain any valuable info that isn't available elsewhere. replace
this with a status indicator
 
<img width="507" height="300" alt="Screenshot 2026-04-08 at 2 47 40 PM"
src="https://github.com/user-attachments/assets/09cf3772-a52d-485d-a15e-b2257b2d9003"
/>
2026-04-09 09:18:19 -04:00
Jeremy Ruppel 3f519744aa fix(site): use locale string for token usage tooltip (#24177)
quality of life improvement

<img width="353" height="291" alt="Screenshot 2026-04-08 at 5 04 55 PM"
src="https://github.com/user-attachments/assets/f1165b03-c82d-4135-97a5-ce04ec7c41c0"
/>
2026-04-09 08:59:09 -04:00
Jeremy Ruppel 2505f6245f fix(site): request logs and sessions page UI consistency (#24163)
couple of little design tweaks to make the UI of the Request Logs page
and Sessions pages more consistent:

- decrease size of Request Logs page chevron
- copy Request Logs page chevron animation for Sessions expandable
sections
- use TokenBadges component in RequestLogsRow 
- wrap tool call counts in badges

<img width="1393" height="210" alt="Screenshot 2026-04-08 at 1 56 10 PM"
src="https://github.com/user-attachments/assets/97e7acb6-71c7-48d6-b0df-a102c7602cc0"
/>
2026-04-09 08:52:32 -04:00
Danielle Maywood 29ad2c6201 feat: merge Limits + Usage into unified Spend page (#24093) 2026-04-09 13:17:03 +01:00
Cian Johnston 27e5ff0a8e chore: update to our fork of charm.land/fantasy with appendCompact perf improvement (#24142)
Fixes CODAGT-117

Updates go.mod to reference our forks of the following dependencies:
* charmbracelet/anthropic-sdk-go =>
https://github.com/coder/anthropic-sdk-go/tree/coder_2_33
* charm.land/fantasy => https://github.com/coder/fantasy/tree/coder_2_33
2026-04-09 13:08:19 +01:00
Hugo Dutka 128a7c23e6 feat(site): agents desktop recording thumbnail frontend (#24023)
Frontend for https://github.com/coder/coder/pull/24022.

From that PR's description: 

> The agents chat interface displays thumbnails for videos recorded by
the computer use agent. Currently, to display a thumbnail, the frontend
downloads the entire video and shows the first frame.

#24022 adds a thumbnail file id to `wait_agent` tool results, and this
PR displays it instead of fetching the entire video.
2026-04-09 11:55:40 +00:00
Hugo Dutka efb19eb748 feat: agents desktop recording thumbnail backend (#24022)
The agents chat interface displays thumbnails for videos recorded by the
computer use agent. Currently, to display a thumbnail, the frontend
downloads the entire video and shows the first frame. This PR starts
storing a new thumbnail file in the database for every recorded video,
and exposes the file id in the `wait_agent` tool result alongside the
recording file id, so the frontend can fetch just the thumbnail.
2026-04-09 13:47:54 +02:00
Garrett Delfosse 2c499484b7 ci: attribute cherry-pick/backport PRs to the requesting user (#24195)
The cherry-pick and backport workflows create PRs under
`github-actions[bot]`. Since GitHub doesn't support creating PRs on
behalf of another user, this adds attribution to the user who added the
label (`github.event.sender.login`):

- **Assignee**: the labeler is assigned to the backport PR
- **Reviewer**: the labeler is added as a reviewer
- **PR body**: includes "Requested by: @user"

Applied to both `cherry-pick.yaml` and `backport.yaml`.

---

> Generated by Coder Agents
2026-04-09 07:44:58 -04:00
Hugo Dutka 33d9d0d875 feat(site): hide agents desktop tab when workspace is stopped (#24191)
Hide the agents desktop tab when the workspace tab is stopped. This
matches the terminal tab's behavior.
2026-04-09 10:51:26 +00:00
Ethan f219834f5c perf(site): add reconnect jitter to reconnectingWebsocket (#24096)
## Motivation

During the April 2 dogfood incident, a pod OOM-kill triggered a
reconnection storm: hundreds of chat-stream and agent-RPC websockets all
attempted to reconnect at the same deterministic backoff intervals (1 s,
2 s, 4 s, …). Because every browser tab computed the same delay, the
surviving replicas received a synchronized wall of new connections at
each retry tick, amplifying the overload that caused the first OOM in
the first place.

The root cause of the memory blowup (chatd serialization cost) is a
separate issue. This change addresses the secondary blast-radius
problem: when N clients reconnect in lockstep, the retry storm itself
becomes a capacity threat.

## Change

The shared `createReconnectingWebSocket` utility now applies symmetric
jitter (default ±30%) to the capped exponential-backoff delay before
scheduling the reconnect timer. With 100 clients and a 1 s base delay,
reconnects spread over the 700 ms–1300 ms window instead of all landing
at exactly 1000 ms, and once retries hit `maxMs` the scheduler still
preserves downward spread instead of collapsing back to a single tick.

Two new options are accepted by callers:

- **`jitter`** (0–1 fraction, default `0.3`) — controls the jitter
window. Values are clamped to `[0, 1]`; `0` preserves exact legacy
timing.
- **`random`** (`() => number`, default `Math.random`) — injectable RNG,
primarily a deterministic test seam. Non-finite output falls back to the
midpoint (`0.5`).

The `retryingAt` timestamp surfaced to `ChatStatusCallout` is computed
from the jittered delay, so the countdown shown to users reflects the
actual retry time. The scheduler also keeps `maxMs` as a hard ceiling on
the final delay and saturates exponential overflow at that cap instead
of dropping to `0ms` retries.

No production callers need changes — the default jitter activates
automatically for all four call sites (`AgentsPage` chat-list watcher,
`AgentChatPage` workspace watcher, `useChatStore` per-chat stream,
`useGitWatcher`). The two downstream tests that asserted exact reconnect
timing now pin `Math.random()` to `0.5` so those expectations stay
deterministic.
2026-04-09 20:31:37 +10:00
Danny Kopping 7a94a683c4 docs: rename AI Bridge to AI Gateway and Agent Boundaries to Agent Firewall (#24094)
*Disclaimer: implemented by a Coder Agent using Claude Opus 4.6*

## Summary

Renames product references across documentation:

| Old Name | New Name |
|----------|----------|
| AI Bridge | AI Gateway |
| AI Bridge Proxy | AI Gateway Proxy |
| Agent Boundaries | Agent Firewall |

## What changed

- Prose text, headings, titles, and descriptions updated across all docs
- Directories renamed:
  - `docs/ai-coder/ai-bridge/` → `docs/ai-coder/ai-gateway/`
- `docs/ai-coder/ai-bridge/ai-bridge-proxy/` →
`docs/ai-coder/ai-gateway/ai-gateway-proxy/`
  - `docs/ai-coder/agent-boundaries/` → `docs/ai-coder/agent-firewall/`
- All internal markdown links updated to new paths
- `manifest.json` route paths updated
- Rename notice added to AI Gateway and Agent Firewall entrypoint pages

## Companion PR

URL redirects (old paths → new paths):
[coder/coder.com#700](https://github.com/coder/coder.com/pull/700)

## What is intentionally NOT changed

- **Env vars**: `CODER_AIBRIDGE_*`
- **CLI flags**: `--aibridge-*`
- **API paths**: `/api/v2/aibridge/*`
- **Config keys**: `aibridge:` YAML blocks
- **Terraform variables**: `enable_aibridge`, `boundary_version`,
`use_boundary_directly`
- **Process names**: `aibridged`, `aibridgeproxyd`
- **Prometheus metrics**: `coder_aibridged_*`, `coder_aibridgeproxyd_*`
- **SDK types**: `codersdk.AIBridge*`
- **GitHub URLs**: `github.com/coder/aibridge`
- **Image paths**: `images/aibridge/`
- **Auto-generated reference docs**: `docs/reference/cli/aibridge*.md`,
`docs/reference/api/aibridge.md`, `docs/reference/api/schemas.md`
- **Frontend code**: `site/src/` references (separate PR)

Code-level renames (env vars, configs, frontend) are planned for a
follow-up PR.
2026-04-09 10:07:50 +00:00
Jake Howell 2e6fdf2344 fix: resolve <Badge /> incorrect sizes (#22539)
This pull-request makes a few changes to our `<Badge />` component to
bring it inline with Figma.

* Added all variants to the stories of Figma (they can vary per
badge-type, so its better we track everything).
* Removed the `border` variant of the component, border variants should
be on all `sm` and `md`.
* Added a hover effect to the `default` variant (per-design).
* Resolved issue with sizings of `xs` and `sm` plus resolved
iconography.
* Resolved issue with icons not showing at all on `xs` variants.
2026-04-09 19:55:59 +10:00
Danielle Maywood 3d139c1a24 refactor(site): replace !! with Boolean() for boolean coercion (#24180) 2026-04-09 10:48:54 +01:00
Jaayden Halko f957981c8b fix(site): add padding below thinking-only assistant messages (#24140)
closes CODAGT-122 

Add a spacer div that renders only when an assistant message lacks the
action bar, matching the height the action bar would provide.

> 🤖 Generated by Coder Agents
2026-04-09 10:17:51 +01:00
Atif Ali 584c61acb5 fix: mark connecting agents as unhealthy instead of healthy (#24044)
## Problem

Workspaces showed as "Healthy" immediately after creation while the
agent was still downloading, starting, or connecting. If the agent never
connected, the workspace stayed "Healthy" for the entire connection
timeout (~120s), then abruptly flipped to "Unhealthy".

## Root cause

In `db2sdk.WorkspaceAgent`, the health switch had no case for
`WorkspaceAgentConnecting`. Agents in `connecting` status with a
non-`off` lifecycle (e.g. `created` after a fresh build) fell through to
the `default` case and were marked `Healthy = true`.

## Fix

Add an explicit case for `WorkspaceAgentConnecting` that sets `Healthy =
false` with reason `"agent has not yet connected"`. The case is placed
after the existing `!connected + off` case (which correctly catches
stopped agents as "not running") and before the `timeout`/`disconnected`
cases.

```
Status        + Lifecycle       → Health reason
──────────────────────────────────────────────────────
any !connected + off           → "agent is not running"
connecting    + created/starting → "agent has not yet connected"  ← NEW
timeout       + any            → "agent is taking too long to connect"
disconnected  + any            → "agent has lost connection"
connected     + start_error    → "agent startup script exited with an error"
connected     + shutting_down  → "agent is shutting down"
connected     + ready/starting → healthy
```

The frontend already handles this case — `getAgentHealthIssue()` returns
"Workspace agent is still connecting" with `severity: "info"` for
unhealthy workspaces with connecting agents.

## Test changes

- **Healthy test**: now actually connects the agent via `agenttest.New`
before asserting health (previously passed due to the bug).
- **New Connecting test**: verifies a never-connected agent is correctly
marked unhealthy.
- **Mixed health test**: connects a1 and waits for the mixed state
(`a1.Healthy && !workspace.Healthy`) to avoid a race where both agents
are initially connecting.
- **Sub-agent excluded test**: connects the parent agent and waits for
it to be healthy before creating the sub-agent.
- **TestWorkspaceAgent/Connect**: flipped assertion to `Health.Healthy
== false` for a `dbfake` agent that never connects.

<details>
<summary>Review notes</summary>

### Known follow-up

The `healthy:false` workspace search filter maps to `[disconnected,
timeout]` and does not include `connecting`. This is a pre-existing gap
that is now more consequential — a workspace unhealthy solely due to a
connecting agent won't appear in `healthy:false` results. Worth a
follow-up issue.

### Deep review findings addressed

| Finding | Severity | Status |
|---------|----------|--------|
| Mixed health test race (all 3 reviewers) | P2 | Fixed — tightened
`Eventually` condition |
| `TestWorkspaceAgent/Connect` assertion break | P1 | Fixed — flipped
assertion |
| CLI renders red for connecting agents | Obs | Acknowledged — design
trade-off, accurate but visually strong for transient state |
| Switch case ordering overlap | Obs | Documented with inline comment |

</details>

> 🤖 This PR was created with the help of Coder Agents, and needs a human
review. 🧑💻
2026-04-09 13:21:28 +05:00
code-qtzl f95a5202bf fix(site/src/pages/CreateWorkspacePage): replace Tooltip with HelpPop… (#24057)
Replace Tooltip with `HelpPopover` in the "New workspace" page header.
`HelpPopover` supports interactive content like links and provides
better layout control, making it a better fit for this use case.
2026-04-09 15:34:29 +10:00
Matt Vollmer d954460380 docs: rename "Security implications" to "Security posture" (#24181)
Renames the "Security implications" section to "Security posture" and
reframes the intro paragraph. "Implications" reads as a caveat or
warning; the section actually describes built-in structural guarantees
of the control plane architecture.

> PR generated with Coder Agents
2026-04-08 19:55:56 -04:00
dylanhuff-at-coder f4240bb8c1 fix: sanitize workspace agent logs before insert (#24028)
Workspace agent logs could still fail after the earlier invalid UTF-8
fix because NUL bytes are valid Go/protobuf strings but are rejected by
Postgres text columns. The legacy HTTP log upload path also bypassed the
old sanitization entirely, and both server insert paths computed
logs_length from the unsanitized input.

Add a shared log-output sanitizer in agentsdk, use it in the protobuf
conversion path and both server-side insert paths, and compute
OutputLength from the sanitized string so overflow accounting matches
what is actually stored. This keeps the old invalid UTF-8 behavior while
also handling embedded NUL bytes consistently across DRPC and HTTP log
ingestion.

Refs [#23292 ](https://github.com/coder/coder/issues/23292)
Refs [#13433 ](https://github.com/coder/coder/issues/13433)
2026-04-08 16:29:38 -07:00
Zach 7caef4987f feat: add input validation for user secret env names and file paths (#24103)
Adds backend validation for user secret environment variable names and file paths.

Env name validation enforces POSIX naming rules and blocks a deliberately aggressive denylist of reserved names and prefixes. The denylist errs on the side of blocking too much since it's easier to remove entries later than to add them after users have created conflicting secrets.

File path validation requires paths to start with ~/ or /.
2026-04-08 17:02:33 -06:00
Zach 9b91af8ab7 feat: add user secrets SDK types and db2sdk converters (#24102)
Adds the SDK types and database-to-SDK conversion helpers for the user secrets feature.
2026-04-08 16:48:41 -06:00
Matt Vollmer 506fba9ebf docs: add BYOK docs, fix tool tables, add platform controls (#24178)
Fixes several documentation gaps and inaccuracies in the Coder Agents
docs identified during a deep review against the current product state.

## BYOK (User API Keys)

`models.md` stated *"Developers cannot add their own providers, models,
or API keys"* — this has been incorrect since the provider key policy
system shipped (Apr 2, #23751/#23781).

- Added **Key policy** section documenting the three admin toggles
(`central_api_key_enabled`, `allow_user_api_key`,
`allow_central_api_key_fallback`) with a truth table showing all
resolution outcomes
- Added **User API keys (BYOK)** section covering the developer-facing
key management page, status indicators, selection priority, and key
removal
- Updated `platform-controls/index.md` to reference BYOK instead of
claiming keys are admin-only

## Reasoning effort enum fixes

- **OpenAI**: removed `none` — code accepts `minimal, low, medium, high,
xhigh`
- **OpenRouter**: narrowed to `low, medium, high` per
`ReasoningEffortFromChat` in `chatprovider.go`

## Tool table completeness

- Added `spawn_computer_use_agent`, `read_skill`, `read_skill_file` to
`index.md` tool table
- Added "Workspace extension tools" section to `architecture.md` for
`read_skill`/`read_skill_file`
- Fixed orchestration restriction note to list all 5 gated tools instead
of just `spawn_agent`
- Added conditional availability notes for desktop and skills tools

## Platform controls

Three admin-only settings existed in the Behavior tab with no
documentation:

- **Virtual desktop** — admin toggle, Anthropic + portabledesktop
requirements
- **Workspace autostop fallback** — default TTL for agent workspaces
without template-defined autostop
- **Data retention** — moved `chat-retention.md` into
`platform-controls/` since it's admin-only, fixed nav path

---

> PR generated with Coder Agents
2026-04-08 18:24:12 -04:00
Cian Johnston 461a31e5d8 feat(site): add under-construction navbar stripes for pre-release builds (#24157)
Dev and RC builds now show diagonal warning stripes in the navbar plus a
centered version badge, making it impossible to miss which build you're
running.

**Devel build:** amber "warning" from theme

**RC build:** sky "pending" from theme

> 🤖 Written by a Coder Agent. Will be reviewed by a human.
2026-04-08 20:10:03 +00:00
Carlo Field e3a0dcd6fc feat: add httproute for K8s Gateway API (#23501)
<!--

If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.

-->
No AI was used to generate this PR.

Adds support for [Gateway API
HTTPRoutes](https://gateway-api.sigs.k8s.io/api-types/httproute/) as an
alternative to Ingress.

---------

Signed-off-by: Carlo Field <carlo@swiss.dev>
Co-authored-by: bpmct <bpmct@users.noreply.github.com>
Co-authored-by: Ben Potter <ben@coder.com>
2026-04-08 14:59:17 -05:00
Danielle Maywood 12ada0115f fix(site): move pagination test from vitest to storybook story (#24165) 2026-04-08 20:56:53 +01:00
Cian Johnston 7b0421d8c6 fix: revert auto-assign agents-access role enabled (#24170)
This reverts commit d4a9c63e91 (#23968).

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-04-08 20:56:17 +01:00
Hugo Dutka 477d6d0cde fix(site): fix agents right panel layout on small landscape viewports (#24161)
Currently, when you're using Agents on mobile with a vertical viewport
and you open the sidebar, the sidebar takes up the entire screen. That's
great, since there isn't enough space to show the other tabs. But when
you tilt your phone to horizontal mode, all 3 tabs show up, and none of
them are very legible:


https://github.com/user-attachments/assets/50a54791-fe53-4a5d-ba7b-85e82f970851

This PR makes it so that the right sidebar takes up the entire screen on
small viewports (<1024px) in horizontal mode too.



https://github.com/user-attachments/assets/a06069df-9f2f-42bd-8072-a237434434e5
2026-04-08 20:01:59 +02:00
Jeremy Ruppel de61ac529d fix(site): scroll when request logs tool call is huge (#24162)
**Disclaimer: I've never encountered this on dogfood, only on my local
where Claude likes to do really long tool calls**

On the Request Logs page, if a tool call has super long lines, it will
break the row layout:


https://github.com/user-attachments/assets/fd1a8be0-7912-4611-a1c3-0c7943b1ea52

This adds stories to demonstrate the behavior, and then a lil overflow x
auto action for the fix


https://github.com/user-attachments/assets/f0fd94da-8254-4330-a718-08599909e8ec
2026-04-08 13:53:43 -04:00