coder

mirror of https://github.com/coder/coder.git synced 2026-06-02 20:48:20 +00:00

Author	SHA1	Message	Date
Garrett Delfosse	3462c31f43	fix: update directory for terraform-managed subagents (#24220 ) When a devcontainer subagent is terraform-managed, the provisioner sets its directory to the host-side `workspace_folder` path at build time. At runtime, the agent injection code determines the correct container-internal path from `devcontainer read-configuration` and sends it via `CreateSubAgent`. However, the `CreateSubAgent` handler only updated `display_apps` for pre-existing agents, ignoring the `Directory` field. This caused SSH/terminal sessions to land in `~` instead of the workspace folder (e.g. `/workspaces/foo`). Add `UpdateWorkspaceAgentDirectoryByID` query and call it in the terraform-managed subagent update path to also persist the directory. Fixes PLAT-118 <details><summary>Root cause analysis</summary> Two code paths set the subagent `Directory` field: 1. Provisioner (build time): `insertDevcontainerSubagent` in `provisionerdserver.go` stores `dc.GetWorkspaceFolder()` — the host-side path from the `coder_devcontainer` Terraform resource (e.g. `/home/coder/project`). 2. Agent injection (runtime): `maybeInjectSubAgentIntoContainerLocked` in `api.go` reads the devcontainer config and gets the correct container-internal path (e.g. `/workspaces/project`), then calls `client.Create(ctx, subAgentConfig)`. For terraform-managed subagents (those with `req.Id != nil`), `CreateSubAgent` in `coderd/agentapi/subagent.go` recognized the pre-existing agent and entered the update path — but only called `UpdateWorkspaceAgentDisplayAppsByID`, discarding the `Directory` field from the request. The agent kept the stale host-side path, which doesn't exist inside the container, causing `expandPathToAbs` to fall back to `~`. </details> > [!NOTE] > Generated by Coder Agents	2026-04-10 10:11:22 -04:00
Ethan	a0ea71b74c	perf(site/src): optimistically edit chat messages (#23976 ) Previously, editing a past user message in Agents chat waited for the PATCH round-trip and cache reconciliation before the conversation visibly settled. The edited bubble and truncated tail could briefly fall back to older fetched state, and a failed edit did not restore the full local editing context cleanly. Keep history editing optimistic end-to-end: update the edited user bubble and truncate the tail immediately, preserve that visible conversation until the authoritative replacement message and cache catch up, and restore the draft/editor/attachment state on failure. The route already scopes each `agentId` to a keyed `AgentChatPage` instance with its own store/cache-writing closures, so navigating between chats does not need an extra post-await active-chat guard to keep one chat's edit response out of another chat.	2026-04-10 23:40:49 +10:00
Cian Johnston	0a14bb529e	refactor(site): convert OrganizationAutocomplete to fully controlled component (#24211 ) Fixes https://github.com/coder/internal/issues/1440 - Convert `OrganizationAutocomplete` to a purely presentational, fully controlled component - Accept `value`, `onChange`, `options` from parent; remove internal state, data fetching, and permission filtering - Update `CreateTemplateForm` and `CreateUserForm` to own org fetching, permission checks, auto-select, and invalid-value clearing inline - Memoize `orgOptions` in callers for stable `useEffect` deps - Rewrite Storybook stories for the new controlled API > 🤖 Written by a Coder Agent. Reviewed by a human.	2026-04-10 13:56:43 +01:00
Danielle Maywood	2c32d84f12	fix: remove double bottom border on build logs table (#24000 )	2026-04-10 13:50:36 +01:00
Jaayden Halko	76d89f59af	fix(site): add bottom spacing for sources-only assistant messages (#24202 ) Closes CODAGT-123 Assistant messages containing only source parts (no markdown or reasoning) were missing the bottom spacer that normally fills the gap left by the hidden action bar, causing them to sit flush against the next user bubble. The existing fallback spacer guarded on `Boolean(parsed.reasoning)`, so it only fired for thinking-only replies. Replace that guard with the broader `hasRenderableContent` flag (which covers blocks, tools, and sources) and extract a named `needsAssistantBottomSpacer` boolean so future content types inherit consistent spacing without re-reading compound conditions. Adds a `SourcesOnlyAssistantSpacing` Storybook story mirroring the existing `ThinkingOnlyAssistantSpacing` pattern for regression coverage.	2026-04-10 13:09:23 +01:00
Jaayden Halko	1a3a92bd1b	fix: fix 4px layout shift on streaming commit in chat (#24203 ) Closes CODAGT-124 When a streaming assistant response finishes and moves from the live stream tail into the conversation timeline, the message jumps 4px upward. This happens because the outer layout wrapper and live-stream section both used `gap-3` (12px), while the committed-message list used `gap-2` (8px). Unify all three containers to `gap-2` so the gap between messages stays at 8px regardless of whether they're streaming or committed, eliminating the layout shift. A Storybook story with play-function assertions locks the invariant: it renders both committed messages and an active stream, then verifies both the outer and inner containers report `rowGap === "8px"`.	2026-04-10 13:09:03 +01:00
Jake Howell	4018320614	fix: resolve `<WorkspaceTimings />` size (#24235 )	2026-04-10 21:31:43 +10:00
Susana Ferreira	d9700baa8d	docs(docs/ai-coder): document AI Gateway Proxy private IP restrictions (#24209 ) Documents the private/reserved IP range restrictions added to AI Gateway Proxy: - Restricting proxy access: Updated to reflect that private/reserved IP ranges are now blocked by default, with atomic IP validation to prevent DNS rebinding. Documents the Coder access URL exemption and the `CODER_AIBRIDGE_PROXY_ALLOWED_PRIVATE_CIDRS` option. - Upstream proxy: Added a note on the DNS rebinding limitation when an upstream proxy is configured, and that upstream proxies should enforce their own restrictions. > [!NOTE] > Initially generated by Coder Agents, modified and reviewed by @ssncferreira Follow-up: #23109	2026-04-10 12:09:14 +01:00
Jake Howell	82456ff62e	feat: resolve `useTime()` `thunk()` error (#24234 ) Fixes a regression introduced in #24060 that could crash the frontend. `thunk` is created by `useEffectEvent()`, and React 19.2 enforces that effect-event functions are not invoked during render. The previous code called `thunk()` inside a `setState` updater function, and React executes updater functions during render, so this became an illegal render-phase call. The fix computes `next` in the interval callback (`const next = thunk()`) and then stores it via `setComputedValue(() => next)`. This keeps the `useEffectEvent` call outside render and also preserves correct behavior when `func` returns a function value, because React stores `next` instead of treating it as a functional updater.	2026-04-10 10:45:18 +00:00
Faur Ioan-Aurel	83fd4cf5c2	fix: OAuth2 cancel button in the authorization page not working (#24058 ) Go's html/template has a built-in security filter (urlFilter) that only allows http, https, and mailto URL schemes. Any other scheme gets replaced with #ZgotmplZ. The OAuth2 app's callback URL uses custom URI scheme which the filter considers unsafe. For example the Coder JetBrains plugin exposes a callback URI with the scheme jetbrains:// - which was effectively changed by the template engine into #ZgotmplZ. Of course this is not an actual callback. When users clicked the cancel button nothing happened. The fix was simple - we now wrap the apps registered callback URI into htmltemplate.URL. Usually this needs some validation otherwise the linter will complain about it. The callback URI used by the Cancel logic is actually validated by our backend when the client app programmatically registered via the dynamic OAuth2 registration endpoints, so we refactored the validation around that code and re-used some of it in the Cancel handling to make sure we don't allow URIs like `javascript` and `data`, even though in theory these URIs were already validated. In addition, while testing this PR with https://github.com/coder/coder-jetbrains-toolbox/pull/209 I discovered that we are also not compliant with https://www.rfc-editor.org/rfc/rfc6749#section-4.1.2.1 which requires the server to attach the local state if it was provided by the client in the original request. Also it is optional but generally a good practice to include `error_description` in the error responses. In fact we follow this pattern for the other types of error responses. So this is not a one off. - resolves #20323 <img width="1485" height="771" alt="Cancel_page_with_invalid_uri" src="https://github.com/user-attachments/assets/5539d234-9ce3-4dda-b421-d023fc9aa99e" /> <img width="486" height="746" alt="Coder Toolbox handling the Cancel button" src="https://github.com/user-attachments/assets/acab71a6-d29c-4fa9-80ba-3c0095bbdc8f" /> <!-- If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting. -->	2026-04-10 12:49:22 +03:00
Danielle Maywood	38d4da82b9	refactor: send raw typed payloads over chat WebSockets (#24148 )	2026-04-10 10:47:30 +01:00
Jaayden Halko	19e0e0e8e6	perf(site): split InlineMarkdown out of Markdown to avoid loading PrismJS in initial bundle (#24192 ) \`InlineMarkdown\` and \`MemoizedInlineMarkdown\` lived in \`Markdown.tsx\` alongside a static \`import { Prism as SyntaxHighlighter } from "react-syntax-highlighter"\` — the full PrismJS build with ~300 language grammars. Because \`DashboardLayout\` eagerly imports \`AnnouncementBannerView → InlineMarkdown\`, every authenticated page loaded and evaluated the entire Prism/refractor bundle on startup even though syntax highlighting is only used in secondary views. This PR moves \`InlineMarkdown\` and \`MemoizedInlineMarkdown\` into their own \`InlineMarkdown.tsx\` file that depends only on \`react-markdown\` and updates all six consumers to import from the new module. \`Markdown.tsx\` keeps the PrismJS import for the full \`Markdown\` component, which is only reached through lazy-loaded routes. > 🤖 Generated by Coder Agents	2026-04-10 07:34:31 +01:00
Ehab Younes	1d0653cdab	fix(cli): retry dial timeouts in SSH connection setup (#24199 ) Reorder error checks in isRetryableError so IsConnectionError is evaluated before context.DeadlineExceeded. Dial timeouts (*net.OpError wrapping DeadlineExceeded) were incorrectly treated as non-retryable, causing Coder Connect to fail immediately on broken tunnels with valid DNS despite existing retry logic. Fixes #24201	2026-04-10 00:55:16 +03:00
Zach	95cff8c5fb	feat: add REST API handlers and client methods for user secrets (#24107 ) Add the five REST endpoints for managing user secrets, SDK client methods, and handler tests. Endpoints: - `POST /api/v2/users/{user}/secrets` - `GET /api/v2/users/{user}/secrets` - `GET /api/v2/users/{user}/secrets/{name}` - `PATCH /api/v2/users/{user}/secrets/{name}` - `DELETE /api/v2/users/{user}/secrets/{name}` Routes are registered under the existing `/{user}` group with `ExtractUserParam`. The delete query was changed from `:exec` to `:execrows` so the handler can distinguish "not found" from success (DELETE with `:exec` silently returns nil for zero affected rows).	2026-04-09 12:12:55 -06:00
Ethan	ad2415ede7	fix: bump coder/tailscale to pick up RTM_MISS fix (#24187 ) ## What Bumps `coder/tailscale` to [`e956a95`](https://github.com/coder/tailscale/commit/e956a950740bd737c55451f56e77038f7430a919) ([PR #113](https://github.com/coder/tailscale/pull/113)) to pick up the `RTM_MISS` fix for the Darwin network monitor. Already released on `release/2.31` as v2.31.8. (#24185) to unblock a customer. This PR is to update `main`. ## Why On Darwin, `RTM_MISS` route-socket messages (fired on every failed route lookup) were not filtered by `netmon`, causing each one to be treated as a `LinkChange`. When netcheck sends STUN probes to an IPv6 address with no route, this creates a self-sustaining feedback loop: `RTM_MISS` → `LinkChange` → `ReSTUN` → netcheck → v6 STUN probe → `RTM_MISS` → … The loop drives DERP home-region flapping at ~70× baseline, which at fleet scale saturates PostgreSQL's `NOTIFY` lock and causes coordinator health-check timeouts. The upstream fix adds a single `if msg.Type == unix.RTM_MISS { return true }` check to `skipRouteMessage`. This is safe because `RTM_MISS` is a lookup-path signal, not a table-mutation signal — route withdrawals always emit `RTM_DELETE` before any subsequent lookup can miss. Of note is that this issue has only been reported recently, since users updated to macOS 26.4. Relates to ENG-2394	2026-04-09 13:22:56 -04:00
Cian Johnston	1e40cea199	feat: warn in CLI when server runs dev or RC builds (#24158 ) Adds warning on stderr when the server version contains `-devel` or `-rc.N` > 🤖 Written by a Coder Agent. Will be reviewed by a human.	2026-04-09 12:48:35 -04:00
Kayla はな	9d6557d173	refactor(site): migrate some components from emotion to tailwind (#24182 )	2026-04-09 10:33:01 -06:00
Kayla はな	224db483d7	refactor(site): remove mui from a few components (#24125 )	2026-04-09 10:02:26 -06:00
Yevhenii Shcherbina	8237822441	feat: byok observability api (#24207 ) ## Summary Exposes `credential_kind` and `credential_hint` on AI Bridge session threads, making credential metadata visible in the session detail API. Each thread in the `/api/v2/aibridge/sessions/{session_id}` response now includes: - `credential_kind`: `centralized` or `byok` - `credential_hint`: masked credential (e.g. `sk-a...pgAA`) Values are taken from the thread's root interception. ## Changes - `codersdk/aibridge.go`: Added `CredentialKind` and `CredentialHint` fields to `AIBridgeThread` - `coderd/database/db2sdk/db2sdk.go`: Populated from root interception in `buildAIBridgeThread` - `SessionTimeline.stories.tsx`: Added fields to mock thread data	2026-04-09 11:41:17 -04:00
Ethan	65bf7c3b18	fix(coderd/x/chatd/chatloop): stabilize startup-timeout tests with quartz (#24193 ) The startup-timeout integration tests in `chatloop` used a 5ms real-time budget and relied on wall-clock scheduling to fire the startup guard timer before the first stream part arrived. On loaded CI runners the timer sometimes lost the race, producing `attempts == 2` instead of `attempts == 1` and flaking `TestRun_FirstPartDisarmsStartupTimeout`. Replace the real `time.Timer` in `startupGuard` with a `quartz.Timer` so tests can control time deterministically. Production behavior is unchanged: `RunOptions.Clock` defaults to `quartz.NewReal()` when nil, and the startup timeout still covers both opening the provider stream and waiting for the first stream part. - Add `RunOptions.Clock quartz.Clock` with nil-safe default. - Tag the startup guard timer as `"startupGuard"` for quartz trap targeting. - Rewrite the four startup-timeout integration tests to use `quartz.NewMock(t)` with trap/advance/release sequences instead of wall-clock sleeps. - Add `awaitRunResult` helper so tests fail with a clear message instead of hanging when `Run` does not complete. Closes https://github.com/coder/internal/issues/1460	2026-04-10 00:40:09 +10:00
Garrett Delfosse	76cbc580f0	ci: add cherry-pick PR check for release branches (#24121 ) Adds a GitHub Actions workflow that runs on PRs targeting `release/` branches to flag non-bug-fix cherry-picks. ## What it does - Triggers on `pull_request_target` (opened, reopened, edited) for `release/` branches - Checks if the PR title starts with `fix:` or `fix(scope):` (conventional commit format) - If not a bug fix, comments on the PR informing the author and emits a warning (via `core.warning`), but does not fail the check - Deduplicates comments on title edits by updating an existing comment (identified by a hidden HTML marker) instead of creating a new one > [!NOTE] > Generated by Coder Agents	2026-04-09 10:37:56 -04:00
Kyle Carberry	391b22aef7	feat: add CLI commands for managing chat context from workspaces (#24105 ) Adds `coder exp chat context add` and `coder exp chat context clear` commands that run inside a workspace to manage chat context files via the agent token. `add` reads instruction and skill files from a directory (defaulting to cwd) and inserts them as context-file messages into an active chat. Multiple calls are additive — `instructionFromContextFiles` already accumulates all context-file parts across messages. `clear` soft-deletes all context-file messages, causing `contextFileAgentID()` to return `!found` on the next turn, which triggers `needsInstructionPersist=true` and re-fetches defaults from the agent. Both commands auto-detect the target chat via `CODER_CHAT_ID` (already set by `agentproc` on chat-spawned processes), or fall back to single-active-chat resolution for the agent. The `--chat` flag overrides both. Also adds sub-agent context inheritance: `createChildSubagentChat` now copies parent context-file messages to child chats at spawn time, so delegated sub-agents share the same instruction context without independently re-fetching from the workspace agent. <details><summary>Implementation details</summary> New files: - `cli/exp_chat.go` — CLI command tree under `coder exp chat context` Modified files: - `agent/agentcontextconfig/api.go` — `ConfigFromDir()` reads context from an arbitrary directory without env vars - `codersdk/agentsdk/agentsdk.go` — `AddChatContext`/`ClearChatContext` SDK methods - `coderd/workspaceagents.go` — POST/DELETE handlers on `/workspaceagents/me/chat-context` - `coderd/coderd.go` — Route registration - `coderd/database/queries/chats.sql` — `GetActiveChatsByAgentID`, `SoftDeleteContextFileMessages` - `coderd/database/dbauthz/dbauthz.go` — RBAC implementations for new queries - `coderd/x/chatd/subagent.go` — `copyParentContextFiles` for sub-agent inheritance - `cli/root.go` — Register `chatCommand()` in `AGPLExperimental()` Auth pattern: Uses `AgentAuth` (same as `coder external-auth`) — agent token via `CODER_AGENT_TOKEN` + `CODER_AGENT_URL` env vars. </details> > 🤖 Generated by Coder Agents --------- Co-authored-by: Michael Suchacz <203725896+ibetitsmike@users.noreply.github.com>	2026-04-09 16:33:00 +02:00
Michael Suchacz	f8e8f979a2	chore(Makefile): use go build -o for helper binaries to reduce GOCACHE growth (#24197 ) ## Problem `go run` caches the final linked executable in `~/.cache/go-build`. Every helper invocation via `go run ./scripts/<tool>` stores a copy, and because the cache key includes build metadata, the same tool accumulates multiple cached executables over time. With 12+ helper binaries invoked during `make gen` and `make pre-commit`, this is a meaningful contributor to GOCACHE growth. ## Fix Replace `go run` with `go build -o _gen/bin/<tool>` for 12 repo-local helper packages (16 Makefile callsites). Each helper is an explicit Make file target with `$(wildcard .go)` prerequisites, so `make -j` serializes builds correctly instead of racing on shared output paths. Helpers converted: `apitypings`, `auditdocgen`, `check-scopes`, `clidocgen`, `dbdump`, `examplegen`, `gensite`, `apikeyscopesgen`, `metricsdocgen`, `metricsdocgen-scanner`, `modeloptionsgen`, `typegen`. Left on `go run` (intentionally): `migrate-ci` and `migrate-test` (CI/test-only, not on common developer paths). `_gen/` is already in `.gitignore`. The `clean` target removes `_gen/bin`. ## GOCACHE growth (isolated cache, single `make gen`) \| \| Old (`go run`) \| New (`go build -o`) \| \|--\|----------------\|---------------------\| \| Total cache size \| 2.9 GB \| 2.6 GB \| \| Cached executables \| 11 \| 4 \| \| Executable bytes \| 401 MB \| 25 MB \| The 4 remaining executables come from tools outside this change (`dbgen` and `goimports` from `generate.sh`, plus two `main` binaries from deferred helpers). Helper binaries now live in `_gen/bin/` (581 MB, gitignored, cleaned by `make clean`). ## Build time benchmarks Source changed* (content hash invalidated, forces recompile): \| Helper \| `go run` \| `go build -o` + run \| Overhead \| \|--------\|---------\|---------------------\|----------\| \| typegen \| 1.50s \| 2.03s \| +0.52s \| \| examplegen \| 1.37s \| 1.67s \| +0.30s \| \| apikeyscopesgen \| 1.21s \| 1.71s \| +0.50s \| \| modeloptionsgen \| 1.23s \| 1.64s \| +0.41s \| Repeat invocation (no source change, the common `make gen` / `make pre-commit` path): \| Helper \| `go run` (cache lookup) \| Cached binary \| Speedup \| \|--------\|------------------------\|---------------\|---------\| \| typegen \| 0.346s \| 0.037s \| 9.4x \| \| examplegen \| 0.368s \| 0.037s \| 9.9x \| \| modeloptionsgen \| 0.342s \| 0.021s \| 16.3x \| \| apikeyscopesgen \| 0.298s \| 0.030s \| 9.9x \| When source changes, `go build -o` is 0.3-0.5s slower per helper (it writes a local binary instead of caching in GOCACHE). On repeat runs (the common path), the pre-built binary is 10-16x faster because `go run` still does a staleness check while the binary just executes. > This PR was authored by Mux on behalf of Mike.	2026-04-09 16:04:06 +02:00
Jeremy Ruppel	fb0ed1162b	fix(site): replace expandable agentic loop section with cool design (#24171 ) the current page has an "Agentic loop completed" block that doesn't really contain any valuable info that isn't available elsewhere. replace this with a status indicator <img width="507" height="300" alt="Screenshot 2026-04-08 at 2 47 40 PM" src="https://github.com/user-attachments/assets/09cf3772-a52d-485d-a15e-b2257b2d9003" />	2026-04-09 09:18:19 -04:00
Jeremy Ruppel	3f519744aa	fix(site): use locale string for token usage tooltip (#24177 ) quality of life improvement <img width="353" height="291" alt="Screenshot 2026-04-08 at 5 04 55 PM" src="https://github.com/user-attachments/assets/f1165b03-c82d-4135-97a5-ce04ec7c41c0" />	2026-04-09 08:59:09 -04:00
Jeremy Ruppel	2505f6245f	fix(site): request logs and sessions page UI consistency (#24163 ) couple of little design tweaks to make the UI of the Request Logs page and Sessions pages more consistent: - decrease size of Request Logs page chevron - copy Request Logs page chevron animation for Sessions expandable sections - use TokenBadges component in RequestLogsRow - wrap tool call counts in badges <img width="1393" height="210" alt="Screenshot 2026-04-08 at 1 56 10 PM" src="https://github.com/user-attachments/assets/97e7acb6-71c7-48d6-b0df-a102c7602cc0" />	2026-04-09 08:52:32 -04:00
Danielle Maywood	29ad2c6201	feat: merge Limits + Usage into unified Spend page (#24093 )	2026-04-09 13:17:03 +01:00
Cian Johnston	27e5ff0a8e	chore: update to our fork of charm.land/fantasy with appendCompact perf improvement (#24142 ) Fixes CODAGT-117 Updates go.mod to reference our forks of the following dependencies: * charmbracelet/anthropic-sdk-go => https://github.com/coder/anthropic-sdk-go/tree/coder_2_33 * charm.land/fantasy => https://github.com/coder/fantasy/tree/coder_2_33	2026-04-09 13:08:19 +01:00
Hugo Dutka	128a7c23e6	feat(site): agents desktop recording thumbnail frontend (#24023 ) Frontend for https://github.com/coder/coder/pull/24022. From that PR's description: > The agents chat interface displays thumbnails for videos recorded by the computer use agent. Currently, to display a thumbnail, the frontend downloads the entire video and shows the first frame. #24022 adds a thumbnail file id to `wait_agent` tool results, and this PR displays it instead of fetching the entire video.	2026-04-09 11:55:40 +00:00
Hugo Dutka	efb19eb748	feat: agents desktop recording thumbnail backend (#24022 ) The agents chat interface displays thumbnails for videos recorded by the computer use agent. Currently, to display a thumbnail, the frontend downloads the entire video and shows the first frame. This PR starts storing a new thumbnail file in the database for every recorded video, and exposes the file id in the `wait_agent` tool result alongside the recording file id, so the frontend can fetch just the thumbnail.	2026-04-09 13:47:54 +02:00
Garrett Delfosse	2c499484b7	ci: attribute cherry-pick/backport PRs to the requesting user (#24195 ) The cherry-pick and backport workflows create PRs under `github-actions[bot]`. Since GitHub doesn't support creating PRs on behalf of another user, this adds attribution to the user who added the label (`github.event.sender.login`): - Assignee: the labeler is assigned to the backport PR - Reviewer: the labeler is added as a reviewer - PR body: includes "Requested by: @user" Applied to both `cherry-pick.yaml` and `backport.yaml`. --- > Generated by Coder Agents	2026-04-09 07:44:58 -04:00
Hugo Dutka	33d9d0d875	feat(site): hide agents desktop tab when workspace is stopped (#24191 ) Hide the agents desktop tab when the workspace tab is stopped. This matches the terminal tab's behavior.	2026-04-09 10:51:26 +00:00
Ethan	f219834f5c	perf(site): add reconnect jitter to reconnectingWebsocket (#24096 ) ## Motivation During the April 2 dogfood incident, a pod OOM-kill triggered a reconnection storm: hundreds of chat-stream and agent-RPC websockets all attempted to reconnect at the same deterministic backoff intervals (1 s, 2 s, 4 s, …). Because every browser tab computed the same delay, the surviving replicas received a synchronized wall of new connections at each retry tick, amplifying the overload that caused the first OOM in the first place. The root cause of the memory blowup (chatd serialization cost) is a separate issue. This change addresses the secondary blast-radius problem: when N clients reconnect in lockstep, the retry storm itself becomes a capacity threat. ## Change The shared `createReconnectingWebSocket` utility now applies symmetric jitter (default ±30%) to the capped exponential-backoff delay before scheduling the reconnect timer. With 100 clients and a 1 s base delay, reconnects spread over the 700 ms–1300 ms window instead of all landing at exactly 1000 ms, and once retries hit `maxMs` the scheduler still preserves downward spread instead of collapsing back to a single tick. Two new options are accepted by callers: - `jitter` (0–1 fraction, default `0.3`) — controls the jitter window. Values are clamped to `[0, 1]`; `0` preserves exact legacy timing. - `random` (`() => number`, default `Math.random`) — injectable RNG, primarily a deterministic test seam. Non-finite output falls back to the midpoint (`0.5`). The `retryingAt` timestamp surfaced to `ChatStatusCallout` is computed from the jittered delay, so the countdown shown to users reflects the actual retry time. The scheduler also keeps `maxMs` as a hard ceiling on the final delay and saturates exponential overflow at that cap instead of dropping to `0ms` retries. No production callers need changes — the default jitter activates automatically for all four call sites (`AgentsPage` chat-list watcher, `AgentChatPage` workspace watcher, `useChatStore` per-chat stream, `useGitWatcher`). The two downstream tests that asserted exact reconnect timing now pin `Math.random()` to `0.5` so those expectations stay deterministic.	2026-04-09 20:31:37 +10:00
Danny Kopping	7a94a683c4	docs: rename AI Bridge to AI Gateway and Agent Boundaries to Agent Firewall (#24094 ) Disclaimer: implemented by a Coder Agent using Claude Opus 4.6 ## Summary Renames product references across documentation: \| Old Name \| New Name \| \|----------\|----------\| \| AI Bridge \| AI Gateway \| \| AI Bridge Proxy \| AI Gateway Proxy \| \| Agent Boundaries \| Agent Firewall \| ## What changed - Prose text, headings, titles, and descriptions updated across all docs - Directories renamed: - `docs/ai-coder/ai-bridge/` → `docs/ai-coder/ai-gateway/` - `docs/ai-coder/ai-bridge/ai-bridge-proxy/` → `docs/ai-coder/ai-gateway/ai-gateway-proxy/` - `docs/ai-coder/agent-boundaries/` → `docs/ai-coder/agent-firewall/` - All internal markdown links updated to new paths - `manifest.json` route paths updated - Rename notice added to AI Gateway and Agent Firewall entrypoint pages ## Companion PR URL redirects (old paths → new paths): [coder/coder.com#700](https://github.com/coder/coder.com/pull/700) ## What is intentionally NOT changed - Env vars: `CODER_AIBRIDGE_` - CLI flags: `--aibridge-` - API paths: `/api/v2/aibridge/` - Config keys: `aibridge:` YAML blocks - Terraform variables: `enable_aibridge`, `boundary_version`, `use_boundary_directly` - Process names: `aibridged`, `aibridgeproxyd` - Prometheus metrics: `coder_aibridged_`, `coder_aibridgeproxyd_` - SDK types: `codersdk.AIBridge` - GitHub URLs: `github.com/coder/aibridge` - Image paths: `images/aibridge/` - Auto-generated reference docs: `docs/reference/cli/aibridge.md`, `docs/reference/api/aibridge.md`, `docs/reference/api/schemas.md` - Frontend code*: `site/src/` references (separate PR) Code-level renames (env vars, configs, frontend) are planned for a follow-up PR.	2026-04-09 10:07:50 +00:00
Jake Howell	2e6fdf2344	fix: resolve `<Badge />` incorrect sizes (#22539 ) This pull-request makes a few changes to our `<Badge />` component to bring it inline with Figma. * Added all variants to the stories of Figma (they can vary per badge-type, so its better we track everything). * Removed the `border` variant of the component, border variants should be on all `sm` and `md`. * Added a hover effect to the `default` variant (per-design). * Resolved issue with sizings of `xs` and `sm` plus resolved iconography. * Resolved issue with icons not showing at all on `xs` variants.	2026-04-09 19:55:59 +10:00
Danielle Maywood	3d139c1a24	refactor(site): replace `!!` with `Boolean()` for boolean coercion (#24180 )	2026-04-09 10:48:54 +01:00
Jaayden Halko	f957981c8b	fix(site): add padding below thinking-only assistant messages (#24140 ) closes CODAGT-122 Add a spacer div that renders only when an assistant message lacks the action bar, matching the height the action bar would provide. > 🤖 Generated by Coder Agents	2026-04-09 10:17:51 +01:00
Atif Ali	584c61acb5	fix: mark connecting agents as unhealthy instead of healthy (#24044 ) ## Problem Workspaces showed as "Healthy" immediately after creation while the agent was still downloading, starting, or connecting. If the agent never connected, the workspace stayed "Healthy" for the entire connection timeout (~120s), then abruptly flipped to "Unhealthy". ## Root cause In `db2sdk.WorkspaceAgent`, the health switch had no case for `WorkspaceAgentConnecting`. Agents in `connecting` status with a non-`off` lifecycle (e.g. `created` after a fresh build) fell through to the `default` case and were marked `Healthy = true`. ## Fix Add an explicit case for `WorkspaceAgentConnecting` that sets `Healthy = false` with reason `"agent has not yet connected"`. The case is placed after the existing `!connected + off` case (which correctly catches stopped agents as "not running") and before the `timeout`/`disconnected` cases. ``` Status + Lifecycle → Health reason ────────────────────────────────────────────────────── any !connected + off → "agent is not running" connecting + created/starting → "agent has not yet connected" ← NEW timeout + any → "agent is taking too long to connect" disconnected + any → "agent has lost connection" connected + start_error → "agent startup script exited with an error" connected + shutting_down → "agent is shutting down" connected + ready/starting → healthy ``` The frontend already handles this case — `getAgentHealthIssue()` returns "Workspace agent is still connecting" with `severity: "info"` for unhealthy workspaces with connecting agents. ## Test changes - Healthy test: now actually connects the agent via `agenttest.New` before asserting health (previously passed due to the bug). - New Connecting test: verifies a never-connected agent is correctly marked unhealthy. - Mixed health test: connects a1 and waits for the mixed state (`a1.Healthy && !workspace.Healthy`) to avoid a race where both agents are initially connecting. - Sub-agent excluded test: connects the parent agent and waits for it to be healthy before creating the sub-agent. - TestWorkspaceAgent/Connect: flipped assertion to `Health.Healthy == false` for a `dbfake` agent that never connects. <details> <summary>Review notes</summary> ### Known follow-up The `healthy:false` workspace search filter maps to `[disconnected, timeout]` and does not include `connecting`. This is a pre-existing gap that is now more consequential — a workspace unhealthy solely due to a connecting agent won't appear in `healthy:false` results. Worth a follow-up issue. ### Deep review findings addressed \| Finding \| Severity \| Status \| \|---------\|----------\|--------\| \| Mixed health test race (all 3 reviewers) \| P2 \| Fixed — tightened `Eventually` condition \| \| `TestWorkspaceAgent/Connect` assertion break \| P1 \| Fixed — flipped assertion \| \| CLI renders red for connecting agents \| Obs \| Acknowledged — design trade-off, accurate but visually strong for transient state \| \| Switch case ordering overlap \| Obs \| Documented with inline comment \| </details> > 🤖 This PR was created with the help of Coder Agents, and needs a human review. 🧑💻	2026-04-09 13:21:28 +05:00
code-qtzl	f95a5202bf	fix(site/src/pages/CreateWorkspacePage): replace Tooltip with HelpPop… (#24057 ) Replace Tooltip with `HelpPopover` in the "New workspace" page header. `HelpPopover` supports interactive content like links and provides better layout control, making it a better fit for this use case.	2026-04-09 15:34:29 +10:00
Matt Vollmer	d954460380	docs: rename "Security implications" to "Security posture" (#24181 ) Renames the "Security implications" section to "Security posture" and reframes the intro paragraph. "Implications" reads as a caveat or warning; the section actually describes built-in structural guarantees of the control plane architecture. > PR generated with Coder Agents	2026-04-08 19:55:56 -04:00
dylanhuff-at-coder	f4240bb8c1	fix: sanitize workspace agent logs before insert (#24028 ) Workspace agent logs could still fail after the earlier invalid UTF-8 fix because NUL bytes are valid Go/protobuf strings but are rejected by Postgres text columns. The legacy HTTP log upload path also bypassed the old sanitization entirely, and both server insert paths computed logs_length from the unsanitized input. Add a shared log-output sanitizer in agentsdk, use it in the protobuf conversion path and both server-side insert paths, and compute OutputLength from the sanitized string so overflow accounting matches what is actually stored. This keeps the old invalid UTF-8 behavior while also handling embedded NUL bytes consistently across DRPC and HTTP log ingestion. Refs [#23292 ](https://github.com/coder/coder/issues/23292) Refs [#13433 ](https://github.com/coder/coder/issues/13433)	2026-04-08 16:29:38 -07:00
Zach	7caef4987f	feat: add input validation for user secret env names and file paths (#24103 ) Adds backend validation for user secret environment variable names and file paths. Env name validation enforces POSIX naming rules and blocks a deliberately aggressive denylist of reserved names and prefixes. The denylist errs on the side of blocking too much since it's easier to remove entries later than to add them after users have created conflicting secrets. File path validation requires paths to start with ~/ or /.	2026-04-08 17:02:33 -06:00
Zach	9b91af8ab7	feat: add user secrets SDK types and db2sdk converters (#24102 ) Adds the SDK types and database-to-SDK conversion helpers for the user secrets feature.	2026-04-08 16:48:41 -06:00
Matt Vollmer	506fba9ebf	docs: add BYOK docs, fix tool tables, add platform controls (#24178 ) Fixes several documentation gaps and inaccuracies in the Coder Agents docs identified during a deep review against the current product state. ## BYOK (User API Keys) `models.md` stated "Developers cannot add their own providers, models, or API keys" — this has been incorrect since the provider key policy system shipped (Apr 2, #23751/#23781). - Added Key policy section documenting the three admin toggles (`central_api_key_enabled`, `allow_user_api_key`, `allow_central_api_key_fallback`) with a truth table showing all resolution outcomes - Added User API keys (BYOK) section covering the developer-facing key management page, status indicators, selection priority, and key removal - Updated `platform-controls/index.md` to reference BYOK instead of claiming keys are admin-only ## Reasoning effort enum fixes - OpenAI: removed `none` — code accepts `minimal, low, medium, high, xhigh` - OpenRouter: narrowed to `low, medium, high` per `ReasoningEffortFromChat` in `chatprovider.go` ## Tool table completeness - Added `spawn_computer_use_agent`, `read_skill`, `read_skill_file` to `index.md` tool table - Added "Workspace extension tools" section to `architecture.md` for `read_skill`/`read_skill_file` - Fixed orchestration restriction note to list all 5 gated tools instead of just `spawn_agent` - Added conditional availability notes for desktop and skills tools ## Platform controls Three admin-only settings existed in the Behavior tab with no documentation: - Virtual desktop — admin toggle, Anthropic + portabledesktop requirements - Workspace autostop fallback — default TTL for agent workspaces without template-defined autostop - Data retention — moved `chat-retention.md` into `platform-controls/` since it's admin-only, fixed nav path --- > PR generated with Coder Agents	2026-04-08 18:24:12 -04:00
Cian Johnston	461a31e5d8	feat(site): add under-construction navbar stripes for pre-release builds (#24157 ) Dev and RC builds now show diagonal warning stripes in the navbar plus a centered version badge, making it impossible to miss which build you're running. Devel build: amber "warning" from theme RC build: sky "pending" from theme > 🤖 Written by a Coder Agent. Will be reviewed by a human.	2026-04-08 20:10:03 +00:00
Carlo Field	e3a0dcd6fc	feat: add httproute for K8s Gateway API (#23501 ) <!-- If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting. --> No AI was used to generate this PR. Adds support for [Gateway API HTTPRoutes](https://gateway-api.sigs.k8s.io/api-types/httproute/) as an alternative to Ingress. --------- Signed-off-by: Carlo Field <carlo@swiss.dev> Co-authored-by: bpmct <bpmct@users.noreply.github.com> Co-authored-by: Ben Potter <ben@coder.com>	2026-04-08 14:59:17 -05:00
Danielle Maywood	12ada0115f	fix(site): move pagination test from vitest to storybook story (#24165 )	2026-04-08 20:56:53 +01:00
Cian Johnston	7b0421d8c6	fix: revert auto-assign agents-access role enabled (#24170 ) This reverts commit `d4a9c63e91` (#23968). --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-04-08 20:56:17 +01:00
Hugo Dutka	477d6d0cde	fix(site): fix agents right panel layout on small landscape viewports (#24161 ) Currently, when you're using Agents on mobile with a vertical viewport and you open the sidebar, the sidebar takes up the entire screen. That's great, since there isn't enough space to show the other tabs. But when you tilt your phone to horizontal mode, all 3 tabs show up, and none of them are very legible: https://github.com/user-attachments/assets/50a54791-fe53-4a5d-ba7b-85e82f970851 This PR makes it so that the right sidebar takes up the entire screen on small viewports (<1024px) in horizontal mode too. https://github.com/user-attachments/assets/a06069df-9f2f-42bd-8072-a237434434e5	2026-04-08 20:01:59 +02:00
Jeremy Ruppel	de61ac529d	fix(site): scroll when request logs tool call is huge (#24162 ) Disclaimer: I've never encountered this on dogfood, only on my local where Claude likes to do really long tool calls On the Request Logs page, if a tool call has super long lines, it will break the row layout: https://github.com/user-attachments/assets/fd1a8be0-7912-4611-a1c3-0c7943b1ea52 This adds stories to demonstrate the behavior, and then a lil overflow x auto action for the fix https://github.com/user-attachments/assets/f0fd94da-8254-4330-a718-08599909e8ec	2026-04-08 13:53:43 -04:00

1 2 3 4 5 ...

13680 Commits