coder

mirror of https://github.com/coder/coder.git synced 2026-06-02 20:48:20 +00:00

Author	SHA1	Message	Date
Kyle Carberry	0f86c4237e	feat: add workspace MCP tool discovery and proxying for chat (#23680 ) Coder's chat (chatd) can now discover and use MCP servers configured in a workspace's `.mcp.json` file. This brings project-specific tooling (GitHub, databases, docs servers, etc.) into the chat without any manual configuration. ## How it works The workspace agent reads `.mcp.json` from the workspace directory (same format Claude Code uses), connects to the declared MCP servers — spawning child processes for stdio servers and connecting over the network for HTTP/SSE — and caches their tool lists. Two new agent HTTP endpoints expose this: - `GET /api/v0/mcp/tools` returns the cached tool list (supports `?refresh=true`) - `POST /api/v0/mcp/call-tool` proxies calls to the correct server On each chat turn, chatd calls `ListMCPTools` through the existing `AgentConn` tailnet connection, wraps each tool as a `fantasy.AgentTool`, and adds them to the LLM's tool set alongside built-in and admin-configured MCP tools. Tool names are prefixed with the server name (`github__create_issue`) to avoid collisions. Failed server connections are logged and skipped — they never block the agent or break the chat. Child stdio processes are terminated on agent shutdown.	2026-03-26 19:57:02 +00:00
Jeremy Ruppel	02b58534a0	fix: use TokenBadges for session list row (#23619 ) The sessions list row uses a bespoke token badge instead of the `<TokenBadge />` component, so this fixes that.	2026-03-26 15:50:32 -04:00
Atif Ali	e35fa8b9ee	fix(site): use restartWorkspace instead of startWorkspace in schedule dialog (#23658 )	2026-03-27 00:45:16 +05:00
Jeremy Ruppel	1358233c83	fix(site): decrease chevron size in AI Bridge session row (#23688 ) The Sessions table row chevron icon is too big. Make it smaller 🤏	2026-03-26 15:10:41 -04:00
Asher	ea4070c0ce	feat: add multi-user dialog select for adding group members (#23396 ) Instead of the single-user dropdown we had before.	2026-03-26 10:42:04 -08:00
Hugo Dutka	1b2fab8306	feat(site): enable copy and paste in agents desktop (#23686 )	2026-03-26 19:32:48 +01:00
Mathias Fredriksson	94e5de22f7	perf(site): fix compiler memoization gap in AgentDetailInput (#23683 ) The React Compiler failed to memoize the messages derivation chain because a useDashboard() hook call sat between the messages computation and its consumer (getLatestContextUsage). An IIFE around the context usage logic also fragmented the dependency chain. Replacing the IIFE with a ternary and reordering the non-hook computation before the hook call lets the compiler group messages + getLatestContextUsage into a single cache guard keyed on messagesByID and orderedMessageIDs.	2026-03-26 18:30:06 +00:00
Mathias Fredriksson	6cbb7c6da7	fix(provisioner/terraform): regenerate fixtures with current provider (#23685 ) Two test fixtures (devcontainer-resources, multiple-agents-multiple-envs) were generated before terraform-provider-coder v2.15.0 added the merge_strategy attribute to coder_env. Running generate.sh with the current provider adds merge_strategy: "replace" (the default) to all coder_env resources, causing unstable diffs on every regeneration.	2026-03-26 18:22:45 +00:00
Jeremy Ruppel	fc60a6bf9b	feat(site): add AIBridge Sessions to deployment menu (#23679 ) Adds AI Bridge Sessions link to the deployment menu.	2026-03-26 13:56:51 -04:00
Atif Ali	a52153968d	fix(site): use Anthropic icon instead of Claude icon for provider (#23661 )	2026-03-26 22:25:55 +05:00
Danielle Maywood	d18e700699	fix(site): collapse chat toolbar badges fluidly on overflow (#23663 )	2026-03-26 17:21:01 +00:00
Mathias Fredriksson	0234e8fffd	perf(site): narrow buildStreamTools compiler cache guard dependencies (#23677 ) The React Compiler guarded buildStreamTools on the whole streamState ref, which changes on every text chunk. Refactoring the function to accept toolCalls and toolResults directly lets the compiler guard on those sub-fields, which are stable during text-only streaming. Before: $[0] !== streamState (misses every text chunk) After: $[0] !== toolCalls \|\| $[1] !== toolResults (passes when only blocks change) Verified: 181 functions compile, 0 diagnostics. Reference stability tests confirm toolCalls/toolResults retain identity across text-part updates and change when tool data updates.	2026-03-26 17:18:35 +00:00
david-fraley	fea4560a64	fix(site): use docs() helper for hardcoded documentation URLs (#23606 )	2026-03-26 17:10:47 +00:00
Danielle Maywood	6dee7cf11d	perf(site/src/pages/AgentsPage): convert renderBlockList to BlockList component (#23673 )	2026-03-26 17:07:36 +00:00
Danielle Maywood	d4fc4e0837	fix(site): fix StreamingCodeFence storybook flake (#23681 )	2026-03-26 17:00:47 +00:00
david-fraley	8da45c14bc	fix(site): fix grammar in batch update description (#23605 )	2026-03-26 11:59:46 -05:00
Cian Johnston	bfee7e6245	fix: populate all chat fields in pubsub events (#23664 ) Problem: `publishChatPubsubEvent` was constructing a partial `codersdk.Chat` that omitted `LastModelConfigID` and other fields. Go's zero-value UUID caused the sidebar to show "Default model" for chats received via SSE. Solution: - Extracted `convertChat`/`convertChats` from `exp_chats.go` into `db2sdk.Chat`/`db2sdk.Chats`, alongside existing `ChatMessage`, `ChatQueuedMessage`, and `ChatDiffStatus` converters. `publishChatPubsubEvent` now calls `db2sdk.Chat(chat, nil)` instead of maintaining its own copy of the conversion logic - Added backend integration test `TestWatchChats/CreatedEventIncludesAllChatFields` - Added frontend regression tests for nil-UUID and valid model config ID cases > 🤖 Created by Coder Agents, reviewed by this human.	2026-03-26 16:49:26 +00:00
Danielle Maywood	52b5d5fdc6	fix(site): match date range picker button height on template insights page (#23667 )	2026-03-26 16:36:51 +00:00
Mathias Fredriksson	cc4cca90fd	perf(site): memo-wrap SmoothedResponse and ReasoningDisclosure to skip completed blocks during streaming (#23674 ) During streaming, StreamingOutput's compiler cache guard misses every chunk because streamState.blocks and streamTools are new references. This causes renderBlockList to recreate all child JSX elements, and React calls every child function even for blocks that finished streaming. Wrapping SmoothedResponse and ReasoningDisclosure in React.memo lets React skip the function call entirely when props are stable. For N completed response blocks and M completed thinking blocks, this reduces per-chunk function calls from N+M+1 to 1. The compiler still compiles both inner functions cleanly (6 and 12 cache slots respectively, zero diagnostics).	2026-03-26 16:35:12 +00:00
Hugo Dutka	081d91982a	fix(site): fix desktop visibility glitch (#23678 ) If the desktop viewer component was hidden, for example after collapsing the sidebar, the next time it was shown the viewer would be blank. This PR fixes that.	2026-03-26 17:20:16 +01:00
Danielle Maywood	00cd7b7346	fix: match workspace picker font size to plus menu dropdown (#23670 )	2026-03-26 16:12:24 +00:00
Danny Kopping	801e57d430	feat: session detail API (#23203 )	2026-03-26 18:09:53 +02:00
Michael Suchacz	e937f89081	feat: add enabled toggle to chat model admin panel (#23665 ) Adds an `enabled` toggle to the chat model admin create/edit form so admins can disable a model without soft-deleting it. Disabled models stay visible in admin settings but stop appearing in user-facing model selectors. The backend already supported this (`chat_model_configs.enabled` column, filtered queries, and SDK fields). This change wires it into the admin UI and adds coverage on both sides. Backend: three new subtests in `coderd/exp_chats_test.go` verifying the visibility contract (admin sees disabled models, non-admin doesn't, update-to-disabled preserves the record). Frontend: `enabled` field added to form logic and seeded from the existing model (defaults to `true` for new models). A Switch+Tooltip control renders in the form header, matching the MCP Server panel pattern. Two interaction stories cover the create-disabled and toggle-existing flows.	2026-03-26 17:07:20 +01:00
Danielle Maywood	5c7057a67f	fix(site): enable streaming-mode Streamdown for live chat output (#23676 )	2026-03-26 15:22:54 +00:00
Ehab Younes	249ef7c567	feat(site): harden Agents embed frame communication and add theme sync (#23574 ) Add theme synchronization, navigation blocking, scroll-to-bottom handling, and chat-ready signaling to the agent embed page. The parent frame can now set light/dark theme via postMessage or query param, and ThemeProvider skips its own class manipulation when the embed marker is present. Navigation attempts that leave the embed route are intercepted and forwarded to the parent frame. The scroll container ref is lifted to the layout so the parent can request scroll-to-bottom.	2026-03-26 18:03:30 +03:00
Cian Johnston	81fe7543b4	chore: set tls.VersionTLS12 MinVersion in cli/server.go to address gosec warning (#23646 ) I was investigating `//nolint` comments and this one popped up. It raised my eyebrows enough to warrant its own PR.	2026-03-26 14:53:47 +00:00
Kyle Carberry	61d2a4a9b8	fix(site): preserve streaming output when queued message is sent (#23595 ) ## Problem When the user sends a message while the agent is actively streaming a response, `handleSend` called `store.clearStreamState()` unconditionally before the POST request. If the server queues the message (`response.queued = true` because the agent is busy), the in-progress stream output is immediately wiped from the UI. The full text only reappears once the agent finishes and the durable message arrives via WebSocket — causing a visible cutoff mid-stream. ## Fix Move `clearStreamState()` from before the POST to after the response, gated behind `!response.queued`: - Queued sends (`response.queued === true`): `clearStreamState()` is never called. The stream continues uninterrupted. The WebSocket `status` handler already clears stream state when the chat transitions to `"pending"` / `"waiting"` after the queued message is dequeued. - Non-queued sends (`response.queued === false`): `clearStreamState()` + `upsertDurableMessage()` fire immediately after the POST, same net behavior as before. - Edit and promote paths: Unchanged — those are intentional interruptions where eager clearing is correct. ### Additional behavior changes (both improvements) 1. Failed sends no longer wipe stream state. Previously `clearStreamState()` ran before the `try` block, so a network error still wiped the agent's in-progress output. Now the `catch` re-throws before reaching `clearStreamState()`, preserving the stream on failure. 2. `clearStreamState()` fires for all non-queued responses, not just those with a `message` body. The original guard was `!response.queued && response.message`; now `clearStreamState()` is under `!response.queued` while `upsertDurableMessage` retains the `response.message` check. The server always sets `message` for non-queued responses, so this is a no-op in practice but is semantically correct. ## Testing AgentDetail.stories.tsx: New `StreamingSurvivesQueuedSend` story exercises the full flow — mocks `createChatMessage` to return `{ queued: true }`, delivers streaming text via WebSocket, sends a message through the UI, and asserts the streaming text remains visible.	2026-03-26 10:35:31 -04:00
Mathias Fredriksson	b23c07cf23	perf(site): use lazy iteration in sliceAtGraphemeBoundary (#23671 ) Array.from(graphemeSegmenter.segment(text)) materializes the entire text into an array before iterating, even though the loop breaks early at the visible prefix length. During streaming at 60fps, this makes each frame O(full text) instead of O(prefix). Benchmark on 5000-char text with 200-char prefix: 22.6x faster (1.44ms to 0.06ms per call, saving 8.3% of the frame budget). The fallback codepoint path had the same issue with Array.from.	2026-03-26 16:33:48 +02:00
Ethan	87aafd4ae2	fix(site): stabilize date-dependent storybook snapshots (#23657 ) _Generated by mux but reviewed by a human_ Several stories computed dates relative to `dayjs()` / `new Date()` at render time, causing snapshot text to shift daily. I ran into this on my PRs. This adds an optional `now` prop to `DateRangePicker`, `TemplateInsightsControls`, and `CreateTokenForm` so stories can inject a deterministic clock without global mocking. License stories replace the misleadingly-named `FIXED_NOW = dayjs().startOf("day")` with absolute timestamps. All fixed timestamps use noon UTC to avoid timezone boundary issues. Affected stories: - `AgentSettingsPageView`: Usage Date Filter, Usage Date Filter Refetch Overlay - `LicenseCard`: Expired/future AI Governance variants, Not Yet Valid - `LicensesSettingsPage`: Shows Addon Ui For Future License Before Nbf - `TemplateInsightsControls`: Day - `CreateTokenPage`: Default	2026-03-27 01:21:52 +11:00
Ethan	4d74603045	fix(coderd/x/chatd): respect provider Retry-After headers in chat retry loop (#23351 ) > PR Stack > 1. #23351 ← `#23282` (you are here) > 2. #23282 ← `#23275` > 3. #23275 ← `#23349` > 4. #23349 ← `main` --- ## Summary `chatretry.Retry()` used pure exponential backoff (1 s, 2 s, 4 s, …) and never consulted provider `Retry-After` headers. Fantasy's `ProviderError` carries `ResponseHeaders` including `Retry-After`, but `chaterror.Classify()` only parsed error text and silently dropped the structured transport metadata. This makes `Retry-After` a first-class signal in the classification → retry pipeline. <img width="853" height="346" alt="image" src="https://github.com/user-attachments/assets/65f012b6-8173-43d2-957e-ab9faddea525" /> ## Changes ### `coderd/chatd/chaterror/classify.go` - Added `RetryAfter time.Duration` field to `ClassifiedError` — a normalized minimum retry delay derived from provider response metadata. - `Classify()` now calls `extractProviderErrorDetails()` before falling back to text heuristics. Structured `ProviderError.StatusCode` takes priority over regex extraction. - `normalizeClassification()` preserves and clamps `RetryAfter`. ### `coderd/chatd/chaterror/provider_error.go` (new) Provider-specific extraction, isolated from the text-based classification logic: - `extractProviderErrorDetails()` unwraps `fantasy.ProviderError` from the error chain via `errors.As`. - `retryAfterFromHeaders()` parses headers in priority order: 1. `retry-after-ms` (OpenAI-specific, millisecond precision) 2. `retry-after` (standard HTTP — integer seconds or HTTP-date) - Case-insensitive header key lookup. ### `coderd/chatd/chatretry/chatretry.go` - `effectiveDelay(attempt, classified)` computes `max(Delay(attempt), classified.RetryAfter)` — the provider hint acts as a floor without weakening the local exponential backoff. - `Retry()` now uses `effectiveDelay` and passes the effective delay to both `onRetry(...)` and the sleep timer, so downstream payloads, logs, and the frontend countdown stay aligned automatically. ### Tests - `classify_test.go`: Structured provider status + `Retry-After` extraction, `retry-after-ms` priority, HTTP-date parsing, invalid header fallback, `WithProvider` preservation. - `chatretry_test.go`: Retry-after-as-floor semantics — longer hint wins, shorter hint keeps base delay. ## Design notes - No SDK/API/frontend changes needed.* `codersdk.ChatStreamRetry` already carries `DelayMs` and `RetryingAt`, and the frontend already consumes them. The fix is purely in the server-side delay computation. - Existing retryability rules unchanged. This fixes when we sleep, not whether an error is retryable. - Provider hint is a floor: `max(baseDelay, RetryAfter)` ensures we never retry earlier than the provider asks, and never weaken our own backoff curve.	2026-03-27 01:20:46 +11:00
Cian Johnston	847a88c6ca	chore: clean up stale and dangerous //nolint comments (#23643 ) ## Changes - Commit 1: Remove 17 unnecessary `//nolint` directives: - `//nolint:varnamelen` — linter not active - `//nolint:unused` on exported `SlimUnsupported` - `//nolint:govet` in `coderd/httpmw/csrf` — no longer fires - `//nolint:revive` on functions refactored since the nolint was added - `//nolint:paralleltest` citing Go 1.22 loop variable capture (obsolete) - Bare `//nolint` narrowed to specific `//nolint:gocritic` with justification - Commit 2: Fix root causes behind 5 dangerous nolint suppressions: - Add `MinVersion: tls.VersionTLS12` to TLS client config (removes `gosec` G402) - Delete trivial unexported wrappers `apiKey()`/`normalizeProvider()` in chatprovider (removes `revive` confusing-naming) - Add doc comments to `StartWithAssert` and `Router` (removes `revive` exported) - Rename unused parameters to `_` in integration test helpers > 🤖 This PR was created using Coder Agents and reviewed by me.	2026-03-26 14:13:53 +00:00
Jeremy Ruppel	a0283ff775	fix(site): use `toLocaleString` for pagination offsets (#23669 ) The Pagination widget localizes the number format of the total results but not the page offsets. Before <img width="620" height="78" alt="Screenshot 2026-03-26 at 09 18 01" src="https://github.com/user-attachments/assets/7ac0ad9a-7baa-4b30-b3d0-0e0325f8433b" /> After <img width="297" height="42" alt="Screenshot 2026-03-26 at 9 41 22 AM" src="https://github.com/user-attachments/assets/79c68366-95fa-4012-8419-5cd6f6e10ae3" />	2026-03-26 09:50:49 -04:00
Cian Johnston	f164463c6a	fix(scripts/metricsdocgen): shush the prometheus scanner in CI (#23642 ) - Suppress informational `log.Printf` messages from the metrics scanner when stdout is not a TTY (i.e. piped via `atomic_write` in `make gen` or CI) - Genuine warnings (`warnf`) still print unconditionally so real problems remain visible - `log.Fatalf` for fatal errors is unchanged > 🤖 Created by Coder Agents and reviewed by a human	2026-03-26 12:58:02 +00:00
Michael Suchacz	4f063cdc47	feat: separate default and additional Coder Agents system prompts (#23616 ) Admins can now control whether the built-in Coder Agents default system prompt is prepended to their custom instructions, rather than having the custom prompt silently replace the default. Changes: - New `include_default_system_prompt` boolean toggle (defaults to `true` for existing deployments) stored as a site config key — no migration needed. - GET `/api/experimental/chats/config/system-prompt` returns the toggle state, the custom prompt, and a preview of the built-in default. - PUT persists both the toggle and custom prompt atomically in a single transaction. - `resolvedChatSystemPrompt()` composes `[default?, custom?]` joined by `\n\n`, falling back to the built-in default on DB errors. - Settings UI adds a Switch toggle with conditional helper text and a "Preview" button that shows the built-in default prompt via the existing `TextPreviewDialog`. - Comprehensive test coverage: 15 subtests covering toggle behavior, prompt composition matrix, auth boundaries, and integration with chat creation.	2026-03-26 13:32:41 +01:00
Cian Johnston	d175e799da	feat: show agent badge on workspace list (#23453 ) - Adds `GET /api/experimental/chats/by-workspace` endpoint that returns workspace_id → latest chat_id mapping - Modifies FE to fetch this alongside the workspace list, gated on `agents` experiment and render an "Agent" badge similar to the existing "Task" badge in `WorkspacesTable` - Badge links to the "latest chat" linked to the given workspace. Notes: - Intentionally uses `fetchWithPostFilter` for RBAC to decouple from workspaces API — will migrate to `workspaces_expanded` view later. - If users have multiple chats linked to the same workspace, the badge will link to the most recently updated one. > 🤖 This PR was created with the help of Coder Agents, and has been reviewed by my human. 🧑‍💻	2026-03-26 11:30:12 +00:00
Jaayden Halko	3fb7c6264f	feat: display the AI add-on column in the UI on the Users and Organization Members tables (#23291 ) ## Summary Adds an entitlement-gated AI add-on column to both the Users table and the Organization Members table. When `ai_governance_user_limit` is entitled, each row shows whether the user is consuming an AI seat. ## Background The AI governance add-on tracks which users are consuming AI seats. Admins need visibility into per-user seat consumption directly from the user management tables. This change surfaces that information through both the site-wide Users table and the per-organization Members table, gated behind the `ai_governance_user_limit` entitlement so the column only appears when the feature is licensed. ## Implementation ### Backend - New SQL query `GetUserAISeatStates` (`coderd/database/queries/aiseatstate.sql`) — returns user IDs consuming an AI seat, derived from: - Users with entries in `aibridge_interceptions` (AI Bridge usage) - Users who own workspaces with `has_ai_task = true` builds (AI Tasks usage) - SDK types — added `has_ai_seat: boolean` to `codersdk.User` and `codersdk.OrganizationMemberWithUserData` - Handler wiring — both the Users list endpoint (`coderd/users.go`) and all Members endpoints (`coderd/members.go`) query AI seat state per page of user IDs and populate the response field - dbauthz — per-user `ActionRead` checks on `ResourceUserObject` ### Frontend - Shared `AISeatCell` component (`site/src/modules/users/AISeatCell.tsx`) — green `CircleCheck` for consuming, gray `X` for non-consuming - `TableColumnHelpTooltip` — extended with `ai_addon` variant with tooltip: "Users with access to AI features like AI Bridge, Boundary, or Tasks who are actively consuming a seat." - Column visibility gated behind `useFeatureVisibility().ai_governance_user_limit` ## Validation - Backend: dbauthz full method suite (`TestMethodTestSuite`) passes including new `GetUserAISeatStates` test - Backend: `TestGetUsers`, `TestUsersFilter`, CLI golden file tests pass - Frontend: 7/7 tests pass across `UsersPage.test.tsx` and `OrganizationMembersPage.test.tsx` (column visibility gating both directions) - `go build ./coderd/...` compiles clean - `pnpm --dir site run lint:types` passes - `make gen` clean ## Risks - Pagination performance: The AI seat query is scoped to the current page's user IDs (not a full table scan), keeping it efficient for paginated views. - Semantic scope: The workspace-side AI seat derivation uses "any build with `has_ai_task = true`" rather than "latest build only". If the product intent is latest-build-only, this can be tightened in a follow-up. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-6` • Thinking: `xhigh` • Cost: `$27.25`_ <!-- mux-attribution: model=anthropic:claude-opus-4-6 thinking=xhigh costs=27.25 -->	2026-03-26 10:36:40 +00:00
Danny Kopping	09d2588e2a	docs: AI session auditing (#23660 ) _Disclaimer: produced with the help of Claude Opus 4.6, heavily modified by me._ Closes https://github.com/coder/internal/issues/1341 --------- Signed-off-by: Danny Kopping <danny@coder.com>	2026-03-26 09:49:53 +00:00
Danny Kopping	8eade29e68	chore: update AI Bridge warning to require AI Governance Add-On (#23662 ) Disclaimer: implemented by a Coder Agent using Claude Opus 4.6, reviewed by me. Replace the transitional soft warning message: > AI Bridge is now Generally Available in v2.30. In a future Coder version, your deployment will require the AI Governance Add-On to continue using this feature. Please reach out to your account team or sales@coder.com to learn more. with the definitive requirement message: > The AI Governance Add-On is required to use AI Bridge. Please reach out to your account team or sales@coder.com to learn more. Updated in: - `enterprise/coderd/license/license.go` - `enterprise/coderd/license/license_test.go` (2 occurrences)	2026-03-26 11:10:53 +02:00
Ethan	15f2fa55c6	perf(coderd/x/chatd): add process-wide config cache for hot DB queries (#23272 ) ## Summary Adds a process-wide cache for three hot database queries in `chatd` that were hitting Postgres on every chat turn despite returning rarely-changing configuration data: \| Query \| Before (50k turns) \| After \| Reduction \| \|---\|---\|---\|---\| \| `GetEnabledChatProviders` \| ~98.6k calls \| ~500-1000 \| ~99% \| \| `GetChatModelConfigByID` \| ~49.2k calls \| ~500-1000 \| ~98% \| \| `GetUserChatCustomPrompt` \| ~46.7k calls \| ~1000-2000 \| ~97% \| These were identified via `coder exp scaletest chat` (5000 concurrent chats × 10 turns) as the dominant source of Postgres load during chat processing. ## Design Follows the established webpush subscription cache pattern (`coderd/webpush/webpush.go`): - `sync.RWMutex` + `tailscale.com/util/singleflight` (generic) + generation-based stale prevention + TTL - 10s TTL for provider/model config, 5s TTL for user prompts - Negative caching for `sql.ErrNoRows` on user prompts (the common case — most users don't set custom prompts) - Deep-clones `ChatModelConfig.Options` (`json.RawMessage` = `[]byte`) on both store and read paths ### Invalidation Single pubsub channel (`chat:config_change`) with kind discriminator for cross-replica cache invalidation. Seven publish points in `coderd/chats.go` cover all admin mutation endpoints (create/update/delete for providers and model configs, put for user prompts). _This PR was generated with mux and was reviewed by a human_	2026-03-26 18:04:53 +11:00
Danny Kopping	2ff329b68a	feat(site): add banner on request-logs page directing users to sessions (#23629 ) Disclaimer: implemented by a Coder Agent using Claude Opus 4.6 Adds an info banner on the `/aibridge/request-logs` page encouraging users to visit `/aibridge/sessions` for an improved audit experience. This allows us to validate whether customers still find the raw request logs view useful before removing it in a future release. Fixes #23563	2026-03-26 11:57:50 +05:00
Ethan	ad3d934290	fix(site/src/pages/AgentsPage): clear retry banner on stream forward progress (#23653 ) When a provider request fails and retries, the "Retrying request" banner lingered in the UI after the retry succeeded. This happened because `retryState` was only cleared on explicit `status` events (`running`, `pending`, `waiting`), not when the stream resumed with `message_part` or `message` events. Since the backend does not publish a dedicated"retry resolved" event, the banner stayed visible for the entire duration of the successful response. Add `store.clearRetryState()` calls to the `message_part`, `message`, and `status` event handlers so the banner disappears as soon as content flows again. Closes https://github.com/coder/coder/issues/23624	2026-03-26 17:41:13 +11:00
Ethan	21c2acbad5	fix: refine chat retry status UX (#23651 ) Follow-up to #23282. The retry and terminal error callouts had a few UX oddities: - Auto-retrying states reused backend error text that said "Please try again" even while the UI was already retrying on behalf of the user. - Terminal error states also said "Please try again" with no action the user could take. - `startup_timeout` had no specific title or retry copy — it fell through to the generic "Retrying request" heading. - The kind pill showed raw enum values like `startup_timeout` and `rate_limit`. - Terminal error metadata showed a "Retryable" / "Not retryable" label that does not help users. - A separate "Provider anthropic" metadata row duplicated information already present in the message body. - The `usage-limit` error kind used a hyphen while every backend kind uses underscores. Changes: Backend (`chaterror/message.go`) - Split message generation into `terminalMessage()` and `retryMessage()`, replacing the old `userFacingMessage()`. - Terminal messages include HTTP status codes and actionable guidance (e.g. "Check the API key, permissions, and billing settings."). - Retry messages are clean factual statements without status codes or remediation, suitable for the retry countdown UI (e.g. "Anthropic is temporarily overloaded."). - Removed "Please try again" / "Please try again later" from all paths. - `StreamRetryPayload` calls `retryMessage()` instead of forwarding `classified.Message`. Frontend - Removed the parallel frontend message-generation system: `getRetryMessage()`, `getProviderDisplayName()`, `getRetryProviderSubject()`, and the `PROVIDER_DISPLAY_NAMES` map are all deleted from `chatStatusHelpers.ts`. - `liveStatusModel.ts` passes `retryState.error` through directly — the backend owns the copy. - Added specific title and retry copy for `startup_timeout`, and extended the title mapping to cover `auth` and `config`. - Kind pills now show humanized labels ("Startup timeout", "Rate limit", etc.) instead of raw enum strings. - Removed the redundant "Provider anthropic" metadata row. - Removed the terminal "Retryable" / "Not retryable" badge. - Normalized `"usage-limit"` → `"usage_limit"` and added it to `ChatProviderFailureKind` so all error kinds follow the same underscore convention and live in one enum. Refs #23282.	2026-03-26 17:37:27 +11:00
Ethan	411714cd73	fix(dogfood/coder): tolerate stale gh auth state (#23588 ) ## Problem The dogfood startup script uses `gh auth status` to decide whether to re-authenticate the GitHub CLI. That command exits non-zero when any stored credential is invalid—even if Coder external auth already injects a working `GITHUB_TOKEN` into the environment and `gh` commands work fine. On workspaces with a persistent home volume, `~/.config/gh/hosts.yml` retains OAuth tokens written by previous `gh auth login --with-token` calls. These tokens are issued by Coder's external auth integration and can be rotated or revoked between workspace starts, but the copy in `hosts.yml` persists on the volume. When the stored token goes stale, `gh auth status` reports two accounts: ``` ✓ Logged in to github.com account user (GITHUB_TOKEN) ← works fine ✗ Failed to log in to github.com account user (hosts.yml) ← stale token ``` It exits 1 because of the stale entry, even though `gh` API calls succeed via `GITHUB_TOKEN`. This makes the auth state indeterminate from `gh auth status` alone—you can't tell whether `gh` actually works or not. When the script enters the login branch: 1. `gh auth login --with-token` refuses to accept piped input when `GITHUB_TOKEN` is already set in the environment, and exits 1. 2. `set -e` kills the script before it reaches `sudo service docker start`. The result: Docker never starts, devcontainer health checks fail, and the workspace reports a startup error—all because of a stale GitHub CLI credential that has no bearing on workspace functionality. ## Fix - Switch the auth guard from `gh auth status` to `gh api user --jq .login`, which tests whether GitHub API access actually works regardless of which credential provides it. - Wrap the fallback `gh auth login` so a failure logs the indeterminate state but does not abort the script.	2026-03-26 17:25:42 +11:00
Ethan	61e31ec5cc	perf(coderd/x/chatd): persist workspace agent binding across chat turns (#23274 ) ## Summary This change removes the steady-state "resolve the latest workspace agent" query from chat execution. Instead of asking the database for the latest build's agent on every turn, a chat now persists the workspace/build/agent binding it actually uses and reuses that binding across subsequent turns. The common path becomes "load the bound agent by ID and dial it", with fallback paths to repair the binding when it is missing, stale, or intentionally changed. ## What changes - add `workspace_id`, `build_id`, and `agent_id` binding fields to `chats` - expose those fields through the chat API / SDK so the execution context is explicit - load the persisted binding first in chatd, instead of always resolving the latest build's agent - persist a refreshed binding when chatd has to re-resolve the workspace agent - keep child / subagent chats on the same bound workspace context by inheriting the parent binding - leave `build_id` / `agent_id` unset for flows like `create_workspace`, then bind them lazily on the next agent-backed turn ## Runtime behavior The binding is treated as an optimistic cache of the agent a chat should use: - if the bound agent still exists and dials successfully, we use it without a latest-build lookup - if the bound agent is missing or no longer reachable, chatd re-resolves against the latest build and persists the new binding - if a workspace mutation changes the chat's target workspace, the binding is updated as part of that mutation To avoid reintroducing a hot-path query, dialing uses lazy validation: - start dialing the cached agent immediately - only validate against the latest build if the dial is still pending after a short delay - if validation finds a different agent, cancel the stale dial, switch to the current agent, and persist the repaired binding ## Result The hot path stops issuing `GetWorkspaceAgentsInLatestBuildByWorkspaceID` for every user message, which is the source of the DB pressure this PR is addressing. At the same time, chats still converge to the correct workspace agent when the binding becomes stale due to rebuilds or explicit workspace changes.	2026-03-26 17:22:38 +11:00
Ethan	17aea0b19c	feat(site): make long execute tool commands expandable (#23562 ) Previously, long bash commands in the execute tool were truncated with an ellipsis and could not be viewed in full. The only way to see the full command was to copy it via the copy button. Adds overflow detection and an inline expand/collapse chevron next to the copy button. Clicking the command text or the chevron toggles between truncated and wrapped views. Short commands that fit on one line are visually unchanged. https://github.com/user-attachments/assets/88ec6cd4-5212-4608-9a90-9ce217d5dce7 EDIT: couldn't be bothered re-recording the video but the chevron is hidden until hovered now, like the copy button.	2026-03-26 15:49:23 +11:00
Ethan	5112ab7da9	fix(site/e2e): fix flaky updateTemplate test expecting transient URL (#23655 ) _PR generated by Mux but reviewed by a human_ ## Problem The e2e test `template update with new name redirects on successful submit` is flaky. After saving template settings, the app navigates to `/templates/<name>`, which immediately redirects to `/templates/<name>/docs` via the router's index route (`<Navigate to="docs" replace />`). The assertion used `expect.poll()` with `toHavePathNameEndingWith(`/${name}`)`, which matches only the transient intermediate URL — it only exists while `TemplateLayout`'s async data fetch is pending. Once the fetch resolves and the `<Outlet />` renders, the index route fires the `/docs` redirect and the URL no longer matches. ## Why it's flaky (not deterministic) The flakiness depends on whether the template query cache is warm: - Cache miss → PASSES: The mutation's `onSuccess` handler invalidates the query cache. If `TemplateLayout` needs to re-fetch, it shows a `<Loader />`, which delays rendering the `<Outlet />` that contains the `<Navigate to="docs">`. This gives `expect.poll()` time to see the transient `/new-name` URL → pass. - Cache hit → FAILS: If the template data is still in the query client, `TemplateLayout` renders immediately and the `<Navigate to="docs" replace />` fires nearly instantly. By the time the first poll runs, the URL is already `/new-name/docs` → fail. ## Fix Assert the final stable URL (`/${name}/docs`) instead of the transient one. This is safe because `expect.poll()` is retry-based: it keeps sampling until a match is found (or timeout). Seeing the transient `/new-name` URL just causes harmless retries — once the redirect completes and the URL settles on `/new-name/docs`, the poll matches and the test passes. \| Poll \| URL \| Ends with `/new-name/docs`? \| Action \| \|---\|---\|---\|---\| \| 1st \| `/templates/new-name` \| No \| Retry \| \| 2nd \| `/templates/new-name` \| No \| Retry \| \| 3rd \| `/templates/new-name/docs` \| Yes \| Pass ✅ \| Closes https://github.com/coder/internal/issues/1403	2026-03-26 04:32:44 +00:00
Cian Johnston	7a9d57cd87	fix(coderd): actually wire the chat template allowlist into tools (#23626 ) Problem: previously, the deployment-wide chat template allowlist was never actually wired in from `chatd.go` - Extracts `parseChatTemplateAllowlist` into shared `coderd/util/xjson.ParseUUIDList` - Adds `Server.chatTemplateAllowlist()` method that reads the allowlist from DB - Passes `AllowedTemplateIDs` callback to `ListTemplates`, `ReadTemplate`, and `CreateWorkspace` tool constructors > 🤖 Created by Coder Agents and reviewed by a human.	2026-03-25 22:15:27 +00:00
david-fraley	dab4e6f0a4	fix(site): use standard dismiss label for cancel confirmation dialogs (#23599 )	2026-03-25 21:24:53 +00:00
Kayla はな	0e69e0eaca	chore: modernize typescript api client/types imports (#23637 )	2026-03-25 15:21:19 -06:00
Kyle Carberry	09bcd0b260	fix: revert "refactor(site/src/pages/AgentsPage): normalize transcript scrolling" (#23638 ) Reverts coder/coder#23576	2026-03-25 20:24:42 +00:00

1 2 3 4 5 ...

13335 Commits