coder

mirror of https://github.com/coder/coder.git synced 2026-06-03 04:58:23 +00:00

Author	SHA1	Message	Date
Michael Suchacz	99a83a2702	fix: clean Bedrock headers (#24718 ) Bedrock chat provider requests can inherit Anthropic public API headers from the process environment, which causes mixed Anthropic and Bedrock auth headers on signed requests. Update the Anthropic SDK fork so its Bedrock middleware strips Anthropic-only headers before signing requests, and keep a chatprovider regression test for the production request shape. > Mux is acting on Mike's behalf.	2026-04-26 21:50:29 +02:00
Michael Suchacz	62e9752acd	fix: prevent malformed OpenAI Responses continuations (#24725 ) > Worked on by Mux on Mike's behalf. ## Summary - Disable OpenAI Responses `previous_response_id` chain mode when the prior assistant response has unresolved local tool calls, so the next request can include paired tool outputs instead of sending an incomplete continuation. - Update the fantasy pin to a Responses replay fix that preserves stored reasoning references, only replays web search references when paired with reasoning, and validates local function-call output pairing before send. - Add fake OpenAI Responses input validation for the two production 400 shapes and integration coverage for full-history reasoning plus web search replay. - Add sanitized diagnostics for the OpenAI Responses continuity errors. ## Tests - `go test ./providers/openai -run 'TestResponsesToPrompt_(ReasoningWithStore\|ReasoningWithWebSearchCombined\|WebSearchRequiresReasoningReference\|ReasoningWithFunctionCallCombined\|WebSearchProviderExecutedToolResults)\|TestPrepareParams_(SkipsProviderExecutedToolReferences\|ValidatesFunctionCallOutputPairing)\|TestValidateResponsesInput_WebSearchReferenceRequiresReasoning' -count=1` - `go test ./providers/openai -count=1` - `GOWORK=off go test ./coderd/x/chatd/chattest -run TestValidateResponsesAPIInput -count=1` - `GOWORK=off go test ./coderd/x/chatd -run 'TestOpenAIResponses(NoStaleWebSearchReplay\|FullReplayPairsReasoningAndWebSearch\|ChainModeSkipsWhenLocalCallPending\|ChainModeStillFiresForProviderExecutedOnly)$\|TestResolveChainMode_' -count=1` - `GOWORK=off go test ./coderd/x/chatd/chatprompt -run 'TestInjectMissingToolResults_' -count=1` - `GOWORK=off go test ./coderd/x/chatd/chaterror -run TestClassify_OpenAIResponsesAPIDiagnostics -count=1` - `GOWORK=off go test ./coderd/x/chatd/... -count=1` - `git diff --check` - `git commit` pre-commit hook	2026-04-26 21:23:06 +02:00
Michael Suchacz	ed33e28b13	fix(coderd/x/chatd): wake after auto-promoting queued message (#24714 ) `tryAutoPromoteQueuedMessage` in `processChat`'s deferred cleanup could set a chat back to `pending` without waking the processor. The processor only noticed on the next 10ms poll, so under load tests like `TestAutoPromoteQueuedMessageFallsBackForInvalidQueuedModelConfigID` could time out waiting for the second streaming request (#1500). Call `p.signalWake()` after the promoted-message publishes when `promotedMessage != nil`, matching the pattern used by `CreateChat`, `SendMessage`, `EditMessage`, `PromoteQueued`, and `InterruptChat`. Make the regression helper `testAutoPromoteQueuedMessageFallback` deterministic by setting `PendingChatAcquireInterval = time.Hour` and synchronizing on a `secondRunStarted` channel instead of polling `requestCount`, so the test fails without the wake instead of relying on the 10ms ticker. Closes https://github.com/coder/internal/issues/1500 > Mux is acting on Mike's behalf.	2026-04-26 11:08:32 +02:00
Michael Suchacz	0211448d09	fix(coderd): sanitize Anthropic provider tool history (#24706 ) Anthropic can reject replayed chat histories when a provider-executed tool call, such as `web_search`, is present without its matching provider result block. This sanitizes unpaired Anthropic provider-executed tool calls during prompt reconstruction, before Anthropic requests, and before persistence so existing poisoned histories can continue and new malformed turns are not stored. Resolves: CODAGT-259 > Mux is acting on Mike's behalf.	2026-04-24 23:57:30 +02:00
Cian Johnston	0ccfd575d0	fix(coderd/database/migrations): rename duplicate migration 477 (#24707 )	2026-04-24 14:49:11 +00:00
Michael Suchacz	c7cac9debe	fix: persist per-turn model on chats and queued messages (#24688 ) Previously, `chats.last_model_config_id` was not updated when a user sent a mid-chat message with a different model, and queued messages did not store their own per-turn model, so promotion ran against whatever the chat row said at promote time. Chat watch events also did not merge `last_model_config_id` into the site's root, child, and per-chat caches, so sidebar labels stayed stale after direct sends and queued promotions. - Add nullable `chat_queued_messages.model_config_id`, backfilled from `chats.last_model_config_id`. Queued inserts round-trip the effective model id at enqueue time. - In `coderd/x/chatd`, direct sends update `chats.last_model_config_id` inside the same transaction that inserts the admitted user message. Manual promotion and auto-promotion use the queued row's stored `model_config_id`, with a fallback to `chats.last_model_config_id` for legacy NULL rows during rollout. `PromoteQueuedOptions.ModelConfigID` is now ignored. - On the site, extract `mergeWatchedChatSummary` and `mergeWatchedChatIntoCaches` in `site/src/api/queries/chats.ts` so status-change watch events merge `last_model_config_id` into the root infinite chat list, the parent-embedded child entry, and the per-chat `chatKey(chatId)` cache. `updated_at` guards against stale watch payloads clobbering newer cached state, while diff status events still merge their PR metadata because they are timestamped outside the chat row. Watch timestamps are compared as instants so variable fractional precision does not make fresh events look stale. - Queued promotion validates stored model config IDs before admission. Invalid legacy queued IDs fall back to the chat's current model config instead of dropping the queued message during auto-promotion. - Backend and frontend regression coverage added for admission, queue promotion (including FIFO across mixed models, legacy NULL fallback, and invalid queued model IDs), and chat watch cache merging. > Mux is acting on Mike's behalf.	2026-04-24 15:36:08 +02:00
Cian Johnston	a876287d36	feat: auto-archive inactive chats with audit trail (#24642 ) Adds a background job in `dbpurge` that periodically archives chats inactive beyond a configurable threshold. Each archived root chat gets a background audit entry tagged `chat_auto_archive`. Disabled by default. * New `AutoArchiveInactiveChats` SQL query with LATERAL last-activity subquery and partial index on archive candidates * `site_configs`-backed `auto_archive_days` setting with admin-only PUT, any-authenticated-user GET * Cascade archive via `root_chat_id`; pinned chats and active threads exempt * Root-only audit dispatch on detached context, matching manual archive (`patchChat`) behavior * 11 subtests covering disabled no-op, boundary, deleted messages, child activity, pinned exemption, multi-owner, idempotency, and batch pagination PR #24643 adds per-owner digest notifications. PR #24704 adds the requisite UI controls. > 🤖	2026-04-24 14:18:28 +01:00
Danielle Maywood	3a9a60dff8	feat: add collapsible thinking blocks with configurable display mode (#24635 )	2026-04-24 11:29:08 +00:00
Michael Suchacz	3d90546aae	feat: add general subagent model override (#24610 ) Adds a deployment-wide admin override for general delegated subagents. ## What changed - store the general override in `site_configs` and expose it through the shared `agent-model-override/{context}` API - apply the general override when spawning delegated general subagents, while preserving the existing Explore override behavior - reuse a shared Agents settings form for the general and Explore override sections ## Validation - `make gen` - `go test ./coderd -run 'TestChatModelOverrides'` - `go test ./coderd/x/chatd -run 'TestSpawnAgent_(GeneralUsesConfiguredModelOverride\|GeneralOverrideLogsAndFallsBackWhenCredentialsUnavailable\|GeneralOverrideLogsAndFallsBackWhenProviderDisabled)'` - `pnpm -C site lint:types` - `pnpm -C site test:storybook -- AgentSettingsAgentsPageView.stories.tsx` - `make lint` - `make pre-commit` > Mux is acting on Mike's behalf.	2026-04-24 12:37:20 +02:00
Cian Johnston	a02339c66a	fix(coderd/x/chatd): prevent invalid tool results from poisoning chat history (#24663 ) - computeruse.go: Decode base64 screenshot data before storing in `ToolResponse.Data` (was casting base64 string to bytes without decoding) - chatloop.go: Re-encode `ToolResponse.Data` to base64 via `base64.StdEncoding.EncodeToString` instead of `string()` cast - mcpclient.go: UTF-8 validate all text from MCP responses in `convertCallResult()` using `strings.ToValidUTF8` - chatprompt.go (persist): Defense-in-depth UTF-8 sanitization of text and media Text fields before database storage - chatprompt.go (replay): Antivenom layer that validates base64 and UTF-8 at read time, auto-healing already-poisoned chats without requiring a migration - `TestToolResultAntivenom`: 4 subtests covering poisoned text, poisoned media, valid media round-trip, and media with invalid UTF-8 text - Adds `TestConvertCallResult_UTF8Sanitization`: 4 subtests covering invalid UTF-8 in TextContent, EmbeddedResource, valid passthrough, and multi-part - Adds `TestComputerUseTool_Run_ScreenshotDataIsDecodedBinary`: Verifies no double-encode in the computer-use path - Updated existing computer-use tests for the new decoded-binary contract > 🤖	2026-04-23 19:58:38 +01:00
Cian Johnston	c602a31856	fix(coderd): reject pinning child chats in patchChat handler (#24669 ) The UI already prevents child (delegated/subagent) chats from being pinned, but the `PATCH /api/experimental/chats/{chat}` endpoint did not enforce this. A direct API call could pin a child chat. - Add a `400 Bad Request` guard in `patchChat` when `pinOrder > 0` and the chat has a `ParentChatID` - Add `TestChatPinOrder/RejectsChildChat` test > 🤖	2026-04-23 18:36:20 +01:00
Michael Suchacz	dbcc654d28	feat: snapshot explore subagent tool entitlements (#24638 ) Explore sub-agents previously could not use `web_search` or external MCP tools. `runChat` hard-skipped both for Explore. Lifting those guards naively would over-grant tools, because a child chat could outlive the spawning turn's plan-mode filter. This change persists the spawning parent turn's filtered external MCP server IDs onto the child Explore chat, and simplifies the Explore provider-tool filter in `runChat`: - New `resolveExploreToolSnapshot` helper: computes the child's inherited external MCP subset by running the parent's configs through `filterExternalMCPConfigsForTurn` (plan-mode policy) and, if the parent is itself an Explore child, further narrowing to the parent's own persisted `MCPServerIDs`. The result is written to the child's `MCPServerIDs` column at spawn time. - The existing `mcp_server_ids` column is the sole durable snapshot. No new chat column is added. - `runChat` for Explore children: loads MCP tools from the persisted snapshot, and keeps only `web_search` from provider-native tools (to block computer-use and other write-style tools, since Explore is read-only). Whether `web_search` is actually available is a per-model decision, determined by the current model config, just like a main chat. - Built-in Explore allowlist is unchanged. Workspace-local MCP remains excluded for Explore. Verification: `go build ./...`, `go test ./coderd/x/chatd/... -count=1`, `make gen` (clean tree), `make lint/emdash`, `go vet`. Deep-review ran 12 reviewers on the feature and 5 on the clarity refactor; CAR reviewed and approved; a subsequent scope reduction dropped a temporary `allow_web_search` column in favor of per-model handling. > Mux is acting on Mike's behalf.	2026-04-23 19:07:38 +02:00
Cian Johnston	b5a625549e	feat: migrate agents-access to org-scoped system role for proper chat RBAC (#24438 ) The agents-access role previously granted chat permissions at user scope, but chats are org-scoped objects. Rego skips user-level perms when org_owner is set, making the grants invisible. Handler-level band-aids used synthetic non-org-scoped objects as a workaround. - Migrates agents-access from users.rbac_roles (site-level) to organization_members.roles (org-scoped) via DB migration - Redefines agents-access as a predefined org-scoped builtin role alongside organization-admin, organization-auditor, etc., with Member permissions granting chat create/read/update - Excludes ResourceChat from OrgMemberPermissions so org membership alone no longer grants chat access - Fixes handler Authorize checks to use org-scoped objects with semantically correct actions (ActionUpdate for message/tool operations) - Grants org admins the ability to assign agents-access Closes #24250 Fixes CODAGT-174 Note: this does not update the "Usage" endpoints. Tracked by CODAGT-161. > 🤖	2026-04-23 17:59:42 +01:00
Mathias Fredriksson	f8fe5d680b	fix(coderd): reject API operations on archived chats (#24633 ) Archived chats accept mutations (messages, edits, queued-message promotions, tool-result submissions) via the API, causing them to re-enter the processing pipeline. This violates the hard-stop design intent from PR #23758. Add archived checks at three layers: - HTTP handlers (postChatMessages, patchChatMessage, promoteChatQueuedMessage, postChatToolResults): return 400 after auth so callers get a clear error. - Daemon functions (SendMessage, EditMessage, PromoteQueued, SubmitToolResults): return ErrChatArchived after row lock, guarding against future callers that bypass the handler. - AcquireChats SQL: filter out archived chats so they are never acquired for processing. Fixes CODAGT-245	2026-04-23 19:03:33 +03:00
Danny Kopping	a8613b2209	chore: deprecate /api/v2/aibridge/interceptions endpoint (#24670 ) Disclaimer: implemented by a Coder Agent using Claude Opus 4.6 Marks the `GET /api/v2/aibridge/interceptions` endpoint as deprecated in favor of `/aibridge/sessions`, which provides richer session-level aggregation including threads and agentic actions. Changes: - Add `@Deprecated` Swagger annotation to the endpoint handler - Add deprecation notice to the `codersdk.Client.AIBridgeListInterceptions` method - Regenerated OpenAPI spec with `"deprecated": true` flag The endpoint remains fully functional. Fixes https://github.com/coder/internal/issues/1339	2026-04-23 15:33:40 +02:00
Cian Johnston	2e5c7d99c2	fix(coderd/x/chatd): fix flaky TestSpawnComputerUseAgentInheritsContext (#24666 ) Fixes flaky `TestSpawnComputerUseAgentInheritsContext`. - The test inserts an Anthropic provider directly into the DB after `CreateChat` has already been called - The server's background goroutine may have already cached the provider list (OpenAI only) via `configCache.EnabledProviders()` with a 10s TTL - The direct DB insert bypasses the pubsub event that production uses to invalidate the cache - `isAnthropicConfigured()` returns the stale cached result, making `computer_use` appear unavailable - Fix: call `server.configCache.InvalidateProviders()` after the insert, mirroring what production does via pubsub CI failure: https://github.com/coder/coder/actions/runs/24829197096/job/72673070101?pr=24648 > 🤖	2026-04-23 13:18:18 +01:00
Jake Howell	4caa52844d	chore!: remove `api.ts` unnecessary calls (#22168 ) > [!WARNING] > The change of the status code from `404` to `204` could break peoples code downstream. Adding this as a breaking change incase. Theres a whole ton of noise around failed requests, these are all unrelated to the actual thing that is broken at hand (and are confusing). * Change `/api/v2/organizations/.../templates/.../versions/.../previous` to return `204` instead of `404` (actually makes more sense because the content doesn't exist, but the route is found. * Remove unnecessary calls to `/api/v2/users/me/appearance` when the user isn't logged in. * Remove unnecessary calls to `/api/v2/deployment/stats` when the deployment stats aren't allowed to be seen. * Various changes to `workspace-sharing` so we don't make unnecessary calls. Whats left: * `/api/v2/users/me` still `401`s on the login page. This persists as when the user is logged in but tries to reach the sign-in page they should be redirected to the app, not sign in again. * `monaco-editor` is still upset... we theoretically could inject an environment that can serve workers... but eh. #### Old ```sh % pnpm playwright:test -g "create workspace with default and required parameters" > coder-v2@ playwright:test /home/coder/coder/site > playwright test --config=e2e/playwright.config.ts -g 'create workspace with default and required parameters' ... Running 2 tests using 1 worker ✓ 1 …e/setup/addUsersAndLicense.spec.ts:7:5 › setup deployment (8.2s) 2 ….ts:79:5 › create workspace with default and required parameters [console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized) [console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized) [response] url=http://localhost:3111/api/v2/users/me/appearance status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."} [response] url=http://localhost:3111/api/v2/users/me status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."} [console][error] Failed to load resource: the server responded with a status of 403 (Forbidden) [response] url=http://localhost:3111/api/v2/deployment/stats status=403 body={"message":"Forbidden.","detail":"You don't have permission to view this content. If you believe this is a mistake, please contact your administrator or try signing in with different credentials."} [console][error] Failed to load resource: the server responded with a status of 403 (Forbidden) [response] url=http://localhost:3111/api/v2/deployment/stats status=403 body={"message":"Forbidden.","detail":"You don't have permission to view this content. If you believe this is a mistake, please contact your administrator or try signing in with different credentials."} [console][error] Failed to load resource: the server responded with a status of 404 (Not Found) [response] url=http://localhost:3111/api/v2/organizations//provisionerdaemons status=404 body={"message":"Resource not found or you do not have access to this resource"} [console][error] Failed to load resource: the server responded with a status of 404 (Not Found) [response] url=http://localhost:3111/api/v2/organizations/default/templates/a4e8096d/versions/agreeable_glenn33/previous status=404 body={"message":"No previous template version found for \"agreeable_glenn33\"."} [console][warning] Could not create web worker(s). Falling back to loading web worker code in main thread, which might cause UI freezes. Please see https://github.com/microsoft/monaco-editor#faq [console][warning] You must define a function MonacoEnvironment.getWorkerUrl or MonacoEnvironment.getWorker [console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized) [console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized) [response] url=http://localhost:3111/api/v2/users/me/appearance status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."} [response] url=http://localhost:3111/api/v2/users/me status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."} [console][error] Failed to load resource: the server responded with a status of 403 (Forbidden) [response] url=http://localhost:3111/api/v2/deployment/stats status=403 body={"message":"Forbidden.","detail":"You don't have permission to view this content. If you believe this is a mistake, please contact your administrator or try signing in with different credentials."} ✓ 2 …5 › create workspace with default and required parameters (7.0s)atus of 403 (Forbidden) [response] url=http://localhost:3111/api/v2/deployment/stats status=403 body={"message":"Forbidden.","detail":"You don't have permission to view this content. If you believe this is a mistake, please contact your administrator or try signing in with different credentials."} [console][error] Failed to load resource: the server responded with a status of 403 (Forbidden) [response] url=http://localhost:3111/api/v2/deployment/stats status=403 body={"message":"Forbidden.","detail":"You don't have permission to view this content. If you believe this is a mistake, please contact your administrator or try signing in with different credentials."} 2 passed (56.1s) ``` `23 LOL` (Lines of logs) #### New ```sh % pnpm playwright:test -g "create workspace with default and required parameters" > coder-v2@ playwright:test /home/coder/coder/site > playwright test --config=e2e/playwright.config.ts -g 'create workspace with default and required parameters' ... Running 2 tests using 1 worker ✓ 1 …e/setup/addUsersAndLicense.spec.ts:7:5 › setup deployment (8.7s) 2 ….ts:79:5 › create workspace with default and required parameters [console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized) [console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized) [response] url=http://localhost:3111/api/v2/users/me/appearance status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."} [response] url=http://localhost:3111/api/v2/users/me status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."} [console][warning] Could not create web worker(s). Falling back to loading web worker code in main thread, which might cause UI freezes. Please see https://github.com/microsoft/monaco-editor#faq [console][warning] You must define a function MonacoEnvironment.getWorkerUrl or MonacoEnvironment.getWorker ✓ 2 …5 › create workspace with default and required parameters (7.1s)atus of 401 (Unauthorized) [console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized) [response] url=http://localhost:3111/api/v2/users/me/appearance status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."} [response] url=http://localhost:3111/api/v2/users/me status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."} 2 passed (32.0s) ``` `9 LOL` (Lines of logs)	2026-04-23 06:20:35 +10:00
Cian Johnston	be1256c418	fix(coderd): fix TestListChats/PinnedOnFirstPage race timeout (#24641 ) - Insert filler chats directly into the database with `completed` status instead of creating them via the API - Removes the `testutil.Eventually` polling loop that waited for all 52 chats to reach terminal status - Avoids spawning 52 background chat processors that each time out on title generation under `-race`, exceeding the 25s `WaitLong` timeout - Test now completes in ~1s instead of timing out at 30s+ Flake: https://github.com/coder/coder/actions/runs/24789695935/job/72543519963?pr=24438 > 🤖	2026-04-22 20:37:06 +01:00
Mathias Fredriksson	1ace519c6e	fix(coderd/x/chatd): remove cache-miss check blocking agent recovery (#24634 ) The cache-miss isAgentUnreachable check added in #24336 runs before dialWithLazyValidation, preventing the existing switch mechanism from discovering the new agent after a workspace rebuild. The chat's stale agent binding is never repaired, causing an infinite loop of 'agent is disconnected' errors. Remove the cache-miss check. The cache-hit check remains (it verifies the agent behind an established connection). The dial timeout and dialWithLazyValidation already bound the cache-miss failure path. Closes CODAGT-248	2026-04-22 21:49:10 +03:00
Cian Johnston	72e3ae9c5f	feat: add chatd tool call error metrics and logging (#24559 ) - Add `coderd_chatd_tool_errors_total` prometheus counter (labels: provider, model, tool_name) - Log tool call errors at warn level with correlation fields: chat_id, owner_id, organization_id, workspace_id, agent_id, parent_chat_id, trigger_message_id, tool_name, tool_call_id, provider, model - Thread enriched logger from chatd.go into chatloop via `RunOptions.Logger` - Remove squashing of all MCP tool calls to the `mcp` bucket > 🤖	2026-04-22 16:19:56 +00:00
Michael Suchacz	7904bed947	fix: fall back to local git watcher for chat diff drawer (#24512 ) The Ctrl+D diff drawer in `coder exp agents` only rendered PR-backed diffs returned by `/api/experimental/chats/{id}/diff`. Local working tree changes in a chat's workspace returned an empty diff, so the drawer showed "No diff contents" with no file summary. Centralise diff loading behind a single `fetchChatDiffContents` helper that first hits `/diff`, then falls back to the chat git watcher WebSocket (`/stream/git`) when the remote diff is empty. Aggregate the agent's `WorkspaceAgentRepoChanges` into a `ChatDiffContents` value so the drawer can derive the file summary and styled body from the local unified diff. Missing workspaces, missing agents, and watcher timeouts are treated as graceful fallbacks that render the empty-diff placeholder instead of a hard error. > Mux is opening this PR on Mike's behalf.	2026-04-22 18:08:02 +02:00
Jeremy Ruppel	c23abc691f	feat: sort AI sessions by last prompt time (#24440 ) Previously, the sessions list sorted by `MIN(started_at)` across interceptions, so sessions with old start times but recent activity would sink to the bottom of the list regardless of how recently they were used. `ListAIBridgeSessions` now sorts by `COALESCE(MAX(prompt.created_at), MIN(started_at)) DESC`, exposed as the non-nullable `last_active_at` field. Sessions with prompts surface by last activity; sessions with no prompts fall back to their start time. The original implementation used two separate columns (`last_active_at` as a nullable prompt timestamp and `sort_at` as the non-nullable cursor key). This revision collapses them into a single `last_active_at` that is always set — simplifying the SQL, the Go conversion, the API type, and the frontend. 🤖 Generated with [Claude Code](https://claude.ai/claude-code) --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-22 12:06:49 -04:00
Marcin Tojek	ec91ac5427	fix: grant AsAIBridged ResourceSystem.ActionCreate for UpsertAISeatState (#24603 ) Related coder/internal#1444	2026-04-22 16:38:57 +02:00
Michael Suchacz	9b5d09ebdc	test(coderd/x/chatd): seed anthropic provider for computer_use tests (#24611 ) `TestSubagentLifecycleToolsIncludePersistedSubagentTypeAcrossVariants/ComputerUse` and two adjacent positive tests passed a static Anthropic key into `newInternalTestServer`, but `seedInternalChatDeps` only inserts an OpenAI provider. At runtime, `Server.resolveUserProviderAPIKeys` calls `chatprovider.PruneDisabledProviderKeys`, which clears `keys.Anthropic` because Anthropic is not in the enabled DB provider set, so the `computer_use` execution path loses its key. Add a focused test helper `seedEnabledAnthropicProvider` and use it only in the positive tests that actually drive a `computer_use` spawn through the runtime key-resolution path (the `computer_use` branch of `TestSubagentLifecycleToolsIncludePersistedSubagentTypeAcrossVariants`, `TestSpawnAgent_ComputerUseUsesComputerUseModelNotParent`, and `TestSpawnAgent_ComputerUseInheritsMCPServerIDs`). `seedInternalChatDeps` stays unchanged, so the negative availability tests continue to model the "Anthropic unavailable" fixture. No production code is modified. Closes https://github.com/coder/internal/issues/1486 > This PR was opened by Mux working on Mike's behalf.	2026-04-22 15:54:17 +02:00
Thomas Kosiewski	b7c2c59931	fix(coderd/x/chatd/chatdebug): allow Anthropic per-modality ratelimit headers (#24592 ) Previously, Anthropic's per-modality, Priority Tier, and fast-mode rate-limit headers (`Anthropic-Ratelimit-Input-Tokens-`, `Anthropic-Ratelimit-Output-Tokens-`, `Anthropic-Priority-Input-Tokens-`, `Anthropic-Priority-Output-Tokens-`, `Anthropic-Fast-Input-Tokens-`, and `Anthropic-Fast-Output-Tokens-`) were shown as `[REDACTED]` in the Debug panel because they contain `"token"` in the name and fell through the generic credential filter. Add them to the allowlist in `coderd/x/chatd/chatdebug/redaction.go` alongside the existing `Anthropic-Ratelimit-Tokens-*` entries so the limits/remaining/reset values surface in the raw response view.	2026-04-22 15:14:31 +02:00
Thomas Kosiewski	26b64fa523	fix(coderd/x/chatd/chatdebug): record SSE attempts on EOF (#24565 ) `chat_turn` debug steps persist with `attempts: []` even when the streaming call to Anthropic completes successfully. Fantasy's Anthropic SSE adapter iterates the response to EOF via `for stream.Next()` and abandons the body without calling `Close()`, so `RecordingTransport`'s Close-only recording path never fires and the attempt is lost. Non-streaming runs (`quickgen`, `title_generation`) go through `model.Generate(...)` and are unaffected. Record on `io.EOF` for `text/event-stream` bodies specifically. Non-SSE responses stay on the Close-only path so JSON integrity, content-length validation, and inner-`Close()` error semantics are preserved. `record()` is already `sync.Once`-guarded, so a later `Close()` is a no-op for recording.	2026-04-22 15:02:02 +02:00
Michael Suchacz	9634739aed	fix: support Bedrock ambient AWS credentials for Agents providers (#24397 ) > This PR was authored by Mux on behalf of Mike. Adds AWS Bedrock ambient credential support to the Agents provider path. Bedrock providers can now be saved without a stored API key and authenticated via the standard AWS SDK credential chain on the Coder server (IAM roles, `AWS_ACCESS_KEY_ID`, etc.). Also fixes missing `Base URL` forwarding for Bedrock. ## Changes Backend runtime (`coderd/x/chatd/chatprovider/chatprovider.go`): - New `ProviderAllowsAmbientCredentials(provider)` helper. Currently returns true only for Bedrock. - `ModelFromConfig` no longer errors on an empty API key when the provider is in the ambient-allowed set AND was explicitly resolved via `ByProvider`. This preserves the policy gate: unresolvable providers (disabled central key, user-key-required without a user key) still error. - `setResolvedProviderAPIKey` internalizes the ambient-credentials contract via `ProviderAllowsAmbientCredentials`, so a resolved-but-keyless Bedrock provider is represented as an empty `ByProvider` entry rather than a post-hoc sentinel patch in the caller. - `WithAPIKey` is only appended when a token is present. - `WithBaseURL(baseURL)` is now forwarded for Bedrock (was previously missing). Backend admin API (`coderd/exp_chats.go`): - `validateChatProviderCentralAPIKey` exempts Bedrock from requiring a stored API key when central credentials are enabled. - AI Gateway separation (`ChatProviderAPIKeysFromDeploymentValues`) is unchanged. No silent reuse of `CODER_AIBRIDGE_BEDROCK_` flags. Frontend* (`site/src/pages/AgentsPage/components/ChatModelAdminPanel/`): - API Key field is optional for Bedrock when central credentials are enabled. - Bedrock-specific descriptions on API Key and Base URL fields (bearer-token vs ambient modes, `AWS_REGION` guidance). - Right-aligned "Clear stored token" action switches an existing Bedrock provider back to ambient mode. - `hasEffectiveAPIKey` treats Bedrock with central credentials enabled as configured, so the provider list shows the correct status icon. - Three new stories: `ProviderFormBedrockAmbientCredentials`, `ProviderFormBedrockBearerToken`, `ProviderFormBedrockClearBearerToken`. Docs* (`docs/ai-coder/agents/models.md`, `docs/ai-coder/ai-gateway/setup.md`): - New "Configuring AWS Bedrock" section covering both credential modes, region resolution, and the Base URL override. - Explicit note that the `us-east-1` region fallback only applies to bearer-token mode; ambient credentials require a region from the standard AWS SDK chain. - Cross-reference in AI Gateway docs clarifying that `CODER_AIBRIDGE_BEDROCK_*` flags are a separate configuration path from Agents. ## Not in scope - Reusing AI Gateway Bedrock flags as an implicit Agents fallback. - Per-provider AWS access key, secret, or region fields (would need a migration and audit-table review). - IMDS or network-backed credential probes in admin/listing request paths. ## Related Dogfood deployment integration: https://github.com/coder/dogfood/pull/324	2026-04-22 14:20:23 +02:00
Mathias Fredriksson	78d9a220cf	fix(coderd/x/chatd): detect disconnected agents in getWorkspaceConn (#24336 ) Add agent status check and dial timeout to getWorkspaceConn to prevent tool calls from hanging when a workspace agent disconnects. Status check: call isAgentUnreachable on every getWorkspaceConn call. On cache miss, check the freshly fetched agent row. On cache hit, re-fetch the agent row by PK for a fresh heartbeat timestamp. Disconnected and timed-out agents return a sentinel immediately; connecting agents proceed to dial. Dial timeout: wrap dialWithLazyValidation in a 30s context.WithTimeoutCause (matching 8 other server-side AgentConn callers). Parent context cancellation propagates unchanged so the chatloop can detect ErrInterrupted. Both sentinels tell the LLM the agent is unreachable and the workspace may need restarting from the dashboard. Closes CODAGT-149	2026-04-22 12:10:32 +00:00
Cian Johnston	38f5d3f0b2	test: add regression guard for chat title masking (#24584 ) Follow-up to #24564 addressing unresolved review findings. - DEREM-1: Add `Test_diff/Chat/TitleMasked` to `enterprise/audit/diff_internal_test.go` so flipping `title` back to `ActionTrack` fails loudly. Verified: the case passes today, fails with a clear diff after flipping to `ActionTrack`, passes again after reverting. - DEREM-4: Inline comment at `coderd/audit/request.go:138` explaining why `ResourceTarget` for `database.Chat` returns a UUID prefix instead of the title. - DEREM-5: Trailing comment on `enterprise/audit/table.go` `title` entry, matching the surrounding `ActionSecret` comment style. Won't-fix, with rationale (per user): - DEREM-2 (8-char prefix collision risk): `resource_target` is a display hint, not an identifier; the full UUID lives in `resource_id`. - DEREM-3 (named constant for `[:8]`): single call site; extracting would be ceremony. - DEREM-6 (PR title misleading): merged PR title is immutable. - DEREM-7 (historical log redaction): the offending version only shipped to dogfood for a couple of hours and not to customers. > 🤖	2026-04-22 10:52:52 +00:00
Jakub Domeracki	86b2db60b2	fix(coderd): enforce ActionSSH in MCP HTTP agent connection path (#24607 )	2026-04-22 12:34:17 +02:00
Ethan	cc4e04afde	feat(site): display file attachments in chat UI (#24281 ) Renders the durable file attachments introduced in #24280 in the chat interface. Without this, attachments were stored and served correctly but the UI showed raw file parts with no previews or download UX. Every attachment gets a download affordance, split into three rendering tiers: - Images — thumbnail with a hover/focus overlay containing a download link. `onFocusCapture`/`onBlurCapture` with `contains(relatedTarget)` keeps the overlay open while tabbing between the image and its download link. - Text-like files (`text/`, `application/json`) — expandable preview button with loading + error-with-retry states and the same download overlay. Preview fetches throw a typed `FetchTextAttachmentError` with a `.status` field instead of a stringly-typed error. - Everything else* — compact `FileCard` with extension badge, filename, and download link. User-side and assistant-side rendering now share `AttachmentBlocks.tsx` (`AttachmentPreviewFrame`, `TextAttachmentButton`, `ImageAttachmentButton`, `FileCard`, plus `getAttachmentHref`/`getAttachmentName`) instead of two near-duplicate implementations. The text-attachment overlay anchors to the preview surface so the download button stays pinned even when a loading/error status line widens the row below. `ComputerRenderer` detects when a screenshot was stored as a durable attachment (`attachment_file_id`) and suppresses the stale base64 rendering — the screenshot appears as a proper file part instead. `ToolLabel` shows the attached filename for `attach_file` tool calls. Storybook coverage in `ConversationTimeline.stories.tsx` was expanded to cover every tier (single/multiple images, inline + file-id text, JSON, download-only files, fetch-failure retry, mixed attachments + file references) with play-function assertions. <img width="811" height="150" alt="image" src="https://github.com/user-attachments/assets/27c71081-3502-4e80-92a7-d8adf1ff9323" /> ## Cleanup Per Mathias' post-merge suggestion on #24280, this PR also relocates `coderd/chatfiles` → `coderd/x/chatfiles` so the durable-attachment helpers live beside the rest of the `chatd` experimental surface. Closes CODAGT-91	2026-04-22 20:11:53 +10:00
Ethan	ad1906589d	fix(coderd): allow deleting chat providers used in historical chats (#24568 ) Drop the `chat_model_configs.provider -> chat_providers.provider` foreign key and soft-delete model configs when their provider is removed. The provider row is now hard-deleted inside a transaction that also tombstones its model configs and promotes a replacement default when needed. Historical chats and messages keep pointing at the soft-deleted model config rows, which are hidden from live/admin queries but still resolve for read. The runtime chat path already falls back to the default model config when a soft-deleted config is looked up. Replaces the lost FK validation in the create/update model-config handlers with an explicit provider lookup that returns the existing `Chat provider is not configured.` 400. ## UX Admin deleting a chat provider that has historical usage - Before: blocked with 400 `Provider models are still referenced by existing chats.` Admins had no in-product way to remove a provider that had ever been used. - After: delete succeeds (204). Any model configs under that provider are soft-deleted. If the removed provider owned the default model config, one of the remaining live configs is auto-promoted to the new default. The promotion is deterministic (`ensureDefaultChatModelConfig` picks the first live config by `provider ASC, model ASC, updated_at DESC, id DESC`); there is no picker, and no toast or response detail names which config became the new default. End users with chats that used a deleted provider's model - Old chats still open and their history still renders unchanged. - Sending a new turn in such a chat silently falls back to the current default model. No banner or warning tells the user the original model is gone. - The model picker no longer lists the deleted model. - If no default model config exists at all after the delete, sending a new turn fails with `no default chat model config is available`. Admin creating or updating a model config against a provider that is not configured - Same as before: 400 `Chat provider is not configured.` Only the detection mechanism changed (explicit `FOR UPDATE` lookup inside the transaction, which also serializes against a concurrent provider delete). Admin updating a model config whose row disappears mid-transaction - Now returns the standard 404 `Resource not found or you do not have access to this resource` instead of the previous 500 that leaked `sql: no rows in result set` in the detail. Unrelated internal races (for example a race on the promoted default candidate) are still reported as 500 so they are not misclassified as "your target is gone". Closes CODAGT-23	2026-04-22 19:34:34 +10:00
Cian Johnston	360e119b43	fix(coderd): use waitChatSettled in remaining title tests (#24585 ) - Replace inline `require.Eventually` blocks in `PreservesUpdatedAt` and `NoOpWhenTitleUnchanged` with the shared `waitChatSettled` helper - These were the last two title subtests still using direct DB polling instead of the API-based helper > 🤖	2026-04-22 09:14:25 +01:00
Ethan	353e522614	fix: handle expired chat file attachments in replay and UI (#24518 ) Closes CODAGT-216 ## Problem `dbpurge` deletes `chat_files` rows after the deployment's configured retention window, but `chat_messages.content` can still contain `file_id` references to those files. On replay, that left the Anthropic provider with an empty file payload and a `400 image cannot be empty` error. In the UI, the same missing file showed up as a broken image. ## Fix - Backend: when replay hits a `file_id` whose bytes are gone, replace it with a short text placeholder instead of emitting an empty file part. We could also drop the missing attachment entirely, but that would silently remove context from the replay and make the conversation harder for the model to interpret. The placeholder keeps the request valid while still telling the model that a file used to be there and is no longer available. - Frontend: classify chat image failures instead of treating every broken image the same. - `404` file fetches render `Image expired`, with a tooltip explaining that chat attachments are deleted after the retention window set for the deployment. - Other remote failures render `Image failed to load`, with a tooltip that surfaces server/network detail when available. - Invalid inline image data still renders `Image failed to load` without a probe.	2026-04-22 14:10:51 +10:00
blinkagent[bot]	79a9f437d7	feat(coderd/x/chatd/chattool): add description tags to tool parameter structs (#24394 )	2026-04-21 11:37:29 -07:00
Jaayden Halko	148e56b5d9	fix(coderd): fix TestPatchChat/Title flake by waiting for chat to settle (#24572 ) ## Problem `TestPatchChat/Title/Rename` and `TestPatchChat/Title/TrimsWhitespace` fail intermittently on `test-go-pg` with: ``` PATCH .../api/experimental/chats/<id>: unexpected status code 409: Title regeneration already in progress for this chat. ``` `createChat` persists a chat with `ChatStatusPending` and signals the daemon wake loop. If the `UpdateChat` PATCH arrives before the daemon transitions the chat past `Pending`/`Running`, the handler's `acquireManualTitleLock` returns a 409. Whether the PATCH wins the race is timing-dependent under PG + `-parallel` load. Sibling subtests `PreservesUpdatedAt` and `NoOpWhenTitleUnchanged` already wait for the chat to leave `Pending`/`Running` before renaming, which is why they do not flake. ## Fix Add a `waitChatSettled` helper closure in `TestPatchChat` that polls `client.GetChat` until the chat status leaves `Pending`/`Running`. Call it in the 4 subtests that issue a valid rename immediately after `createChat`: - `Title/Rename` (originally reported flake) - `Title/TrimsWhitespace` (originally reported flake) - `Title/LengthBoundaries` (latent flake in valid-rename cases) - `Title/PublishesWatchEvent` (latent flake, goroutine silently 409s) No handler, daemon, or SDK changes. The 409 is intentional production behavior; this is a pure test-side timing fix. Refs coder/internal#1480	2026-04-21 17:10:00 +01:00
Ethan	c1421b4ead	test(coderd/x/chatd): deflake stale control notification test (#24545 ) Previously, `TestProcessChat_IgnoresStaleControlNotification` could return as soon as `UpdateChatStatus` ran, even though `processChat` still re-read chat state and finished deferred cleanup afterward. That let gomock and quartz teardown race the tail of cleanup and intermittently fail the test. Wait for `processChat` itself to return before asserting the final status, while keeping the existing strict mock expectations intact. Closes https://github.com/coder/internal/issues/1479	2026-04-22 00:08:34 +10:00
Ethan	2295e9d5be	feat: surface upstream provider error details in chat callout (#24546 ) Anthropic HTTP 400 responses (e.g. "image exceeds 5 MB maximum") were collapsed in the chat UI to the generic headline "Anthropic returned an unexpected error (HTTP 400)." with no actionable detail — the upstream message survived to the processor log but was dropped before reaching the client. Add a new optional `Detail` field on `codersdk.ChatStreamError` that carries the upstream provider message alongside the existing normalized headline. The backend extracts `error.message` from `fantasy.ProviderError.ResponseBody` (the JSON envelope shared by Anthropic and OpenAI), falls back to the trimmed provider message when the body is absent or unparseable, and caps the result at 500 runes. The frontend threads `Detail` through `useChatStore`, `liveStatusModel`, and `ChatStatusCallout`, rendering it as a muted secondary line inside the existing `AlertDescription`. Before: <img width="1552" height="185" alt="image" src="https://github.com/user-attachments/assets/524b588e-3cee-4fad-bc15-6bf3aec0899d" /> After: <img width="814" height="173" alt="image" src="https://github.com/user-attachments/assets/eae82a89-3ac1-4a33-8d18-ef9f77263d89" /> ## Persistence `Detail` is not persisted — it disappears on refresh. Persisting it would require a DB change (today `chats.last_error` is a single nullable `TEXT` column), and the shape of persisted chat errors is worth a more deliberate rethink — e.g. promoting `last_error` to `JSONB` so we can also retain structured fields like `kind`, `statusCode`, `provider`, and `retryable` instead of only the normalized headline string. That's a bigger design discussion than this PR should carry. In the meantime, seeing the upstream error reason immediately on failure is already a large UX improvement over the status quo, and this PR gets us there without prejudicing the eventual persistence design. Tracking persistence in CODAGT-239. Closes CODAGT-235	2026-04-22 00:05:27 +10:00
Cian Johnston	4d45b69b03	fix: stop tracking chat title in audit logs (#24564 ) Chat titles can contain sensitive information (secrets, internal project names, etc.) and should not be visible in audit logs. - Use truncated chat UUID (first 8 chars) as `resource_target` instead of the title - Mark the `title` field as `ActionSecret` so diffs render as `••••••••` <details><summary>Implementation notes</summary> Two changes: 1. `coderd/audit/request.go`: `ResourceTarget` for Chat returns `typed.ID.String()[:8]` instead of `typed.Title` 2. `enterprise/audit/table.go`: Chat `title` field tracking changed from `ActionTrack` to `ActionSecret` No frontend changes needed. The frontend already handles `secret: true` fields. </details> > 🤖	2026-04-21 14:26:22 +01:00
Michael Suchacz	f073323c89	refactor: unify subagent spawn behind spawn_subagent (#24535 ) Unify the three subagent spawn tools (`spawn_agent`, `spawn_explore_agent`, `spawn_computer_use_agent`) behind a single `spawn_subagent` tool keyed by a `subagent_type` discriminant (`general`, `explore`, `computer_use`). Mirrors the single-entry-point pattern already used by `task` in mux while keeping `wait_agent`, `message_agent`, and `close_agent` as separate lifecycle tools. A new backend subagent definition catalog (`coderd/x/chatd/subagent_catalog.go`) is the source of truth for tool description, prompt guidance, availability rules (plan mode, desktop/Anthropic gating), and child-chat option building. `spawn_subagent` advertises only the types available in the current context and validates `subagent_type` server-side; context inheritance still flows through the existing `createChildSubagentChatWithOptions` path. `wait_agent`, `message_agent`, and `close_agent` responses now include a server-derived `subagent_type` so the UI stops inferring lifecycle state from tool names. The frontend gets a shared normalization helper (`site/src/pages/AgentsPage/components/ChatElements/tools/subagentDescriptor.ts`) that maps either legacy tool names or new `spawn_subagent` args into a common descriptor (action, variant, icon, fallback copy). Legacy transcripts still render identically; `Tool.tsx`, `SubagentTool.tsx`, `ToolLabel.tsx`, `ToolIcon.tsx`, and `messageParsing.ts` now key off the descriptor instead of hard-coded names. Existing UI copy is preserved (`Spawning Explore agent...`, `Using the computer...`, computer-use monitor icon and Open Desktop affordance). > This PR was opened by Mux working on Mike's behalf.	2026-04-21 14:01:32 +02:00
Michael Suchacz	cb67e71835	fix(coderd/database): renumber duplicate MCP migration (#24552 ) ## Summary - rename the `allow_in_plan_mode` migration pair from `000472` to `000473` - rename the matching fixture file and update its comment - remove the duplicate migration version that broke containerized database startup ## Testing - `go test ./coderd/database/migrations -run '^TestMigrate$' -count=1 -timeout 15m` - validated `iofs.New` for `coderd/database/migrations` and `coderd/database/migrations/testdata/fixtures` Closes coder/internal#1483 > Mux opened this PR on Mike's behalf.	2026-04-21 11:10:17 +00:00
Michael Suchacz	9d0469fc4c	feat: allow approved external MCP tools in root plan mode (#24509 ) ## Summary Allow root plan-mode chats to use MCP tools from external servers that an admin has explicitly approved for plan mode. Workspace MCP and plan-mode subagents remain blocked. ## Problem `chatd.go` excluded every MCP tool when `isPlanModeTurn` was true, so planning had no access to tools like docs search, ticketing, etc. Lifting that guard wholesale was unsafe: `mcp_server_configs` already has centralized admin governance, but workspace-local MCP (discovered from agent `.mcp.json`) does not, and subagents use a narrower trust boundary. ## Fix Add an admin-controlled per-server `allow_in_plan_mode` flag (default `false`) and gate plan-mode MCP access on it. ### Backend / schema - New migration `000472_mcp_server_allow_in_plan_mode.{up,down}.sql` and matching fixture update. - `mcpserverconfigs.sql` + generated code: persist and read the new column. - `codersdk/mcp.go`: thread the field through `MCPServerConfig`, `Create`, and `Update` request types. - `coderd/mcp.go`: validate, persist, and return the flag in get/list/create/update handlers. ### chatd - `coderd/x/chatd/chatd.go`: pre-filter selected external MCP configs by `AllowInPlanMode` before calling `mcpclient.ConnectAll` on plan-mode root turns. Workspace MCP discovery is skipped entirely on plan-mode turns. - Single helper decides whether a tool is available in plan mode, used both at construction and for active-tool filtering (defense in depth). Plan-mode subagents, dynamic tools, provider-native tools, computer-use, and workspace MCP stay unchanged. - `coderd/x/chatd/prompt.go`: update the root plan-mode overlay text to match the new boundary. ### UI - `MCPServerAdminPanel.tsx`: add an explicit toggle ("Allow all tools from this MCP server in root plan mode") next to the existing governance controls. - Regenerated `site/src/api/typesGenerated.ts`. ### Docs - `docs/ai-coder/agents/architecture.md`: replace the blanket "MCP is unavailable in plan mode" note with the new root-only, external-only, admin-approved policy. Explicitly call out that workspace MCP and plan-mode subagents are still excluded. ### Tests - Plan-mode visibility (approved vs non-approved external server). - Plan-mode invocation of an approved external MCP tool. - End-to-end plan-mode workflow that uses an approved MCP tool and then reaches `propose_plan`. - Regressions: workspace MCP still excluded in plan mode; plan-mode subagents still on the restricted tool boundary; existing tool allow/deny list filtering still applies. ## Policy precedence `allow_in_plan_mode` is an additional requirement on top of existing `enabled`, availability, chat-selected / forced server IDs, and tool allow/deny lists. It approves all tools on that server for root plan mode; a per-tool plan allowlist is deliberately deferred. ## Follow-ups (explicitly out of scope) - Whether plan-mode subagents should inherit approved external MCP tools. - Workspace-local MCP safety model (agent-side `.mcp.json` schema vs. a coderd-managed workspace MCP config). ## Validation - `go vet ./coderd/x/chatd/...` - `go test ./coderd/x/chatd -run 'TestPlan.\|TestMCP.' -count=1` - `go test ./coderd/x/chatd -count=1 -timeout 5m` (full chatd suite) - `make fmt` (no diff) > Mux opened this PR on Mike's behalf.	2026-04-21 12:26:12 +02:00
Cian Johnston	c968a1f3a3	feat: make database.Chat auditable (#24485 ) Wire database.Chat into the audit system so chat lifecycle events (creation, patches, etc.) produce audit log entries. Part of CODAGT-200. > 🤖	2026-04-21 11:11:56 +01:00
Cian Johnston	5f3effd839	fix(coderd/x/chatd): add chattest.OpenAI() default fake server (#24540 ) - Add `chattest.OpenAI(t)` convenience wrapper around `NewOpenAI` with sensible defaults (JSON title response for non-streaming, text chunk for streaming) - Update `seedChatDependencies` to use it instead of an empty base URL, preventing title generation from hitting real `api.openai.com` with a fake key: ``` t.go:111: 2026-04-20 19:23:31.885 [debu] coderd.chatd.processor: title model candidate failed chat_id=edb43454-f23d-4163-9974-d101b8091de6 chat_id=edb43454-f23d-4163-9974-d101b8091de6 ... error= generate structured title: github.com/coder/coder/v2/coderd/x/chatd.generateStructuredTitleWithUsage /home/coder/src/coder/coder/coderd/x/chatd/quickgen.go:443 - unauthorized: Incorrect API key provided: test-api-key. You can find your API key at https://platform.openai.com/account/api-keys. ``` > 🤖	2026-04-21 10:26:20 +01:00
Ethan	181e103201	fix: reuse shared tailnet for coderd-hosted MCP workspace tools (#24460 ) ## Problem Coderd can expose an MCP server at `/api/experimental/mcp/http` (we have this enabled on dogfood). Its workspace tools dialed agents through a per-call client-side tailnet stack. Every tool call re-created a WireGuard device, netstack, magicsock + UDP sockets, DERP connection, coordinator websocket, and their goroutines — in a process that already runs a long-lived shared tailnet. The duplicate stacks drove up resource usage under load. ## Fix Route this server's tool calls through the existing shared tailnet, so none of those transports are reconstructed per call. Closing an `AgentConn` now releases a tunnel reference instead of tearing down a transport. ## Potential follow-up `coder exp mcp server` still builds a fresh tailnet per call. It pays per-call latency and causes coordinator/DERP churn. A shared CLI tailnet is more involved — unlike coderd, the CLI has no existing shared tailnet to reuse, so it would need a new long-lived client-side tailnet with reconnect, sleep/wake, and idle-destination handling. There's less motivation to optimize this, given the client-side MCP does not compete for resources with coderd. Closes CODAGT-199 > Generated by mux, but reviewed by a human	2026-04-21 11:37:10 +10:00
Ethan	1203f625b7	feat(coderd): accept parameters in start_workspace tool (#24434 ) When the chat `start_workspace` tool triggers an active-version upgrade that introduces new required parameters, the build fails with a parameter validation error. Previously this returned a message telling the user to update from the UI — a dead end for the model. This PR lets the model recover inside the chat by: 1. Accepting an optional `parameters` map on `start_workspace` (same schema as `create_workspace`), forwarded as `RichParameterValues`. 2. Returning structured JSON error responses that preserve validation details and the workspace's `template_id`, so the model can call `read_template` to discover what changed. 3. Replacing the UI-only guidance in `exp_chats.go` with model-actionable retry instructions. The expected model flow on an active-version parameter failure is now: ``` start_workspace → fails (structured error with template_id + validations) read_template → discovers new required parameters start_workspace → retries with parameters map → workspace starts ``` <img width="846" height="511" alt="image" src="https://github.com/user-attachments/assets/d18b6864-5970-4225-8da0-0f2ab134ccb4" />	2026-04-21 11:36:20 +10:00
Jakub Domeracki	411ed21059	fix(coderd): omit frame-ancestors CSP for embed routes (#24529 )	2026-04-20 15:38:52 +02:00
Jaayden Halko	410f9a5e19	feat: allow renaming of agent chat title (#24489 ) Co-authored-by: Coder Agents <noreply@coder.com>	2026-04-20 14:00:46 +01:00
Thomas Kosiewski	18a30a7a10	feat: add chat debug HTTP handlers and API docs (#23918 )	2026-04-20 13:34:41 +02:00
Dean Sheather	ea00d2d396	fix(coderd): enforce workspace authz on watchChatGit (#24477 ) `watchChatGit` proxies a live websocket to the workspace agent's git watcher (`/api/v0/git/watch`), streaming repository diffs back through the chat stream. Before this change it only enforced `chat:read` (via `ExtractChatParam`) plus an implicit `workspace:read` from the dbauthz wrapper on `GetWorkspaceAgentsInLatestBuildByWorkspaceID`. The sibling `watchChatDesktop` handler already fetches the workspace and requires `policy.ActionApplicationConnect` or `policy.ActionSSH` before dialing. Built-in roles like Template Admin and Org Admin grant `workspace:read` without SSH/ApplicationConnect, and Owner also loses both under `DisableOwnerWorkspaceExec`. A chat owner whose exec-level workspace access was revoked after the chat was bound could therefore keep streaming repository content from the workspace agent through the chat's git-watch endpoint. Mirror `watchChatDesktop`: fetch the workspace and require `ApplicationConnect \|\| SSH` before any agent-tunnel activity. Adds one real-coderdtest regression test (`TestWatchChatGitAuthz`) that demotes the chat's owner to template-admin after binding and asserts the git-watch endpoint returns 403; the mock-based `TestWatchChatGit` in `coderd/workspaceagents_internal_test.go` continues to cover the no-workspace / disconnected-agent / websocket-proxy paths. Fixes CODAGT-184.	2026-04-20 21:33:35 +10:00

1 2 3 4 5 ...

3702 Commits