Commit Graph

3723 Commits

Author SHA1 Message Date
Garrett Delfosse f009c17217 fix(coderd): cut DB fan-out on agent instance-identity auth (backport #24973) (#24982)
Backport of #24973 to `release/2.33`.

## Summary

Restores `v2.33.0-rc.2`-equivalent query cost for agent
instance-identity auth, which currently saturates the pgx pool when
multiple agents share an instance ID. Customer report against rc.3
traced 233x `Internal error fetching provisioner job resource` 500s
during a 50-minute incident window to this path.

## Changes

1. **System fast-path on `authorizeProvisionerJob`**
(`coderd/database/dbauthz/dbauthz.go`): Short-circuits the per-job RBAC
fan-out through `GetWorkspaceBuildByJobID` -> `GetWorkspaceByID` for
`AsSystemRestricted` callers.
2. **Drop survivor re-fetch in `handleAuthInstanceID`**
(`coderd/workspaceresourceauth.go`): Captures the provisioner job
alongside each candidate during the filter loop so the post-selection
code reads it directly instead of re-querying.

## Conflict resolution

One conflict in `coderd/database/dbauthz/dbauthz_test.go`: the
`TestAsAutostart` test function (from an unrelated commit on `main`) was
brought in as surrounding context during the cherry-pick. It was removed
since it tests functionality (`ResourceUserSecret.Read` for the
Autostart role) not present on the release branch.

## Tests

- `TestAuthorizeProvisionerJob_SystemFastPath` (3 sub-tests): all pass
- `TestPostWorkspaceAuthAWSInstanceIdentity/Ambiguous/*` (7 sub-tests):
all pass

> Generated by Coder Agents

Co-authored-by: Dean Sheather <dean@deansheather.com>
2026-05-05 21:54:04 +02:00
Jon Ayers 17635dde5c chore: include pgcoordinator schema changes in 2.33 (#24931)
Includes https://github.com/coder/coder/pull/24613 since it landed prior
to the pgcoordinator migration

---------

Co-authored-by: Marcin Tojek <mtojek@users.noreply.github.com>
2026-05-04 15:42:34 -05:00
github-actions[bot] e67d027786 fix(coderd/externalauth): detect concurrent refresh race to prevent cache poisoning (#24228) (#24938)
Cherry-pick of https://github.com/coder/coder/pull/24228

Original PR: #24228 — fix(coderd/externalauth): detect concurrent
refresh race to prevent cache poisoning
Merge commit: da6e708bd2
Requested by: @f0ssel

Co-authored-by: Jason Barnett <J@sonBarnett.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Garrett Delfosse <garrett@coder.com>
2026-05-04 14:03:39 -04:00
Cian Johnston eabb68d89e fix: add preset support to MCP tools (#24694) (#24889)
The chat tools (`read_template`, `create_workspace`) did not surface or
respect template version presets. Presets were invisible to the LLM and
preset parameter defaults were never applied at workspace creation. The
`toolsdk` MCP surface had the same gap (ref #24695, now subsumed here).

## What this changes

- **`read_template`** returns presets with `id`, `name`, `default`,
`description`, `icon`, `parameters`, and `desired_prebuild_instances`
(when set), so the LLM can pick the right preset and prefer
prebuilt-backed ones.
- **`create_workspace`** accepts a `preset_id`. The wsbuilder applies
preset parameter defaults and may claim a prebuilt workspace.
- **`start_workspace`** does *not* accept a preset. Presets are a
creation-time choice; subsequent starts use the workspace's existing
version and parameters. Users who need a specific preset or version on
an existing chat can create the workspace out-of-band (CLI / UI / API)
with the desired configuration and attach the chat to it.
- **`toolsdk`** gains `GetTemplate` (with presets including
`desired_prebuild_instances`), preset support on `CreateWorkspace`, and
preset + `rich_parameters` support on `CreateWorkspaceBuild`. The
`template_version_preset_id` description warns about preset/version
affinity.

> 🤖 Generated with [Coder Agents](https://coder.com/agents) and reviewed
by a human.



(cherry picked from commit 04cc983833)

<!--

If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.

-->

Co-authored-by: Max schwenk <maschwenk@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:26:47 +01:00
Cian Johnston df1bfe6479 feat: audit user secret create, update, and delete (#24756) (#24849)
Emit user secret audit log entries for create/update/delete operations.
Reads stay un-audited, matching every other resource.

Audit log entries record changes in user secret name, environment
variable name, file path, and value. The secret value column is marked
`ActionSecret` so the diff records the change without showing the
ciphertext or plaintext.

Closes a TOCTOU window on delete to ensure no phantom audit logs for a
delete of a non-existent secret. Secret update accepts a small TOCTOU
window matching the other audited resources (templates, workspaces,
chats). The two-query pattern is wrapped in a transaction so audit state
can't leak from a failed mutation.

(cherry picked from commit 1c30d52b2b)

<!--

If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.

-->

Co-authored-by: Zach <3724288+zedkipp@users.noreply.github.com>
2026-04-30 21:01:27 +01:00
George K 9538390107 fix(coderd/healthcheck/derphealth): avoid data races in DERP report (#24795)
Fixes two data races, one introduced in #24544 and one pre-existing.

Related to: https://github.com/coder/internal/issues/1505
2026-04-28 13:06:45 -07:00
Michael Suchacz 1d8e29815e fix(coderd/x/chatd/chatdebug): restore request body after capture (#24784)
> Mux working on behalf of Mike.

Debug recording could consume request bodies when a provider SDK
returned the active body from `GetBody`, which left the upstream request
with an empty body after capture.

Reset the request body after debug capture and add coverage for shared
`GetBody` readers so debug logging does not alter the bytes sent
upstream.
2026-04-28 19:09:27 +02:00
Mathias Fredriksson 881df9a5b0 feat: reload MCP config on change via lazy stat-on-request (#24700)
The MCP manager previously read .mcp.json exactly once at agent startup.
Editing the file had no effect until workspace rebuild or agent restart.

handleListTools now stats config file mtimes on every tool-list request
and triggers a differential reload when any file changed. Unchanged
servers keep their client pointer so in-flight tool calls survive.
Concurrent reload requests coalesce via singleflight.

MCP stdio subprocesses use the agent's execer for resource limits and
receive the same enriched environment as SSH sessions via updateEnv.

On the chatd side, WorkspaceMCPTool.Run detects 404 responses from
CallMCPTool (indicating the server was removed) and drops the chat's
cached tool list so the next turn refetches from the agent.
2026-04-28 19:47:14 +03:00
George K 3f0e015fe5 fix: allow coderd to start with an empty DERP map when built-in DERP is disabled (#24544)
Allow coderd to start with an empty base DERP map when built-in DERP
is disabled and no static DERP map is configured, so DERP can come from
workspace proxies after startup.

Also add a DERP healthcheck warning when no DERP servers are currently
available at runtime.

Related to: https://linear.app/codercom/issue/PLAT-43/bug-coderd-unable-to-be-started-if-built-in-derp-server-disabled-and
Related to: https://github.com/coder/coder/issues/22324
2026-04-28 09:17:08 -07:00
Mathias Fredriksson 1926b7e658 fix(coderd/externalauth): detect rate-limit 403/429 and narrow isFailedRefresh (#24334)
ValidateToken treated all 403 responses as "token invalid," including
GitHub rate limits. isFailedRefresh included 403 in the status code
fallthrough, destroying tokens on rate-limited refresh attempts.

Split the combined 401/403 check in ValidateToken into a switch on
status code. On 403, inspect X-RateLimit-Remaining and Retry-After
headers; if either indicates a rate limit, return optimistically valid.
Handle 429 the same way. Plain 403 without rate-limit headers preserves
the existing invalid-token behavior.

Add incorrect_client_credentials and invalid_client to isFailedRefresh
error code switch. Remove 403 from the status code fallthrough since no
known provider returns 403 from the token endpoint.
2026-04-28 18:03:35 +03:00
Mathias Fredriksson 3c450899ea fix: pass agent context config explicitly instead of reading env (#24759)
The CODER_AGENT_EXP_* env vars are agent-internal options. When set
in the workspace environment they leak to MCP subprocesses and user
shells.

ReadEnvConfig() captures the values and ClearEnvVars() strips them
before the reinit loop, so config survives agent restarts. NewAPI
and ReadEnvConfig both use applyDefaults() to fill zero fields.
The chatd test passes config via agenttest.WithContextConfigFromEnv().
2026-04-28 17:58:28 +03:00
Cian Johnston 1666bff1f9 fix(coderd/x/chatd): block chain mode when provider missing tool results (#24782)
When `StopAfterTool` fires (e.g., `propose_plan`), the LLM response
containing a `function_call` is stored at OpenAI via `store=true`, but
the tool result is only persisted locally. On the next user message,
`resolveChainMode` sees the tool result in the local DB and concludes
all calls are resolved. Chain mode activates with
`previous_response_id`, but OpenAI rejects because its stored chain has
an unresolved `function_call`.

This adds a `providerMissingToolResults` check to `resolveChainMode`
that detects the `assistant(tool-call) → tool(result) → user` pattern
with no follow-up assistant message. The absence of a follow-up
assistant proves the tool results were never round-tripped to the
provider. When detected, chain mode is blocked and the system falls back
to full history replay, which includes both the tool call and its
result.

Deploying this fix un-bricks existing affected chats with no DB
migration needed.

> Generated by Coder Agents.
2026-04-28 15:30:04 +01:00
david-fraley 5222db86c7 feat: add after_id pagination for chat messages (#24531) 2026-04-28 08:31:33 -05:00
Michael Suchacz 8fe11e9b14 fix: match Bedrock streaming accept headers (#24781)
> Mux is working on behalf of Mike.

## Summary
- Bump `github.com/coder/anthropic-sdk-go` to the corrected Bedrock
streaming header fix from coder/anthropic-sdk-go#14.
- Match botocore's `InvokeModelWithResponseStream` request shape by
using `X-Amzn-Bedrock-Accept` and omitting the HTTP `Accept` header.
- Update chatd regression coverage for the corrected header shape.

## Context
The previous fix set `Accept: application/vnd.amazon.eventstream`. Real
boto3/botocore streaming requests do not send that header. They send
`X-Amzn-Bedrock-Accept: application/json`, which is the modeled Bedrock
request header for the desired model response MIME type.

## Validation
- `go test ./coderd/x/chatd/chatprovider -run
'TestModelFromConfig_Bedrock(StreamingHeaders|StripsAnthropicHeaders)?$'
-count=1`
- `go mod tidy -diff`
- `git diff --check`
- pre-commit hook during `git commit`
2026-04-28 14:39:10 +02:00
Michael Suchacz dec3e98e54 fix: set Bedrock streaming accept headers (#24776)
> Mux is working on behalf of Mike.

## Summary
- Bump `github.com/coder/anthropic-sdk-go` to the clean Bedrock
streaming header fix from coder/anthropic-sdk-go#10.
- Add chatd regression coverage that verifies Bedrock streaming requests
use AWS event stream headers and include `X-Amzn-Bedrock-Accept` in the
SigV4 signed headers.

## SDK follow-up
- Reverted the bad coder/anthropic-sdk-go#8 merge with
coder/anthropic-sdk-go#9.
- Re-applied only the intended Bedrock streaming header change in
coder/anthropic-sdk-go#10.

## Validation
- `go test ./coderd/x/chatd/chatprovider -run
'TestModelFromConfig_Bedrock(StreamingHeaders|StripsAnthropicHeaders)?$'
-count=1`
- `go test ./coderd/x/chatd/chatprovider -count=1`
- `go mod tidy -diff`
- `make lint`
- pre-commit hook during `git commit`
2026-04-28 11:28:20 +00:00
Michael Suchacz 99eb46dac1 fix(coderd/x/chatd): repair Anthropic provider tool history (#24744)
## Problem

Anthropic returns HTTP 400 when an assistant message contains a
`web_search_tool_result` block whose `tool_use_id` has no matching
earlier `server_tool_use` block in the same assistant message. A
previous fix (#24706) sanitized provider-executed tool calls without
matching results, but the opposite direction, orphaned or misordered
provider-executed results, could still slip through both the prompt
sanitizer and the persistence path.

## Fix

Tighten Anthropic provider-executed tool history handling while
preserving the useful result payload as normal assistant text when the
provider-tool metadata is unsafe.

1. Extract Anthropic provider-tool sanitization into
`coderd/x/chatd/chatsanitize` so provider-specific repair logic is no
longer spread through `chatprompt` and `chatloop`.

2. `chatsanitize.SanitizeAnthropicProviderToolHistory` removes invalid
provider-executed tool structure for Anthropic prompts: orphans in
either direction, result-before-call, duplicate IDs, invalid JSON
inputs, empty IDs and tool names, unsupported tool names, mismatched
`ProviderExecuted` flags, provider-executed blocks outside assistant
messages, and web-search results without serializable Anthropic result
metadata. Provider-executed result payloads are textified instead of
being discarded when there is text to preserve.

3. `chatsanitize.SanitizeAnthropicProviderToolContent` mirrors the same
rule at the streamed step content level. Persisted history no longer
carries invalid provider-tool blocks forward, but it keeps the result
text for future turns.

4. `chatsanitize.ApplyAnthropicProviderToolGuard` only repairs
structurally invalid Anthropic provider-tool history. It no longer
strips otherwise-valid historical `web_search` blocks just because web
search is disabled for the current request. The fail-closed fallback
also textifies provider results before removing provider-tool metadata.

Tests cover prompt sanitization, validation reason strings, result
payload textification, content-level persistence sanitization, disabled
web-search history preservation, direct pre-request guard behavior, and
the fallback strip path.

> Mux is acting on Mike's behalf.
2026-04-28 12:45:23 +02:00
Cian Johnston 70d6efa311 feat: chat auto-archive owner digest notifications (#24643)
Depends on #24642

Adds per-owner digest notifications onto the chat auto-archive
subsystem.

Each tick's archived rows are grouped by owner, the top 25 titles per
owner are rendered into a new `Chats Auto-Archived` notification
template, and any remainder surfaces as `and N more`. Each digest is
per-tick, so users with large amounts of purgeable data may get multiple
notifications in sequence (one per user per tick).

The template body branches on `retention_days`: when retention is
disabled (`retention_days=0`), users are told archived chats are kept
indefinitely rather than falsely claiming imminent deletion.

### Changes
- migration `000XXX_chat_auto_archive_notification_template` adds new
notification template
- `dbpurge`: threads `notifications.Enqueuer` through `New`; and
enqueues notification message.
- `cli/server.go`: passes `options.NotificationsEnqueuer` into
`dbpurge.New`.
- `coderd/notifications/events.go`: new `TemplateChatAutoArchiveDigest`
UUID.
- `coderd/inboxnotifications.go`: inbox registration.
- Docs: adds a `Notifications` section to `chat-auto-archive.md`.

> 🤖
2026-04-28 08:56:36 +01:00
Faur Ioan-Aurel a8e7f329ac fix: redirect OAuth2 authorization page to dashboard (#24499)
Currently when a user clicks either the Cancel or Allow button on the
authorization page the client app URI is executed but the page does not
land to the main dashboard page, leaving the two buttons open for
multiple clicks from the user. Aside from the potential problems it
might cause by activating the callback URI multiple times, the page also
provides poor UX because users usually expect the authorization tab to
return to the dashboard.

The consent page now executes the OAuth2 callback (auth code on Allow,
`access_denied` on Cancel) and hides the two buttons and updates the
existing description with a user instruction to close the window.
Initial implementation relied on a pop-up window executing the callback
while the main window was redirected to the dashboard main page.
- resolves https://github.com/coder/coder/issues/20323

<!--

If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.

-->
2026-04-27 23:26:17 +03:00
Zach 79735f2d45 feat: plumb user secrets through provisioner chain to terraform (#24542)
This change passes user secrets from coderd to the Terraform process at
workspace build time so the `data.coder_secret` data source in
terraform-provider-coder can resolve values at plan time.

Secrets traverse two proto hops: `provisionerdserver` fetches them
via`ListUserSecretsWithValues`, attaches them to
`AcquiredJob.WorkspaceBuild.user_secrets` on `provisionerd.proto`;
`runner.go` forwards into `PlanRequest.user_secrets` on
`provisioner.proto`; the Terraform provisioner encodes each as
`CODER_SECRET_ENV_<name>` or `CODER_SECRET_FILE_<hex(path)>` before
invoking `terraform plan`. Only plan requests carry secrets; apply runs
with `nil` because values are baked into plan state.

Fetch is gated on a workspace transitioning to start. stop and delete
transitions never carry secrets, so revoking or deleting a stored secret
cannot make a workspace unstoppable. DB errors on the fetch fail the job
outright rather than silently continuing with an empty secret set.

Note that user secrets will be stored in the workspace_builds table in
provisioner_state with other Terraform state (including other sensitive data).
2026-04-27 08:26:07 -06:00
Cian Johnston 2f26903af9 feat: add admin UI control for chat auto-archive days (#24704)
Relates to #24642 

Adds admin UI controls for managing chat auto-archive (days) under
"Lifecycle".
Also adds a "Days" label to the right of the pre-existing unitless
numeric input for consistency.

Exemplary screenshot below. More screens available in Storybook.

<img width="847" height="585" alt="Screenshot 2026-04-24 at 16 48 59"
src="https://github.com/user-attachments/assets/d38de5f8-d379-4b06-b175-ac399f31e578"
/>
2026-04-27 09:54:22 +01:00
Kyle Carberry 069223ae26 fix: recover web push subscriptions after PWA reinstall (#24720) 2026-04-26 14:49:10 -07:00
Michael Suchacz 99a83a2702 fix: clean Bedrock headers (#24718)
Bedrock chat provider requests can inherit Anthropic public API headers
from the process environment, which causes mixed Anthropic and Bedrock
auth headers on signed requests.

Update the Anthropic SDK fork so its Bedrock middleware strips
Anthropic-only headers before signing requests, and keep a chatprovider
regression test for the production request shape.

> Mux is acting on Mike's behalf.
2026-04-26 21:50:29 +02:00
Michael Suchacz 62e9752acd fix: prevent malformed OpenAI Responses continuations (#24725)
> Worked on by Mux on Mike's behalf.

## Summary

- Disable OpenAI Responses `previous_response_id` chain mode when the
prior assistant response has unresolved local tool calls, so the next
request can include paired tool outputs instead of sending an incomplete
continuation.
- Update the fantasy pin to a Responses replay fix that preserves stored
reasoning references, only replays web search references when paired
with reasoning, and validates local function-call output pairing before
send.
- Add fake OpenAI Responses input validation for the two production 400
shapes and integration coverage for full-history reasoning plus web
search replay.
- Add sanitized diagnostics for the OpenAI Responses continuity errors.

## Tests

- `go test ./providers/openai -run
'TestResponsesToPrompt_(ReasoningWithStore|ReasoningWithWebSearchCombined|WebSearchRequiresReasoningReference|ReasoningWithFunctionCallCombined|WebSearchProviderExecutedToolResults)|TestPrepareParams_(SkipsProviderExecutedToolReferences|ValidatesFunctionCallOutputPairing)|TestValidateResponsesInput_WebSearchReferenceRequiresReasoning'
-count=1`
- `go test ./providers/openai -count=1`
- `GOWORK=off go test ./coderd/x/chatd/chattest -run
TestValidateResponsesAPIInput -count=1`
- `GOWORK=off go test ./coderd/x/chatd -run
'TestOpenAIResponses(NoStaleWebSearchReplay|FullReplayPairsReasoningAndWebSearch|ChainModeSkipsWhenLocalCallPending|ChainModeStillFiresForProviderExecutedOnly)$|TestResolveChainMode_'
-count=1`
- `GOWORK=off go test ./coderd/x/chatd/chatprompt -run
'TestInjectMissingToolResults_' -count=1`
- `GOWORK=off go test ./coderd/x/chatd/chaterror -run
TestClassify_OpenAIResponsesAPIDiagnostics -count=1`
- `GOWORK=off go test ./coderd/x/chatd/... -count=1`
- `git diff --check`
- `git commit` pre-commit hook
2026-04-26 21:23:06 +02:00
Michael Suchacz ed33e28b13 fix(coderd/x/chatd): wake after auto-promoting queued message (#24714)
`tryAutoPromoteQueuedMessage` in `processChat`'s deferred cleanup could
set a chat back to `pending` without waking the processor. The processor
only noticed on the next 10ms poll, so under load tests like
`TestAutoPromoteQueuedMessageFallsBackForInvalidQueuedModelConfigID`
could time out waiting for the second streaming request (#1500).

Call `p.signalWake()` after the promoted-message publishes when
`promotedMessage != nil`, matching the pattern used by `CreateChat`,
`SendMessage`, `EditMessage`, `PromoteQueued`, and `InterruptChat`. Make
the regression helper `testAutoPromoteQueuedMessageFallback`
deterministic by setting `PendingChatAcquireInterval = time.Hour` and
synchronizing on a `secondRunStarted` channel instead of polling
`requestCount`, so the test fails without the wake instead of relying on
the 10ms ticker.

Closes https://github.com/coder/internal/issues/1500

> Mux is acting on Mike's behalf.
2026-04-26 11:08:32 +02:00
Michael Suchacz 0211448d09 fix(coderd): sanitize Anthropic provider tool history (#24706)
Anthropic can reject replayed chat histories when a provider-executed
tool call, such as `web_search`, is present without its matching
provider result block.

This sanitizes unpaired Anthropic provider-executed tool calls during
prompt reconstruction, before Anthropic requests, and before persistence
so existing poisoned histories can continue and new malformed turns are
not stored.

Resolves: CODAGT-259

> Mux is acting on Mike's behalf.
2026-04-24 23:57:30 +02:00
Cian Johnston 0ccfd575d0 fix(coderd/database/migrations): rename duplicate migration 477 (#24707) 2026-04-24 14:49:11 +00:00
Michael Suchacz c7cac9debe fix: persist per-turn model on chats and queued messages (#24688)
Previously, `chats.last_model_config_id` was not updated when a user
sent a mid-chat message with a different model, and queued messages did
not store their own per-turn model, so promotion ran against whatever
the chat row said at promote time. Chat watch events also did not merge
`last_model_config_id` into the site's root, child, and per-chat
caches, so sidebar labels stayed stale after direct sends and queued
promotions.

- Add nullable `chat_queued_messages.model_config_id`, backfilled from
  `chats.last_model_config_id`. Queued inserts round-trip the effective
  model id at enqueue time.
- In `coderd/x/chatd`, direct sends update `chats.last_model_config_id`
  inside the same transaction that inserts the admitted user message.
  Manual promotion and auto-promotion use the queued row's stored
  `model_config_id`, with a fallback to `chats.last_model_config_id`
for legacy NULL rows during rollout.
`PromoteQueuedOptions.ModelConfigID`
  is now ignored.
- On the site, extract `mergeWatchedChatSummary` and
  `mergeWatchedChatIntoCaches` in `site/src/api/queries/chats.ts` so
  status-change watch events merge `last_model_config_id` into the
  root infinite chat list, the parent-embedded child entry, and the
  per-chat `chatKey(chatId)` cache. `updated_at` guards against stale
  watch payloads clobbering newer cached state, while diff status
  events still merge their PR metadata because they are timestamped
  outside the chat row. Watch timestamps are compared as instants so
  variable fractional precision does not make fresh events look stale.
- Queued promotion validates stored model config IDs before admission.
  Invalid legacy queued IDs fall back to the chat's current model config
  instead of dropping the queued message during auto-promotion.
- Backend and frontend regression coverage added for admission, queue
  promotion (including FIFO across mixed models, legacy NULL fallback,
  and invalid queued model IDs), and chat watch cache merging.

> Mux is acting on Mike's behalf.
2026-04-24 15:36:08 +02:00
Cian Johnston a876287d36 feat: auto-archive inactive chats with audit trail (#24642)
Adds a background job in `dbpurge` that periodically archives chats
inactive beyond a configurable threshold. Each archived root chat gets a
background audit entry tagged `chat_auto_archive`. Disabled by default.

* New `AutoArchiveInactiveChats` SQL query with LATERAL last-activity
subquery and partial index on archive candidates
* `site_configs`-backed `auto_archive_days` setting with admin-only PUT,
any-authenticated-user GET
* Cascade archive via `root_chat_id`; pinned chats and active threads
exempt
* Root-only audit dispatch on detached context, matching manual archive
(`patchChat`) behavior
* 11 subtests covering disabled no-op, boundary, deleted messages, child
activity, pinned exemption, multi-owner, idempotency, and batch
pagination

PR #24643 adds per-owner digest notifications.
PR #24704 adds the requisite UI controls.

> 🤖
2026-04-24 14:18:28 +01:00
Danielle Maywood 3a9a60dff8 feat: add collapsible thinking blocks with configurable display mode (#24635) 2026-04-24 11:29:08 +00:00
Michael Suchacz 3d90546aae feat: add general subagent model override (#24610)
Adds a deployment-wide admin override for general delegated subagents.

## What changed
- store the general override in `site_configs` and expose it through the
shared `agent-model-override/{context}` API
- apply the general override when spawning delegated general subagents,
while preserving the existing Explore override behavior
- reuse a shared Agents settings form for the general and Explore
override sections

## Validation
- `make gen`
- `go test ./coderd -run 'TestChatModelOverrides'`
- `go test ./coderd/x/chatd -run
'TestSpawnAgent_(GeneralUsesConfiguredModelOverride|GeneralOverrideLogsAndFallsBackWhenCredentialsUnavailable|GeneralOverrideLogsAndFallsBackWhenProviderDisabled)'`
- `pnpm -C site lint:types`
- `pnpm -C site test:storybook --
AgentSettingsAgentsPageView.stories.tsx`
- `make lint`
- `make pre-commit`

> Mux is acting on Mike's behalf.
2026-04-24 12:37:20 +02:00
Cian Johnston a02339c66a fix(coderd/x/chatd): prevent invalid tool results from poisoning chat history (#24663)
- **computeruse.go**: Decode base64 screenshot data before storing in
`ToolResponse.Data` (was casting base64 string to bytes without
decoding)
- **chatloop.go**: Re-encode `ToolResponse.Data` to base64 via
`base64.StdEncoding.EncodeToString` instead of `string()` cast
- **mcpclient.go**: UTF-8 validate all text from MCP responses in
`convertCallResult()` using `strings.ToValidUTF8`
- **chatprompt.go (persist)**: Defense-in-depth UTF-8 sanitization of
text and media Text fields before database storage
- **chatprompt.go (replay)**: Antivenom layer that validates base64 and
UTF-8 at read time, auto-healing already-poisoned chats without
requiring a migration
- `TestToolResultAntivenom`: 4 subtests covering poisoned text, poisoned
media, valid media round-trip, and media with invalid UTF-8 text
-  Adds `TestConvertCallResult_UTF8Sanitization`: 4 subtests covering invalid
UTF-8 in TextContent, EmbeddedResource, valid passthrough, and
multi-part
- Adds `TestComputerUseTool_Run_ScreenshotDataIsDecodedBinary`: Verifies no
double-encode in the computer-use path
- Updated existing computer-use tests for the new decoded-binary
contract

> 🤖
2026-04-23 19:58:38 +01:00
Cian Johnston c602a31856 fix(coderd): reject pinning child chats in patchChat handler (#24669)
The UI already prevents child (delegated/subagent) chats from being
pinned, but the `PATCH /api/experimental/chats/{chat}` endpoint did not
enforce this. A direct API call could pin a child chat.

- Add a `400 Bad Request` guard in `patchChat` when `pinOrder > 0` and
the chat has a `ParentChatID`
- Add `TestChatPinOrder/RejectsChildChat` test

> 🤖
2026-04-23 18:36:20 +01:00
Michael Suchacz dbcc654d28 feat: snapshot explore subagent tool entitlements (#24638)
Explore sub-agents previously could not use `web_search` or external MCP
tools. `runChat` hard-skipped both for Explore. Lifting those guards
naively would over-grant tools, because a child chat could outlive the
spawning turn's plan-mode filter.

This change persists the spawning parent turn's filtered external MCP
server IDs onto the child Explore chat, and simplifies the Explore
provider-tool filter in `runChat`:

- New `resolveExploreToolSnapshot` helper: computes the child's
inherited external MCP subset by running the parent's configs through
`filterExternalMCPConfigsForTurn` (plan-mode policy) and, if the parent
is itself an Explore child, further narrowing to the parent's own
persisted `MCPServerIDs`. The result is written to the child's
`MCPServerIDs` column at spawn time.
- The existing `mcp_server_ids` column is the sole durable snapshot. No
new chat column is added.
- `runChat` for Explore children: loads MCP tools from the persisted
snapshot, and keeps only `web_search` from provider-native tools (to
block computer-use and other write-style tools, since Explore is
read-only). Whether `web_search` is actually available is a per-model
decision, determined by the current model config, just like a main chat.
- Built-in Explore allowlist is unchanged. Workspace-local MCP remains
excluded for Explore.

Verification: `go build ./...`, `go test ./coderd/x/chatd/... -count=1`,
`make gen` (clean tree), `make lint/emdash`, `go vet`. Deep-review ran
12 reviewers on the feature and 5 on the clarity refactor; CAR reviewed
and approved; a subsequent scope reduction dropped a temporary
`allow_web_search` column in favor of per-model handling.

> Mux is acting on Mike's behalf.
2026-04-23 19:07:38 +02:00
Cian Johnston b5a625549e feat: migrate agents-access to org-scoped system role for proper chat RBAC (#24438)
The agents-access role previously granted chat permissions at user
scope, but chats are org-scoped objects. Rego skips user-level perms
when org_owner is set, making the grants invisible. Handler-level
band-aids used synthetic non-org-scoped objects as a workaround.

  - Migrates agents-access from users.rbac_roles (site-level) to
    organization_members.roles (org-scoped) via DB migration
  - Redefines agents-access as a predefined org-scoped builtin role
    alongside organization-admin, organization-auditor, etc., with
    Member permissions granting chat create/read/update
  - Excludes ResourceChat from OrgMemberPermissions so org membership
    alone no longer grants chat access
  - Fixes handler Authorize checks to use org-scoped objects with
semantically correct actions (ActionUpdate for message/tool operations)
  - Grants org admins the ability to assign agents-access

Closes #24250
Fixes CODAGT-174

Note: this does not update the "Usage" endpoints. Tracked by CODAGT-161.
> 🤖
2026-04-23 17:59:42 +01:00
Mathias Fredriksson f8fe5d680b fix(coderd): reject API operations on archived chats (#24633)
Archived chats accept mutations (messages, edits, queued-message
promotions, tool-result submissions) via the API, causing them to
re-enter the processing pipeline. This violates the hard-stop
design intent from PR #23758.

Add archived checks at three layers:

- HTTP handlers (postChatMessages, patchChatMessage,
  promoteChatQueuedMessage, postChatToolResults): return 400
  after auth so callers get a clear error.
- Daemon functions (SendMessage, EditMessage, PromoteQueued,
  SubmitToolResults): return ErrChatArchived after row lock,
  guarding against future callers that bypass the handler.
- AcquireChats SQL: filter out archived chats so they are never
  acquired for processing.

Fixes CODAGT-245
2026-04-23 19:03:33 +03:00
Danny Kopping a8613b2209 chore: deprecate /api/v2/aibridge/interceptions endpoint (#24670)
*Disclaimer: implemented by a Coder Agent using Claude Opus 4.6*

Marks the `GET /api/v2/aibridge/interceptions` endpoint as deprecated in
favor of `/aibridge/sessions`, which provides richer session-level
aggregation including threads and agentic actions.

Changes:
- Add `@Deprecated` Swagger annotation to the endpoint handler
- Add deprecation notice to the
`codersdk.Client.AIBridgeListInterceptions` method
- Regenerated OpenAPI spec with `"deprecated": true` flag

The endpoint remains fully functional.

Fixes https://github.com/coder/internal/issues/1339
2026-04-23 15:33:40 +02:00
Cian Johnston 2e5c7d99c2 fix(coderd/x/chatd): fix flaky TestSpawnComputerUseAgentInheritsContext (#24666)
Fixes flaky `TestSpawnComputerUseAgentInheritsContext`.

- The test inserts an Anthropic provider directly into the DB after
`CreateChat` has already been called
- The server's background goroutine may have already cached the provider
list (OpenAI only) via `configCache.EnabledProviders()` with a 10s TTL
- The direct DB insert bypasses the pubsub event that production uses to
invalidate the cache
- `isAnthropicConfigured()` returns the stale cached result, making
`computer_use` appear unavailable
- Fix: call `server.configCache.InvalidateProviders()` after the insert,
mirroring what production does via pubsub

CI failure:
https://github.com/coder/coder/actions/runs/24829197096/job/72673070101?pr=24648

> 🤖
2026-04-23 13:18:18 +01:00
Jake Howell 4caa52844d chore!: remove api.ts unnecessary calls (#22168)
> [!WARNING]  
> The change of the status code from `404` to `204` could break peoples
code downstream. Adding this as a breaking change incase.

Theres a whole ton of noise around failed requests, these are all
unrelated to the actual thing that is broken at hand (and are
confusing).

* Change `/api/v2/organizations/.../templates/.../versions/.../previous`
to return `204` instead of `404` (actually makes more sense because the
content doesn't exist, but the route is found.
* Remove unnecessary calls to `/api/v2/users/me/appearance` when the
user isn't logged in.
* Remove unnecessary calls to `/api/v2/deployment/stats` when the
deployment stats aren't allowed to be seen.
* Various changes to `workspace-sharing` so we don't make unnecessary
calls.

Whats left:

* `/api/v2/users/me` still `401`s on the login page. This persists as
when the user is logged in but tries to reach the sign-in page they
should be redirected to the app, not sign in again.
* `monaco-editor` is still upset... we theoretically could inject an
environment that can serve workers... but eh.

#### Old

```sh
% pnpm playwright:test -g "create workspace with default and required parameters"

> coder-v2@ playwright:test /home/coder/coder/site
> playwright test --config=e2e/playwright.config.ts -g 'create workspace with default and required parameters'

...

Running 2 tests using 1 worker

  ✓  1 …e/setup/addUsersAndLicense.spec.ts:7:5 › setup deployment (8.2s)
     2 ….ts:79:5 › create workspace with default and required parameters
[console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized)
[console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized)
[response] url=http://localhost:3111/api/v2/users/me/appearance status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."}
[response] url=http://localhost:3111/api/v2/users/me status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."}
[console][error] Failed to load resource: the server responded with a status of 403 (Forbidden)
[response] url=http://localhost:3111/api/v2/deployment/stats status=403 body={"message":"Forbidden.","detail":"You don't have permission to view this content. If you believe this is a mistake, please contact your administrator or try signing in with different credentials."}
[console][error] Failed to load resource: the server responded with a status of 403 (Forbidden)
[response] url=http://localhost:3111/api/v2/deployment/stats status=403 body={"message":"Forbidden.","detail":"You don't have permission to view this content. If you believe this is a mistake, please contact your administrator or try signing in with different credentials."}
[console][error] Failed to load resource: the server responded with a status of 404 (Not Found)
[response] url=http://localhost:3111/api/v2/organizations//provisionerdaemons status=404 body={"message":"Resource not found or you do not have access to this resource"}
[console][error] Failed to load resource: the server responded with a status of 404 (Not Found)
[response] url=http://localhost:3111/api/v2/organizations/default/templates/a4e8096d/versions/agreeable_glenn33/previous status=404 body={"message":"No previous template version found for \"agreeable_glenn33\"."}
[console][warning] Could not create web worker(s). Falling back to loading web worker code in main thread, which might cause UI freezes. Please see https://github.com/microsoft/monaco-editor#faq
[console][warning] You must define a function MonacoEnvironment.getWorkerUrl or MonacoEnvironment.getWorker
[console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized)
[console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized)
[response] url=http://localhost:3111/api/v2/users/me/appearance status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."}
[response] url=http://localhost:3111/api/v2/users/me status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."}
[console][error] Failed to load resource: the server responded with a status of 403 (Forbidden)
[response] url=http://localhost:3111/api/v2/deployment/stats status=403 body={"message":"Forbidden.","detail":"You don't have permission to view this content. If you believe this is a mistake, please contact your administrator or try signing in with different credentials."}
  ✓  2 …5 › create workspace with default and required parameters (7.0s)atus of 403 (Forbidden)
[response] url=http://localhost:3111/api/v2/deployment/stats status=403 body={"message":"Forbidden.","detail":"You don't have permission to view this content. If you believe this is a mistake, please contact your administrator or try signing in with different credentials."}
[console][error] Failed to load resource: the server responded with a status of 403 (Forbidden)
[response] url=http://localhost:3111/api/v2/deployment/stats status=403 body={"message":"Forbidden.","detail":"You don't have permission to view this content. If you believe this is a mistake, please contact your administrator or try signing in with different credentials."}

  2 passed (56.1s)
```

`23 LOL` (Lines of logs)

#### New

```sh
% pnpm playwright:test -g "create workspace with default and required parameters"

> coder-v2@ playwright:test /home/coder/coder/site
> playwright test --config=e2e/playwright.config.ts -g 'create workspace with default and required parameters'

...

Running 2 tests using 1 worker

  ✓  1 …e/setup/addUsersAndLicense.spec.ts:7:5 › setup deployment (8.7s)
     2 ….ts:79:5 › create workspace with default and required parameters
[console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized)
[console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized)
[response] url=http://localhost:3111/api/v2/users/me/appearance status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."}
[response] url=http://localhost:3111/api/v2/users/me status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."}
[console][warning] Could not create web worker(s). Falling back to loading web worker code in main thread, which might cause UI freezes. Please see https://github.com/microsoft/monaco-editor#faq
[console][warning] You must define a function MonacoEnvironment.getWorkerUrl or MonacoEnvironment.getWorker
  ✓  2 …5 › create workspace with default and required parameters (7.1s)atus of 401 (Unauthorized)
[console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized)
[response] url=http://localhost:3111/api/v2/users/me/appearance status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."}
[response] url=http://localhost:3111/api/v2/users/me status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."}

  2 passed (32.0s)
```

`9 LOL` (Lines of logs)
2026-04-23 06:20:35 +10:00
Cian Johnston be1256c418 fix(coderd): fix TestListChats/PinnedOnFirstPage race timeout (#24641)
- Insert filler chats directly into the database with `completed` status
instead of creating them via the API
- Removes the `testutil.Eventually` polling loop that waited for all 52
chats to reach terminal status
- Avoids spawning 52 background chat processors that each time out on
title generation under `-race`, exceeding the 25s `WaitLong` timeout
- Test now completes in ~1s instead of timing out at 30s+

Flake:
https://github.com/coder/coder/actions/runs/24789695935/job/72543519963?pr=24438

> 🤖
2026-04-22 20:37:06 +01:00
Mathias Fredriksson 1ace519c6e fix(coderd/x/chatd): remove cache-miss check blocking agent recovery (#24634)
The cache-miss isAgentUnreachable check added in #24336 runs before
dialWithLazyValidation, preventing the existing switch mechanism from
discovering the new agent after a workspace rebuild. The chat's stale
agent binding is never repaired, causing an infinite loop of
'agent is disconnected' errors.

Remove the cache-miss check. The cache-hit check remains (it verifies
the agent behind an established connection). The dial timeout and
dialWithLazyValidation already bound the cache-miss failure path.

Closes CODAGT-248
2026-04-22 21:49:10 +03:00
Cian Johnston 72e3ae9c5f feat: add chatd tool call error metrics and logging (#24559)
- Add `coderd_chatd_tool_errors_total` prometheus counter (labels:
provider, model, tool_name)
- Log tool call errors at warn level with correlation fields: chat_id,
owner_id, organization_id, workspace_id, agent_id, parent_chat_id,
trigger_message_id, tool_name, tool_call_id, provider, model
- Thread enriched logger from chatd.go into chatloop via
`RunOptions.Logger`
- Remove squashing of all MCP tool calls to the `mcp` bucket

> 🤖
2026-04-22 16:19:56 +00:00
Michael Suchacz 7904bed947 fix: fall back to local git watcher for chat diff drawer (#24512)
The Ctrl+D diff drawer in `coder exp agents` only rendered PR-backed
diffs returned by `/api/experimental/chats/{id}/diff`. Local working
tree changes in a chat's workspace returned an empty diff, so the
drawer showed "No diff contents" with no file summary.

Centralise diff loading behind a single `fetchChatDiffContents` helper
that first hits `/diff`, then falls back to the chat git watcher
WebSocket (`/stream/git`) when the remote diff is empty. Aggregate the
agent's `WorkspaceAgentRepoChanges` into a `ChatDiffContents` value so
the drawer can derive the file summary and styled body from the local
unified diff. Missing workspaces, missing agents, and watcher timeouts
are treated as graceful fallbacks that render the empty-diff
placeholder instead of a hard error.

> Mux is opening this PR on Mike's behalf.
2026-04-22 18:08:02 +02:00
Jeremy Ruppel c23abc691f feat: sort AI sessions by last prompt time (#24440)
Previously, the sessions list sorted by `MIN(started_at)` across
interceptions, so sessions with old start times but recent activity
would sink to the bottom of the list regardless of how recently they
were used.

`ListAIBridgeSessions` now sorts by `COALESCE(MAX(prompt.created_at),
MIN(started_at)) DESC`, exposed as the non-nullable `last_active_at`
field. Sessions with prompts surface by last activity; sessions with no
prompts fall back to their start time.

The original implementation used two separate columns (`last_active_at`
as a nullable prompt timestamp and `sort_at` as the non-nullable cursor
key). This revision collapses them into a single `last_active_at` that
is always set — simplifying the SQL, the Go conversion, the API type,
and the frontend.

🤖 Generated with [Claude Code](https://claude.ai/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 12:06:49 -04:00
Marcin Tojek ec91ac5427 fix: grant AsAIBridged ResourceSystem.ActionCreate for UpsertAISeatState (#24603)
Related coder/internal#1444
2026-04-22 16:38:57 +02:00
Michael Suchacz 9b5d09ebdc test(coderd/x/chatd): seed anthropic provider for computer_use tests (#24611)
`TestSubagentLifecycleToolsIncludePersistedSubagentTypeAcrossVariants/ComputerUse`
and two adjacent positive tests passed a static Anthropic key into
`newInternalTestServer`, but `seedInternalChatDeps` only inserts an
OpenAI
provider. At runtime, `Server.resolveUserProviderAPIKeys` calls
`chatprovider.PruneDisabledProviderKeys`, which clears `keys.Anthropic`
because Anthropic is not in the enabled DB provider set, so the
`computer_use` execution path loses its key.

Add a focused test helper `seedEnabledAnthropicProvider` and use it only
in
the positive tests that actually drive a `computer_use` spawn through
the
runtime key-resolution path (the `computer_use` branch of
`TestSubagentLifecycleToolsIncludePersistedSubagentTypeAcrossVariants`,
`TestSpawnAgent_ComputerUseUsesComputerUseModelNotParent`, and
`TestSpawnAgent_ComputerUseInheritsMCPServerIDs`).
`seedInternalChatDeps`
stays unchanged, so the negative availability tests continue to model
the
"Anthropic unavailable" fixture. No production code is modified.

Closes https://github.com/coder/internal/issues/1486

> This PR was opened by Mux working on Mike's behalf.
2026-04-22 15:54:17 +02:00
Thomas Kosiewski b7c2c59931 fix(coderd/x/chatd/chatdebug): allow Anthropic per-modality ratelimit headers (#24592)
Previously, Anthropic's per-modality, Priority Tier, and fast-mode rate-limit headers (`Anthropic-Ratelimit-Input-Tokens-*`, `Anthropic-Ratelimit-Output-Tokens-*`, `Anthropic-Priority-Input-Tokens-*`, `Anthropic-Priority-Output-Tokens-*`, `Anthropic-Fast-Input-Tokens-*`, and `Anthropic-Fast-Output-Tokens-*`) were shown as `[REDACTED]` in the Debug panel because they contain `"token"` in the name and fell through the generic credential filter.

Add them to the allowlist in `coderd/x/chatd/chatdebug/redaction.go` alongside the existing `Anthropic-Ratelimit-Tokens-*` entries so the limits/remaining/reset values surface in the raw response view.
2026-04-22 15:14:31 +02:00
Thomas Kosiewski 26b64fa523 fix(coderd/x/chatd/chatdebug): record SSE attempts on EOF (#24565)
`chat_turn` debug steps persist with `attempts: []` even when the
streaming call to Anthropic completes successfully. Fantasy's
Anthropic SSE adapter iterates the response to EOF via
`for stream.Next()` and abandons the body without calling `Close()`,
so `RecordingTransport`'s Close-only recording path never fires and
the attempt is lost. Non-streaming runs (`quickgen`,
`title_generation`) go through `model.Generate(...)` and are
unaffected.

Record on `io.EOF` for `text/event-stream` bodies specifically.
Non-SSE responses stay on the Close-only path so JSON integrity,
content-length validation, and inner-`Close()` error semantics are
preserved. `record()` is already `sync.Once`-guarded, so a later
`Close()` is a no-op for recording.
2026-04-22 15:02:02 +02:00
Michael Suchacz 9634739aed fix: support Bedrock ambient AWS credentials for Agents providers (#24397)
> This PR was authored by Mux on behalf of Mike.

Adds AWS Bedrock ambient credential support to the Agents provider path.
Bedrock providers can now be saved without a stored API key and
authenticated via the standard AWS SDK credential chain on the Coder
server (IAM roles, `AWS_ACCESS_KEY_ID`, etc.). Also fixes missing `Base
URL` forwarding for Bedrock.

## Changes

**Backend runtime** (`coderd/x/chatd/chatprovider/chatprovider.go`):
- New `ProviderAllowsAmbientCredentials(provider)` helper. Currently
returns true only for Bedrock.
- `ModelFromConfig` no longer errors on an empty API key when the
provider is in the ambient-allowed set AND was explicitly resolved via
`ByProvider`. This preserves the policy gate: unresolvable providers
(disabled central key, user-key-required without a user key) still
error.
- `setResolvedProviderAPIKey` internalizes the ambient-credentials
contract via `ProviderAllowsAmbientCredentials`, so a
resolved-but-keyless Bedrock provider is represented as an empty
`ByProvider` entry rather than a post-hoc sentinel patch in the caller.
- `WithAPIKey` is only appended when a token is present.
- `WithBaseURL(baseURL)` is now forwarded for Bedrock (was previously
missing).

**Backend admin API** (`coderd/exp_chats.go`):
- `validateChatProviderCentralAPIKey` exempts Bedrock from requiring a
stored API key when central credentials are enabled.
- AI Gateway separation (`ChatProviderAPIKeysFromDeploymentValues`) is
unchanged. No silent reuse of `CODER_AIBRIDGE_BEDROCK_*` flags.

**Frontend**
(`site/src/pages/AgentsPage/components/ChatModelAdminPanel/*`):
- API Key field is optional for Bedrock when central credentials are
enabled.
- Bedrock-specific descriptions on API Key and Base URL fields
(bearer-token vs ambient modes, `AWS_REGION` guidance).
- Right-aligned "Clear stored token" action switches an existing Bedrock
provider back to ambient mode.
- `hasEffectiveAPIKey` treats Bedrock with central credentials enabled
as configured, so the provider list shows the correct status icon.
- Three new stories: `ProviderFormBedrockAmbientCredentials`,
`ProviderFormBedrockBearerToken`, `ProviderFormBedrockClearBearerToken`.

**Docs** (`docs/ai-coder/agents/models.md`,
`docs/ai-coder/ai-gateway/setup.md`):
- New "Configuring AWS Bedrock" section covering both credential modes,
region resolution, and the Base URL override.
- Explicit note that the `us-east-1` region fallback only applies to
bearer-token mode; ambient credentials require a region from the
standard AWS SDK chain.
- Cross-reference in AI Gateway docs clarifying that
`CODER_AIBRIDGE_BEDROCK_*` flags are a separate configuration path from
Agents.

## Not in scope

- Reusing AI Gateway Bedrock flags as an implicit Agents fallback.
- Per-provider AWS access key, secret, or region fields (would need a
migration and audit-table review).
- IMDS or network-backed credential probes in admin/listing request
paths.

## Related

Dogfood deployment integration:
https://github.com/coder/dogfood/pull/324
2026-04-22 14:20:23 +02:00
Mathias Fredriksson 78d9a220cf fix(coderd/x/chatd): detect disconnected agents in getWorkspaceConn (#24336)
Add agent status check and dial timeout to getWorkspaceConn to
prevent tool calls from hanging when a workspace agent disconnects.

Status check: call isAgentUnreachable on every getWorkspaceConn
call. On cache miss, check the freshly fetched agent row. On
cache hit, re-fetch the agent row by PK for a fresh heartbeat
timestamp. Disconnected and timed-out agents return a sentinel
immediately; connecting agents proceed to dial.

Dial timeout: wrap dialWithLazyValidation in a 30s
context.WithTimeoutCause (matching 8 other server-side AgentConn
callers). Parent context cancellation propagates unchanged so
the chatloop can detect ErrInterrupted.

Both sentinels tell the LLM the agent is unreachable and the
workspace may need restarting from the dashboard.

Closes CODAGT-149
2026-04-22 12:10:32 +00:00
Cian Johnston 38f5d3f0b2 test: add regression guard for chat title masking (#24584)
Follow-up to #24564 addressing unresolved review findings.

- **DEREM-1**: Add `Test_diff/Chat/TitleMasked` to
`enterprise/audit/diff_internal_test.go` so flipping `title` back to
`ActionTrack` fails loudly. Verified: the case passes today, fails with
a clear diff after flipping to `ActionTrack`, passes again after
reverting.
- **DEREM-4**: Inline comment at `coderd/audit/request.go:138`
explaining why `ResourceTarget` for `database.Chat` returns a UUID
prefix instead of the title.
- **DEREM-5**: Trailing comment on `enterprise/audit/table.go` `title`
entry, matching the surrounding `ActionSecret` comment style.

Won't-fix, with rationale (per user):

- **DEREM-2** (8-char prefix collision risk): `resource_target` is a
display hint, not an identifier; the full UUID lives in `resource_id`.
- **DEREM-3** (named constant for `[:8]`): single call site; extracting
would be ceremony.
- **DEREM-6** (PR title misleading): merged PR title is immutable.
- **DEREM-7** (historical log redaction): the offending version only
shipped to dogfood for a couple of hours and not to customers.

> 🤖
2026-04-22 10:52:52 +00:00