Previously, `chats.last_model_config_id` was not updated when a user
sent a mid-chat message with a different model, and queued messages did
not store their own per-turn model, so promotion ran against whatever
the chat row said at promote time. Chat watch events also did not merge
`last_model_config_id` into the site's root, child, and per-chat
caches, so sidebar labels stayed stale after direct sends and queued
promotions.
- Add nullable `chat_queued_messages.model_config_id`, backfilled from
`chats.last_model_config_id`. Queued inserts round-trip the effective
model id at enqueue time.
- In `coderd/x/chatd`, direct sends update `chats.last_model_config_id`
inside the same transaction that inserts the admitted user message.
Manual promotion and auto-promotion use the queued row's stored
`model_config_id`, with a fallback to `chats.last_model_config_id`
for legacy NULL rows during rollout.
`PromoteQueuedOptions.ModelConfigID`
is now ignored.
- On the site, extract `mergeWatchedChatSummary` and
`mergeWatchedChatIntoCaches` in `site/src/api/queries/chats.ts` so
status-change watch events merge `last_model_config_id` into the
root infinite chat list, the parent-embedded child entry, and the
per-chat `chatKey(chatId)` cache. `updated_at` guards against stale
watch payloads clobbering newer cached state, while diff status
events still merge their PR metadata because they are timestamped
outside the chat row. Watch timestamps are compared as instants so
variable fractional precision does not make fresh events look stale.
- Queued promotion validates stored model config IDs before admission.
Invalid legacy queued IDs fall back to the chat's current model config
instead of dropping the queued message during auto-promotion.
- Backend and frontend regression coverage added for admission, queue
promotion (including FIFO across mixed models, legacy NULL fallback,
and invalid queued model IDs), and chat watch cache merging.
> Mux is acting on Mike's behalf.
Adds a deployment-wide admin override for general delegated subagents.
## What changed
- store the general override in `site_configs` and expose it through the
shared `agent-model-override/{context}` API
- apply the general override when spawning delegated general subagents,
while preserving the existing Explore override behavior
- reuse a shared Agents settings form for the general and Explore
override sections
## Validation
- `make gen`
- `go test ./coderd -run 'TestChatModelOverrides'`
- `go test ./coderd/x/chatd -run
'TestSpawnAgent_(GeneralUsesConfiguredModelOverride|GeneralOverrideLogsAndFallsBackWhenCredentialsUnavailable|GeneralOverrideLogsAndFallsBackWhenProviderDisabled)'`
- `pnpm -C site lint:types`
- `pnpm -C site test:storybook --
AgentSettingsAgentsPageView.stories.tsx`
- `make lint`
- `make pre-commit`
> Mux is acting on Mike's behalf.
*Disclaimer: implemented by a Coder Agent using Claude Opus 4.6*
Marks the `GET /api/v2/aibridge/interceptions` endpoint as deprecated in
favor of `/aibridge/sessions`, which provides richer session-level
aggregation including threads and agentic actions.
Changes:
- Add `@Deprecated` Swagger annotation to the endpoint handler
- Add deprecation notice to the
`codersdk.Client.AIBridgeListInterceptions` method
- Regenerated OpenAPI spec with `"deprecated": true` flag
The endpoint remains fully functional.
Fixes https://github.com/coder/internal/issues/1339
> [!WARNING]
> The change of the status code from `404` to `204` could break peoples
code downstream. Adding this as a breaking change incase.
Theres a whole ton of noise around failed requests, these are all
unrelated to the actual thing that is broken at hand (and are
confusing).
* Change `/api/v2/organizations/.../templates/.../versions/.../previous`
to return `204` instead of `404` (actually makes more sense because the
content doesn't exist, but the route is found.
* Remove unnecessary calls to `/api/v2/users/me/appearance` when the
user isn't logged in.
* Remove unnecessary calls to `/api/v2/deployment/stats` when the
deployment stats aren't allowed to be seen.
* Various changes to `workspace-sharing` so we don't make unnecessary
calls.
Whats left:
* `/api/v2/users/me` still `401`s on the login page. This persists as
when the user is logged in but tries to reach the sign-in page they
should be redirected to the app, not sign in again.
* `monaco-editor` is still upset... we theoretically could inject an
environment that can serve workers... but eh.
#### Old
```sh
% pnpm playwright:test -g "create workspace with default and required parameters"
> coder-v2@ playwright:test /home/coder/coder/site
> playwright test --config=e2e/playwright.config.ts -g 'create workspace with default and required parameters'
...
Running 2 tests using 1 worker
✓ 1 …e/setup/addUsersAndLicense.spec.ts:7:5 › setup deployment (8.2s)
2 ….ts:79:5 › create workspace with default and required parameters
[console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized)
[console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized)
[response] url=http://localhost:3111/api/v2/users/me/appearance status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."}
[response] url=http://localhost:3111/api/v2/users/me status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."}
[console][error] Failed to load resource: the server responded with a status of 403 (Forbidden)
[response] url=http://localhost:3111/api/v2/deployment/stats status=403 body={"message":"Forbidden.","detail":"You don't have permission to view this content. If you believe this is a mistake, please contact your administrator or try signing in with different credentials."}
[console][error] Failed to load resource: the server responded with a status of 403 (Forbidden)
[response] url=http://localhost:3111/api/v2/deployment/stats status=403 body={"message":"Forbidden.","detail":"You don't have permission to view this content. If you believe this is a mistake, please contact your administrator or try signing in with different credentials."}
[console][error] Failed to load resource: the server responded with a status of 404 (Not Found)
[response] url=http://localhost:3111/api/v2/organizations//provisionerdaemons status=404 body={"message":"Resource not found or you do not have access to this resource"}
[console][error] Failed to load resource: the server responded with a status of 404 (Not Found)
[response] url=http://localhost:3111/api/v2/organizations/default/templates/a4e8096d/versions/agreeable_glenn33/previous status=404 body={"message":"No previous template version found for \"agreeable_glenn33\"."}
[console][warning] Could not create web worker(s). Falling back to loading web worker code in main thread, which might cause UI freezes. Please see https://github.com/microsoft/monaco-editor#faq
[console][warning] You must define a function MonacoEnvironment.getWorkerUrl or MonacoEnvironment.getWorker
[console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized)
[console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized)
[response] url=http://localhost:3111/api/v2/users/me/appearance status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."}
[response] url=http://localhost:3111/api/v2/users/me status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."}
[console][error] Failed to load resource: the server responded with a status of 403 (Forbidden)
[response] url=http://localhost:3111/api/v2/deployment/stats status=403 body={"message":"Forbidden.","detail":"You don't have permission to view this content. If you believe this is a mistake, please contact your administrator or try signing in with different credentials."}
✓ 2 …5 › create workspace with default and required parameters (7.0s)atus of 403 (Forbidden)
[response] url=http://localhost:3111/api/v2/deployment/stats status=403 body={"message":"Forbidden.","detail":"You don't have permission to view this content. If you believe this is a mistake, please contact your administrator or try signing in with different credentials."}
[console][error] Failed to load resource: the server responded with a status of 403 (Forbidden)
[response] url=http://localhost:3111/api/v2/deployment/stats status=403 body={"message":"Forbidden.","detail":"You don't have permission to view this content. If you believe this is a mistake, please contact your administrator or try signing in with different credentials."}
2 passed (56.1s)
```
`23 LOL` (Lines of logs)
#### New
```sh
% pnpm playwright:test -g "create workspace with default and required parameters"
> coder-v2@ playwright:test /home/coder/coder/site
> playwright test --config=e2e/playwright.config.ts -g 'create workspace with default and required parameters'
...
Running 2 tests using 1 worker
✓ 1 …e/setup/addUsersAndLicense.spec.ts:7:5 › setup deployment (8.7s)
2 ….ts:79:5 › create workspace with default and required parameters
[console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized)
[console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized)
[response] url=http://localhost:3111/api/v2/users/me/appearance status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."}
[response] url=http://localhost:3111/api/v2/users/me status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."}
[console][warning] Could not create web worker(s). Falling back to loading web worker code in main thread, which might cause UI freezes. Please see https://github.com/microsoft/monaco-editor#faq
[console][warning] You must define a function MonacoEnvironment.getWorkerUrl or MonacoEnvironment.getWorker
✓ 2 …5 › create workspace with default and required parameters (7.1s)atus of 401 (Unauthorized)
[console][error] Failed to load resource: the server responded with a status of 401 (Unauthorized)
[response] url=http://localhost:3111/api/v2/users/me/appearance status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."}
[response] url=http://localhost:3111/api/v2/users/me status=401 body={"message":"You are signed out or your session has expired. Please sign in again to continue.","detail":"Cookie \"coder_session_token\" or query parameter must be provided."}
2 passed (32.0s)
```
`9 LOL` (Lines of logs)
The Ctrl+D diff drawer in `coder exp agents` only rendered PR-backed
diffs returned by `/api/experimental/chats/{id}/diff`. Local working
tree changes in a chat's workspace returned an empty diff, so the
drawer showed "No diff contents" with no file summary.
Centralise diff loading behind a single `fetchChatDiffContents` helper
that first hits `/diff`, then falls back to the chat git watcher
WebSocket (`/stream/git`) when the remote diff is empty. Aggregate the
agent's `WorkspaceAgentRepoChanges` into a `ChatDiffContents` value so
the drawer can derive the file summary and styled body from the local
unified diff. Missing workspaces, missing agents, and watcher timeouts
are treated as graceful fallbacks that render the empty-diff
placeholder instead of a hard error.
> Mux is opening this PR on Mike's behalf.
Previously, the sessions list sorted by `MIN(started_at)` across
interceptions, so sessions with old start times but recent activity
would sink to the bottom of the list regardless of how recently they
were used.
`ListAIBridgeSessions` now sorts by `COALESCE(MAX(prompt.created_at),
MIN(started_at)) DESC`, exposed as the non-nullable `last_active_at`
field. Sessions with prompts surface by last activity; sessions with no
prompts fall back to their start time.
The original implementation used two separate columns (`last_active_at`
as a nullable prompt timestamp and `sort_at` as the non-nullable cursor
key). This revision collapses them into a single `last_active_at` that
is always set — simplifying the SQL, the Go conversion, the API type,
and the frontend.
🤖 Generated with [Claude Code](https://claude.ai/claude-code)
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Anthropic HTTP 400 responses (e.g. "image exceeds 5 MB maximum") were
collapsed in the chat UI to the generic headline "Anthropic returned an
unexpected error (HTTP 400)." with no actionable detail — the upstream
message survived to the processor log but was dropped before reaching
the client.
Add a new optional `Detail` field on `codersdk.ChatStreamError` that
carries the upstream provider message alongside the existing normalized
headline. The backend extracts `error.message` from
`fantasy.ProviderError.ResponseBody` (the JSON envelope shared by
Anthropic and OpenAI), falls back to the trimmed provider message when
the body is absent or unparseable, and caps the result at 500 runes. The
frontend threads `Detail` through `useChatStore`, `liveStatusModel`, and
`ChatStatusCallout`, rendering it as a muted secondary line inside the
existing `AlertDescription`.
Before:
<img width="1552" height="185" alt="image"
src="https://github.com/user-attachments/assets/524b588e-3cee-4fad-bc15-6bf3aec0899d"
/>
After:
<img width="814" height="173" alt="image"
src="https://github.com/user-attachments/assets/eae82a89-3ac1-4a33-8d18-ef9f77263d89"
/>
## Persistence
`Detail` is **not** persisted — it disappears on refresh. Persisting it
would require a DB change (today `chats.last_error` is a single nullable
`TEXT` column), and the shape of persisted chat errors is worth a more
deliberate rethink — e.g. promoting `last_error` to `JSONB` so we can
also retain structured fields like `kind`, `statusCode`, `provider`, and
`retryable` instead of only the normalized headline string. That's a
bigger design discussion than this PR should carry.
In the meantime, seeing the upstream error reason *immediately on
failure* is already a large UX improvement over the status quo, and this
PR gets us there without prejudicing the eventual persistence design.
Tracking persistence in CODAGT-239.
Closes CODAGT-235
## Summary
Allow root plan-mode chats to use MCP tools from external servers that
an admin has explicitly approved for plan mode. Workspace MCP and
plan-mode subagents remain blocked.
## Problem
`chatd.go` excluded every MCP tool when `isPlanModeTurn` was true, so
planning had no access to tools like docs search, ticketing, etc.
Lifting that guard wholesale was unsafe: `mcp_server_configs` already
has centralized admin governance, but workspace-local MCP (discovered
from agent `.mcp.json`) does not, and subagents use a narrower trust
boundary.
## Fix
Add an admin-controlled per-server `allow_in_plan_mode` flag (default
`false`) and gate plan-mode MCP access on it.
### Backend / schema
- New migration `000472_mcp_server_allow_in_plan_mode.{up,down}.sql` and
matching fixture update.
- `mcpserverconfigs.sql` + generated code: persist and read the new
column.
- `codersdk/mcp.go`: thread the field through `MCPServerConfig`,
`Create*`, and `Update*` request types.
- `coderd/mcp.go`: validate, persist, and return the flag in
get/list/create/update handlers.
### chatd
- `coderd/x/chatd/chatd.go`: pre-filter selected external MCP configs by
`AllowInPlanMode` before calling `mcpclient.ConnectAll` on plan-mode
root turns. Workspace MCP discovery is skipped entirely on plan-mode
turns.
- Single helper decides whether a tool is available in plan mode, used
both at construction and for active-tool filtering (defense in depth).
Plan-mode subagents, dynamic tools, provider-native tools, computer-use,
and workspace MCP stay unchanged.
- `coderd/x/chatd/prompt.go`: update the root plan-mode overlay text to
match the new boundary.
### UI
- `MCPServerAdminPanel.tsx`: add an explicit toggle ("Allow all tools
from this MCP server in root plan mode") next to the existing governance
controls.
- Regenerated `site/src/api/typesGenerated.ts`.
### Docs
- `docs/ai-coder/agents/architecture.md`: replace the blanket "MCP is
unavailable in plan mode" note with the new root-only, external-only,
admin-approved policy. Explicitly call out that workspace MCP and
plan-mode subagents are still excluded.
### Tests
- Plan-mode visibility (approved vs non-approved external server).
- Plan-mode invocation of an approved external MCP tool.
- End-to-end plan-mode workflow that uses an approved MCP tool and then
reaches `propose_plan`.
- Regressions: workspace MCP still excluded in plan mode; plan-mode
subagents still on the restricted tool boundary; existing tool
allow/deny list filtering still applies.
## Policy precedence
`allow_in_plan_mode` is an **additional** requirement on top of existing
`enabled`, availability, chat-selected / forced server IDs, and tool
allow/deny lists. It approves **all tools on that server** for root plan
mode; a per-tool plan allowlist is deliberately deferred.
## Follow-ups (explicitly out of scope)
- Whether plan-mode subagents should inherit approved external MCP
tools.
- Workspace-local MCP safety model (agent-side `.mcp.json` schema vs. a
coderd-managed workspace MCP config).
## Validation
- `go vet ./coderd/x/chatd/...`
- `go test ./coderd/x/chatd -run 'TestPlan.*|TestMCP.*' -count=1`
- `go test ./coderd/x/chatd -count=1 -timeout 5m` (full chatd suite)
- `make fmt` (no diff)
> Mux opened this PR on Mike's behalf.
## Problem
Coderd can expose an MCP server at `/api/experimental/mcp/http` (we have
this enabled on dogfood). Its workspace tools dialed agents through a
per-call client-side tailnet stack. Every tool call re-created a
WireGuard device, netstack, magicsock + UDP sockets, DERP connection,
coordinator websocket, and their goroutines — in a process that already
runs a long-lived shared tailnet. The duplicate stacks drove up resource
usage under load.
## Fix
Route this server's tool calls through the existing shared tailnet, so
none of those transports are reconstructed per call. Closing an
`AgentConn` now releases a tunnel reference instead of tearing down a
transport.
## Potential follow-up
`coder exp mcp server` still builds a fresh tailnet per call. It pays
per-call latency and causes coordinator/DERP churn. A shared CLI tailnet
is more involved — unlike coderd, the CLI has no existing shared tailnet
to reuse, so it would need a new long-lived client-side tailnet with
reconnect, sleep/wake, and idle-destination handling. There's less
motivation to optimize this, given the client-side MCP does not compete
for resources with coderd.
Closes CODAGT-199
> Generated by mux, but reviewed by a human
GetChats now returns only root chats (parent_chat_id IS NULL).
A new GetChildChatsByParentIDs query fetches children for visible
roots and embeds them in each parent's Children field. The
singular getChat endpoint does the same.
Archive invariant is one-way: parent archived implies child
archived. Parent archive/unarchive cascades via root_chat_id.
Individual child archive is permitted; child unarchive while the
parent is archived is rejected atomically (row lock on child,
re-read parent inside the transaction). Embedded children are
filtered by the caller's archive state so individually-archived
children stay hidden from active-parent views.
Gitsync MarkStale uses GetChatsByWorkspaceIDs directly;
MarkStaleParams.OwnerID removed (dead after the switch).
Frontend: buildChatTree reads from the embedded children field,
WebSocket handlers route child events into the parent's children
array, and archiving a child strips it from the parent cache.
Agents can already see workspace files and take screenshots, but users could not download those artifacts from chat. This PR adds durable chat attachments to chatd. `attach_file`, explicit `computer` screenshot actions (not the automatic post-action screenshots), and `propose_plan` now fetch bytes over the agent connection, store them in `chat_files`, link them to the chat, and carry attachment metadata in tool responses so `buildAssistantPartsForPersist` can materialize ordinary `type:"file"` assistant parts that the chat file APIs serve.
The same storage helpers are reused for other artifact-producing paths. `wait_agent` recordings and thumbnails are stored as chat files and linked back to the parent chat, with best-effort relinking so parent chats retain those artifacts without leaving orphaned rows when chat-file caps reject links. `storeChatAttachment` wraps insert + link in one transaction, files are capped at 10 MB each and 20 per chat, and serving defaults to `Content-Disposition: attachment` with an explicit inline-safe allowlist.
This PR also consolidates chat-file media policy in `coderd/chatfiles`. Uploads and tool-generated attachments share byte-based MIME detection, SVG blocking, inline-safety rules, and compatible `text/plain` refinement for JSON, CSV, and Markdown. Prompt construction still only inlines synthetic pasted text for model consumption; assistant-created attachments are persisted for the user and intentionally not replayed into later LLM turns.
UI follow-up lives in #24281.
Relates to CODAGT-91
Fixes three classes of edit_files bugs and adds structured per-file
diff output for tool callers:
- New IncludeDiff flag on FileEditRequest; when set, the agent
returns FileEditResponse.Files[]{Path, Diff} with unified diffs
computed via go-udiff v0.4.1 Lines + ToUnified (not Unified,
which calls log.Fatalf on internal error).
- Fuzzy match comparators split each line into leading whitespace,
body, trailing whitespace, and ending. The splice substitutes at
each position: on agreement between search and replace the file's
bytes win; on disagreement the replacement's bytes are spliced
verbatim. Carve-outs for empty-body lines, multi-line EOF splices,
and level-aware indent translation for inserted lines.
- Indent-unit detection (GCD for spaces, tab-priority) lets a 4sp
LLM search insert correctly into tab or 2sp files. Falls back to
the previous cLead-inheritance path when units can't be detected
cleanly.
- Empty search is rejected with "search string must not be empty".
- Duplicate file paths in one request are rejected; symlink aliases
resolved via api.resolvePath before the dedup check.
- Frontend EditFilesRenderer consumes the structured files array by
explicit path (no label munging) with per-file synthetic fallback
for older agents or mismatched paths. On error, no diff is
rendered so the synthetic fallback doesn't misrepresent a
rejected edit as applied.
Breaking change: AgentConn.EditFiles changes from (ctx, req) error
to (ctx, req) (FileEditResponse, error) in codersdk/workspacesdk.
Source-breaking for external Go consumers; no compat shim per plan
owner.
Out of scope (tracked in CODAGT-214): level-aware indent for
middle-substituted splice lines. Locked in
TestEditFiles_FuzzyIndent_InsertionLevelAware's Lock_* cases plus
TestEditFiles_ReplaceAll_FuzzyIndentGap.
Injects user secrets into workspace agents at runtime via the agent
manifest. Secrets with an environment variable name are set as
environment variables in every agent session and startup script. Secrets
with a file path are written to disk before startup scripts run.
- Fetch user secrets in GetManifest and convert to proto
- Defensively strip secrets from manifests received by the agent to
avoid accidental leakage
- Add WorkspaceSecret type and proto conversion helpers to agentsdk
- Write secret files eagerly on manifest fetch (0600 perms, 0700 dirs)
- Inject secret env vars per-session in updateCommandEnv
- Expand ~/paths using caller-resolved home directory
- Log file write errors without blocking workspace startup
> This PR was authored by Mux on behalf of Mike.
Introduce Explore mode, a read-only subagent modality for delegated
discovery and code investigation.
## What
Adds a `spawn_explore_agent` tool that creates child chats restricted to
read-only operations. An admin can optionally configure a
deployment-wide
model override so Explore subagents use a model optimized for large
context
or reasoning without changing the root chat's model.
### Backend
- New `ChatModeExplore` enum value (migration 000471).
- `spawn_explore_agent` tool definition with read-only allowlist:
`read_file`, `execute`, `process_output`, `read_skill`,
`read_skill_file`.
Write tools, file editors, and nested subagent spawning are blocked.
- Deployment config storage for the Explore model override
(`agents_chat_explore_model_override` in `site_configs`).
- Model resolution hierarchy: configured override, then current turn
model,
then global default. Silent fallback with warning log when the override
becomes unavailable.
- RBAC: `AsChatd` for daemon reads, `ActionRead` and `ActionUpdate` on
`ResourceDeploymentConfig` for admin API calls.
- Plan mode root chats can use `spawn_explore_agent` for read-only
research,
matching the planning prompt guidance.
- The Explore override config API now reports malformed saved overrides
as
"treated as unset" so admins can clear them explicitly.
### Frontend
- `ExploreModelOverrideSettings` component in admin agent behavior
settings.
Uses `ModelSelector`, handles unavailable model warnings, and supports
explicit Save and Clear actions.
- Malformed saved overrides show a warning and require an explicit Save
to
clear, instead of Clear auto-submitting behind the scenes.
### Tests
- Integration: `TestExploreSubagentIsReadOnly` (full spawn flow, tool
verification, prompt overlay, DB state).
- Unit: tool allowlist tests for explore, plan, and default modes.
- Internal: model override resolution with valid, invalid UUID,
disabled, and
unconfigured override scenarios.
- RBAC: `dbauthz_test.go` for `GetChatExploreModelOverride` and
`UpsertChatExploreModelOverride`.
- API: admin set and clear, malformed stored override reporting,
disabled
model rejection, non-admin denial.
Wire DERPTLSConfig through the CLI, SDK, tailnet, VPN client, agent, and
health checks to allow custom TLS configuration for DERP connections.
The main use case is to be able to set a custom CA and also present
client certs (mTLS). See https://github.com/coder/tailscale/pull/105 for
related changes.
Adds three new global CLI flags:
- `--client-tls-ca-file` / `CODER_CLIENT_TLS_CA_FILE`
- `--client-tls-cert-file` / `CODER_CLIENT_TLS_CERT_FILE`
- `--client-tls-key-file` / `CODER_CLIENT_TLS_KEY_FILE`
Based on community PR #22695 by @ibdafna, with autogeneration issues
fixed (protobuf version mismatches in .pb.go files, golden file
regeneration, lint fixes).
> [!NOTE]
> This PR was authored by Coder Agents on behalf of a Coder team member.
<details>
<summary>Relationship to #22695</summary>
This is a clean reimplementation of the changes from #22695 on top of
current `main`, with the following differences:
- **Removed**: Accidental protobuf version changes in `.pb.go` files
(contributor had `protoc v6.33.4` vs project's `protoc v4.23.4`)
- **Added**: Properly regenerated golden files and docs via `make gen`
- **Fixed**: Lint issue (`var-declaration` revive warning on explicit
type in `createHTTPClient`)
- All meaningful code changes are identical to the original PR
</details>
Add a `chat_client_type` enum (`ui` | `api`) and `client_type` column to
the `chats` table. The column defaults to `api` for new rows so API
callers don't need to set it explicitly. Existing rows are backfilled to
`ui`.
The field flows through `CreateChatRequest`, `chatd.CreateOptions`,
`InsertChat`, and is returned in the `Chat` response via `db2sdk`.
<details>
<summary>Implementation notes (Coder Agents generated)</summary>
### Changes
**Database migration (000469)**
- New enum `chat_client_type` with values `ui`, `api`.
- New `client_type` column, `NOT NULL DEFAULT 'api'`.
- Backfill: `UPDATE chats SET client_type = 'ui'`.
**SQL query** — `InsertChat` now includes `client_type`.
**SDK** — `ChatClientType` type added; `ClientType` field added to both
`CreateChatRequest` (optional, defaults server-side to `api`) and `Chat`
response.
**Handler** — `postChats` maps the request field (defaulting to `api`)
and passes it through `chatd.CreateOptions`.
**Sub-agent** — Child chats inherit their parent's `client_type`.
**db2sdk** — Maps the database value to the SDK type.
### Decision log
- Default is `api` (not `ui`) so existing API integrations get the
correct value without code changes.
- Backfill sets existing rows to `ui` per requirement.
- Child chats inherit `client_type` from parent rather than defaulting.
</details>
> This PR was authored by Mux on behalf of Mike.
## Summary
Adds support for multiple peer root workspace agents sharing the same
`auth_instance_id`, so AWS, Azure, and GCP instance-identity auth can
issue the correct session token for a selected agent instead of assuming
a
single root agent per instance.
## Problem
When a Terraform template attaches two or more `coder_agent` resources
(with `auth = "aws-instance-identity"`) to a single compute instance,
every agent shares the same cloud instance ID. The existing singular
lookup picks whichever agent was created most recently, silently
ignoring
the others.
## Solution
Introduce an optional pre-auth agent selector (`CODER_AGENT_NAME`) and
make the server-side lookup ambiguity-aware.
**Database layer:**
- `GetWorkspaceAgentsByInstanceID` (`:many`): returns all matching root
agents for an instance ID.
- `GetWorkspaceAgentByInstanceIDAndName` (`:one`): returns the named
root
agent for disambiguation.
**SDK and CLI:**
- `agent_name` field added to AWS, Azure, and GCP request structs
(`omitempty` for backward compatibility).
- `CODER_AGENT_NAME` env var and `--agent-name` flag wired into the
agent
bootstrap before instance-identity auth runs.
**Server handler (`handleAuthInstanceID`):**
- When `agent_name` is present: direct lookup by (instance ID, name).
- When absent: legacy lookup, then resource-scoped ambiguity check.
Returns 409 with available agent names if multiple root agents match.
- Whitespace-only names are trimmed and treated as unspecified.
- Sub-agents remain excluded (`parent_id IS NULL` filter).
**Verification template:**
- `examples/templates/aws-multi-agent/` provisions one EC2 instance with
two agents (`main` and `dev`), both using instance-identity auth with
`CODER_AGENT_NAME` set in the cloud-init user data.
## Backward compatibility
Existing single-agent deployments work unchanged. The `agent_name` field
is optional with `omitempty`, and the unnamed path preserves today's
behavior when only one root agent matches.
> This PR was authored by Mux on behalf of Mike.
## Summary
- add persistent plan mode for chats and the chat-specific plan file
flow
- add structured planning tools such as `ask_user_question` and
`propose_plan`
- keep `write_file` and `edit_files` constrained to the chat-specific
plan file during plan turns
- allow shell exploration in plan mode, including subagents, via
`execute` and `process_output`
- block implementation-oriented, provider-native, MCP, dynamic, and
computer-use tools during plan turns
- update the chat UI, tests, and docs for the new planning flow
## Summary
Adds `--ai-gateway-allow-byok` deployment option to control whether
users can use Bring Your Own Key (BYOK) mode with AI Gateway.
When disabled (`--ai-gateway-allow-byok=false`), BYOK requests are
rejected with a 403 and a message directing the admin to enable the
flag. Centralized key authentication works regardless of this setting.
Defaults to `true` (BYOK allowed).
---------
Co-authored-by: Danny Kopping <danny@coder.com>
_Disclaimer: produced mostly by Claude Opus 4.6 following detailed
planning._
## Summary
- Support multiple instances of the same AI Bridge provider type via
indexed env vars (`CODER_AIBRIDGE_PROVIDER_<N>_<KEY>`), following the
`CODER_EXTERNAL_AUTH_<N>_<KEY>` pattern
- Existing single-provider env vars (`CODER_AIBRIDGE_OPENAI_KEY`, etc.)
continue to work unchanged
- Setting both a legacy env var and an indexed provider with the same
name errors at startup to prevent silent misconfiguration
- Mark legacy provider fields (`OpenAI`, `Anthropic`, `Bedrock`) as
deprecated in `AIBridgeConfig` in favor of `Providers`
## Example
```sh
CODER_AIBRIDGE_PROVIDER_0_TYPE=anthropic
CODER_AIBRIDGE_PROVIDER_0_NAME=anthropic-corp
CODER_AIBRIDGE_PROVIDER_0_KEY=sk-ant-corp-xxx
CODER_AIBRIDGE_PROVIDER_0_BASE_URL=https://llm-proxy.internal.example.com/anthropic
CODER_AIBRIDGE_PROVIDER_1_TYPE=anthropic
CODER_AIBRIDGE_PROVIDER_1_NAME=anthropic-direct
CODER_AIBRIDGE_PROVIDER_1_KEY=sk-ant-direct-yyy
```
Each instance is routed by name:
- /api/v2/aibridge/**anthropic-corp**/v1/messages
- /api/v2/aibridge/**anthropic-direct**/v1/messages
Closes
[AIGOV-157](https://linear.app/codercom/issue/AIGOV-157/spike-to-understand-if-there-is-a-simple-way-to-handle-multi-api-key)
---------
Signed-off-by: Danny Kopping <danny@coder.com>
Previously we reserved some env vars that may collide with AI gateway.
These were incomplete and take away flexibility from the user, which
we're prioritizing in the first iteration of the feature.
This was originally added because it was present in `env` output in
dogfood, but it's specifically injected in the dogfood template so it
doesn't make sense to deny across the board for a secret environment
variable name.
* Removes experiment `web-push`.
* Falls back to NoopWebpusher in case of error
* Checks browser capability in FE
* Adds note to agents getting-started docs regarding webpush without TLS
> 🤖
Add secret value validation to reject null bytes and values exceeding 32KB.
The 32KB limit applies uniformly to both env var and file secrets because the
value field is shared and the destination can change after creation.
Add file path validation to also reject null bytes and paths exceeding 4096
bytes.
Wire up secret value validation into both POST and PATCH handlers.
Fixes https://github.com/coder/internal/issues/1436
* Adds organization_id to chats with backfill (workspace org → user org membership → default org)
* No support yet for ACLs (follow-up issue)
- Cross-org workspace binding rejected (both in `CreateChatRequest` and in `create_workspace` tool
- Adds `OrganizationAutocomplete` to `AgentCreateForm`
- Docs updated with `organization_id` in chats-api.md
> 🤖 Written by a Coder Agent. Reviewed by many humans and many agents.
---------
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
Closes#16332
Previously `coder provisioner jobs list` showed no indication of what a workspace
build job was doing (i.e., start, stop, or delete). This adds
`workspace_build_transition` to the provisioner job metadata, exposed in
both the REST API and CLI. Template and workspace name columns were also
added, both available via `-c`.
```
$ coder provisioner jobs list -c id,type,status,"workspace build transition"
ID TYPE STATUS WORKSPACE BUILD TRANSITION
95f35545-a59f-4900-813d-80b8c8fd7a33 template_version_import succeeded
0a903bbe-cef5-4e72-9e62-f7e7b4dfbb7a workspace_build succeeded start
```
The "By model" and "Pull requests" tables on the PR Insights page
(`/agents/settings/insights`) were side-by-side at `lg` breakpoints, and
the Pull requests table was hard-capped at 20 rows by the backend.
- Replaced `lg:grid-cols-2` with a single-column stacked layout so both
tables span the full content width.
- Removed the `LIMIT 20` from the `GetPRInsightsRecentPRs` SQL query so
all PRs in the selected time range are returned.
- Can add this back if we need it. If we do, we should add a little
subheader above this table to indicate that we're not showing all PRs
within the selected timeframe.
- Added client-side pagination to the Pull requests table using
`PaginationWidgetBase` (page size 10), matching the existing pattern in
`ChatCostSummaryView`.
- Renamed the section heading from "Recent" to "Pull requests" since it
now shows the full set for the time range.
<img width="1481" height="1817" alt="image"
src="https://github.com/user-attachments/assets/0066c42f-4d7b-4cee-b64b-6680848edc68"
/>
> 🤖 PR generated with Coder Agents
Go's html/template has a built-in security filter (urlFilter) that only
allows http, https, and mailto URL schemes. Any other scheme gets
replaced with #ZgotmplZ.
The OAuth2 app's callback URL uses custom URI scheme which the filter
considers unsafe. For example the Coder JetBrains plugin exposes a
callback URI with the scheme jetbrains:// - which was effectively
changed by the template engine into #ZgotmplZ. Of course this is not an
actual callback. When users clicked the cancel button nothing happened.
The fix was simple - we now wrap the apps registered callback URI into
htmltemplate.URL. Usually this needs some validation otherwise the
linter will complain about it. The callback URI used by the Cancel logic
is actually validated by our backend when the client app
programmatically registered via the dynamic OAuth2 registration
endpoints, so we refactored the validation around that code and re-used
some of it in the Cancel handling to make sure we don't allow URIs like
`javascript` and `data`, even though in theory these URIs were already
validated.
In addition, while testing this PR with
https://github.com/coder/coder-jetbrains-toolbox/pull/209 I discovered
that we are also not compliant with
https://www.rfc-editor.org/rfc/rfc6749#section-4.1.2.1 which requires
the server to attach the local state if it was provided by the client in
the original request. Also it is optional but generally a good practice
to include `error_description` in the error responses. In fact we follow
this pattern for the other types of error responses. So this is not a
one off.
- resolves#20323
<img width="1485" height="771" alt="Cancel_page_with_invalid_uri"
src="https://github.com/user-attachments/assets/5539d234-9ce3-4dda-b421-d023fc9aa99e"
/>
<img width="486" height="746" alt="Coder Toolbox handling the Cancel
button"
src="https://github.com/user-attachments/assets/acab71a6-d29c-4fa9-80ba-3c0095bbdc8f"
/>
<!--
If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.
-->
Add the five REST endpoints for managing user secrets, SDK client
methods, and handler tests.
Endpoints:
- `POST /api/v2/users/{user}/secrets`
- `GET /api/v2/users/{user}/secrets`
- `GET /api/v2/users/{user}/secrets/{name}`
- `PATCH /api/v2/users/{user}/secrets/{name}`
- `DELETE /api/v2/users/{user}/secrets/{name}`
Routes are registered under the existing `/{user}` group with
`ExtractUserParam`. The delete query was changed from `:exec` to
`:execrows` so the handler can distinguish "not found" from success
(DELETE with `:exec` silently returns nil for zero affected rows).
## Summary
Exposes `credential_kind` and `credential_hint` on AI Bridge session
threads, making credential metadata visible in the session detail API.
Each thread in the `/api/v2/aibridge/sessions/{session_id}` response now
includes:
- `credential_kind`: `centralized` or `byok`
- `credential_hint`: masked credential (e.g. `sk-a...pgAA`)
Values are taken from the thread's root interception.
## Changes
- `codersdk/aibridge.go`: Added `CredentialKind` and `CredentialHint`
fields to `AIBridgeThread`
- `coderd/database/db2sdk/db2sdk.go`: Populated from root interception
in `buildAIBridgeThread`
- `SessionTimeline.stories.tsx`: Added fields to mock thread data
Adds `coder exp chat context add` and `coder exp chat context clear`
commands that run inside a workspace to manage chat context files via
the agent token.
`add` reads instruction and skill files from a directory (defaulting to
cwd) and inserts them as context-file messages into an active chat.
Multiple calls are additive — `instructionFromContextFiles` already
accumulates all context-file parts across messages.
`clear` soft-deletes all context-file messages, causing
`contextFileAgentID()` to return `!found` on the next turn, which
triggers `needsInstructionPersist=true` and re-fetches defaults from the
agent.
Both commands auto-detect the target chat via `CODER_CHAT_ID` (already
set by `agentproc` on chat-spawned processes), or fall back to
single-active-chat resolution for the agent. The `--chat` flag overrides
both.
Also adds sub-agent context inheritance: `createChildSubagentChat` now
copies parent context-file messages to child chats at spawn time, so
delegated sub-agents share the same instruction context without
independently re-fetching from the workspace agent.
<details><summary>Implementation details</summary>
**New files:**
- `cli/exp_chat.go` — CLI command tree under `coder exp chat context`
**Modified files:**
- `agent/agentcontextconfig/api.go` — `ConfigFromDir()` reads context
from an arbitrary directory without env vars
- `codersdk/agentsdk/agentsdk.go` — `AddChatContext`/`ClearChatContext`
SDK methods
- `coderd/workspaceagents.go` — POST/DELETE handlers on
`/workspaceagents/me/chat-context`
- `coderd/coderd.go` — Route registration
- `coderd/database/queries/chats.sql` — `GetActiveChatsByAgentID`,
`SoftDeleteContextFileMessages`
- `coderd/database/dbauthz/dbauthz.go` — RBAC implementations for new
queries
- `coderd/x/chatd/subagent.go` — `copyParentContextFiles` for sub-agent
inheritance
- `cli/root.go` — Register `chatCommand()` in `AGPLExperimental()`
**Auth pattern:** Uses `AgentAuth` (same as `coder external-auth`) —
agent token via `CODER_AGENT_TOKEN` + `CODER_AGENT_URL` env vars.
</details>
> 🤖 Generated by Coder Agents
---------
Co-authored-by: Michael Suchacz <203725896+ibetitsmike@users.noreply.github.com>
The agents chat interface displays thumbnails for videos recorded by the
computer use agent. Currently, to display a thumbnail, the frontend
downloads the entire video and shows the first frame. This PR starts
storing a new thumbnail file in the database for every recorded video,
and exposes the file id in the `wait_agent` tool result alongside the
recording file id, so the frontend can fetch just the thumbnail.
Workspace agent logs could still fail after the earlier invalid UTF-8
fix because NUL bytes are valid Go/protobuf strings but are rejected by
Postgres text columns. The legacy HTTP log upload path also bypassed the
old sanitization entirely, and both server insert paths computed
logs_length from the unsanitized input.
Add a shared log-output sanitizer in agentsdk, use it in the protobuf
conversion path and both server-side insert paths, and compute
OutputLength from the sanitized string so overflow accounting matches
what is actually stored. This keeps the old invalid UTF-8 behavior while
also handling embedded NUL bytes consistently across DRPC and HTTP log
ingestion.
Refs [#23292 ](https://github.com/coder/coder/issues/23292)
Refs [#13433 ](https://github.com/coder/coder/issues/13433)
Adds backend validation for user secret environment variable names and file paths.
Env name validation enforces POSIX naming rules and blocks a deliberately aggressive denylist of reserved names and prefixes. The denylist errs on the side of blocking too much since it's easier to remove entries later than to add them after users have created conflicting secrets.
File path validation requires paths to start with ~/ or /.
Adds an optional `CreatedAt` timestamp to `tool-call` and `tool-result`
`ChatMessagePart` variants so the frontend can compute tool execution
duration (`result.created_at - call.created_at`).
Timestamps are recorded at the correct moments in the chatloop:
- **Tool-call**: when the model stream emits the tool call
- **Tool-result**: when tool execution completes (or is interrupted)
These are passed through `PersistedStep.PartCreatedAt` so the
persistence layer can apply accurate timestamps to stored parts.
SSE-published parts also carry `CreatedAt` for real-time display.
Old persisted messages without `created_at` deserialize to `nil` — fully
backward compatible.
<details><summary>Implementation notes (Coder Agents
generated)</summary>
### Why not stamp in `PartFromContent`?
`PartFromContent` is called both for SSE publishing (correct timing) and
during persistence (wrong timing — both tool-call and tool-result would
get the same "persistence time" timestamp, yielding ~0 duration).
Instead, timestamps are captured in the chatloop at the right moments
and carried through `PersistedStep.PartCreatedAt` as a
`map[string]time.Time` keyed by `"call:<id>"` / `"result:<id>"`.
### Interrupted tool calls
`persistInterruptedStep` also stamps `CreatedAt` on synthetic error
results for cancelled/interrupted tool calls, so partial duration is
available.
### Files changed
| File | Change |
|------|--------|
| `codersdk/chats.go` | Add `CreatedAt *time.Time` field |
| `codersdk/chats_test.go` | JSON round-trip test |
| `coderd/database/dbtime/dbtime.go` | Add `TimePtr` helper |
| `coderd/x/chatd/chatloop/chatloop.go` | Track timestamps, pass through
`PersistedStep` |
| `coderd/x/chatd/chatd.go` | Apply timestamps during persistence |
| `coderd/x/chatd/chatprompt/chatprompt_test.go` | Verify
`PartFromContent` does NOT stamp |
| `site/src/api/typesGenerated.ts` | Auto-generated |
</details>
---------
Co-authored-by: Ethan <39577870+ethanndickson@users.noreply.github.com>
Adds client-executed dynamic tools to the chat API. Dynamic tools are
declared by the client at chat creation time, presented to the LLM
alongside built-in tools, but executed by the client rather than chatd.
This enables external systems (Slack bots, IDE extensions, Discord bots,
CI/CD integrations) to plug custom tools into the LLM chat loop without
modifying chatd's built-in tool set.
Modeled after OpenAI's Assistants API: the chat pauses with
`requires_action` status when the LLM calls a dynamic tool, the client
POSTs results back via `POST /chats/{id}/tool-results`, and the chat
resumes.
See [this example](https://github.com/coder/coder-slackbot-poc) as a
reference for how this is used. It's highly-configurable, which would
enable creating chats from webhooks, periodically polling, or running as
a Slackbot.
<details>
<summary>Design context</summary>
### Architecture
The chatloop **exits** when it encounters dynamic tools and
**re-enters** when results arrive. No blocking channels, no pubsub for
tool results, no in-memory registry. The DB is the only coordination
mechanism.
```
Phase 1 (chatloop):
LLM response → execute built-in tools only →
Persist(assistant + built-in results) →
status = requires_action → chatloop exits
Phase 2 (POST /tool-results):
Persist(dynamic tool results) →
status = pending → wakeCh → chatloop re-enters
```
### Validation (POST /tool-results)
1. Chat status must be `requires_action` (409 if not)
2. Read chat's `dynamic_tools` → set of dynamic tool names
3. Read last assistant message → extract tool-call parts matching
dynamic tool names
4. Submitted tool_call_ids must match exactly (400 for missing/extra)
5. Persist tool-result message parts, set status to `pending`, signal
wake
### Idempotency
Tool call IDs scoped per LLM step. State machine (`requires_action` →
`pending`) is the guard. First POST wins, subsequent get 409.
### Mixed tool calls
When the LLM calls both built-in and dynamic tools in one step, built-in
tools execute immediately. Their results are persisted in phase 1.
Dynamic tool results arrive via POST in phase 2. The LLM sees all
results when the chatloop resumes.
</details>
> 🤖 Generated by Coder Agents
Fixes https://github.com/coder/coder/issues/23910
Adds periodic cleanup of chats and chat files to the dbpurge background
goroutine, with a configurable retention period exposed in the Agent
settings UI.
> 🤖 Written by a Coder Agent. Reviewed by a human.
Audit and connection log pages were timing out due to expensive COUNT(*)
queries over large tables. This commit adds opt-in count capping: requests can
return a `count_cap` field signaling that the count was truncated at a threshold,
avoiding full table scans that caused page timeouts.
Text-cast UUID comparisons in regosql-generated authorization queries
also contributed to the slowdown by preventing index usage for connection
and audit log queries. These now emit native UUID operators.
Frontend changes handle the capped state in usePaginatedQuery and
PaginationWidget, optionally displaying a capped count in the pagination
UI (e.g. "Showing 2,076 to 2,100 of 2,000+ logs")
Related to:
https://linear.app/codercom/issue/PLAT-31/connectionaudit-log-performance-issue
Needed by #23833
Adds a `chat_file_links` association table to track which files are
associated with each chat.
- `AppendChatFileIDs` query links a file to a chat with deduplication
- `GetChatFileMetadataByIDs` query returns lightweight file metadata by
IDs
- Tool-created files (e.g. `propose_plan`) are linked to the chat after
insert
- User-uploaded files are linked to the chat when the referencing
message is sent
- Single-chat GET endpoint hydrates `files: ChatFileMetadata[]` on the
response
> 🤖 Created by Coder Agents and massaged into shape by a human.