## Summary
Adds a "Generate new title" action that lets users manually regenerate a
chat's title using richer conversation context than the automatic
first-message title path.
## Changes
### Backend
- **New endpoint:** `POST
/api/experimental/chats/{chatID}/title/regenerate` returns the updated
Chat with a regenerated title
- **Manual title algorithm:** Extracts useful user/assistant text turns
→ selects first user turn + last 3 turns → builds context with gap
markers → renders prompt with anti-recency guidance → calls lightweight
model → normalizes output
- **Helpers:** `extractManualTitleTurns`,
`selectManualTitleTurnIndexes`, `buildManualTitleContext`,
`renderManualTitlePrompt`, `generateManualTitle` — all private, with the
public `Server.RegenerateChatTitle` method
- **SDK:** `ExperimentalClient.RegenerateChatTitle(ctx, chatID) (Chat,
error)`
- Persists title via existing `UpdateChatByID` and broadcasts
`ChatEventKindTitleChange`
### Frontend
- API client method + React Query mutation with cache invalidation
- "Generate new title" menu item (with wand icon) in both TopBar and
Sidebar dropdown menus
- Loading/disabled state while regeneration is in-flight
- Error toast on failure
- Stories updated for both menus
### Tests
- `quickgen_test.go`: Table-driven tests for all 4 helper functions
(turn extraction, index selection, context building, prompt rendering)
- `exp_chats_test.go`: Handler tests (ChatNotFound,
NotFoundForDifferentUser, NoDaemon)
## Design notes
- The existing auto-title path (`maybeGenerateChatTitle`, `titleInput`)
is completely unchanged
- Manual regeneration uses richer context (first user turn + last 3
turns + gap markers) vs the auto path's single first message
- Endpoint is experimental and marked with `@x-apidocgen {"skip": true}`
https://github.com/user-attachments/assets/bd5d12a1-61b3-4b7d-83b6-317bdfb60b3c
## Summary
Adds pinned chats to the agents page sidebar with server-side
persistence and drag-to-reorder. Users can pin/unpin chats via the
context menu, and pinned chats appear in a dedicated "Pinned" section
above the time-grouped list.
## Database
Migration `000453_chat_pin_order`: adds `pin_order integer DEFAULT 0 NOT
NULL` column on `chats` (0 = unpinned, 1+ = pinned in display order).
Three SQL queries handle pin operations server-side using CTEs with
`ROW_NUMBER()`:
- `PinChatByID`: normalizes existing orders and appends to end
- `UnpinChatByID`: sets target to 0 and compacts remaining pins
- `UpdateChatPinOrder`: shifts neighbors, clamps to `[1, pinned_count]`
All queries exclude archived chats. `ArchiveChatByID` clears `pin_order`
on archive. The handler rejects pinning archived chats with 400.
## Backend
Pin/unpin/reorder go through the existing `PATCH
/api/experimental/chats/{chat}` via the `pin_order` field on
`UpdateChatRequest`. The handler routes based on current pin state:
`pin_order == 0` unpins, `> 0` on an already-pinned chat reorders, `> 0`
on an unpinned chat appends to end.
## Frontend
- `pinChat` / `unpinChat` / `reorderPinnedChat` optimistic mutations
using shared `isChatListQuery` predicate
- Sidebar renders Pinned section above time groups, excludes pinned
chats from time groups
- Pin/Unpin context menu items (hidden for child/delegated chats)
- `@dnd-kit/core` + `@dnd-kit/sortable` for drag-to-reorder with
`MouseSensor`, `TouchSensor`, and `KeyboardSensor`
- Local pin-order override prevents flash on drop; click blocker
prevents NavLink navigation after drag
---
*PR generated with Coder Agents*
Coder's chat (chatd) can now discover and use MCP servers configured in
a workspace's `.mcp.json` file. This brings project-specific tooling
(GitHub, databases, docs servers, etc.) into the chat without any manual
configuration.
## How it works
The workspace agent reads `.mcp.json` from the workspace directory (same
format Claude Code uses), connects to the declared MCP servers —
spawning child processes for stdio servers and connecting over the
network for HTTP/SSE — and caches their tool lists. Two new agent HTTP
endpoints expose this:
- `GET /api/v0/mcp/tools` returns the cached tool list (supports
`?refresh=true`)
- `POST /api/v0/mcp/call-tool` proxies calls to the correct server
On each chat turn, chatd calls `ListMCPTools` through the existing
`AgentConn` tailnet connection, wraps each tool as a
`fantasy.AgentTool`, and adds them to the LLM's tool set alongside
built-in and admin-configured MCP tools. Tool names are prefixed with
the server name (`github__create_issue`) to avoid collisions.
Failed server connections are logged and skipped — they never block the
agent or break the chat. Child stdio processes are terminated on agent
shutdown.
*Problem:* `publishChatPubsubEvent` was constructing a partial
`codersdk.Chat` that omitted `LastModelConfigID` and other fields. Go's
zero-value UUID caused the sidebar to show "Default model" for chats
received via SSE.
*Solution:*
- Extracted `convertChat`/`convertChats` from `exp_chats.go` into
`db2sdk.Chat`/`db2sdk.Chats`, alongside existing `ChatMessage`,
`ChatQueuedMessage`, and `ChatDiffStatus` converters.
`publishChatPubsubEvent` now calls `db2sdk.Chat(chat, nil)` instead of
maintaining its own copy of the conversion logic
- Added backend integration test
`TestWatchChats/CreatedEventIncludesAllChatFields`
- Added frontend regression tests for nil-UUID and valid model config ID
cases
> 🤖 Created by Coder Agents, reviewed by this human.
Adds an `enabled` toggle to the chat model admin create/edit form so
admins
can disable a model without soft-deleting it. Disabled models stay
visible
in admin settings but stop appearing in user-facing model selectors.
The backend already supported this (`chat_model_configs.enabled` column,
filtered queries, and SDK fields). This change wires it into the admin
UI
and adds coverage on both sides.
**Backend:** three new subtests in `coderd/exp_chats_test.go` verifying
the visibility contract (admin sees disabled models, non-admin doesn't,
update-to-disabled preserves the record).
**Frontend:** `enabled` field added to form logic and seeded from the
existing model (defaults to `true` for new models). A Switch+Tooltip
control renders in the form header, matching the MCP Server panel
pattern.
Two interaction stories cover the create-disabled and toggle-existing
flows.
> **PR Stack**
> 1. **#23351** ← `#23282` *(you are here)*
> 2. #23282 ← `#23275`
> 3. #23275 ← `#23349`
> 4. #23349 ← `main`
---
## Summary
`chatretry.Retry()` used pure exponential backoff (1 s, 2 s, 4 s, …) and
never consulted provider `Retry-After` headers. Fantasy's
`ProviderError` carries `ResponseHeaders` including `Retry-After`, but
`chaterror.Classify()` only parsed error text and silently dropped the
structured transport metadata.
This makes `Retry-After` a first-class signal in the classification →
retry pipeline.
<img width="853" height="346" alt="image"
src="https://github.com/user-attachments/assets/65f012b6-8173-43d2-957e-ab9faddea525"
/>
## Changes
### `coderd/chatd/chaterror/classify.go`
- Added `RetryAfter time.Duration` field to `ClassifiedError` — a
normalized minimum retry delay derived from provider response metadata.
- `Classify()` now calls `extractProviderErrorDetails()` before falling
back to text heuristics. Structured `ProviderError.StatusCode` takes
priority over regex extraction.
- `normalizeClassification()` preserves and clamps `RetryAfter`.
### `coderd/chatd/chaterror/provider_error.go` (new)
Provider-specific extraction, isolated from the text-based
classification logic:
- `extractProviderErrorDetails()` unwraps `*fantasy.ProviderError` from
the error chain via `errors.As`.
- `retryAfterFromHeaders()` parses headers in priority order:
1. `retry-after-ms` (OpenAI-specific, millisecond precision)
2. `retry-after` (standard HTTP — integer seconds or HTTP-date)
- Case-insensitive header key lookup.
### `coderd/chatd/chatretry/chatretry.go`
- `effectiveDelay(attempt, classified)` computes `max(Delay(attempt),
classified.RetryAfter)` — the provider hint acts as a floor without
weakening the local exponential backoff.
- `Retry()` now uses `effectiveDelay` and passes the effective delay to
both `onRetry(...)` and the sleep timer, so downstream payloads, logs,
and the frontend countdown stay aligned automatically.
### Tests
- `classify_test.go`: Structured provider status + `Retry-After`
extraction, `retry-after-ms` priority, HTTP-date parsing, invalid header
fallback, `WithProvider` preservation.
- `chatretry_test.go`: Retry-after-as-floor semantics — longer hint
wins, shorter hint keeps base delay.
## Design notes
- **No SDK/API/frontend changes needed.** `codersdk.ChatStreamRetry`
already carries `DelayMs` and `RetryingAt`, and the frontend already
consumes them. The fix is purely in the server-side delay computation.
- **Existing retryability rules unchanged.** This fixes *when* we sleep,
not *whether* an error is retryable.
- **Provider hint is a floor:** `max(baseDelay, RetryAfter)` ensures we
never retry earlier than the provider asks, and never weaken our own
backoff curve.
## Changes
- **Commit 1**: Remove 17 unnecessary `//nolint` directives:
- `//nolint:varnamelen` — linter not active
- `//nolint:unused` on exported `SlimUnsupported`
- `//nolint:govet` in `coderd/httpmw/csrf` — no longer fires
- `//nolint:revive` on functions refactored since the nolint was added
- `//nolint:paralleltest` citing Go 1.22 loop variable capture
(obsolete)
- Bare `//nolint` narrowed to specific `//nolint:gocritic` with
justification
- **Commit 2**: Fix root causes behind 5 dangerous nolint suppressions:
- Add `MinVersion: tls.VersionTLS12` to TLS client config (removes
`gosec` G402)
- Delete trivial unexported wrappers `apiKey()`/`normalizeProvider()` in
chatprovider (removes `revive` confusing-naming)
- Add doc comments to `StartWithAssert` and `Router` (removes `revive`
exported)
- Rename unused parameters to `_` in integration test helpers
> 🤖 This PR was created using Coder Agents and reviewed by me.
Admins can now control whether the built-in Coder Agents default system
prompt is prepended to their custom instructions, rather than having the
custom prompt silently replace the default.
**Changes:**
- New `include_default_system_prompt` boolean toggle (defaults to `true`
for existing deployments) stored as a site config key — no migration
needed.
- GET `/api/experimental/chats/config/system-prompt` returns the toggle
state, the custom prompt, and a preview of the built-in default.
- PUT persists both the toggle and custom prompt atomically in a single
transaction.
- `resolvedChatSystemPrompt()` composes `[default?, custom?]` joined by
`\n\n`, falling back to the built-in default on DB errors.
- Settings UI adds a Switch toggle with conditional helper text and a
"Preview" button that shows the built-in default prompt via the existing
`TextPreviewDialog`.
- Comprehensive test coverage: 15 subtests covering toggle behavior,
prompt composition matrix, auth boundaries, and integration with chat
creation.
- Adds `GET /api/experimental/chats/by-workspace` endpoint that returns
workspace_id → latest chat_id mapping
- Modifies FE to fetch this alongside the workspace list, gated on
`agents` experiment and render an "Agent" badge similar to the existing
"Task" badge in `WorkspacesTable`
- Badge links to the "latest chat" linked to the given workspace.
Notes:
- Intentionally uses `fetchWithPostFilter` for RBAC to decouple from
workspaces API — will migrate to `workspaces_expanded` view later.
- If users have multiple chats linked to the same workspace, the badge
will link to the most recently updated one.
> 🤖 This PR was created with the help of Coder Agents, and has been
reviewed by my human. 🧑💻
## Summary
Adds an entitlement-gated **AI add-on** column to both the **Users**
table and the **Organization Members** table. When
`ai_governance_user_limit` is entitled, each row shows whether the user
is consuming an AI seat.
## Background
The AI governance add-on tracks which users are consuming AI seats.
Admins need visibility into per-user seat consumption directly from the
user management tables. This change surfaces that information through
both the site-wide Users table and the per-organization Members table,
gated behind the `ai_governance_user_limit` entitlement so the column
only appears when the feature is licensed.
## Implementation
### Backend
- **New SQL query** `GetUserAISeatStates`
(`coderd/database/queries/aiseatstate.sql`) — returns user IDs consuming
an AI seat, derived from:
- Users with entries in `aibridge_interceptions` (AI Bridge usage)
- Users who own workspaces with `has_ai_task = true` builds (AI Tasks
usage)
- **SDK types** — added `has_ai_seat: boolean` to `codersdk.User` and
`codersdk.OrganizationMemberWithUserData`
- **Handler wiring** — both the Users list endpoint (`coderd/users.go`)
and all Members endpoints (`coderd/members.go`) query AI seat state per
page of user IDs and populate the response field
- **dbauthz** — per-user `ActionRead` checks on `ResourceUserObject`
### Frontend
- **Shared `AISeatCell` component**
(`site/src/modules/users/AISeatCell.tsx`) — green `CircleCheck` for
consuming, gray `X` for non-consuming
- **`TableColumnHelpTooltip`** — extended with `ai_addon` variant with
tooltip: *"Users with access to AI features like AI Bridge, Boundary, or
Tasks who are actively consuming a seat."*
- **Column visibility** gated behind
`useFeatureVisibility().ai_governance_user_limit`
## Validation
- Backend: dbauthz full method suite (`TestMethodTestSuite`) passes
including new `GetUserAISeatStates` test
- Backend: `TestGetUsers`, `TestUsersFilter`, CLI golden file tests pass
- Frontend: 7/7 tests pass across `UsersPage.test.tsx` and
`OrganizationMembersPage.test.tsx` (column visibility gating both
directions)
- `go build ./coderd/...` compiles clean
- `pnpm --dir site run lint:types` passes
- `make gen` clean
## Risks
- **Pagination performance**: The AI seat query is scoped to the current
page's user IDs (not a full table scan), keeping it efficient for
paginated views.
- **Semantic scope**: The workspace-side AI seat derivation uses "any
build with `has_ai_task = true`" rather than "latest build only". If the
product intent is latest-build-only, this can be tightened in a
follow-up.
---
_Generated with `mux` • Model: `anthropic:claude-opus-4-6` • Thinking:
`xhigh` • Cost: `$27.25`_
<!-- mux-attribution: model=anthropic:claude-opus-4-6 thinking=xhigh
costs=27.25 -->
## Summary
Adds a process-wide cache for three hot database queries in `chatd` that
were hitting Postgres on **every chat turn** despite returning
rarely-changing configuration data:
| Query | Before (50k turns) | After | Reduction |
|---|---|---|---|
| `GetEnabledChatProviders` | ~98.6k calls | ~500-1000 | ~99% |
| `GetChatModelConfigByID` | ~49.2k calls | ~500-1000 | ~98% |
| `GetUserChatCustomPrompt` | ~46.7k calls | ~1000-2000 | ~97% |
These were identified via `coder exp scaletest chat` (5000 concurrent
chats × 10 turns) as the dominant source of Postgres load during chat
processing.
## Design
Follows the established **webpush subscription cache pattern**
(`coderd/webpush/webpush.go`):
- `sync.RWMutex` + `tailscale.com/util/singleflight` (generic) +
generation-based stale prevention + TTL
- 10s TTL for provider/model config, 5s TTL for user prompts
- Negative caching for `sql.ErrNoRows` on user prompts (the common case
— most users don't set custom prompts)
- Deep-clones `ChatModelConfig.Options` (`json.RawMessage` = `[]byte`)
on both store and read paths
### Invalidation
Single pubsub channel (`chat:config_change`) with kind discriminator for
cross-replica cache invalidation. Seven publish points in
`coderd/chats.go` cover all admin mutation endpoints
(create/update/delete for providers and model configs, put for user
prompts).
_This PR was generated with mux and was reviewed by a human_
Follow-up to #23282. The retry and terminal error callouts had a few UX
oddities:
- Auto-retrying states reused backend error text that said "Please try
again" even while the UI was already retrying on behalf of the user.
- Terminal error states also said "Please try again" with no action the
user could take.
- `startup_timeout` had no specific title or retry copy — it fell
through to the generic "Retrying request" heading.
- The kind pill showed raw enum values like `startup_timeout` and
`rate_limit`.
- Terminal error metadata showed a "Retryable" / "Not retryable" label
that does not help users.
- A separate "Provider anthropic" metadata row duplicated information
already present in the message body.
- The `usage-limit` error kind used a hyphen while every backend kind
uses underscores.
Changes:
**Backend (`chaterror/message.go`)**
- Split message generation into `terminalMessage()` and
`retryMessage()`, replacing the old `userFacingMessage()`.
- Terminal messages include HTTP status codes and actionable guidance
(e.g. "Check the API key, permissions, and billing settings.").
- Retry messages are clean factual statements without status codes or
remediation, suitable for the retry countdown UI (e.g. "Anthropic is
temporarily overloaded.").
- Removed "Please try again" / "Please try again later" from all paths.
- `StreamRetryPayload` calls `retryMessage()` instead of forwarding
`classified.Message`.
**Frontend**
- Removed the parallel frontend message-generation system:
`getRetryMessage()`, `getProviderDisplayName()`,
`getRetryProviderSubject()`, and the `PROVIDER_DISPLAY_NAMES` map are
all deleted from `chatStatusHelpers.ts`.
- `liveStatusModel.ts` passes `retryState.error` through directly — the
backend owns the copy.
- Added specific title and retry copy for `startup_timeout`, and
extended the title mapping to cover `auth` and `config`.
- Kind pills now show humanized labels ("Startup timeout", "Rate limit",
etc.) instead of raw enum strings.
- Removed the redundant "Provider anthropic" metadata row.
- Removed the terminal "Retryable" / "Not retryable" badge.
- Normalized `"usage-limit"` → `"usage_limit"` and added it to
`ChatProviderFailureKind` so all error kinds follow the same underscore
convention and live in one enum.
Refs #23282.
## Summary
This change removes the steady-state "resolve the latest workspace
agent" query from chat execution.
Instead of asking the database for the latest build's agent on every
turn, a chat now persists the workspace/build/agent binding it actually
uses and reuses that binding across subsequent turns. The common path
becomes "load the bound agent by ID and dial it", with fallback paths to
repair the binding when it is missing, stale, or intentionally changed.
## What changes
- add `workspace_id`, `build_id`, and `agent_id` binding fields to
`chats`
- expose those fields through the chat API / SDK so the execution
context is explicit
- load the persisted binding first in chatd, instead of always resolving
the latest build's agent
- persist a refreshed binding when chatd has to re-resolve the workspace
agent
- keep child / subagent chats on the same bound workspace context by
inheriting the parent binding
- leave `build_id` / `agent_id` unset for flows like `create_workspace`,
then bind them lazily on the next agent-backed turn
## Runtime behavior
The binding is treated as an optimistic cache of the agent a chat should
use:
- if the bound agent still exists and dials successfully, we use it
without a latest-build lookup
- if the bound agent is missing or no longer reachable, chatd
re-resolves against the latest build and persists the new binding
- if a workspace mutation changes the chat's target workspace, the
binding is updated as part of that mutation
To avoid reintroducing a hot-path query, dialing uses lazy validation:
- start dialing the cached agent immediately
- only validate against the latest build if the dial is still pending
after a short delay
- if validation finds a different agent, cancel the stale dial, switch
to the current agent, and persist the repaired binding
## Result
The hot path stops issuing
`GetWorkspaceAgentsInLatestBuildByWorkspaceID` for every user message,
which is the source of the DB pressure this PR is addressing. At the
same time, chats still converge to the correct workspace agent when the
binding becomes stale due to rebuilds or explicit workspace changes.
Problem: previously, the deployment-wide chat template allowlist was never actually wired in from `chatd.go`
- Extracts `parseChatTemplateAllowlist` into shared `coderd/util/xjson.ParseUUIDList`
- Adds `Server.chatTemplateAllowlist()` method that reads the allowlist from DB
- Passes `AllowedTemplateIDs` callback to `ListTemplates`, `ReadTemplate`, and `CreateWorkspace` tool constructors
> 🤖 Created by Coder Agents and reviewed by a human.
## Problem
MCP OAuth2 auto-discovery stripped the path component from the MCP
server URL
before looking up Protected Resource Metadata. Per RFC 9728 §3.1, the
well-known
URL should be path-aware:
```
{origin}/.well-known/oauth-protected-resource{path}
```
For `https://api.githubcopilot.com/mcp/`, the correct metadata URL is
`https://api.githubcopilot.com/.well-known/oauth-protected-resource/mcp/`,
not
`https://api.githubcopilot.com/.well-known/oauth-protected-resource`
(which
returns 404).
The same issue applied to RFC 8414 Authorization Server Metadata for
issuers
with path components (e.g. `https://github.com/login/oauth` →
`/.well-known/oauth-authorization-server/login/oauth`).
## Fix
Replace the `mcp-go` `OAuthHandler`-based discovery with a
self-contained
implementation that correctly follows path-aware well-known URI
construction for
both RFC 9728 and RFC 8414, falling back to root-level URLs when the
path-aware
form returns an error. Also implements RFC 7591 registration directly,
removing
the `mcp-go/client/transport` dependency from the discovery path.
Note: this fix resolves the discovery half of the problem for servers
like
GitHub Copilot. Full OAuth2 support for GitHub's MCP server also
requires
dynamic client registration (RFC 7591), which GitHub's authorization
server
does not currently support — that will be addressed separately.
- Clarify the system prompt to prefer tools before asking the user for
clarification.
- Limit clarification to cases where ambiguity or user preferences
materially affect the outcome.
- Remove the contradictory instruction to always start by asking
clarifying questions.
> 🤖 This PR has been reviewed by the author.
### Changes
**coder/coder:**
- `coderd/aibridge/aibridge.go` — Added `HeaderCoderBYOKToken` constant,
`IsBYOK()` helper, and updated `ExtractAuthToken` to check the BYOK
header first.
- `enterprise/aibridged/http.go` — BYOK-aware header stripping: in BYOK
mode only the BYOK header is stripped (user's LLM credentials
preserved); in centralized mode all auth headers are stripped.
<hr/>
**NOTE**: `X-Coder-Token` was removed! As of now `ExtractAuthToken`
retrieves token either from `X-Coder-AI-Governance-BYOK-Token` or from
`Authorization`/`X-Api-Key`.
---------
Co-authored-by: Susana Ferreira <susana@coder.com>
Co-authored-by: Danny Kopping <danny@coder.com>
## Summary
Adds a general-purpose `map[string]string` label system to chats, stored
as jsonb with a GIN index for efficient containment queries.
This is a standalone foundational feature that will be used by the
upcoming Automations feature for session identity (matching webhook
events to existing chats), replacing the need for bespoke session-key
tables.
## Changes
### Database
- **Migration 000451**: Adds `labels jsonb NOT NULL DEFAULT '{}'` column
to `chats` table with a GIN index (`idx_chats_labels`)
- **`InsertChat`**: Accepts labels on creation via `COALESCE(@labels,
'{}')`
- **`UpdateChatByID`**: Supports partial update —
`COALESCE(sqlc.narg('labels'), labels)` preserves existing labels when
NULL is passed
- **`GetChats`**: New `has_labels` filter using PostgreSQL `@>`
containment operator
- **`GetAuthorizedChats`**: Synced with generated `GetChats` (new column
scan + query param)
### API
- **Create chat** (`POST /chats`): Accepts optional `labels` field,
validated before creation
- **Update chat** (`PATCH /chats/{chat}`): Supports `labels` field for
atomic label replacement
- **List chats** (`GET /chats`): Supports `?label=key:value` query
parameters (multiple are AND-ed)
### SDK
- `Chat`, `CreateChatRequest`, `UpdateChatRequest`, `ListChatsOptions`
all gain `Labels` fields
- `UpdateChatRequest.Labels` is a pointer (`*map[string]string`) so
`nil` means "don't change" vs empty map means "clear all"
### Validation (`coderd/httpapi/labels.go`)
- Max 50 labels per chat
- Key: 1–64 chars, must match `[a-zA-Z0-9][a-zA-Z0-9._/-]*` (supports
namespaced keys like `github.repo`, `automation/pr-number`)
- Value: 1–256 chars
- 13 test cases covering all edge cases
### Chat runtime
- `chatd.CreateOptions` gains `Labels` field, threaded through to
`InsertChat`
- Existing `UpdateChatByID` callers (e.g., quickgen title updates) are
unaffected — NULL labels preserve existing values via COALESCE
We had a bug where computer use base64-encoded screenshots would not be
interpreted as screenshots anymore once saved to the db, loaded back
into memory, and sent to Anthropic. Instead, they would be interpreted
as regular text. Once a computer use agent made enough screenshots and
stopped, and you tried sending it another message, you'd get an out of
context error:
<img width="808" height="367" alt="Screenshot 2026-03-23 at 12 02 54"
src="https://github.com/user-attachments/assets/f0bf6be2-4863-47ca-a7a9-9e6d9dfceeed"
/>
This PR fixes that.
## Summary
Introduces a new `context-file` ChatMessagePart type for persisting
workspace instruction files (AGENTS.md) as durable, frontend-visible
message parts. This is the foundation for showing loaded context files
in the chat input's context indicator tooltip.
### Problem
Previously, instruction files were resolved transiently on every turn
via `resolveInstructions()` → `InsertSystem()` and injected into the
in-memory prompt without persistence. The frontend had no knowledge that
instruction files were loaded into context, and there was no way to
surface this information to users.
### Solution
Instruction files are now read **once** when a workspace is first
attached to a chat (matching how [openai/codex handles
it](https://developers.openai.com/codex/guides/agents-md)) and persisted
as `user`-role, `both`-visibility message parts with a new
`context-file` type. This ensures:
- **Durability**: survives page refresh (data is in the DB, returned by
`getChatMessages`)
- **Cache-friendly**: `user`-role avoids the system-message hoisting
that providers do, keeping the instruction content in a stable position
for prompt caching
- **Frontend-visible**: the frontend receives paths and truncation
status for future context indicator rendering
- **Extensible**: the same pattern works for Skills (future)
### Key changes
| Layer | Change |
|---|---|
| **SDK** (`codersdk/chats.go`) | Add `ChatMessagePartTypeContextFile`
with `context_file_path`, `context_file_content` (internal, stripped
from API), `context_file_truncated` fields |
| **Prompt expansion** (`chatprompt`) | Expand `context-file` parts to
`<workspace-context>` text blocks in `partsToMessageParts()` |
| **Chat engine** (`chatd.go`) | Add `persistInstructionFiles()`, called
on first turn with a workspace. Remove per-turn `resolveInstructions()`
+ `InsertSystem()` from `processChat()` and `ReloadMessages` |
| **Frontend** | Ignore `context-file` parts in `messageParsing.ts` and
`streamState.ts` (no rendering yet — follow-up will add tooltip display)
|
### How it works
1. On each turn, `processChat` checks if any loaded message contains
`context-file` parts
2. If not (first turn with a workspace), reads AGENTS.md files via the
workspace agent connection and persists them
3. For this first turn, also injects the instruction text into the
prompt (since messages were loaded before persistence)
4. On all subsequent turns, `ConvertMessagesWithFiles()` encounters the
persisted `context-file` parts and expands them into text automatically
— no extra resolution needed
The `ComputerUseProviderTool` function needed a little bit of an
adjustment because I changed `NewComputerUseTool`'s signature in
upstream fantasy a little bit.
- Stores a deployment-wide agents template allowlist in `site_configs`
(`agents_template_allowlist`)
- Adds `GET/PUT /api/experimental/chats/config/template-allowlist`
endpoints
- Filters `list_templates`, `read_template`, and `create_workspace` chat
tools by allowlist, if defined (empty=all allowed)
- Add "Templates" admin settings tab in Agents UI ([what it looks
like](https://624de63c6aacee003aa84340-sitjilsyrr.chromatic.com/?path=/story/pages-agentspage-agentsettingspageview--template-allowlist))
> 🤖 This PR was created with the help of Coder Agents, and has been
reviewed by my human. 🧑💻
## Problem
When chatd pushes a branch and then creates a PR (e.g. `git push`
followed by `gh pr create`), the gitsync background worker often picks
up the stale `chat_diff_statuses` row between the two operations. At
that point no PR exists yet, so the worker skips the row. However, the
acquisition SQL locks the row for **5 minutes** (crash-recovery
interval), creating a dead zone where the PR diff is invisible in the UI
until the user manually navigates to the chat.
### Root cause
1. `git push` triggers `GIT_ASKPASS` → coderd external-auth handler →
`MarkStale()` sets `stale_at = now - 1s`
2. Background worker acquires the row within ~10s, atomically bumps
`stale_at = NOW() + 5 min` (crash-recovery lock)
3. Worker calls `ResolveBranchPullRequest` → no PR exists yet → returns
`nil` → worker skips with `continue`
4. `gh pr create` completes moments later, but uses its own auth (not
`GIT_ASKPASS`), so no second `MarkStale` fires
5. Row is locked for 5 minutes before the worker can retry
Loading the chat works immediately because `GET /chats/{chat}` calls
`resolveChatDiffStatus` synchronously, which discovers the PR inline.
## Fix
When `ResolveBranchPullRequest` returns nil (no PR yet) **and** the row
was recently marked stale (within 2 minutes), apply a short 15-second
backoff via `BackoffChatDiffStatus` instead of letting the 5-minute
acquisition lock stand. Outside the retry window, the worker skips the
row as before — no indefinite fast-polling for branches that never
receive a PR.
To make the "recently marked stale" check work, `updated_at` is no
longer overwritten by the acquisition and backoff SQL queries. This
preserves it as a reliable "last externally changed" timestamp (set by
`MarkStale` or a successful refresh).
### Behavior summary
| Scenario | `updated_at` age | Backoff | Effective retry |
|---|---|---|---|
| Fresh push, no PR yet | < 2 min | 15s (`NoPRBackoff`) | ~15s |
| Old row, no PR | ≥ 2 min | None (skip) | ~5 min (acquisition lock) |
| Error (any age) | Any | 120s (`DiffStatusTTL`) | ~120s |
| Success (any age) | Any | 120s (`DiffStatusTTL`) | ~120s |
## Changes
- **`coderd/database/queries/chats.sql`** — Remove `updated_at = NOW()`
from `AcquireStaleChatDiffStatuses` and `BackoffChatDiffStatus`
- **`coderd/database/queries.sql.go`** — Regenerated
- **`coderd/x/gitsync/worker.go`** — Add `NoPRBackoff` (15s) and
`NoPRRetryWindow` (2 min) constants; apply short backoff only within the
retry window
- **`coderd/x/gitsync/worker_test.go`** — Add
`TestWorker_NoPR_RecentMarkStale_BacksOffShort` and
`TestWorker_NoPR_OldRow_Skips`
- Add `SanitizePromptText` stripping ~24 invisible Unicode codepoints
and collapsing excessive newlines
- Apply at write and read paths for defense-in-depth
- Frontend: warn in both prompt textareas when invisible characters
detected
- Explicit codepoint list (not blanket `unicode.Cf`) to avoid breaking
flag emoji
- 34 Go tests + idempotency meta-test, 11 TS unit tests, 4 Storybook
stories
> This PR was created with the help of Coder Agents, and was reviewed by my human.
- Add `X-Coder-Owner-Id`, `X-Coder-Chat-Id`, `X-Coder-Subchat-Id`,
`X-Coder-Workspace-Id` headers to all outgoing LLM API requests from
chatd
- Extend `ModelFromConfig` with `extraHeaders` param, forwarded via
Fantasy `WithHeaders` on all 8 providers
- Add `CoderHeaders(database.Chat)` helper to build the header map from
chat state
- Update all 4 `ModelFromConfig` call sites (resolveChatModel,
computer-use override, title gen, push summary)
- Thread `database.Chat` into `generatePushSummary` (was `chatTitle
string`)
- Tests: `TestCoderHeaders` (4 subtests),
`TestModelFromConfig_ExtraHeaders` (OpenAI + Anthropic),
`TestModelFromConfig_NilExtraHeaders`
- Refactor existing `TestModelFromConfig_UserAgent` to use channel-based
signaling
> 🤖 This PR was generated by Coder Agents and self-reviewed by a human.
create_workspace could create a replacement workspace after a single 5s
agent dial failed, even when the existing workspace agent had recently
checked in. That made temporary reachability blips look like dead
workspaces and let chatd replace a running workspace too aggressively.
Use the workspace agent's DB-backed status with the deployment's
AgentInactiveDisconnectTimeout before allowing replacement. Recently
connected and still-connecting agents now reuse the existing workspace,
while disconnected or timed-out agents still allow a new workspace. This
also threads the inactivity timeout through chatd and adds focused
coverage for the reuse and replacement branches.
## Problem
When an MCP tool returns an `EmbeddedResource` content item (e.g. GitHub
MCP server returning file contents via `get_file_contents`), the
`convertCallResult` function falls through to the `default` case,
producing:
```
[unsupported content type: mcp.EmbeddedResource]
```
This loses the actual resource content and shows an unhelpful message in
the chat UI.
## Root Cause
The type switch in `convertCallResult` handles `TextContent`,
`ImageContent`, and `AudioContent`, but not the other two `mcp.Content`
implementations from `mcp-go`:
- `mcp.EmbeddedResource` — wraps a `ResourceContents` (either
`TextResourceContents` or `BlobResourceContents`)
- `mcp.ResourceLink` — contains a URI, name, and description
## Fix
Add two new cases to the type switch:
1. **`mcp.EmbeddedResource`**: nested type switch on `.Resource`:
- `TextResourceContents` → append `.Text` to `textParts`
- `BlobResourceContents` → base64-decode `.Blob` as binary (type
`"image"` or `"media"` based on MIME)
- Unknown → fallback `[unsupported embedded resource type: ...]`
2. **`mcp.ResourceLink`**: render as `[resource: Name (URI)]` text
## Testing
Added 3 new test cases (all passing, full suite 23/23 PASS):
- `TestConnectAll_EmbeddedResourceText` — text resource extraction
- `TestConnectAll_EmbeddedResourceBlob` — binary blob decoding
- `TestConnectAll_ResourceLink` — resource link rendering
Child chats created via `spawn_agent` and `spawn_computer_use_agent`
were not inheriting the parent's `MCPServerIDs`, meaning subagents lost
access to the parent's MCP server tools.
## Changes
- Pass `parent.MCPServerIDs` in the `CreateOptions` for both
`createChildSubagentChat()` and the `spawn_computer_use_agent` tool
handler in `coderd/x/chatd/subagent.go`.
## Tests
Added 3 tests in `subagent_internal_test.go`:
- `TestCreateChildSubagentChat_InheritsMCPServerIDs` — verifies child
chat gets parent's MCP server IDs (multiple servers)
- `TestSpawnComputerUseAgent_InheritsMCPServerIDs` — verifies computer
use subagent gets parent's MCP server IDs via the tool
- `TestCreateChildSubagentChat_NoMCPServersStaysEmpty` — verifies no
regression when parent has no MCP servers
*Disclaimer: implemented by a Coder Agent and reviewed by me.*
Renames the env vars used by chatd integration tests from the canonical
`SOMEPROVIDER_API_KEY` (e.g. `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`) to
`SOMEPROVIDER_TEST_API_KEY` (e.g. `ANTHROPIC_TEST_API_KEY`,
`OPENAI_TEST_API_KEY`) so that test-specific keys don't collide with
production/canonical provider credentials.
Relates to https://github.com/coder/internal/issues/1425
See also:
https://codercom.slack.com/archives/C0AGTPWLA3U/p1774433646799499
## Summary
Adds `xhigh` to the OpenAI reasoning effort normalizer so GPT-5.4 class
models can use `reasoning_effort: xhigh` without it being silently
dropped.
## Problem
The SDK schema (`codersdk/chats.go`) already advertises `xhigh` as a
valid `reasoning_effort` value, but the runtime normalizer in
`chatprovider.go` only accepts `minimal|low|medium|high` for the OpenAI
provider. When a user sets `xhigh`, `ReasoningEffortFromChat()` returns
`nil` and the value never reaches the OpenAI API.
## Changes
- **Fantasy dependency**: Updated `kylecarbs/fantasy` (cj/go1.25) which
now includes the `ReasoningEffortXHigh` constant
([kylecarbs/fantasy#9](https://github.com/kylecarbs/fantasy/pull/9)).
- **`chatprovider.go`**: Adds `fantasyopenai.ReasoningEffortXHigh` to
the OpenAI case in `ReasoningEffortFromChat()`.
- **`chatprovider_test.go`**: Adds `OpenAIXHighEffort` test case.
## Upstream
-
[charmbracelet/fantasy#186](https://github.com/charmbracelet/fantasy/pull/186)
Consolidates coderdtest invocations in 7 tests to reduce 23 instances to 7 across:
- `TestGetUser` (3 → 1) — read-only user lookups
- `TestUserTerminalFont` (3 → 1) — each creates own user via
CreateAnotherUser
- `TestUserTaskNotificationAlertDismissed` (3 → 1) — each creates own
user
- `TestUserLogin` (3 → 1) — each creates/deletes own user
- `TestExpMcpConfigureClaudeCode` (5 → 1) — writes to isolated temp dirs
- `TestOAuth2RegistrationTokenSecurity` (3 → 1) — independent
registrations
- `TestOAuth2SpecificErrorScenarios` (3 → 1) — independent error
scenarios
> 🤖 This PR was created with the help of Coder Agents, and has been
reviewed by my human. 🧑💻
## Problem
Template administrators cannot delete templates that have running
prebuilds.
The `deleteTemplate` handler fetches all non-deleted workspaces and
blocks
deletion if any exist, making no distinction between human-owned
workspaces
and prebuild workspaces (owned by the system `PrebuildsSystemUserID`).
This forces admins into a manual multi-step workflow: set
`desired_instances`
to 0 on every preset, wait for the reconciler to drain prebuilds, then
retry
deletion. Prebuilds are an internal system concern that admins should
not need
to manage manually.
## Fix
Replace the blanket `len(workspaces) > 0` guard in `deleteTemplate` with
a
loop that only blocks deletion when a non-prebuild (human-owned)
workspace
exists. Prebuild workspaces — owned by `database.PrebuildsSystemUserID`
— are
now ignored during the check.
Once the template is soft-deleted (`deleted=true`), the existing
prebuilds
reconciler detects `isActive()=false` and cleans up remaining prebuilds
asynchronously. No changes to the reconciler are needed.
The error message and HTTP status for human workspaces remain unchanged.
## Testing
Added two new subtests to `TestDeleteTemplate`:
- **`OnlyPrebuilds`**: deletion succeeds when only prebuild workspaces
exist.
- **`PrebuildsAndHumanWorkspaces`**: deletion is blocked when both
prebuild
and human workspaces exist.
Existing reconciler test ("soft-deleted templates MAY have prebuilds")
already
covers post-deletion prebuild cleanup.
> **PR Stack**
> 1. #23351 ← `#23282`
> 2. #23282 ← `#23275`
> 3. **#23275** ← `#23349` *(you are here)*
> 4. #23349 ← `main`
---
## Summary
Extracts a structured error classification subsystem for agent chat
(`chatd`) so that retry and error payloads carry machine-readable
metadata — error kind, provider name, HTTP status code, and retryability
— instead of raw error strings.
This is the **backend half** of the error-handling work. The frontend
counterpart is in #23282.
## Changes
### New package: `coderd/chatd/chaterror/`
Canonical error classification — extracts error kind, provider, status
code, and user-facing message from raw provider errors. One source of
truth that drives both retry policy and stream payloads.
- **`kind.go`**: Error kind enum (`rate_limit`, `timeout`, `auth`,
`config`, `overloaded`, `unknown`).
- **`signals.go`**: Signal extraction — parses provider name, HTTP
status code, and retryability from error strings and wrapped types.
- **`classify.go`**: Classification logic — maps extracted signals to an
error kind.
- **`message.go`**: User-facing message templates keyed by kind +
signals.
- **`payload.go`**: Projectors that build `ChatStreamError` and
`ChatStreamRetry` payloads from a classified error.
### Modified
- **`codersdk/chats.go`**: Added `Kind`, `Provider`, `Retryable`,
`StatusCode` fields to `ChatStreamError` and `ChatStreamRetry`.
- **`coderd/chatd/chatretry/`**: Thinned to retry-policy only;
classification logic moved to `chaterror`.
- **`coderd/chatd/chatloop/`**: Added per-attempt first-chunk timeout
(60 s) via `guardedStream` wrapper — produces retryable
`startup_timeout` errors instead of hanging forever.
- **`coderd/chatd/chatd.go`**: Publishes normalized retry/error payloads
via `chaterror` projectors.
Nine subtests covering the poll loop, pubsub notification path,
timeout, context cancellation, descendant auth check, and both
error-status branches in handleSubagentDone.
Wire p.clock through awaitSubagentCompletion's timer and ticker
so future tests can use quartz mock clock. Tests use channel-based
coordination and context.WithTimeout instead of time.Sleep.
Coverage: awaitSubagentCompletion 0%->70.3%, handleSubagentDone
0%->100%, checkSubagentCompletion 0%->77.8%,
latestSubagentAssistantMessage 0%->78.9%.
## Summary
Large pasted text that the UI collapses into an attachment chip was
completely invisible to the LLM. Providers only accept specific MIME
types (images, PDFs) in file content blocks — a `text/plain` `FilePart`
is silently dropped, so the model received nothing for pasted content.
## Fix
Detect paste-originated text files by their
`pasted-text-{timestamp}.txt` filename pattern and convert them to
`fantasy.TextPart` with a bounded 128 KiB inline body and truncation
notice. Binary uploads and real uploaded text files keep their existing
`FilePart` semantics.
The detection uses the existing frontend naming convention
(`pasted-text-YYYY-MM-DD-HH-MM-SS.txt`) combined with a text-like MIME
check for defense-in-depth. A TODO marks this for migration to explicit
origin metadata.
<details>
<summary>Review notes: intentionally skipped findings</summary>
A 10-reviewer deep review was run on this change. The following findings
were raised and intentionally dropped after cross-check. Documenting
them here so future reviewers do not re-flag the same concerns:
**"Unresolved file IDs cause silent data loss" (Edge Case Analyst P1)**
— When a file ID is not in the resolver map, `name` stays empty and
paste detection fails. This is pre-existing behavior for ALL file types
(not introduced by this change). The resolver calls `GetChatFilesByIDs`
which returns whatever rows exist; missing IDs simply fall through to an
empty `FilePart`. The Contract Auditor independently traced this path
and confirmed the fallback is safe. If the file was deleted between
message construction and conversion, the model already saw nothing
before this patch — this change does not make it worse.
**"String builder pre-allocation overhead" (Performance Analyst P1)** —
Misidentified scope. `formatSyntheticPasteText` is only called when
`isSyntheticPaste` returns true (actual synthetic pastes), not for every
file part. The `Grow()` call is correct and efficient.
**"Constant naming violates Uber style" (Style Reviewer P1)** —
Over-severity. `syntheticPasteInlineBudget` is standard Go camelCase for
unexported constants, consistent with the Uber guide and surrounding
code.
**"`IsSyntheticPasteForTest` naming is misleading" (Style Reviewer P2)**
— This is the standard Go `export_test.go` pattern. The `ForTest` suffix
is conventional.
</details>
Linear's MCP server (`mcp.linear.app`) returns `token_type="bearer"`
(lowercase) in its OAuth2 token response but rejects requests that use
the lowercase form in the `Authorization` header. RFC 6750 says the
scheme is case-insensitive, but Linear enforces capital-B `Bearer`.
Confirmed by running the actual Linear MCP OAuth flow end-to-end:
- `Authorization: Bearer <token>` → **42 tools, works**
- `Authorization: bearer <token>` → **401 invalid_token**
This is a one-line fix: normalize any case variant of `bearer` to
`Bearer` before building the `Authorization` header, matching the
behavior of the mcp-go library's own OAuth handler.
## Problem
MCP servers like Linear (`mcp.linear.app`) require PKCE (RFC 7636) for
their OAuth2 flow. Without it, the token exchange may succeed but the
resulting access token is immediately rejected with a 401
`invalid_token` error when the chat daemon tries to connect to the MCP
server.
This means users can authenticate successfully in the UI (the OAuth
popup completes, `auth_connected` shows `true`), but the model never
receives the MCP tools — they silently fail to load.
### Root cause
The `mcpServerOAuth2Connect` handler was calling
`oauth2Config.AuthCodeURL(state)` without any PKCE parameters
(`code_challenge`, `code_challenge_method`). The callback was calling
`oauth2Config.Exchange(ctx, code)` without a `code_verifier`. Linear's
MCP OAuth endpoint decoded state confirms it expected PKCE with
`codeChallengeMethod: "plain"`.
### Investigation
- The chat (`c2c04fc5-5622-4b71-a5a9-80508e86f78e`) had the Linear MCP
server ID in `mcp_server_ids`
- `auth_connected: true` (token row exists in DB)
- No "expired" or "empty token" warnings in logs
- Server log showed: `skipping MCP server due to connection failure ...
error="initialize: transport error: request failed with status 401:
{"error":"invalid_token","error_description":"Missing or invalid access
token"}"`
- Decoding Linear's OAuth state revealed PKCE was expected
## Changes
- Generate a PKCE `code_verifier` during the OAuth2 connect step using
`oauth2.GenerateVerifier()` and store it in a cookie scoped to the
callback path
- Include `code_challenge` (S256) in the authorization redirect URL via
`oauth2.S256ChallengeOption()`
- Pass the `code_verifier` during the token exchange in the callback via
`oauth2.VerifierOption()`
- Fix a nil-pointer guard on `api.HTTPClient` in the callback
- Add tests verifying PKCE parameters are sent correctly and backwards
compatibility when no verifier cookie is present
- Detect `-testify.m` sub-test filtering in `SetupSuite` and skip the `Accounting` check.
> 🤖 This PR was created with the help of Coder Agents, and was reviewed by my human. 🧑💻
Adds a `propose_plan` tool that presents a workspace markdown file as a
dedicated plan card in the agent UI.
The workflow is: the agent uses `write_file`/`edit_files` to build a
plan file (e.g. `/home/coder/PLAN.md`), then calls `propose_plan(path)`
to present it. The backend reads the file via `ReadFile` and the
frontend renders it as an expanded markdown preview card.
**Backend** (`coderd/x/chatd/chattool/proposeplan.go`): new tool
registered as root-chat-only. Validates `.md` suffix, requires an
absolute path, reads raw file content from the workspace agent. Includes
1 MiB size cap.
**Frontend** (`site/src/components/ai-elements/tool/`): dedicated
`ProposePlanTool` component with `ToolCollapsible` + `ScrollArea` +
`Response` markdown renderer, expanded by default. Custom icon
(`ClipboardListIcon`) and filename-based label.
**System prompt** (`coderd/x/chatd/prompt.go`): added `<planning>`
section guiding the agent to research → write plan file → iterate → call
`propose_plan`.
OpenAI Responses follow-up turns were replaying full assistant/tool
history even when `store=true`, which breaks after reasoning +
provider-executed `web_search` output.
This change persists the OpenAI response ID on assistant messages, then
in `coderd/x/chatd` switches `store=true` follow-ups to
`previous_response_id` chaining with a system + new-user-only prompt.
`store=false` and missing-ID cases still fall back to manual replay.
It also updates the fake OpenAI server and integration coverage for the
chaining contract, and carries the rebased path move to `coderd/x/chatd`
plus the migration renumber needed after rebasing onto `main`.
## Problem
`TestConnectAll_MultipleServers` flakes with:
```
net/http: HTTP/1.x transport connection broken: http: CloseIdleConnections called
```
Each MCP client connection implicitly uses `http.DefaultTransport`. When
`httptest.Server.Close()` runs during parallel test cleanup, it calls
`CloseIdleConnections` on `http.DefaultTransport`, breaking in-flight
connections from other goroutines or parallel tests sharing that
transport.
## Fix
Clone the default transport for each MCP connection via
`http.DefaultTransport.(*http.Transport).Clone()`, passed through
`WithHTTPBasicClient` (StreamableHTTP) and `WithHTTPClient` (SSE). This
scopes idle connection cleanup to a single MCP server so it cannot
disrupt unrelated connections.
Fixescoder/internal#1420