Commit Graph

206 Commits

Author SHA1 Message Date
Ethan 9444eddf4e feat(coderd/x/chatd): allow attach_file in root plan-mode chats (#25388)
`attach_file` was registered for plan-mode turns but never added to
`builtinPlanToolAllowed`, so the per-turn `ActiveTools` allowlist
filtered it out and calls failed with `Tool not active in this turn:
attach_file`. This was an omission rather than a deliberate block — the
tool (#24280) landed shortly after plan mode (#24236) and no subsequent
edit to the allowlist picked it up.

Add `attach_file` under the `isRootChat` case, matching how other
artifact-producing tools (`propose_plan`, `write_file`, `edit_files`)
are gated. The tool only reads from the workspace and writes to
chat-attachment storage, so it preserves plan mode's invariant of not
making implementation changes to the workspace. Subagents in plan mode
remain restricted to the minimal read-only surface.
2026-05-19 17:01:23 +10:00
Danielle Maywood 170a6e1fe9 feat: add chat sharing foundation (#25041) 2026-05-18 22:32:05 +01:00
Kyle Carberry 385146000b feat: record created_at/completed_at on reasoning ChatMessageParts (#24789)
Records reasoning start and end times on persisted reasoning
`ChatMessagePart`s so reasoning duration can be computed for stored
chats. Backend-only: no SSE changes and no frontend rendering ship in
this PR.

The `created_at` field on `ChatMessagePart` is extended to also be
present on `reasoning` parts (it previously appeared only on `tool-call`
and `tool-result`), and a new `completed_at` field is added for
`reasoning` parts.

### How timestamps are recorded

- `StreamPartTypeReasoningStart`: stamp `startedAt = dbtime.Now()` on
the active reasoning state.
- `StreamPartTypeReasoningEnd`: stamp `completedAt = dbtime.Now()` and
append both into parallel `[]time.Time` slices on `stepResult`.
- Persistence reads the slices in occurrence order (reasoning has no
provider-side ID) and applies them to the matching `ChatMessagePart` via
`buildAssistantPartsForPersist`. The first reasoning block's stamps go
onto the first reasoning part, and so on.
- `flushActiveState` flushes partial reasoning interrupted before
`StreamPartTypeReasoningEnd` with `startedAt` from the active state and
`completedAt = dbtime.Now()` at the interruption.

### Why two fields, not one?

Tool calls and results are point events. The frontend computes their
duration by subtracting the call's `created_at` from the result's
`created_at`. Reasoning is one assistant part that brackets a span, so
we record both endpoints on the part itself.

### Why not stamp in `PartFromContent`?

Same rationale as #24101: `PartFromContent` is called during both SSE
publishing and persistence. Stamping there would yield incorrect
persistence-time timestamps for reasoning blocks that finished much
earlier in the step. Instead we capture in the chatloop and apply during
persistence.

<details><summary>Implementation plan</summary>

- `codersdk/chats.go`: extend `CreatedAt`'s `variants` to include
`reasoning?`; add `CompletedAt *time.Time` with `variants:"reasoning?"`.
- `coderd/x/chatd/chatloop/chatloop.go`: extend `reasoningState` with
`startedAt`; extend `stepResult` and `PersistedStep` with parallel
`[]time.Time` reasoning slices; stamp on
`ReasoningStart`/`ReasoningEnd`; thread the slices through all
`PersistStep` call sites including the interrupt-safe path; record
partial reasoning in `flushActiveState`.
- `coderd/x/chatd/attachments.go`: walk reasoning parts in occurrence
order and apply `step.ReasoningStartedAt[i]` to `part.CreatedAt` and
`step.ReasoningCompletedAt[i]` to `part.CompletedAt`.

### Tests

- `codersdk/chats_test.go` round-trips `created_at` + `completed_at` on
reasoning parts and verifies omission when absent and partial
interrupted parts.
- `coderd/x/chatd/chatprompt/chatprompt_test.go` asserts
`PartFromContent(ReasoningContent{})` does NOT stamp timestamps.
- `coderd/x/chatd/chatloop/chatloop_test.go`
`TestRun_ReasoningTimestamps` drives a stream with two reasoning blocks
and verifies parallel slices, monotonicity, ordering, non-zero values,
and content-block ordering.
`TestRun_InterruptedReasoningFlushesTimestamps` cancels mid-reasoning
and verifies `flushActiveState` records a non-zero pair.
- `coderd/x/chatd/attachments_test.go` covers
`buildAssistantPartsForPersist` for normal interleaved reasoning,
partial (zero `completed_at`), and missing slices.

</details>

> Generated by Coder Agents.

Co-authored-by: Coder Agent <agent@coder.com>
2026-05-18 12:30:30 -04:00
Kyle Carberry 159089686a fix(coderd/x/chatd): prime workspace MCP cache after create/start (#25298)
## Problem

Mid-turn workspace MCP discovery was broken when an agent was still
cold-starting. `PrepareTools` in `chatd.go` flipped
`workspaceMCPDiscovered = true` *before* calling
`discoverWorkspaceMCPTools`, so a failed discovery attempt permanently
blocked retries within the turn.

Customer-reported repro:

- New chat with no pre-selected workspace.
- LLM calls `create_workspace` mid-turn at `23:35:05`.
- `PrepareTools` fires, dials the agent with a 30s timeout, dial times
out at `23:38:15`, `discoverWorkspaceMCPTools` returns empty.
- Agent connects at `23:38:29`, 14 seconds later.
- `workspaceMCPDiscovered` was already true, so `PrepareTools` never
retried for the rest of the turn. MCP tools only appeared on the next
user message.

A naive retry loop in `PrepareTools` would also miss the bigger picture:
a workspace boot can take several minutes (EC2 cold start, 10 min
startup scripts), and the chatloop only gets a chance to call
`PrepareTools` between LLM steps.

## Fix

Do the workspace MCP discovery from inside the tool that already waits
for the agent. `chattool.CreateWorkspace` and `chattool.StartWorkspace`
call `waitForAgentReady`, which has a 2 min agent-online budget plus a
10 min startup-script budget. By the time they fire `OnChatUpdated`, the
agent is `Ready`. The chatd `onChatUpdated` callback now launches an
async `primeWorkspaceMCPCache` goroutine on every bind that has a valid
workspace ID:

- The primer calls `discoverWorkspaceMCPTools` until it returns a
non-empty list or `workspaceMCPPrimeMaxWait` (30s) elapses, with a 2s
backoff between attempts. The bounded wait handles the short race
between agent-online and the agent's MCP `Connect` settling.
- The primer runs asynchronously so the tool itself never blocks. Some
templates simply do not advertise MCP tools, in which case the primer
would otherwise spend its full budget for nothing.
- The primer shares the chat `ctx` (not a detached one) so it is
canceled together with the chat. A dangling primer would re-dial the
workspace conn after `runChat`'s deferred `workspaceCtx.close()` and
leak that conn.
- `inflight.Add(1)` ensures server shutdown still waits for any
in-progress primer.
- `PrepareTools` is simplified back to a single discovery call. It now
only sets `workspaceMCPDiscovered = true` on success, so an empty result
no longer permanently blocks discovery within the turn. The cache hit
warmed by the primer makes that call cheap in the common case; the dial
fallback handles the rare cache miss.

## Tests

All in `coderd/x/chatd/chatd_internal_test.go`:

- `TestPrimeWorkspaceMCPCache_SuccessOnFirstAttempt` — single
`ListMCPTools` call returning tools populates the cache.
- `TestPrimeWorkspaceMCPCache_RetriesUntilToolsAppear` — first call
empty, second returns tools; primer retries past the backoff and writes
the cache. Uses `quartz.Mock.Trap` on `NewTimer`.
- `TestPrimeWorkspaceMCPCache_GivesUpAfterDeadline` — `ListMCPTools`
always empty; primer stops at `workspaceMCPPrimeMaxWait` and refuses to
cache the empty result so PrepareTools can retry on the next step.

The existing integration test
`TestRunChat_WorkspaceMCPDiscoveryAfterMidTurnCreateWorkspace` continues
to pass and now also exercises the async-primer path end-to-end via the
create_workspace tool.

```
go test ./coderd/x/chatd/... -count=1
go test ./coderd/x/chatd/ -race -count=1
make pre-commit
```

<details>
<summary>Design notes</summary>

- The first iteration of this PR added retry+cooldown+failure-cap logic
inside `PrepareTools`. It worked for the customer's ~30s race window but
did not help workspaces that take several minutes to boot, because
`PrepareTools` only fires between LLM steps. Reviewer pointed out the
right place to handle this is the tool itself; the current
implementation does that.
- Why async: a primer that ran synchronously inside the `OnChatUpdated`
callback blocked the create_workspace tool from returning for up to
`workspaceMCPPrimeMaxWait`, which broke
`TestCreateWorkspaceTool_EndToEnd` and would hurt any template that does
not expose MCP tools. Decoupling lets the tool return immediately and
lets the primer warm the cache concurrently with the next LLM step.
- Why share the chat `ctx` rather than `context.WithoutCancel(ctx)` (the
title-generation pattern): the primer touches
`workspaceCtx.getWorkspaceConn`, which `runChat`'s deferred
`workspaceCtx.close()` invalidates. A detached primer outliving the chat
would dial a fresh conn and leak it.
- The constant naming distinguishes `workspaceMCPDiscoveryTimeout` (35s
per-call dial budget, unchanged from #25169) from
`workspaceMCPPrimeMaxWait` (30s total budget for the post-ready primer
loop) and `workspaceMCPPrimeRetryInterval` (2s between empty-result
retries).

</details>

Follow-up to #25169.

---

_This pull request was generated by Coder Agents._
2026-05-18 07:55:56 -04:00
Ethan e75bd3aca4 fix: preserve Anthropic replay fidelity (#25377)
Anthropic is strict about replaying the latest assistant turn once it
contains signed or redacted reasoning. We were still mutating that turn
in a few Coder-owned places: dropping empty reasoning blocks on replay,
rewriting provider-tool history during sanitization, and in the worst
case sending a prompt we already knew Anthropic would reject.

This patch keeps the latest signed assistant immutable through Coder's
replay and sanitization paths, preserves empty signed or redacted
reasoning anywhere Coder owns the ledger, and fails before the provider
call if the prompt is still unsafe.

It also bumps the existing `coder/fantasy` `coder_2_33` fork that `main`
already uses to the commit containing coder/fantasy#35. These fixes have
also been upstreamed to charmbracelet/fantasy.

Closes CODAGT-409.
2026-05-18 15:20:33 +10:00
Michael Suchacz 792f0b4902 feat: add personal skill resolver (#25362)
> Mux updated this PR on behalf of Mike.

## Stack Context

This stack splits experimental personal skills into smaller reviewable
PRs. Personal skills are user-owned `SKILL.md` files stored by Coder and
injected into chatd alongside workspace skills.

Stack order:
1. #25362 personal skill resolver
2. #25363 storage, permissions, API, and SDK
3. #25365 API test coverage
4. #25366 chattool and chatd integration
5. #25066 settings UI and docs
6. #25386 personal skills slash menu

## What?

Adds the shared personal skill parser and resolver package, plus
reusable skill-name validation exported from `workspacesdk`.

The parser enforces the full personal skill contract: max raw size,
kebab-case name, max name length, and non-empty body.

## Why?

The rest of the stack needs one source-aware resolver for personal and
workspace skills, including collision handling and qualified aliases.
Keeping personal skill constraints in the parser prevents callers from
accidentally parsing invalid personal skills.

## Validation

- `go test ./coderd/x/skills ./codersdk/workspacesdk`
- pre-commit hooks on this branch
2026-05-16 15:33:43 +00:00
Ethan a59b951565 test: skip stale notification chatd flakes (#25376)
These chatd tests are flaking for the same stale control-notification
race tracked by CODAGT-353, so this change skips the newly reflaking
advisor-chain and `TestPatchChatMessage/ChangesModel` tests and rewrites
the older `TODO(hugodutka)` skips to point at the same root cause. This
keeps the known flakes documented consistently until the chatd
notification-flow refactor lands.

Closes CODAGT-427
Closes https://github.com/coder/internal/issues/1510
2026-05-15 17:36:48 +10:00
Ethan a35f71cd8a fix(coderd/x/chatd): retry HTTP/2 stream resets (#25170)
Mid-stream HTTP/2 peer resets from LLM providers can arrive after a 200
streaming response has already emitted provisional parts. Previously
those resets fell through as generic non-retryable errors because
`stream ID` messages did not match retryable transport signals, and
stream IDs could be misread as HTTP statuses.

Classify retryable HTTP/2 RST_STREAM codes as transient timeout
failures, ignore stream IDs during status extraction, and keep the
existing `retry` event as the rollback boundary for provisional message
parts so replacement attempts do not replay failed-attempt output.

Closes CODAGT-382
2026-05-14 11:40:43 +10:00
Michael Suchacz d1a471e29e fix(coderd/x/chatd): retune subagent selection guidance (#25311)
> Mux working on behalf of Mike.

## Summary

- retune chatd subagent guidance to prefer `general` for substantial
delegated work, including read-only synthesis and planning support
- narrow `explore` guidance to repository-local code lookup and bounded
tracing
- add regression tests for planning, spawn tool, and Plan Mode guidance
text

## Tests

- `go test ./coderd/x/chatd -run
'Test(DefaultSystemPromptPlanningGuidance_SteersSubagentSelection|SpawnAgent_DescriptionSteersGeneralForSubstantialResearch|SpawnAgent_PlanModeDescriptionOmitsComputerUse|PlanningOverlaySubagentGuidance_UsesPlanModeSafeDescriptions|ExploreSubagentIsReadOnly)$'`
- `make lint`
- `make test TEST_PACKAGES=./coderd/x/chatd RUN=Guidance && make test
TEST_PACKAGES=./coderd/x/chatd RUN=Description`
- pre-commit hook during `git commit`
2026-05-13 23:10:21 +02:00
Kyle Carberry b0b07536fc feat: add opt-in Coder identity headers for MCP servers (#25153) 2026-05-12 08:54:53 -04:00
Michael Suchacz f1d160c7f4 fix: allow changing model when editing earlier chat message (#25084)
Editing a previous user message and selecting a different model in the
picker silently kept using the original model: the selection was dropped
on the frontend, in the SDK, and in the backend, so both the replacement
user message and the assistant turn that followed ran against the old
model.

Plumb the selected model through all three layers (`AgentChatPage`,
`codersdk.EditChatMessageRequest`, `chatd.EditMessageOptions` /
`Server.EditMessage`), defaulting to the original message's model when
the client does not specify one. The existing `InsertChatMessages` CTE
already advances `chats.last_model_config_id` when the inserted
message's model differs, so the assistant turn picks up the new
selection without further changes. The new model is validated inside the
transaction, so an unknown ID rolls the edit back and returns a 400
`Invalid model config ID.`, mirroring the `SendMessage` path.

Refs: CODAGT-345

This change was generated by a Coder agent.

<details>
<summary>Implementation plan</summary>

# CODAGT-345: Editing an earlier message cannot change model

## Problem

When editing a previous user message in a chat, the user can change the
model in the model picker, but the backend keeps using the original
message's model. The model selection is dropped at three layers:

1. **Frontend:** `AgentChatPage.tsx`'s edit branch builds an
`EditChatMessageRequest` that omits `model_config_id`. The new-message
branch (a few lines below) does include it.
2. **SDK:** `codersdk.EditChatMessageRequest` has no `ModelConfigID`
field at all.
3. **Backend:** `chatd.EditMessageOptions` has no model field, and
`Server.EditMessage` always copies the original message's
`ModelConfigID` into the replacement message.

Once the replacement user message is inserted with the original model,
the `InsertChatMessages` CTE leaves `chats.last_model_config_id`
unchanged, so the assistant turn that follows runs against the old
model.

## Fix

Plumb the selected model through all three layers, defaulting to the
original message's model when the client doesn't override it. This
mirrors the `SendMessage` path, which already accepts a
`model_config_id` and validates it via
`resolveSendMessageModelConfigID`.

### Backend

- `codersdk/chats.go`: add `ModelConfigID *uuid.UUID` to
`EditChatMessageRequest`.
- `coderd/x/chatd/chatd.go`:
  - Add `ModelConfigID uuid.UUID` to `EditMessageOptions`.
- In `EditMessage`, after fetching the edited message, resolve the
model: if `opts.ModelConfigID != uuid.Nil`, validate it exists with
`tx.GetChatModelConfigByID` (using `chatdModelConfigLookupContext`),
otherwise keep `editedMsg.ModelConfigID.UUID`. Pass the resolved ID into
`newChatMessage(...)`.
  - Reuse the existing `ErrInvalidModelConfigID` sentinel.
- `coderd/exp_chats.go` (`patchChatMessage`):
- Read `req.ModelConfigID` (nil-safe), pass into
`chatd.EditMessageOptions`.
- Add a `case xerrors.Is(editErr, chatd.ErrInvalidModelConfigID)` arm
returning 400 `Invalid model config ID.`, matching the
`postChatMessages` handler.

### Frontend

- `site/src/pages/AgentsPage/AgentChatPage.tsx`:
- In the edit branch, set `model_config_id: effectiveSelectedModel ||
undefined` on the `EditChatMessageRequest`.
- On success, persist the chosen model to `lastModelConfigIDStorageKey`
so the next chat from this browser keeps the same default. Mirrors the
new-message branch.

### Generated

- `make site/src/api/typesGenerated.ts` and `make
coderd/apidoc/swagger.json` produce the updated `EditChatMessageRequest`
schema in `typesGenerated.ts`, `coderd/apidoc/{docs.go,swagger.json}`,
and `docs/reference/api/{chats.md,schemas.md}`.

## Tests

- `coderd/x/chatd/chatd_test.go`:
- `TestEditMessageWithModelConfigOverride`: edit with a different model
-> replacement message and `chats.LastModelConfigID` use the new model.
- `TestEditMessagePreservesModelConfigByDefault`: edit without
`ModelConfigID` -> original model preserved.
- `TestEditMessageRejectsUnknownModelConfig`: passes a random UUID ->
`ErrInvalidModelConfigID`, original message still present,
`LastModelConfigID` unchanged (rollback).
- `coderd/exp_chats_test.go` (under `TestPatchChatMessage`):
- `ChangesModel`: end-to-end via SDK; `edited.Message.ModelConfigID` and
`chat.LastModelConfigID` both match the new model.
- `InvalidModelConfigID`: random UUID -> 400 `Invalid model config ID.`.

</details>
2026-05-12 14:51:55 +02:00
Michael Suchacz f847ff3731 test(coderd/x/chatd): skip stale notification flakes (#25177)
Skip the chatd tests that currently flake because the control
notification flow cannot distinguish stale wake/status NOTIFY payloads
from real interrupt requests. Each skipped test includes a TODO to
re-enable it after the chatd notification flow refactor handles stale
notifications correctly.

Supersedes #25133, #25134, #25135, and #25139.

Refs [CODAGT-353](https://linear.app/coder/issue/CODAGT-353),
[CODAGT-356](https://linear.app/coder/issue/CODAGT-356),
[CODAGT-360](https://linear.app/coder/issue/CODAGT-360), and
[CODAGT-361](https://linear.app/coder/issue/CODAGT-361).

> Mux working on behalf of Mike.
2026-05-12 14:50:30 +02:00
Ethan 4e08543ace test(coderd): centralize chat test harness and stabilize flakes (#25171)
Chat tests previously constructed a real `openai` provider with a fake
API key and no `BaseURL`, so background title generation hit
`api.openai.com` and timed out under `-race`. The same root cause
produced several distinct flakes: title regeneration races with
synchronous `UpdateChat`/`ProposeChatTitle`, and pagination races
against `updated_at` bumps from real-network processing.

This moves the fake OpenAI-compatible provider and the chat-settle wait
into first-class `coderdtest` capabilities.
`coderd.Options.ChatProviderAPIKeys` is the new seam tests use to
redirect chat traffic to a local `httptest.Server`.
`coderdtest.WaitForChatSettled` replaces per-test waiters and drains
tracked chat-daemon work after the chat row leaves `pending`/`running`.
The `newChatClient*` constructors funnel through one options builder
that installs the fake provider before the coderd test server so cleanup
ordering is deterministic.

Closes https://github.com/coder/internal/issues/1528 & Closes ENG-2659
Closes https://github.com/coder/internal/issues/1480 & Closes CODAGT-359
Closes https://github.com/coder/internal/issues/1507 & Closes CODAGT-368
Relates to https://github.com/coder/internal/issues/1397 & Relates to
CODAGT-374
2026-05-12 22:13:55 +10:00
Kyle Carberry 376fc80451 fix(coderd/x/chatd): discover workspace MCP tools mid-turn after create_workspace (#25169)
## Problem

In `coderd/x/chatd/chatd.go` `runChat`, workspace MCP discovery is gated
on `chat.WorkspaceID.Valid` at the start of each turn. New chats that
bind their workspace mid-turn (via `create_workspace` or
`start_workspace`) get an empty workspace tool list on the first step,
and the model falls back to `execute` (bash) because no workspace MCP
tools are advertised.

**Repro:** new chat → "create a workspace and use MCP tools". No
`/api/v0/mcp/tools` request hits the agent on turn 1; turn 2 in the same
chat works fine.

## Fix

- Add a `PrepareTools` callback to `chatloop.RunOptions`, analogous to
`PrepareMessages`. It is invoked once before each LLM step with the
current tool list. When it returns non-nil, the chatloop replaces
`opts.Tools`, rebuilds the per-step tool definitions, and appends new
tool names to `opts.ActiveTools` so newly injected tools are callable
immediately.
- Wire `PrepareTools` in `runChat` to trigger workspace MCP discovery
the first time the chat snapshot reports a valid `WorkspaceID`. The
previous top-of-turn discovery path is unchanged for chats that start
with a workspace.
- Extract the discovery logic into `Server.discoverWorkspaceMCPTools` so
the top-of-turn and mid-turn paths share identical behavior (cache,
agent resolution, `ListMCPTools` timeout, invalidation).

Mid-turn discovery stays disabled in plan-mode turns and Explore
subagents, matching the existing top-of-turn gate. The
`workspaceMCPDiscovered` flag prevents redundant dials after the first
successful discovery.

## Tests

- `coderd/x/chatd/chatloop/chatloop_test.go`: two new
`TestRun_PrepareTools*` cases covering injection on the next step and
active-set merging when `ActiveTools` is non-empty.
- `coderd/x/chatd/chatd_test.go`:
`TestRunChat_WorkspaceMCPDiscoveryAfterMidTurnCreateWorkspace` drives
`runChat` through a `create_workspace` tool call against a real Postgres
+ mocked agent conn and asserts the second streamed LLM request
advertises the workspace MCP tool. Verified that the test fails (and
pinpoints the missing tool) when the `PrepareTools` wiring is disabled.

## Validation

```
go test ./coderd/x/chatd/chatloop/... -count=1
go test ./coderd/x/chatd/... -count=1
make lint/emdash
```

<details>
<summary>Decision log</summary>

- Chose a per-step `PrepareTools` callback over mutating `opts.Tools` in
place because `chatloop.Run` builds the `fantasy.Tool` definitions once
at start; a hook is required to let the LLM see new tools on the next
step.
- Returned `[]fantasy.AgentTool` (not also active-tool-names) and let
the chatloop derive name merges via `mergeNewToolNames`. This avoids
leaking plan-mode gating decisions into the callback contract.
- Kept the existing top-of-turn discovery path so chats that already
have a workspace at turn start pay no extra latency.
- Skipped reusing `ReloadMessages` (history reload) since this is purely
a tool-availability concern; coupling it to a history reload would
defeat the chatloop cache prefix optimizations.

</details>

---

_This pull request was generated by Coder Agents._
2026-05-12 00:30:56 -04:00
Kyle Carberry 5a5cd79c4c fix: drop buffered chat parts after their durable message commits (#25164) 2026-05-12 00:30:38 -04:00
Kyle Carberry 0ed57ee343 fix(coderd/x/chatd): checkpoint buffered message_parts to avoid stale replay (#25145) 2026-05-11 17:27:03 -04:00
Thomas Kosiewski e56381eb61 feat: stream advisor tool output (#25032)
Stream advisor output into the advisor tool card while the nested
advisor call is still running.

This keeps the advisor implementation intentionally advisor-specific:
the parent model still receives the same final structured tool result,
while the frontend receives transient `tool-result.result_delta` parts
to render partial advisor text in the expanded card. The final persisted
chat history remains unchanged.

Refs CODAGT-322.

Generated by Coder Agents.

<details>
<summary>Implementation plan</summary>

- Publish advisor text deltas from the nested `chatloop.Run` via
`RunAdvisorOptions.OnAdviceDelta`.
- Forward those deltas through `chatadvisor.Tool` with the parent
advisor tool call ID.
- Emit transient `ChatMessagePartTypeToolResult` websocket parts with
`ResultDelta` from `chatd`.
- Add `result_delta` to the generated tool-result TypeScript variant.
- Accumulate tool result deltas in frontend stream state and keep the
tool running until the final result arrives.
- Render streamed advisor advice in the existing advisor card using
streaming markdown mode, while retaining the updated advisor UI.

</details>
2026-05-11 20:18:49 +02:00
Michael Suchacz 6bb88775ab test(coderd/x/chatd): pin TestGetWorkspaceConn_StatusCheck to mock clock (#25130)
The `TimedOutAgentCacheHit`, `CacheHitHealthyAgent`, and
`CacheHitDBError` subtests of `TestGetWorkspaceConn_StatusCheck` built
their `WorkspaceAgent` timestamps with `time.Now()` in the parent test's
slice literal and then ran the actual check against the server's real
wall clock (`quartz.NewReal()`). On slow Windows CI runners, more than
`agentInactiveDisconnectTimeout` (30s) of wall time can elapse between
slice construction and the parallel subtest body. In that window, the
cached "healthy" agent gets reclassified as disconnected by
`agentDisconnectedFor`, and `CacheHitHealthyAgent` fails with
`errChatAgentDisconnected` instead of returning the cached connection.

Build each agent inside the subtest with `quartz.NewMock(t)` and feed
the same clock into the `Server` so the agent timestamps and the status
math share a single frozen `now`. This matches the pattern already used
by `TestGetWorkspaceConn_DialTimeoutDisconnectedRecoveryThreshold` in
the same file.

Closes https://github.com/coder/internal/issues/1522

<details>
<summary>Verification</summary>

Inserting `time.Sleep(35 * time.Second)` at the top of each subtest's
body reliably reproduces the original failure
(`errChatAgentDisconnected` on `CacheHitHealthyAgent`) on the parent
commit and passes with this change. After removing the synthetic sleep,
`go test ./coderd/x/chatd -run TestGetWorkspaceConn_StatusCheck
-count=50` passes cleanly.

</details>

> Generated by Coder Agents on behalf of the assignee.

Co-authored-by: Coder Agents <noreply@coder.com>
2026-05-11 19:53:58 +02:00
Michael Suchacz 60779ad2ec test(coderd/x/chatd): stop waking acquireLoop in TestResolveExploreToolSnapshot (#25129)
Fixes [CODAGT-367](https://linear.app/codercom/issue/CODAGT-367).

`TestResolveExploreToolSnapshot/*` flaked on CI (Linux and Windows) with
`context deadline exceeded` on the `GetMCPServerConfigsByIDs` call
inside `resolveExploreToolSnapshot`.

Each test setup called `server.CreateChat` twice with `MCPServerIDs` set
to fake `.example.com` URLs. `CreateChat` marks the chat pending and
calls `signalWake`, which causes the chatd background `acquireLoop` to
pick the chat up. That goroutine then dialed the fake MCP URLs
(NXDOMAIN, slower on Windows) and made an OpenAI request with the dbgen
default test key (401). Under CI load, that activity racing the 4
parallel subtests' `GetMCPServerConfigsByIDs` calls was enough to exceed
the 25s test context deadline. The failure logs in the issue showed both
side effects firing in the same job.

`resolveExploreToolSnapshot` only reads `ID`, `MCPServerIDs`,
`PlanMode`, `ParentChatID`, and `Mode` off the parent argument, so the
chats do not need to be persisted. Build them as in-memory
`database.Chat` values instead. The MCP server configs remain in the DB
because the function still queries them via `GetMCPServerConfigsByIDs`.

Verified locally with `go test ./coderd/x/chatd -run
TestResolveExploreToolSnapshot -count=100 -race` (passes, ~5s total) and
the surrounding `TestResolve*` / `TestCreateChildSubagentChat*` /
`TestSpawnAgent_Explore*` tests.

---

_Made by Coder Agents on behalf of @ibetitsmike. [Linear
session](https://linear.app/codercom/issue/CODAGT-367/flake-testresolveexploretoolsnapshot#agent-session-0730f3fe)._
2026-05-11 19:46:59 +02:00
Michael Suchacz 645b8cc63d fix(coderd/x/chatd/chaterror): deflake TestClassify_ParsesRetryAfterHTTPDate (#25128)
The test built a `Retry-After` HTTP-date with
`time.Now().Add(3*time.Second).UTC().Format(http.TimeFormat)`, then
asserted that the parsed `RetryAfter` was `>= 2s`. `http.TimeFormat` has
second precision, so `Format()` truncates up to ~1s. Combined with the
small elapsed time between formatting in the test and `time.Until()` in
production, the value could land just under `offset-1s` (1.997s observed
in CI), failing the lower bound.

Round the formatted target up to the next whole second so the parsed
deadline is never earlier than `now+offset`, and assert against a
symmetric `[offset-1s, offset+1s]` window.

Closes
[CODAGT-365](https://linear.app/codercom/issue/CODAGT-365/flake-testclassify-parsesretryafterhttpdate)
Refs https://github.com/coder/internal/issues/1512

<sub>Created by [Coder Agents](https://coder.com/docs/agent).</sub>

Co-authored-by: Coder Agents <coderagents@coder.com>
2026-05-11 19:09:51 +02:00
Cian Johnston e8508b2d90 fix: recover chatd from poisoned chain anchor on retry (#25097)
When OpenAI's Responses API returns `Previous response with id ... not
found` for a chained turn, classify it as a `ChainBroken` retry, clear
`previous_response_id`, exit chain mode, reload full history, and let
`chatretry` retry. Self-heals chats whose anchor was poisoned before
#25074 stopped truncated streams from being persisted as a successful
turn with a stored response id.

The new state is exposed via the existing
`coderd_chatd_stream_retries_total` counter as a
`chain_broken="true"|"false"` label. Aggregating queries (`sum`, `rate`
over `provider`/`model`/`kind`) keep working without changes; raw-series
matchers without aggregation will now see two series per `(provider,
model, kind)` where they previously saw one. The metric is internal-only
so the blast radius should be small, but if you have dashboards that
index by exact label matchers without aggregation they will need an
extra `sum` or an explicit `chain_broken` selector.

> 🤖 This PR was created with the help of Coder Agents, and was reviewed by a human 🧑‍💻
2026-05-11 17:43:40 +01:00
Michael Suchacz 915956460a feat(coderd/x/chatd): add compact turn status labels (#25043)
> Mux is acting on Mike's behalf.

Changes chat turn-end summaries into compact status labels for the
cached `last_turn_summary` and successful web push body.

Uses a structured-output model call for successful turns, requiring a
2-5 word `label` and validating it to reject agent-centric phrasing.
Pending and requires-action states keep deterministic status labels.

Removes the earlier deterministic tool-signal pipeline in favor of the
smaller structured-output path.
2026-05-11 17:09:42 +02:00
Mathias Fredriksson fb60bb0c08 chore(coderd/x/chatd): instrument PromoteQueued + stream subscriber for ENG-2645 (#25085)
TestPromoteQueuedWhileRequiresActionMixedTools has flaked three times across
Windows and Ubuntu CI runners since 2026-05-06; local repro on the dev
workspace has not surfaced it. The May 8 Ubuntu log shows all four
PromoteQueued post-TX pubsub publishes reaching pg_notify, yet the test still
times out 25s later, so the failure is downstream between the subscriber's
listener and the test's events channel. Adds three Debug-level markers in
chatd.go (no logic change) plus two t.Logf markers in the test's reader so
the next CI occurrence pins down exactly which step failed.

Closes ENG-2645
Closes coder/internal#1523
2026-05-11 08:33:46 +00:00
Ethan 063c06ca5f test: prevent expired contexts in chatd parallel subtests (#25107)
Parallel subtests in `coderd/x/chatd` reused a parent test context with
a `testutil.WaitLong` deadline, so the context could expire before a
subtest was scheduled under load. That made the subagent lifecycle tools
return plain-text context errors instead of the expected JSON payload,
causing flaky JSON unmarshal failures.

Create fresh `chatdTestContext` values inside the affected parallel
subtests and add `chatdTestContext` to the `paralleltestctx` custom
function list so this pattern is caught by `make lint`.

Closes https://github.com/coder/internal/issues/1494
2026-05-11 17:48:27 +10:00
Ethan bd6cc1aaf2 feat(coderd): add stop_workspace chatd tool and recovery classification (#24997)
## Summary

Adds a `stop_workspace` tool to chatd so the model can recover from the
"workspace running but agent dead" failure mode (e.g. an OOM that leaves
the workspace running but the agent unreachable) by stopping and then
starting the workspace.

<img width="924" height="742" alt="image"
src="https://github.com/user-attachments/assets/279dedb6-6e29-4fe1-8754-3a1f01e538bf"
/>



## What changed

**New `stop_workspace` chatd tool**
(`coderd/x/chatd/chattool/stopworkspace.go`). Mirrors `start_workspace`:
shares `WorkspaceMu` to serialize with create/start, waits for any
in-progress build before issuing a stop, and is idempotent only after a
successful Stop transition. Failed stop builds re-attempt rather than
reporting success.

**New `chatStopWorkspace` coderd hook** (`coderd/exp_chats.go`). Mirrors
`chatStartWorkspace` minus the `RequireActiveVersion` gate. Stop should
not be blocked by template version policy.

**Differentiated recovery sentinels** (`coderd/x/chatd/chatd.go`).
`errChatAgentDisconnected` instructs the model to call `stop_workspace`
then `start_workspace`. `errChatDialTimeout` instructs a single retry,
then user escalation if it repeats. The previous single message
conflated transient and persistent failures.

**Two-signal recovery gate.** Recovery is only surfaced when a tool call
times out *and* a fresh DB read of the latest workspace agent says
`Disconnected`. The previous draft escalated on the DB read alone, which
would fire on a 30-second heartbeat blip (e.g. agent respawn) and prompt
a destructive stop/start unnecessarily.

**Cache-hit disconnected handling** now clears the cache and retries a
fresh dial before escalating, rather than returning the recovery
sentinel immediately. Latest-agent classification uses
`GetWorkspaceAgentsInLatestBuildByWorkspaceID` instead of the chat's
bound `AgentID`, so stale bindings after a rebuild don't misclassify.

**Shared chattool helpers** in `coderd/x/chatd/chattool/chattool.go`:
`latestWorkspaceBuildAndJob`, `publishBuildBinding`,
`provisionerJobTerminal`. Applied to both `start_workspace` and
`stop_workspace`.

## Notes

- Reverts an earlier draft that widened `ask_user_question` to root
standard turns. Plan-mode-only behavior is restored.
- The `stop_workspace` tool currently renders via the generic chat
tool-call UI. A follow-up frontend PR will prettify the `stop_workspace`
tool and style it like the `start_workspace` tool.
- Never-connected (`Timeout` status) agents are intentionally excluded
from recovery. They indicate template or startup failure, not the
running-but-dead case this PR targets.

Closes CODAGT-315
2026-05-11 16:23:07 +10:00
Mathias Fredriksson 3925d3941b fix(coderd/x/chatd): wait long enough for cold-start workspace MCP discovery (#25035)
The 5s timeout cancelled cold-start ListMCPTools calls before the
agent's 30s connectTimeout could settle, so workspace MCP tools
never reached the LLM. Bump to 35s and scope to ListMCPTools only.
2026-05-08 17:49:10 +03:00
Ethan b6dbc5614c fix(coderd/x/chatd): handle truncated provider streams (#25074)
coder/fantasy now fails closed when Anthropic or OpenAI Responses
streams close before their provider terminal events instead of yielding
a successful finish.

This bumps the fantasy replacement to coder/fantasy#33 and teaches chat
error classification to treat those failures as retryable timeout errors
with explicit stream-closed messages.

<img width="875" height="311" alt="image"
src="https://github.com/user-attachments/assets/69c6f7b5-c885-46d2-a88b-b7a2b111bd55"
/>
2026-05-08 15:52:42 +10:00
Ethan de9cdca77e fix(coderd): handle external-agent workspaces honestly in chat (#24969)
## Summary

Make Coder's chat agent honest about workspaces that use
`coder_external_agent`. Three behaviors change so the chat stops
pretending it can drive an external workspace through to a usable state
on its own.

<img width="859" height="537" alt="image"
src="https://github.com/user-attachments/assets/0561442b-95f1-4a2d-853c-7e3776114680"
/>


## Problem

External agents are not started by Coder. The user has to run `coder
agent` on their own host with a token Coder generates. Before this
change, the chat agent treated those workspaces like any other:

- `create_workspace` would enqueue a build for an external-agent
template and then wait minutes (~22 worst case) for an agent that was
never going to come up.
- When mid-turn tool calls dialed an external agent that was not
connected, the chat burned the full 30-second dial timeout and returned
generic "the workspace may need to be restarted from the Coder
dashboard" guidance, which is not the action the user can take.
- Nothing told the chat (or the user, through the chat) that the next
action lives outside Coder.

## Fix

Three changes scoped to `coderd/x/chatd/`:

1. **`create_workspace` blocks templates with external agents.** The
tool reads `template_versions.has_external_agent` for the template's
active version and refuses external-agent templates with a message
instructing the chat to pick a different template, or to have the user
create and start the workspace themselves and then attach it.

2. **Attaching an existing external workspace stays open.** No
selection-time gate on attachment; users can still bind a working
external workspace to a chat.

3. **External-agent-aware error handling on connection.** Two
complementary changes both predicated on proven connectivity failures
rather than every dial error:

- **`getWorkspaceConn` preflight and timeout handling.** Before opening
a connection, the cache-miss path reads the agent's status from the
already-loaded row. If the selected agent is external and clearly
offline according to the existing `isAgentUnreachable` helper
(`Disconnected` or `Timeout`, never `Connecting`), it returns an
external-agent-specific error immediately instead of waiting out the
30-second dial timeout. `Connecting` external agents fall through to the
dial so a user who just started the agent on their host can still
succeed in the same turn. The preflight only fires when the agent is
still the latest selected agent for the workspace, so stale-binding
recovery via `dialWithLazyValidation` is unaffected. The post-dial
rewrite is limited to the dial timeout sentinel; stale/no-agent bindings
and non-timeout dial failures preserve their original errors.

- **`waitForAgentReady` timeout-branch rewrite.** The 2-minute retry
loop used by `create_workspace` and `start_workspace` runs unchanged for
all agents. When the loop's outer deadline elapses, the timeout branch
substitutes the external-agent message in place of the raw dial error if
the agent belongs to an external resource.

This applies the same pattern that the cache-hit path of
`getWorkspaceConn` already used (`isAgentUnreachable` returning
`errChatAgentDisconnected`), extended to the cache-miss path and to the
readiness helper, with the external-agent-aware error rewrite layered
only on confirmed offline or timeout paths.

Closes CODAGT-314
2026-05-08 13:51:13 +10:00
Ethan 3a9080fff6 feat: tag chat-originating agent logs with chat_id (#25019)
Workspace-agent logs emitted while serving chatd-driven requests were
not correlated with the originating chat, making agent logs hard to
attribute to the corresponding/originating chat.

This adds agent-side chat context middleware that parses `Coder-Chat-Id`
once, enriches agent access logs and structured handler/background logs,
and adds a chatd bridge log when chat headers are attached to an agent
connection.

Closes CODAGT-324
2026-05-08 13:25:30 +10:00
Dean Sheather e1b1c7ec5b feat: resize chat image attachments client-side for provider budgets (#24533)
Anthropic rejects inline images over 5,242,880 bytes, but our upload
endpoint accepts images up to 10 MiB — so 5–10 MiB images were
reaching the provider and failing. This adds two layers of
protection: the browser resizes oversized images before upload, and
the server rejects any that still slip through before an upstream
request is issued.

Client-side resizing uses `createImageBitmap` with
`resizeWidth`/`resizeHeight` to clamp the decoded bitmap at decode
time, then iteratively shrinks on an `OffscreenCanvas` (falling back
to `HTMLCanvasElement`) until the output fits the applicable budget.
Anthropic (and Bedrock-hosted Claude — fantasy's bedrock provider is
a thin wrapper around the Anthropic client) uses a ~5 MiB budget;
other providers use a ~10 MiB budget to stay under the server cap.
Doing the resize in the browser avoids decoding attacker-controlled
image bytes in `coderd` (image-bomb DoS surface).

Server-side, `chatFileResolver` now takes a provider string and
looks up the inline-image cap via a new
`chatprovider.InlineImageByteCap`
helper; oversized `image/*` files for capped providers are rejected
with a pre-classified `chaterror` before the SDK call. The backstop
fires for older clients, direct API callers, or any image that was
committed to the composer before the user switched to a stricter
provider.

Attachments commit to composer state synchronously with a new
`"processing"` `UploadState` so paste+Enter can't dispatch before
the resize finishes; the `"uploading"` send gate now covers both
states. Dismissed-while-resizing attachments are tracked in a
`WeakSet` so a late swap can't resurrect a removed file.

Closes CODAGT-215
2026-05-08 02:07:33 +10:00
Thomas Kosiewski 273e828442 fix: remove advisor reasoning configuration (#25030) 2026-05-07 15:19:19 +02:00
Mathias Fredriksson 8c08aa1f6c fix(coderd/x/chatd): wake after async chat-row UPDATEs commit (#25036)
The async title-generation and turn-summary goroutines launched from
processChat run autocommit UPDATEs on the chat row after finishActiveChat
has set the chat to pending and signalWake has fired. If the row lock
from one of those UPDATEs is held while acquireLoop's processOnce runs,
AcquireChats's FOR UPDATE SKIP LOCKED skips the freshly-pending chat and
returns no rows. The wake is then consumed with no acquisition, and the
chat sits in pending until the next acquireTicker (default 1s).

Wake again after each UPDATE commits. The second wake covers the race
window without changing the transaction semantics.

Closes coder/internal#1500
2026-05-07 15:53:11 +03:00
Ethan 6fa7e84761 test(coderd/x/chatd): skip TestExploreChatSendMessageCannotMutateMCPSnapshot (#25023)
Skips `TestExploreChatSendMessageCannotMutateMCPSnapshot` while the
chatd redesign is in flight. The test exposes a self-interrupt race in
`processChat`'s control-pubsub subscriber that is structurally fixed by
the redesign in #24444; skipping until then matches the existing
`TestSubscribeRelayEstablishedMidStream` skip in
`enterprise/coderd/x/chatd/chatd_test.go`.

Relates to https://github.com/coder/internal/issues/1493.
2026-05-07 07:56:17 +00:00
Ethan 2ff05608d2 test: stabilize chatdebug heartbeat threshold test (#25022)
`launchHeartbeat` could miss a stale-threshold update during startup if
`SetStaleAfter` ran after the heartbeat ticker was created but before
the goroutine subscribed to `thresholdChan`. In that case, the heartbeat
kept the old interval until a future tick, and the mock-clock test could
time out waiting for `Ticker.Reset` without advancing time.

Subscribe to `thresholdChan` before reading the heartbeat interval so
the channel consistently invalidates the interval. The regression test
now changes the threshold while ticker creation is trapped, making the
startup race deterministic.

Closes https://github.com/coder/internal/issues/1513
2026-05-07 17:12:14 +10:00
Ethan 100ebd9f3b test(coderd/x/chatd): deflake advisor chain mode snapshot (#25021)
`TestAdvisorChainMode_SnapshotKeepsFullHistory` was using the generic
active chatd test server, which leaves periodic pending-chat polling
enabled. That made the test inconsistent with the other OpenAI Responses
API tests and allowed stale pending pubsub notifications to interrupt
the second turn before the advisor request was observed.

Use the existing OpenAI Responses test server helper so pending-chat
acquisition is delayed and the test only starts processing after the
SendMessage pending notification has been published.

Closes https://github.com/coder/internal/issues/1510
2026-05-07 17:12:12 +10:00
Ethan ef0151601e feat: report insufficient quota build failures in chat tools (#24956)
## Summary

When a workspace build fails because the user is over their group quota,
the chat tools currently surface the failure as a bare `"workspace build
failed: insufficient quota"` string with no machine-readable error code
and no visibility into the user's current usage. Agents and the UI
cannot distinguish quota failures from any other Terraform error, so
users see an opaque message and have no clear path to recovery.

This PR tags quota failures with a typed error code at the source and
propagates it through the chat tool layer so callers can react to it
explicitly.

Relates to CODAGT-20

## Changes

**Provisioner runner**

- Add `InsufficientQuotaErrorCode = "INSUFFICIENT_QUOTA"` and set it
explicitly at the `commitQuota` failure site via a new
`failedWorkspaceBuildfCode` helper, so `provisioner_jobs.error_code` is
populated only on the genuine quota path. The substring matcher used for
externally produced sentinels (e.g. `"missing parameter"`, `"required
template variables"`) is intentionally not extended; provider errors
that happen to mention "insufficient quota" stay classified as generic
build failures.

**SDK and API contract**

- Add `JobErrorCodeInsufficientQuota` and a
`JobIsInsufficientQuotaErrorCode` helper to `codersdk`.
- Extend the swagger `enums` tag on `ProvisionerJob.ErrorCode` to
include `INSUFFICIENT_QUOTA`.
- Regenerate `coderd/apidoc`, `docs/reference/api/*`, and
`site/src/api/typesGenerated.ts`.

**chattool create_workspace / start_workspace**

- `waitForBuild` now returns a typed `*workspaceBuildError` carrying
both the message and the `JobErrorCode`, instead of a bare error string.
- New `quotaerror.go` introduces a structured `quotaErrorResult` (with
`error_code`, `title`, `message`, `build_id`, and optional `quota`) and
a best-effort `workspaceQuotaDetails` lookup that wraps owner
authorization internally and fetches `credits_consumed` and `budget`
from the database. Quota lookup failures (including authorization
failures) never block the failure payload.
- On quota-coded build failures, both `create_workspace` and
`start_workspace` now return the structured response (with the recovery
guidance inlined into `message`) instead of the bare `"insufficient
quota"` string. This applies to all three failure paths: post-creation,
an in-progress existing build, and a freshly triggered start build.
Non-quota build failures continue to use the existing
`buildToolResponse` / `newBuildError` path.
- Owner authorization is wrapped only on the call sites that need it
(the `CreateFn` and `StartFn` invocations and the quota-detail lookup),
so idempotent fast paths (already running, already in progress,
existing-workspace early returns) do not pay for an extra RBAC
round-trip or fail when role lookup is transient.

## Out of scope

- No changes to quota math, allowances, or bypass behavior.
- No automatic retries.
- No new quota-inspection tools and no changes to MCP
`coder_create_workspace` (which returns immediately and never observed
the build outcome here).
- No frontend UI changes; those will land in a follow-up PR that
consumes the new `INSUFFICIENT_QUOTA` code.
2026-05-07 15:01:58 +10:00
Mathias Fredriksson 6b0518d051 fix: state-aware queued message promotion (#24819)
PromoteQueued now branches on chat status: synth tool results before
the user message on requires_action, deferred reorder + Waiting on
running so the worker's persist+auto-promote keeps partial output.
Stale heartbeat falls through to the synchronous path; GetStaleChats
picks up Waiting+queue to recover post-cleanup-crash. Endpoint
returns 202.

Closes CODAGT-119
2026-05-06 19:11:56 +03:00
Michael Suchacz 0bfb9f6f13 feat: show agent turn summary in agents sidebar (#24942)
Persists the agent-generated turn-end summary on `chats` and shows it as
the Agents sidebar subtitle when present, falling back to the model
name. Errors still take precedence.

> Mux is acting on Mike's behalf.

## What changes

**Storage.** New nullable `last_turn_summary` column on `chats`
(migration `000486`). New `UpdateChatLastTurnSummary` query normalizes
blank/whitespace input to `NULL`, preserves `updated_at` (so the chat
does not jump to the top of the sidebar on summary writes), and uses an
`expected_updated_at` stale-write guard so an older async summary cannot
overwrite a newer turn.

**Backend.** `coderd/x/chatd/chatd.go` decouples summary generation from
webpush. Generated summaries persist for completed parent turns even
when webpush is unconfigured or has no subscriptions. The same generated
text is reused as the webpush body when webpush is configured, so the
summary model is not called twice. Generic fallback push text is no
longer persisted; it clears any stale summary instead.
Error/interrupt/pending-action terminal paths clear `last_turn_summary`
for the latest turn.

**Frontend.** `AgentsSidebar.tsx` subtitle priority is now `errorReason
|| lastTurnSummary || modelName`, normalized via the existing
`asNonEmptyString` helper from `blockUtils.ts`.

## Tests

- `TestUpdateChatLastTurnSummary` (database): success,
whitespace-to-NULL, stale guard rejects, `updated_at` preserved.
- `TestUpdateLastTurnSummaryRejectsStaleWrites` (chatd internal): direct
stale-`expected_updated_at` test.
- `TestSuccessfulChatPersistsTurnSummaryWithoutWebPush`: persistence
works without webpush subscriptions.
- `TestSuccessfulChatSendsWebPushWithSummary`: same generated text
drives both DB and push body.
-
`TestSuccessfulChatSendsWebPushFallbackWithoutSummaryForEmptyAssistantText`:
fallback text is not persisted.
- `TestErroredChatClearsLastTurnSummaryAndSendsWebPush`: error path
clears the field.
- `TestInterruptChatDoesNotSendWebPushNotification`: interrupt path
clears the field, no push fires.
- `AgentsSidebar.test.tsx`: subtitle priority for summary-present,
error-wins, no-summary fallback, whitespace fallback.
- `AgentsSidebar.stories.tsx`: `ChatWithTurnSummary` and
`ChatWithTurnSummaryAndError`.

## Notes

- No backfill. Existing chats keep showing the model name until their
next turn completes.
- Parent chats only in this iteration; the field is rendered on any
`Chat` if a future change extends generation to children.
- Decoupling generation from webpush adds quickgen model calls for
completed parent turns that previously skipped generation when no
subscriptions existed. Existing parent-only, assistant-text-present,
`PushSummaryModel` configured, and bounded-timeout gates keep this
behavior bounded.
2026-05-06 16:43:35 +02:00
Cian Johnston a74015fc85 refactor: make store and chatID explicit parameter arguments in chattools (#24850)
Fixes CODAGT-175

Addresses a review finding in https://github.com/coder/coder/pull/23827
that the nil-guards for both `database.Store` and `chatID` are both dead
code in practice in the `chattool` package.

- Modifies the return signatures require passing both `database.Store`
and `chatID` explicitly as positional arguments instead of just
parameter struct keys.
- Drops the nil-guards for `database.Store` and `chatID`.
2026-05-06 11:05:16 +01:00
Ethan e5c7fdff86 fix(coderd/x/chatd): refresh chat status and bound subscriber reads on Subscribe (#24095)
Tightens the chat stream subscription path on a few related axes. None
of these changes touch the steady-state event flow; they all concern the
subscribe handshake.

## Motivation

`Server.Subscribe` carries three responsibilities that were entangled:

1. Authorize the caller against the chat row.
2. Arm local + pubsub subscriptions before any DB reads
(subscribe-first-then-query).
3. Build the initial snapshot from a fresh chat row, message history,
and queue.

When all three live in one function and share the request context, a few
unfortunate behaviors fall out:

- The HTTP handler's middleware already loaded and authorized the chat
row, but `Subscribe(chatID)` discarded it and re-fetched on every
WebSocket connection.
- The chat row used to populate the initial `status` event was loaded
*before* the pubsub subscription was armed, so a status transition that
happened in that window was silently lost.
- Control-path DB reads inherited whatever context the caller passed in.
A caller without a deadline could wedge a subscriber goroutine
indefinitely on a stalled DB.
- A transient failure of the chat re-read collapsed the entire
subscription instead of degrading gracefully.

## What changes

**Split the auth boundary out into the type signature.** A new
`SubscribeAuthorized(ctx, chat, ...)` takes the already-authorized row
directly. The HTTP handler in `coderd/exp_chats.go` calls it with the
chat row from `httpmw.ChatParam`, eliminating the redundant
`GetChatByID`. `Subscribe(chatID)` is preserved as a thin wrapper for
callers that don't have a chat row in hand (tests, internal callers); it
does the auth lookup and delegates.

**Re-read the chat after arming subscriptions.** Inside
`SubscribeAuthorized`, after the local stream and pubsub subscriptions
are active, we reload the chat row to populate the initial `status`
event and any enterprise relay setup. Combined with the existing
subscribe-first-then-query pattern, this closes the gap where a status
transition between the middleware's load and the subscription arming
would not appear in either the initial snapshot or a live notification.

**Fall back to the middleware row on refresh failure.** If the
post-subscription refresh fails (transient DB blip, brief pool
exhaustion), we log a warning and reuse the row that proved
authorization in the first place. Messages, queue, and pubsub are all
independent of this row, so the stream still works; the initial `status`
is just slightly stale and self-corrects via the next pubsub event.

**Bound subscriber control-path DB reads.** A new
`streamSubscriberControlFetchContext` helper applies a 5-second fallback
timeout only when the caller has no deadline of their own. Used at the
chat refresh, the initial queue load, and the queue-update goroutine
following pubsub notifications. HTTP-driven callers pass through
unchanged; background callers can no longer hang forever on a stalled DB
and leak subscriber goroutines, pubsub subscriptions, and `chatStreams`
entries.
2026-05-06 14:29:53 +10:00
Ethan 46a60e6d5d refactor: move chat error kinds into codersdk (#24955)
Moves the chat error kind taxonomy from `coderd/x/chatd/chaterror` into
`codersdk.ChatErrorKind` and types `ChatError.Kind` /
`ChatStreamRetry.Kind` so generated TypeScript exposes an SDK-owned
union, including `usage_limit`. Backend chat classification now
references the SDK constants directly while preserving the existing JSON
string values.

Keeps chat usage-limit admission failures on their existing 409 response
shape. The frontend maps structured usage-limit responses to the
SDK-owned `usage_limit` kind, uses generated `TypesGen.ChatErrorKind`
directly, and removes the local string union and alias.
2026-05-06 11:57:48 +10:00
Ethan 4751416b29 fix!: persist structured chat errors (#24919)
**Breaking change for changelog:**

> `codersdk.Chat.last_error` now returns a structured `ChatError` object
(`{message, kind, provider, retryable, status_code, detail}`) instead of
a plain string. The chats API is experimental
(`/api/experimental/chats`), so this ships without a deprecation cycle;
consumers reading `chat.last_error` as a string must update to read
`chat.last_error.message`. SDK/generated TypeScript terminal error
payloads now use the single `ChatError` type; the live stream error
payload type is renamed from `ChatStreamError` to `ChatError`.

Persisted chat errors now carry the same provider-specific detail (kind,
provider, retryable, HTTP status, optional detail) as the live stream,
so refreshing a failed chat rehydrates with the full structured error
instead of a one-line headline.

Existing rows are migrated in place: legacy text errors are wrapped into
`{message, kind: "generic"}` so already-errored chats still render, and
rows with `last_error IS NULL` stay NULL. Internally, persisted fallback
decoding now reuses the existing `chaterror.KindGeneric` constant, with
no JSON value change.

Closes CODAGT-239
2026-05-05 12:56:06 +10:00
Ethan 7e01edeb8e fix: align chat attachment picker with allowed file types (#24917)
The agent chat composer only advertised image uploads to the OS file
picker and filtered drag-and-drop and paste events to `image/*`, even
though the backend accepts text, CSV, JSON, PDF, and a narrower set of
image types.

Move the allowed chat attachment media types into `codersdk` so the
frontend picker and backend enforcement share one source of truth. Use
the generated TypeScript list to drive the file input `accept` attribute
and the drag-and-drop and paste filters, while adding common text
extensions so platforms without MIME registrations still surface those
files in the picker.
2026-05-05 12:25:13 +10:00
Michael Suchacz 632dcdb63a feat: add personal chat model overrides (#24715) 2026-05-05 00:57:51 +02:00
Michael Suchacz 0bb09935bc feat: add computer-use provider selection for AI agents (#24772)
Adds a deployment-wide setting to select the computer-use provider
(Anthropic or OpenAI) for AI agents, plus the OpenAI computer-use runner
needed to honor that selection.

The setting is stored in `site_configs` under
`agents_computer_use_provider`, defaults to Anthropic when unset, and is
exposed via experimental GET/PUT endpoints under
`/api/experimental/chats/config/computer-use-provider`. The chatd
computer-use tool now dispatches to either `runAnthropicComputerUse` or
`runOpenAIComputerUse` based on the resolved provider, with
provider-specific result metadata for OpenAI screenshots.

Frontend adds a provider dropdown to the Agents Experiments settings
page nested under the virtual desktop toggle, with disabled state
handling while virtual desktop is off and skeleton loaders while config
queries are in flight.

Hugo and Codex review follow-up:
- Uses shared provider validation and clearer computer-use constant
names.
- Removes stale OpenAI pending-safety-checks commentary.
- Documents why provider result metadata is needed for OpenAI
screenshots.
- Keeps the computer-use subagent visible when provider credentials are
missing, then returns a clear spawn-time configuration error.
- Uses OpenAI's recommended 1600x900 screenshot geometry to preserve the
native 16:9 aspect ratio.
- Moves OpenAI-specific computer-use helpers into
`coderd/x/chatd/chatopenai/computeruse` after rebasing onto the provider
package refactor in `main`.
- Converts OpenAI pixel scroll deltas to Coder desktop wheel-click
amounts.
- Preserves OpenAI pointer modifiers with key down/up desktop actions
and rejects unsupported non-left double-click buttons explicitly.
- Maps OpenAI back/forward side-button clicks to browser navigation key
actions.
- Defaults omitted OpenAI click buttons to left-click.
- Retries mouse release cleanup if the final OpenAI drag release fails.
- Keeps computer-use subagent availability messages stable when provider
config cannot be loaded, while logging the backend error.
- Releases remaining OpenAI modifier keys if a synthetic key-up cleanup
action fails.
- Updates Storybook interaction stories so provider snapshots show the
selected final provider.

> Mux updated this PR description on behalf of Mike.
2026-05-04 20:30:50 +02:00
Michael Suchacz 033ed0bb82 feat: add admin-configurable chat title generation model (#24838)
Adds an admin-configurable deployment-wide setting that controls which
model is used for chat title generation. Admins can pick any enabled
chat model config from the Agents settings page, or leave the setting
unset to keep the existing fast-models-then-chat-model fallback
algorithm.

When a model is selected, both automatic and manual title generation use
only that model, with no silent fallback. When the configured model is
disabled, missing credentials, or otherwise unusable, automatic title
generation skips entirely (best-effort) and manual title regeneration
returns a clear error, so admins notice the misconfiguration instead of
silently routing title traffic through another provider.

## Surface

- New deployment-wide setting stored as a `site_configs` row
(`agents_chat_title_generation_model_override`).
- New experimental endpoint `GET/PUT
/api/experimental/chats/config/model-override/{context}`.
- Frontend: title generation now appears as a third dropdown on the
Agents admin settings page alongside the existing general and explore
context overrides.

## DRY refactors folded in

Title generation is integrated as a third value of the existing
`ChatModelOverrideContext` type alongside `general` and `explore`,
sharing the parameterized HTTP route, SDK methods, generated types, and
frontend API plumbing rather than introducing a parallel surface. The
`Agent` prefix was dropped from the type and route since title
generation is not a delegated agent.

The chatd model-override resolver is also shared.
`resolveConfiguredModelOverride` now takes a `failureMode` parameter:

- Subagent overrides use soft failure: misconfigured overrides are
logged and the parent model is used.
- Title generation uses hard failure: misconfigured overrides return an
explicit error so manual title regeneration surfaces the
misconfiguration and automatic title generation skips instead of
silently falling back.

> Mux is acting on Mike's behalf.
2026-05-04 13:13:00 +02:00
Michael Suchacz 203b0a9df8 refactor(coderd/x/chatd): extract OpenAI logic into chatopenai package (#24788)
Extracts OpenAI-specific logic from `coderd/x/chatd` into
`coderd/x/chatd/chatopenai` so the main chat path no longer references
`fantasyopenai` directly for chain mode info, response IDs, web search
tooling, or option mapping.

Structural refactor. The only deliberate behavioral narrowing is
consolidating Responses store checks and related keyed option or
metadata access on `opts[fantasyopenai.Name]`. That is documented by
`TestIsResponsesStoreEnabledIgnoresMalformedNonOpenAIKey` and is
unreachable in production where Responses options always live under
`fantasyopenai.Name`.

Summary:

- Moves OpenAI Responses chain mode info, response ID helpers, web
search tool construction, and provider option conversion into
`chatopenai`.
- Keeps Anthropic, Google, OpenRouter, and Vercel provider branches as
thin, existing code paths.
- `chatopenai` only imports `chatprompt` from chatd subpackages. It does
not import `chatd`, `chatloop`, `chatprovider`, or `chaterror`.
- Follow-up review fixes align helper names, keyed provider option
access, map cloning behavior, and PR documentation with the extracted
package boundary.
- Final sweep trims unused chain-mode state, removes a duplicate
store-check test case, drops an unused provider-tool parameter, and
shares the chat-message test helper through `chattest`.

> Mux is updating this PR on Mike's behalf.
2026-05-04 11:17:19 +02:00
Ethan 3a153ebb15 fix(coderd/x/chatd): replay retry phase on subscribe (#24569)
Retry events were previously fire-and-forget, so subscribers that
connected after a retry started only saw durable history plus
`status=running` and could not tell the stream was backing off.

Keep the current retry phase in `chatStreamState`, capture it atomically
with subscriber registration, replay it in the initial snapshot for
same-chat late joiners, and clear it when streaming resumes or ends so
reconnects get consistent retry state without duplicate delivery at the
subscription boundary.

Relates to CODAGT-139
2026-05-04 11:48:39 +10:00
Kyle Carberry d889ba1842 feat: add user_oidc auth type for MCP servers (#24793)
Adds a 5th MCP server authentication mode, `user_oidc` ("User OIDC
Identity"), that forwards the calling user's OIDC access token from
`user_links.oauth_access_token` to the upstream MCP server as
`Authorization: Bearer <token>`.

The token is read from `user_links` and refreshed transparently via
`oauth2.TokenSource` before each MCP request. No new per-MCP-server
secret storage and no per-user connect/disconnect step.

**Limitation**: only users who logged in via OIDC have a forwardable
token. Users authenticated via password or GitHub will see requests sent
without an `Authorization` header, and the upstream MCP server is
expected to respond with 401. A pluggable token source (e.g. CLI-minted
E2E tokens) is left as future work.

<details>
<summary>Implementation notes</summary>

- Schema: new
`coderd/database/migrations/000481_mcp_user_oidc_auth.{up,down}.sql`
relaxes the `mcp_server_configs.auth_type` CHECK constraint to include
`user_oidc`. Down migration deletes affected rows before restoring the
old constraint.
- SDK validation: `codersdk/mcp.go` extends `oneof` for
`CreateMCPServerConfigRequest` and `UpdateMCPServerConfigRequest`.
- Handler: `coderd/mcp.go` adds `case "user_oidc":` to the
field-clearing switch on update. The existing list and detail handlers
already report `auth_connected = true` for any non-`oauth2` auth type.
- Header construction: `coderd/x/chatd/mcpclient/mcpclient.go`
introduces a `UserOIDCTokenSource` interface and adds the `user_oidc`
case to `buildAuthHeaders`. `ConnectAll` / `connectOne` /
`buildAuthHeaders` gain `userID uuid.UUID, oidcSrc UserOIDCTokenSource`
parameters.
- Wiring: `coderd/x/chatd/chatd.go` adds `OIDCTokenSource` to `Config` /
`Server` and passes `chat.OwnerID` plus the source through `ConnectAll`.
`coderd/coderd.go` constructs the source next to the `chatd.New` call
when `options.OIDCConfig` is non-nil.
- Token source: `oidcMCPTokenSource` lives in `coderd/mcp.go`. It reads
the user's OIDC link, refreshes via `oauth2.TokenSource`, and writes the
refreshed token back to `user_links`. Logic is duplicated from
`provisionerdserver.ObtainOIDCAccessToken` to avoid an MCP ->
provisionerdserver dependency. The two copies must be kept in sync; a
comment on `oidcMCPTokenSource` records this.
- Frontend: `MCPServerAdminPanel.tsx` adds the new dropdown option, an
explanatory helper block (no admin-configurable fields), and a Storybook
story (`CreateServerUserOIDC`).
- Tests:
- `mcpclient_test.go`: `TestConnectAll_UserOIDCAuth`,
`TestConnectAll_UserOIDCAuth_NoLink`,
`TestConnectAll_UserOIDCAuth_NilSource`. All existing tests updated for
the new signature.
- `mcp_test.go`: extends `TestMCPServerConfigsAuthConnected` to assert
`auth_connected=true` for `user_oidc`; adds
`TestMCPServerConfigsUserOIDCClearsFields` and
`TestMCPServerConfigsUserOIDCDirect`.
- Docs: `docs/ai-coder/agents/platform-controls/mcp-servers.md`
describes the new mode and its OIDC-only limitation.

</details>

This PR was created by Coder Agents.

---------

Co-authored-by: Coder Agents <agents@coder.com>
2026-05-03 11:31:48 -04:00
Jaayden Halko efda5c2c12 feat: disable Git controls when Git is not active (#24673)
closes CODAGT-148

In chats with no Git context (no repositories known to the watcher, no
PR tab, no remote diff), the refresh button fires an "Unable to refresh
git status" toast because the watcher WebSocket never opens.

Derive `isGitActive = repositories.size > 0 || showRemoteTab` in
`GitPanel` and use it to:

- Disable the refresh button, unified-diff toggle, and split-diff toggle
  with a "Git is not set up for this chat" tooltip.
- Show a dedicated empty state explaining how to enable Git, replacing
  the generic "No pushed changes yet" copy.

Chats with at least one repository or a PR tab are unaffected; all
controls remain enabled and behave as before.

Adds a `GitNotActive` Storybook story with play-function assertions
covering the disabled controls and empty-state copy.
2026-05-01 14:46:46 +01:00