Fixes three classes of edit_files bugs and adds structured per-file
diff output for tool callers:
- New IncludeDiff flag on FileEditRequest; when set, the agent
returns FileEditResponse.Files[]{Path, Diff} with unified diffs
computed via go-udiff v0.4.1 Lines + ToUnified (not Unified,
which calls log.Fatalf on internal error).
- Fuzzy match comparators split each line into leading whitespace,
body, trailing whitespace, and ending. The splice substitutes at
each position: on agreement between search and replace the file's
bytes win; on disagreement the replacement's bytes are spliced
verbatim. Carve-outs for empty-body lines, multi-line EOF splices,
and level-aware indent translation for inserted lines.
- Indent-unit detection (GCD for spaces, tab-priority) lets a 4sp
LLM search insert correctly into tab or 2sp files. Falls back to
the previous cLead-inheritance path when units can't be detected
cleanly.
- Empty search is rejected with "search string must not be empty".
- Duplicate file paths in one request are rejected; symlink aliases
resolved via api.resolvePath before the dedup check.
- Frontend EditFilesRenderer consumes the structured files array by
explicit path (no label munging) with per-file synthetic fallback
for older agents or mismatched paths. On error, no diff is
rendered so the synthetic fallback doesn't misrepresent a
rejected edit as applied.
Breaking change: AgentConn.EditFiles changes from (ctx, req) error
to (ctx, req) (FileEditResponse, error) in codersdk/workspacesdk.
Source-breaking for external Go consumers; no compat shim per plan
owner.
Out of scope (tracked in CODAGT-214): level-aware indent for
middle-substituted splice lines. Locked in
TestEditFiles_FuzzyIndent_InsertionLevelAware's Lock_* cases plus
TestEditFiles_ReplaceAll_FuzzyIndentGap.
* Adds `streamJanitorLoop` to clean up stale streams every 30s
* zeroes dropped slots to aid in gc-eligibliity
* Adds regression tests in coderd/x/chatd and enterprise/coderd/x/chatd
> 🤖
Adds production-observability metrics to coderd/x/chatd/ for
model-level correlation and a chatStreams memory-leak investigation.
- Label per-request chatd metrics (steps_total, message_count,
prompt_size_bytes, tool_result_size_bytes, ttft_seconds,
compaction_total) with `model` and enrich the per-turn logger
with provider/model.
- Add `coderd_chatd_stream_retries_total{provider, model, kind}`
counter incremented in chatloop before OnRetry.
- Register a prometheus.Collector exposing `streams_active`,
`stream_buffer_size_max`, `stream_buffer_events`,
`stream_subscribers` from p.chatStreams.
- Add `coderd_chatd_stream_buffer_dropped_total` counter,
incremented per publishToStream drop independently of the
existing log-rate-limited bufferDropCount.
- Snapshot logger/model before the title-generation goroutine to
avoid a data race with the logger/model rebind below it.
> 🤖
> This PR was authored by Mux on behalf of Mike.
Introduce Explore mode, a read-only subagent modality for delegated
discovery and code investigation.
## What
Adds a `spawn_explore_agent` tool that creates child chats restricted to
read-only operations. An admin can optionally configure a
deployment-wide
model override so Explore subagents use a model optimized for large
context
or reasoning without changing the root chat's model.
### Backend
- New `ChatModeExplore` enum value (migration 000471).
- `spawn_explore_agent` tool definition with read-only allowlist:
`read_file`, `execute`, `process_output`, `read_skill`,
`read_skill_file`.
Write tools, file editors, and nested subagent spawning are blocked.
- Deployment config storage for the Explore model override
(`agents_chat_explore_model_override` in `site_configs`).
- Model resolution hierarchy: configured override, then current turn
model,
then global default. Silent fallback with warning log when the override
becomes unavailable.
- RBAC: `AsChatd` for daemon reads, `ActionRead` and `ActionUpdate` on
`ResourceDeploymentConfig` for admin API calls.
- Plan mode root chats can use `spawn_explore_agent` for read-only
research,
matching the planning prompt guidance.
- The Explore override config API now reports malformed saved overrides
as
"treated as unset" so admins can clear them explicitly.
### Frontend
- `ExploreModelOverrideSettings` component in admin agent behavior
settings.
Uses `ModelSelector`, handles unavailable model warnings, and supports
explicit Save and Clear actions.
- Malformed saved overrides show a warning and require an explicit Save
to
clear, instead of Clear auto-submitting behind the scenes.
### Tests
- Integration: `TestExploreSubagentIsReadOnly` (full spawn flow, tool
verification, prompt overlay, DB state).
- Unit: tool allowlist tests for explore, plan, and default modes.
- Internal: model override resolution with valid, invalid UUID,
disabled, and
unconfigured override scenarios.
- RBAC: `dbauthz_test.go` for `GetChatExploreModelOverride` and
`UpsertChatExploreModelOverride`.
- API: admin set and clear, malformed stored override reporting,
disabled
model rejection, non-admin denial.
> Mux working on behalf of Mike.
## Summary
- add an enabled chat model config lookup by ID for internal callers
- keep `spawn_agent` unchanged while threading an internal model
override through child subagent chat creation
- extend chatd coverage for inherited bindings, plan mode, and internal
override behavior
## Validation
- `go test ./coderd/x/chatd ./coderd/database/dbauthz`
- `make lint`
Two `InsertChatParams` blocks in `startworkspace_test.go` were missing
the `ClientType` field. Since the `chat_client_type` enum column is `NOT
NULL`, Postgres rejects the Go zero value (`""`), causing
`TestStartWorkspace` subtests `StoppedWorkspaceReportsAutoUpdate` and
`ManualUpdateRequired` to fail with:
```
pq: invalid input value for enum chat_client_type: ""
```
Closes https://github.com/coder/internal/issues/1471
## Problem
When a template has `require_active_version` enabled and the chat agent
tries to start a workspace that is stopped on an older template version,
the agent gets stuck in an infinite loop: `start_workspace` fails with a
403 (the old version is not the active version and the user lacks
`ActionUpdate` on the template), then `create_workspace` sees the
existing stopped workspace and tells the agent to use `start_workspace`,
repeat forever.
The root cause is that `chatStartWorkspace()` passes the start build
request through without setting `TemplateVersionID`, so `wsbuilder`
defaults to the previous build's template version — which RBAC rejects
when `RequireActiveVersion` is true.
## Fix
In `chatStartWorkspace()` (`coderd/exp_chats.go`), when the template's
access control has `RequireActiveVersion` enabled, explicitly set
`req.TemplateVersionID` to `template.ActiveVersionID` before calling
`postWorkspaceBuildsInternal()`. This mirrors how the autobuild executor
handles the same scenario (`coderd/autobuild/lifecycle_executor.go`).
If the new active version introduces required parameters that cannot be
resolved automatically (no defaults, no previous values), the build
fails at parameter validation before a provisioner job is created. In
that case, a clear error message tells the user to update and start the
workspace from the UI instead of surfacing a raw internal error.
On successful auto-update, the tool response includes
`updated_to_active_version`, `update_reason`, and a human-readable
`message` so the model can explain to the user what happened.
<img width="782" height="122" alt="image"
src="https://github.com/user-attachments/assets/289430d6-066e-41cf-bc97-cd013dcf717d"
/>
### Changes
- **`coderd/exp_chats.go`**: `chatStartWorkspace()` loads the template,
checks `RequireActiveVersion` via `AccessControlStore`, and pins the
build to the active version when required. New
`isChatStartWorkspaceManualUpdateRequiredError()` classifies parameter
validation failures from both the dynamic parameters path
(`DiagnosticError`) and the classic path (`ErrParameterValidation`
sentinel).
- **`coderd/wsbuilder/wsbuilder.go`**: New `ErrParameterValidation`
sentinel error, wrapped into the classic parameter validation
`BuildError` so callers can use `errors.Is` instead of string matching.
- **`coderd/x/chatd/chattool/startworkspace.go`**:
`waitForAgentAndRespond` now returns `map[string]any` instead of
`fantasy.ToolResponse`, letting the caller annotate the result (e.g.
auto-update metadata) before converting. Error handling for `StartFn`
checks for `httperror.Responder` errors to surface clean messages for
the manual-update case.
- **`coderd/x/chatd/chattool/startworkspace_test.go`**: Two new tests —
`StoppedWorkspaceReportsAutoUpdate` (verifies auto-update fields in
response) and `ManualUpdateRequired` (verifies clean error message
without internal wrapping).
### Follow-up
The manual-update error message could include a direct link to the
workspace settings page, but the chattool layer does not currently have
access to the deployment's access URL. Plumbing it through is
straightforward but out of scope for this fix.
Closes CODAGT-192
Add a `chat_client_type` enum (`ui` | `api`) and `client_type` column to
the `chats` table. The column defaults to `api` for new rows so API
callers don't need to set it explicitly. Existing rows are backfilled to
`ui`.
The field flows through `CreateChatRequest`, `chatd.CreateOptions`,
`InsertChat`, and is returned in the `Chat` response via `db2sdk`.
<details>
<summary>Implementation notes (Coder Agents generated)</summary>
### Changes
**Database migration (000469)**
- New enum `chat_client_type` with values `ui`, `api`.
- New `client_type` column, `NOT NULL DEFAULT 'api'`.
- Backfill: `UPDATE chats SET client_type = 'ui'`.
**SQL query** — `InsertChat` now includes `client_type`.
**SDK** — `ChatClientType` type added; `ClientType` field added to both
`CreateChatRequest` (optional, defaults server-side to `api`) and `Chat`
response.
**Handler** — `postChats` maps the request field (defaulting to `api`)
and passes it through `chatd.CreateOptions`.
**Sub-agent** — Child chats inherit their parent's `client_type`.
**db2sdk** — Maps the database value to the SDK type.
### Decision log
- Default is `api` (not `ui`) so existing API integrations get the
correct value without code changes.
- Backfill sets existing rows to `ui` per requirement.
- Child chats inherit `client_type` from parent rather than defaulting.
</details>
> This PR was authored by Mux on behalf of Mike.
## Summary
- add persistent plan mode for chats and the chat-specific plan file
flow
- add structured planning tools such as `ask_user_question` and
`propose_plan`
- keep `write_file` and `edit_files` constrained to the chat-specific
plan file during plan turns
- allow shell exploration in plan mode, including subagents, via
`execute` and `process_output`
- block implementation-oriented, provider-native, MCP, dynamic, and
computer-use tools during plan turns
- update the chat UI, tests, and docs for the new planning flow
The `coderd_chatd_message_count` histogram's current max bucket of 128
is being hit in production. This increases the exponential bucket count
from 8 to 11, extending coverage from `1..128` to `1..1024`.
Before: `1, 2, 4, 8, 16, 32, 64, 128`
After: `1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024`
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Context files (AGENTS.md) and skills were only fetched from the
workspace on the first turn or when the agent changed. On subsequent
turns, stale content from persisted messages was used. This meant that
if AGENTS.md or skills were modified on the workspace between turns, the
agent wouldn't see the changes until the user created a new chat.
## Changes
- Extract `fetchWorkspaceContext` from `persistInstructionFiles` to
allow fetching workspace context without persisting
- On subsequent turns, re-fetch fresh context from the workspace instead
of reading stale persisted content; falls back to persisted messages if
the workspace dial fails
- Update `ReloadMessages` callback to re-derive instruction and skills
from reloaded database messages after compaction, instead of using
captured closure variables
- Add `formatSystemInstructionsFromParts` helper to build system
instructions directly from agent parts without requiring separate
OS/directory params
- Add tests for the new helper
<details><summary>Implementation Notes</summary>
### Root cause
In `runChat`, the `else if hasContextFiles` branch (subsequent turns)
called `instructionFromContextFiles(messages)` which read stale content
from persisted DB messages. The `ReloadMessages` callback
(post-compaction) also used captured `instruction`/`skills` closure
variables from the start of the turn, never re-deriving them.
### Approach
1. **Extract `fetchWorkspaceContext`** — Pure refactor of the fetch-only
part of `persistInstructionFiles` (agent connection, context config
retrieval, content sanitization, metadata stamping). Returns parts +
skills without persisting.
2. **Subsequent turns**: Instead of reading from persisted messages,
launch a `g2` goroutine that calls `fetchWorkspaceContext` to get fresh
context from the workspace. Falls back gracefully to persisted messages
if the workspace is unreachable.
3. **ReloadMessages**: Re-derive `instruction` from
`instructionFromContextFiles(reloadedMsgs)` and `skills` from
`skillsFromParts(reloadedMsgs)` using the freshly loaded messages, with
fallback to captured values if the reloaded messages don't contain
context (e.g. compacted away).
</details>
> 🤖 Generated by Coder Agents
## Problem
`resolveDeploymentSystemPrompt` was called inside `InTx` closures in
both `CreateChat` (`coderd/x/chatd/chatd.go`) and
`createChildSubagentChatWithOptions` (`coderd/x/chatd/subagent.go`).
That method uses `p.db` (the root store) internally to call
`GetChatSystemPromptConfig`, which requires a second DB pool checkout
while the transaction already holds one connection.
Under concurrent chat creation load (e.g., the chat scaletest at 4800
chats), this causes pool starvation: every in-flight create holds one
connection and blocks waiting for another, leading to `idle in
transaction` pileups and cascading timeouts across the entire coderd DB
pool — including unrelated background work like prebuild metrics and the
chat acquire loop.
## Fix
Move the `resolveDeploymentSystemPrompt` call before `p.db.InTx(...)` in
both call sites. The system prompt config is a read-only
deployment-level setting that does not need transactional consistency
with the chat insert, so fetching it before the transaction is both safe
and preferable (it also shortens transaction lifetime).
## Backporting
The `CreateChat` instance of this bug is also present on `release/2.32`
(`coderd/x/chatd/chatd.go` line 907). The `subagent.go` instance is not
— the child-subagent-chat creation path with its own `InTx` was added
after the branch cut.
This should be backported, but because this is only in the chat creation
path, and that's not typically hit with a great deal of concurrency in
the real world, I don't think an urgent patch for 2.32 is necessary.
## Lint gap
The existing `InTx` ruleguard rule in `scripts/rules.go` catches direct
outer-store usage (`p.db.GetFoo()`) and passing the outer store as a
function argument inside `InTx` closures, but it explicitly cannot catch
indirect access through receiver methods like
`p.resolveDeploymentSystemPrompt()` — the rule documents this blind spot
at line 273. Catching this class of bug would require interprocedural
analysis (following the callee's body to see if it touches `p.db`),
which is beyond what ruleguard's AST pattern matching can express. We're
considering a lightweight custom `go/analysis` analyzer (similar to
`paralleltestctx`) that does 1-level same-package callee inspection to
detect this pattern. In the meantime, this PR adds guidance to
`AGENTS.md` so AI reviewers can flag the pattern during code review.
Addresses review findings from #23827 that were added post-merge:
- Persisted attachments now store `organizationId`; mismatched orgs
pruned on restore
- Workspace selection reconciliation: stale IDs from previous orgs
dropped via derived `effectiveWorkspaceId`
- Org picker uses `permittedOrganizations()` for RBAC-aware filtering
- Org picker hidden when user belongs to only one org
- Ref-sync `useEffect` replaced with `useEffectEvent`
- `CreateWorkspace()` and `ListTemplates()` take `organizationID` and
`db` as required function parameters instead of optional struct fields —
compiler enforces them, removes scattered nil guards
- Cross-org template check in `CreateWorkspace` is now unconditional
- `ListTemplates` org-scoping filter now has test coverage
- `setupChatInfra` comment fixed; test helpers use params structs
instead of positional UUIDs
- Enterprise test documents that org admin only sees own chats (handler
hardcodes `OwnerID` — future work needs sidebar UI before lifting that
restriction)
> 🤖
We noticed during higher active workspace counts that the agent
connection metric, generated via a query to the database, would report a
relatively high amount of agents as disconnected. Somewhere between 5
and 20%. However, other metrics such as # of websocket connections would
suggest that all agent connections are healthy.
Looking at the `Agents` function in prometheus metrics, plus the query
execution time (not accounting for actual database RT time) revealed
that this reporting of agents as disconnected was almost certainly false
positives due to clock drift in the way we're generating the metric
values. At 10k metrics, with a p50 of 2ms and p99 of 5ms, the entire
`agents` function could take upwards of 50s to execute. Because we were
doing a query/database RT to query th apps for each agent individually,
and grabbing a `time.Now` value on each iteration of that loop, it's
likely the portion of agents that were reported as disconnected were
those that had last heartbeat the furthest in the past.
The fix here is to set a consistent `now` before fetching agent data to
avoid clock drift inflating the inactive timeout comparison, and replace
the per-agent app query N+1 with a single batched lookup to prevent loop
execution time from pushing agents over the disconnected threshold.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Three SQL queries (`GetUserGroupSpendLimit`,
`ResolveUserChatSpendLimit`, `GetUserChatSpendInPeriod`) aggregated chat
spend limits and usage globally across all organizations. A restrictive
group limit in org A would bleed into org B.
## Changes
- Add `organization_id` parameter to all three SQL queries in
`coderd/database/queries/chats.sql`
- When nil UUID is passed, queries fall back to global behavior
(backward compat for HTTP dashboard endpoints)
- When real org ID is passed, limits and spend are scoped to that
organization
- Thread `organizationID` through `ResolveUsageLimitStatus` →
`checkUsageLimit` → all chatd call sites
- Update dbauthz wrappers for new param structs
- HTTP endpoints (`chatCostSummary`, `getMyChatUsageLimitStatus`) pass
`uuid.Nil` with TODO for future org-scoped UI
- Add `TestResolveUsageLimitStatus_OrgScoped` with 5 test cases covering
org isolation, nil-UUID fallback, spend scoping, and user override
priority
Closescoder/internal#1466
> 🤖
> This PR was authored by Mux on behalf of Mike.
Chats sharing one workspace (e.g. sibling subagents) all wrote to
`/home/coder/PLAN.md`, causing plan file collisions. This change derives
a unique plan path per chat from the workspace home directory and chat
ID.
## Changes
* `write_file`, `edit_files`, and `propose_plan` reject any `plan.md`
variant (case-insensitive) at the workspace home root, with a clear
error pointing to the chat-specific path.
* Root chats receive a `<plan-file-path>` block inlined in the main
system prompt with the concrete path.
* Prompt and tool descriptions no longer hardcode `/home/coder/PLAN.md`.
* Plan path handling is POSIX-only (forward-slash), relying on the
contract that workspace agent paths are normalized before reaching
chatd.
* Updated `ProposePlanTool.stories.tsx` to use per-chat path examples.
* Full test coverage for plan path detection, legacy-path rejection in
all three tools, inline prompt rendering, and fallback behavior.
Fixes https://github.com/coder/internal/issues/1436
* Adds organization_id to chats with backfill (workspace org → user org membership → default org)
* No support yet for ACLs (follow-up issue)
- Cross-org workspace binding rejected (both in `CreateChatRequest` and in `create_workspace` tool
- Adds `OrganizationAutocomplete` to `AgentCreateForm`
- Docs updated with `organization_id` in chats-api.md
> 🤖 Written by a Coder Agent. Reviewed by many humans and many agents.
---------
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
The startup-timeout integration tests in `chatloop` used a 5ms real-time
budget and relied on wall-clock scheduling to fire the startup guard
timer before the first stream part arrived. On loaded CI runners the
timer sometimes lost the race, producing `attempts == 2` instead of
`attempts == 1` and flaking `TestRun_FirstPartDisarmsStartupTimeout`.
Replace the real `time.Timer` in `startupGuard` with a `quartz.Timer` so
tests can control time deterministically. Production behavior is
unchanged: `RunOptions.Clock` defaults to `quartz.NewReal()` when nil,
and the startup timeout still covers both opening the provider stream
and waiting for the first stream part.
- Add `RunOptions.Clock quartz.Clock` with nil-safe default.
- Tag the startup guard timer as `"startupGuard"` for quartz trap
targeting.
- Rewrite the four startup-timeout integration tests to use
`quartz.NewMock(t)` with trap/advance/release sequences instead of
wall-clock sleeps.
- Add `awaitRunResult` helper so tests fail with a clear message instead
of hanging when `Run` does not complete.
Closes https://github.com/coder/internal/issues/1460
Adds `coder exp chat context add` and `coder exp chat context clear`
commands that run inside a workspace to manage chat context files via
the agent token.
`add` reads instruction and skill files from a directory (defaulting to
cwd) and inserts them as context-file messages into an active chat.
Multiple calls are additive — `instructionFromContextFiles` already
accumulates all context-file parts across messages.
`clear` soft-deletes all context-file messages, causing
`contextFileAgentID()` to return `!found` on the next turn, which
triggers `needsInstructionPersist=true` and re-fetches defaults from the
agent.
Both commands auto-detect the target chat via `CODER_CHAT_ID` (already
set by `agentproc` on chat-spawned processes), or fall back to
single-active-chat resolution for the agent. The `--chat` flag overrides
both.
Also adds sub-agent context inheritance: `createChildSubagentChat` now
copies parent context-file messages to child chats at spawn time, so
delegated sub-agents share the same instruction context without
independently re-fetching from the workspace agent.
<details><summary>Implementation details</summary>
**New files:**
- `cli/exp_chat.go` — CLI command tree under `coder exp chat context`
**Modified files:**
- `agent/agentcontextconfig/api.go` — `ConfigFromDir()` reads context
from an arbitrary directory without env vars
- `codersdk/agentsdk/agentsdk.go` — `AddChatContext`/`ClearChatContext`
SDK methods
- `coderd/workspaceagents.go` — POST/DELETE handlers on
`/workspaceagents/me/chat-context`
- `coderd/coderd.go` — Route registration
- `coderd/database/queries/chats.sql` — `GetActiveChatsByAgentID`,
`SoftDeleteContextFileMessages`
- `coderd/database/dbauthz/dbauthz.go` — RBAC implementations for new
queries
- `coderd/x/chatd/subagent.go` — `copyParentContextFiles` for sub-agent
inheritance
- `cli/root.go` — Register `chatCommand()` in `AGPLExperimental()`
**Auth pattern:** Uses `AgentAuth` (same as `coder external-auth`) —
agent token via `CODER_AGENT_TOKEN` + `CODER_AGENT_URL` env vars.
</details>
> 🤖 Generated by Coder Agents
---------
Co-authored-by: Michael Suchacz <203725896+ibetitsmike@users.noreply.github.com>
The agents chat interface displays thumbnails for videos recorded by the
computer use agent. Currently, to display a thumbnail, the frontend
downloads the entire video and shows the first frame. This PR starts
storing a new thumbnail file in the database for every recorded video,
and exposes the file id in the `wait_agent` tool result alongside the
recording file id, so the frontend can fetch just the thumbnail.
Adds an optional `CreatedAt` timestamp to `tool-call` and `tool-result`
`ChatMessagePart` variants so the frontend can compute tool execution
duration (`result.created_at - call.created_at`).
Timestamps are recorded at the correct moments in the chatloop:
- **Tool-call**: when the model stream emits the tool call
- **Tool-result**: when tool execution completes (or is interrupted)
These are passed through `PersistedStep.PartCreatedAt` so the
persistence layer can apply accurate timestamps to stored parts.
SSE-published parts also carry `CreatedAt` for real-time display.
Old persisted messages without `created_at` deserialize to `nil` — fully
backward compatible.
<details><summary>Implementation notes (Coder Agents
generated)</summary>
### Why not stamp in `PartFromContent`?
`PartFromContent` is called both for SSE publishing (correct timing) and
during persistence (wrong timing — both tool-call and tool-result would
get the same "persistence time" timestamp, yielding ~0 duration).
Instead, timestamps are captured in the chatloop at the right moments
and carried through `PersistedStep.PartCreatedAt` as a
`map[string]time.Time` keyed by `"call:<id>"` / `"result:<id>"`.
### Interrupted tool calls
`persistInterruptedStep` also stamps `CreatedAt` on synthetic error
results for cancelled/interrupted tool calls, so partial duration is
available.
### Files changed
| File | Change |
|------|--------|
| `codersdk/chats.go` | Add `CreatedAt *time.Time` field |
| `codersdk/chats_test.go` | JSON round-trip test |
| `coderd/database/dbtime/dbtime.go` | Add `TimePtr` helper |
| `coderd/x/chatd/chatloop/chatloop.go` | Track timestamps, pass through
`PersistedStep` |
| `coderd/x/chatd/chatd.go` | Apply timestamps during persistence |
| `coderd/x/chatd/chatprompt/chatprompt_test.go` | Verify
`PartFromContent` does NOT stamp |
| `site/src/api/typesGenerated.ts` | Auto-generated |
</details>
---------
Co-authored-by: Ethan <39577870+ethanndickson@users.noreply.github.com>
Adds client-executed dynamic tools to the chat API. Dynamic tools are
declared by the client at chat creation time, presented to the LLM
alongside built-in tools, but executed by the client rather than chatd.
This enables external systems (Slack bots, IDE extensions, Discord bots,
CI/CD integrations) to plug custom tools into the LLM chat loop without
modifying chatd's built-in tool set.
Modeled after OpenAI's Assistants API: the chat pauses with
`requires_action` status when the LLM calls a dynamic tool, the client
POSTs results back via `POST /chats/{id}/tool-results`, and the chat
resumes.
See [this example](https://github.com/coder/coder-slackbot-poc) as a
reference for how this is used. It's highly-configurable, which would
enable creating chats from webhooks, periodically polling, or running as
a Slackbot.
<details>
<summary>Design context</summary>
### Architecture
The chatloop **exits** when it encounters dynamic tools and
**re-enters** when results arrive. No blocking channels, no pubsub for
tool results, no in-memory registry. The DB is the only coordination
mechanism.
```
Phase 1 (chatloop):
LLM response → execute built-in tools only →
Persist(assistant + built-in results) →
status = requires_action → chatloop exits
Phase 2 (POST /tool-results):
Persist(dynamic tool results) →
status = pending → wakeCh → chatloop re-enters
```
### Validation (POST /tool-results)
1. Chat status must be `requires_action` (409 if not)
2. Read chat's `dynamic_tools` → set of dynamic tool names
3. Read last assistant message → extract tool-call parts matching
dynamic tool names
4. Submitted tool_call_ids must match exactly (400 for missing/extra)
5. Persist tool-result message parts, set status to `pending`, signal
wake
### Idempotency
Tool call IDs scoped per LLM step. State machine (`requires_action` →
`pending`) is the guard. First POST wins, subsequent get 409.
### Mixed tool calls
When the LLM calls both built-in and dynamic tools in one step, built-in
tools execute immediately. Their results are persisted in phase 1.
Dynamic tool results arrive via POST in phase 2. The LLM sees all
results when the chatloop resumes.
</details>
> 🤖 Generated by Coder Agents
## Summary
Replaces N per-chat heartbeat goroutines with a single centralized
heartbeat loop that issues one `UPDATE` per 30s interval for all running
chats on a worker.
## Problem
Each running chat spawned a dedicated goroutine that issued an
individual `UPDATE chats SET heartbeat_at = NOW() WHERE id = $1 AND
worker_id = $2 AND status = 'running'` query every 30 seconds. At 10,000
concurrent chats this produces **~333 DB queries/second** just for
heartbeats, plus ~333 `ActivityBumpWorkspace` CTE queries/second from
`trackWorkspaceUsage`.
## Solution
New `UpdateChatHeartbeats` (plural) SQL query replaces the old singular
`UpdateChatHeartbeat`:
```sql
UPDATE chats
SET heartbeat_at = @now::timestamptz
WHERE worker_id = @worker_id::uuid
AND status = 'running'::chat_status
RETURNING id;
```
A single `heartbeatLoop` goroutine on the `Server`:
1. Ticks every `chatHeartbeatInterval` (30s)
2. Issues one batch UPDATE for all registered chats
3. Detects stolen/completed chats via set-difference (equivalent of old
`rows == 0`)
4. Calls `trackWorkspaceUsage` for surviving chats
`processChat` registers an entry in the heartbeat registry instead of
spawning a goroutine.
## Impact
| Metric | Before (10K chats) | After (10K chats) |
|---|---|---|
| Heartbeat queries/sec | ~333 | ~0.03 (1 per 30s per replica) |
| Heartbeat goroutines | 10,000 | 1 |
| Self-interrupt detection | Per-chat `rows==0` | Batch set-difference |
---
> 🤖 Generated by Coder Agents
<details><summary>Implementation notes</summary>
- Uses `@now` parameter instead of `NOW()` so tests with `quartz.Mock`
can control timestamps.
- `heartbeatEntry` stores `context.CancelCauseFunc` + workspace state
for the centralized loop.
- `recoverStaleChats` is unaffected — it reads `heartbeat_at` which is
still updated.
- The old singular `UpdateChatHeartbeat` is removed entirely.
- `dbauthz` wrapper uses system-level `rbac.ResourceChat` authorization
(same pattern as `AcquireChats`).
</details>
> This PR was authored by Mux on behalf of Mike.
External MCP tools returned by `ConnectAll` were ordered by goroutine
completion, making the tool list nondeterministic across chat turns.
This broke prompt-cache stability since tools are serialized in order.
Sort tools by their model-visible name after all connections complete,
matching the existing pattern in workspace MCP tools
(`agent/x/agentmcp/manager.go`). Also guards against a nil-client panic
in cleanup when a connected server contributes zero tools after
filtering.
Needed by #23833
Adds a `chat_file_links` association table to track which files are
associated with each chat.
- `AppendChatFileIDs` query links a file to a chat with deduplication
- `GetChatFileMetadataByIDs` query returns lightweight file metadata by
IDs
- Tool-created files (e.g. `propose_plan`) are linked to the chat after
insert
- User-uploaded files are linked to the chat when the referencing
message is sent
- Single-chat GET endpoint hydrates `files: ChatFileMetadata[]` on the
response
> 🤖 Created by Coder Agents and massaged into shape by a human.
Fixes https://github.com/coder/internal/issues/1418
The `TestRun_ActiveToolsPrepareBehavior` test asserts
`persistedStep.Runtime > 0`, but on Windows the timer resolution (~15ms)
means the in-memory mock model can complete within the same clock tick,
producing a measured duration of `0s`.
Change the assertion from `require.Greater` to `require.GreaterOrEqual`
so that a legitimately measured zero duration on low-resolution clocks
does not cause a flake.
> Generated by Coder Agents
## Fix flaky TestAwaitSubagentCompletion/CompletesViaPubsub
Fixescoder/internal#1435
### Root Cause
During `createParentChildChats`, the processor publishes notifications
on `ChatStreamNotifyChannel(child.ID)` via PostgreSQL `LISTEN/NOTIFY`.
After `drainInflight()` returns, these stale notifications can still be
buffered in the pgListener's `NotifyChan()`. When
`awaitSubagentCompletion` subscribes and a stale notification is
dispatched between `setChatStatus(Waiting)` and
`insertAssistantMessage`, `checkSubagentCompletion` sees `done=true`
(status is `Waiting`) but returns an empty report because the message
hasn't been committed yet.
### Fix
Swap the order: insert the assistant message **before** transitioning
the status to `Waiting`. This guarantees the report is committed before
the status makes the chat appear complete to `checkSubagentCompletion`.
### Verification
- 50 consecutive runs of the specific test: all pass
- 10 runs of the full `TestAwaitSubagentCompletion` suite: all pass
- 20 runs with `-race`: all pass
> Generated by Coder Agents
Adds a `system_prompt` field to `CreateChatRequest` that allows API
consumers to provide custom instructions when creating a chat. The
per-chat prompt is stored as a separate system message (`role=system`,
`visibility=model`) in the `chat_messages` table, inserted between the
deployment system prompt and the workspace awareness message.
Also moves deployment system prompt resolution from the HTTP handler
(`resolvedChatSystemPrompt`) into `chatd.CreateChat` where it belongs.
The handler no longer assembles system prompts —
`CreateOptions.SystemPrompt` is now purely the per-chat user prompt, and
the deployment prompt is resolved internally by chatd.
No database schema changes required.
**Message insertion order:**
1. Deployment system prompt (resolved by chatd, existing)
2. Per-chat user system prompt (new, from `CreateOptions.SystemPrompt`)
3. Workspace awareness (existing)
4. Initial user message (existing)
🤖 Generated with [Coder Agents](https://coder.com/agents)
## Summary
Move `ConvertMessagesWithFiles` into the `g2` errgroup so prompt
conversion runs concurrently with instruction persistence, user prompt
resolution, MCP server connections, and workspace MCP tool discovery.
## Problem
In `runChat`, the setup before the first LLM `Stream()` call is
sequential across two errgroups:
```
g.Wait() // model + messages + MCP configs
ConvertMessagesWithFiles() // sequential — blocked on g2 starting
g2.Wait() // instructions + user prompt + MCP connect + workspace MCP
```
`ConvertMessagesWithFiles` can take non-trivial time on conversations
with file attachments (batch DB resolution), and it was blocking g2 from
starting.
## Fix
`ConvertMessagesWithFiles` only reads the `messages` slice (available
after `g.Wait()`) and resolves file references via the database. No g2
task reads or writes the `prompt` variable. This makes it safe to
overlap with g2:
```
g.Wait()
g2.Wait() // now includes ConvertMessagesWithFiles in parallel
```
The `InsertSystem` call for parent chats and the `promptErr` check are
deferred to after `g2.Wait()`, preserving correctness.
<details><summary>Decision log</summary>
- `ConvertMessagesWithFiles` is read-only on `messages` — no mutation,
safe for concurrent access
- `prompt` and `promptErr` are written only by the conversion goroutine,
read only after `g2.Wait()` — no data race
- Error from prompt conversion is checked immediately after `g2.Wait()`,
before any code that uses `prompt`
- `chatloop.Run` now uses `:=` instead of `=` since the prior `err`
declaration from `prompt, err :=` was removed
</details>
> Generated by Coder Agents
Piggybacks on #23878. Moves instruction file reading and skill discovery
from `chatd` (server-side, via multiple `LS`/`ReadFile` round-trips
through the agent connection) to the agent itself (local filesystem
access).
This intentionally drops backward compatibility with older agents that
don't support the context-config endpoint. Agents and server are
deployed together; there is no rolling-update contract to maintain here.
## What changed
The agent's `GET /api/v0/context-config` response now returns
`[]ChatMessagePart` directly — the same types chatd persists. This
eliminates intermediate type conversions and makes the protocol
extensible.
| Field | Type | Description |
|---|---|---|
| `parts` | `[]ChatMessagePart` | Context-file and skill parts, ready to
persist |
| `working_dir` | `string` | Agent's resolved working directory |
Removed from the response: `instructions_dirs`, `instructions_file`,
`skills_dirs`, `skill_meta_file`, `mcp_config_files` — the agent reads
files locally and returns their content as parts.
Removed from chatd: all legacy `LS`/`ReadFile` fallback code
(`readHomeInstructionFile`, `readInstructionDirFile`, `DiscoverSkills`
via LS, etc).
## Why
The previous architecture had the agent resolve paths, serve them over
HTTP, then `chatd` make N+1 round-trips back through the agent
connection to read files. The agent has direct filesystem access and
should just read the files.
## Key design decisions
- **Agent returns `ChatMessagePart` directly** — same types chatd
persists. No intermediate `InstructionFileEntry`/`SkillEntry` types
needed.
- **`SkillMeta.MetaFile`** — persisted via `ContextFileSkillMetaFile` on
the skill part, so custom meta file names
(`CODER_AGENT_EXP_SKILL_META_FILE`) survive across chat turns.
- **No pre-read body** — `read_skill` always dials the workspace to
fetch the skill body on demand. Simpler than caching the body in the
response.
- **MCP config paths kept agent-internal** — `MCPConfigFiles()` getter,
not sent over the wire.
- **No backward compat fallback** — old agents that don't support
context-config get no instruction files. This is acceptable since agent
and server deploy together.
This PR introduces screen recording of the computer use agent using the
virtual desktop.
- Screen recording is triggered by a `wait_agent` tool call. Recording
is stopped by a successful `wait_agent` tool call or when there hasn't
been any desktop activity for 10 minutes.
- Recordings are handled by the `portabledesktop` cli via the `record`
command. The videos are sped up in periods of inactivity.
- Recordings are saved to the database to the `chat_files` table.
There's a hard limit of 100MB per recording. Larger recordings are
dropped.
- A successful `wait_agent` on a computer use subagent tool call returns
a `recording_file_id`, later allowing the frontend to display the
corresponding video.
`isContextLimitKey` had a fallback heuristic that matched any key starting with `"max"` containing `"context"`, causing false positives on keys like `"max_context_version"`. A provider returning such metadata would have the value parsed as a context limit.
Replace substring matching on the separator-stripped key with word-level matching. A new `metadataKeyWords` function tokenizes keys by splitting on separators and camelCase boundaries, then the fallback requires
`"context"` paired with a limit-related word (`"limit"`, `"window"` + qualifier, `"length"` + qualifier, or `"tokens"` + qualifier). Known exact forms like `"context_window"` remain in the fast-path switch.
Closes https://github.com/coder/coder/issues/23332
- Extend `TestChatTemplateAllowlistEnforcement` to also exercise
`read_template` and `create_workspace` through the allowlist
- Mock LLM now chains 4 tool calls: list_templates, read_template
(blocked), read_template (allowed), create_workspace (blocked)
- Wire dummy `CreateWorkspace` config into test server so the tool
reaches the allowlist check
- Generalize tool result collection to support multiple calls per tool
name
> 🤖 Created by Coder Agents and reviewed by Kyle the human.
- Change `errChatHasNoWorkspaceAgent` message from cryptic `"chat has no
workspace agent"` to actionable `"workspace has no running agent: the
workspace may be stopped. Use the start_workspace tool to start it, or
create_workspace to create a new one"`
- Update test assertions to match the new message substring
> 🤖 Written by a Coder Agent. Reviewed by a human.
- stabilize `TestAwaitSubagentCompletion/CompletesViaPubsub` by waiting
for durable completion state before sending the synthetic pubsub wake
- add coverage for successful subagent completion with an empty report
> 🤖 Written by a Coder Agent. Reviewed by a human.
Previously, `CreateChat` inserted the `chats` row with the DB default
status (`waiting`), then updated it to `pending` in the same transaction
via `setChatPendingWithStore`. This wasted two extra queries per chat
creation (`GetChatByID` + `UpdateChatStatus`) and rewrote the same row
immediately after inserting it.
Now `CreateChat` passes the status directly to `InsertChat`, so the row
is written once in its final create-time state. The
`setChatPendingWithStore` helper is removed entirely. `InsertChat` now
requires an explicit `status` parameter at all callsites instead of
relying on a DB column default.
## Motivation
On an experimental branch we're trialing firing all chatd notifications
from plpgsql triggers. The old two-step insert made that awkward: in an
`AFTER INSERT` trigger, `NEW` only contained the insert-time row
(`waiting`), not the final committed state (`pending`). To emit the
correct event payload the trigger had to be deferred and re-read the row
from `chats` at commit time.
With this change, `NEW` already contains the correct row to publish — no
deferred trigger, no extra `SELECT`, simpler and cheaper trigger logic.
That said, this seems like a worthwhile change regardless of the trigger
experiment: writing the final row state once removes unnecessary DB work
on every chat creation and makes the create path easier to reason about.