## Summary
- add an `IS DISTINCT FROM` guard to `InsertChatMessage`'s
`updated_chat` CTE so `chats.last_model_config_id` is only rewritten
when the incoming `model_config_id` actually changes
- regenerate the query layer
- add focused regression coverage for the two meaningful behaviors:
same-model inserts and real model switches
- trim redundant message-field assertions so the new test stays focused
on the guard behavior
## Proof this is an improvement
This PR reduces work in the hottest chat write query without changing
the insert behavior.
### Why the old query did unnecessary work
Before this change, `InsertChatMessage` always ran this update whenever
`model_config_id` was non-null:
```sql
UPDATE chats
SET last_model_config_id = sqlc.narg('model_config_id')::uuid
WHERE id = @chat_id::uuid
AND sqlc.narg('model_config_id')::uuid IS NOT NULL
```
That means the query rewrote the `chats` row even when
`chats.last_model_config_id` was already equal to the incoming value.
### What changes in this PR
This PR adds:
```sql
AND chats.last_model_config_id IS DISTINCT FROM sqlc.narg('model_config_id')::uuid
```
So same-model inserts still insert the message, but they no longer
perform a redundant `UPDATE chats`.
### Why this matters on the hot path
From the chat scaletest investigation that motivated this change:
- `InsertChatMessage` (+ `updated_chat` CTE) was the hottest write query
- about **104k calls**
- about **0.69 ms average latency**
- about **71.8 s total DB execution time**
We also verified common callsites where the update is provably
redundant:
- `CreateChat` inserts the chat with `LastModelConfigID =
opts.ModelConfigID`, then immediately inserts initial system/user
messages with that same model config
- follow-up user messages commonly pass `lockedChat.LastModelConfigID`
straight into `InsertChatMessage`
- assistant/tool/summary persistence keeps the current model in the
common case; only real switches or fallback cases need the chat row
update
That means a meaningful fraction of executions of the hottest DB write
query move from:
- **before:** insert message **+** rewrite chat row
- **after:** insert message only
This should reduce row churn and write contention on `chats`, especially
against other chat-row writers like `UpdateChatStatus` and
`GetChatByIDForUpdate`.
Adds the `head_branch` field (the source/feature branch name of a PR) to
the diff status pipeline. Previously only `base_branch` (target branch)
and the head commit SHA were captured from the GitHub API, but not the
head branch name itself.
## Changes
- **Migration 438**: Add `head_branch` nullable TEXT column to
`chat_diff_statuses`
- **gitprovider**: Parse `head.ref` from the GitHub API response
(alongside `head.sha`) and add `HeadBranch` to `PRStatus`
- **gitsync**: Wire `HeadBranch` through `refreshOne()` into the DB
upsert params
- **worker**: Map `HeadBranch` in `chatDiffStatusFromRow()`
- **coderd**: Convert `HeadBranch` in `convertChatDiffStatus()`
- **codersdk**: Expose as `head_branch` (`*string`, omitempty) in
`ChatDiffStatus` API response
- **Tests**: Updated `github_test.go` pull JSON fixtures and assertions
Surfaces cache token data in the analytics views and fixes table
spacing.
### Changes
- **Cache token columns**: Added cache read and cache write token counts
to all analytics views (user and admin), from SQL queries through Go SDK
types to the frontend tables and summary cards.
- **Table spacing fix**: Replaced the bare React fragment in
`ChatCostSummaryView` with a `space-y-6` container so the model and chat
breakdown tables no longer overlap.
### Data flow
`chat_messages` table already stores `cache_read_tokens` and
`cache_creation_tokens` (and uses them for cost calculation). This PR
aggregates and displays them alongside input/output tokens in:
- Summary cards (6 cards: Total Cost, Input, Output, Cache Read, Cache
Write, Messages)
- Per-model breakdown table
- Per-chat breakdown table
- Admin per-user table
Implement the backend for the desktop feature for agents.
- Adds a new `/api/experimental/chats/$id/desktop` endpoint to coderd
which exposes a VNC stream from a
[portabledesktop](https://github.com/coder/portabledesktop) process
running inside the workspace
- Adds a new `spawn_computer_use_agent` tool to chatd, which spawns a
subagent that has access to the `computer` tool which lets it interact
with the `portabledesktop` process running inside the workspace
- Adds the plumbing to make the above possible
There's a follow up frontend PR here:
https://github.com/coder/coder/pull/23006
Add cost tracking for LLM chat interactions with microdollar precision.
## Changes
- Add `chatcost` package for per-message cost calculation using
`shopspring/decimal` for intermediate arithmetic
- **Ceil rounding policy**: fractional micros round UP to next whole
micro (applied once after summing all components)
- Database migration: `total_cost_micros` BIGINT column with historical
backfill and `created_at` index
- API endpoints: per-user cost summary and admin rollup under
`/api/experimental/chats/cost/`
- SDK types: `ChatCostSummary`, `ChatCostModelBreakdown`,
`ChatCostUserRollup`
- Fix `modeloptionsgen` to handle `decimal.Decimal` as opaque numeric
type
- Update frontend pricing test fixtures for string decimal types
## Design decisions
- `NULL` = unpriced (no matching model config), `0` = free
- Reasoning tokens included in output tokens (no double-counting)
- Integer microdollars (BIGINT) for storage and API responses
- Price config uses `decimal.Decimal` for exact parsing; totals use
`int64`
Frontend: #23037
Migration 000434 converts chat_messages.role from text to a Postgres
enum, rebuilds the partial index, and adds content_version smallint.
The column is backfilled with DEFAULT 0, then the default is dropped
so future inserts must set it explicitly.
Version 0 uses the role-aware heuristic from #22958. Version 1 (all
new inserts) stores []ChatMessagePart JSON for all roles, including
system messages. ParseContent takes database.ChatMessage directly
and dispatches on version internally. Unknown versions error.
All string(codersdk.ChatMessageRole*) casts at DB write sites are
replaced with database.ChatMessageRole* constants from sqlc.
Refs #22958
## Summary
Scale-tested the `chatd` package with mock-based benchmarks to identify
performance bottlenecks. This PR fixes 6 of the 8 identified issues,
ranked by severity.
## Changes
### 1. Parallel tool execution (HIGH) — `chatloop.go`
`executeTools` ran tool calls sequentially. Now dispatches all calls
concurrently via goroutines with `sync.WaitGroup`. Results are
pre-allocated by index (no mutex needed). `onResult` callbacks fire as
each tool completes.
### 2. Pubsub-backed subagent await (HIGH) — `subagent.go`
`awaitSubagentCompletion` polled the DB every 200ms. Now subscribes to
the child chat's `ChatStreamNotifyChannel` via pubsub for near-instant
notifications. Fallback poll reduced to 5s. Falls back to 200ms only
when `pubsub == nil` (single-instance / in-memory).
### 3. Per-chat stream locking (MEDIUM) — `chatd.go`
Replaced single global `streamMu` + `map[uuid.UUID]*chatStreamState`
with `sync.Map` where each `chatStreamState` has its own `sync.Mutex`.
Zero cross-chat contention.
### 4. Batch chat acquisition (MEDIUM) — `chatd.go`
`processOnce` acquired 1 chat per tick. Now loops up to
`maxChatsPerAcquire = 10` per tick, avoiding idle time when many chats
are pending.
### 5. Reduced heartbeat frequency (LOW-MEDIUM) — `chatd.go`
`chatHeartbeatInterval` changed from 30s to 60s. Safe given the 5-minute
`DefaultInFlightChatStaleAfter`.
### 6. O(depth) descendant check (LOW) — `subagent.go`
Replaced top-down BFS (`O(total_descendants)` queries) with bottom-up
parent-chain walk (`O(depth)` queries). Includes cycle protection.
## Not addressed (intentionally)
- Message serialization overhead
- Buffer eviction (`buffer[1:]` pattern)
Adds `pull_request_title` and `pull_request_draft` to the chat diff
status pipeline (DB → provider → SDK → frontend). The GitHub provider
now fetches the PR title alongside existing status fields.
The agents sidebar now displays PR-state-aware icons for chats that have
a linked pull request (when the chat is in waiting/completed state):
- **Open PR**: `GitPullRequestArrow` (green)
- **Draft PR**: `GitPullRequestDraft` (gray)
- **Merged PR**: `GitMerge` (purple)
- **Closed PR**: `GitPullRequestClosed` (red)
Running/pending/paused/error chats keep their existing activity icons
(spinner, pause, error triangle).
### Changes
**Database migration** (`000432`): Adds `pull_request_title TEXT` and
`pull_request_draft BOOLEAN` columns to `chat_diff_statuses`.
**Backend pipeline**:
- `gitprovider.PRStatus` gains a `Title` field
- GitHub provider decodes the `title` from the API response
- `gitsync` and `coderd/chats.go` pass title + draft through to the DB
upsert
- `codersdk.ChatDiffStatus` exposes both new fields in the API response
**Frontend** (`AgentsSidebar.tsx`): New `getPRIconConfig()` function
resolves the appropriate Lucide git icon based on `pull_request_state`
and `pull_request_draft`. Only applies when the chat is in a terminal
state (waiting/completed).
**Real-time sync**: No changes needed — the existing
`diff_status_change` pubsub event already propagates the full
`ChatDiffStatus` including the new fields.
Adds a `created_by` column (nullable UUID) to the `chat_messages` table
to track which user created each message. Only user-sent messages
populate this field; assistant, tool, system, and summary messages leave
it null.
The column is threaded through the full stack: SQL migration, query
updates, generated Go/TypeScript types, db2sdk conversion, chatd
(including subagent paths), and API handlers. All API handlers that
insert user messages now pass the authenticated user's ID as
`created_by`.
No foreign key constraint was added, matching the existing pattern used
by `chat_model_configs.created_by`.
- Adds `_API_BASE_URL` to `CODER_EXTERNAL_AUTH_CONFIG_`
- Extracts and refactors existing GitHub PR sync logic to new packages
`coderd/gitsync` and `coderd/externalauth/gitprovider`
- Associated wiring and tests
Created using Opus 4.6
Instead of the static 'Agent has finished running.' text, extract a
summary from the last assistant message to give users meaningful context
about what the agent accomplished. Falls back to the static text if no
suitable message is found.
Co-authored-by: Kyle Carberry <kyle@carberry.com>
Adds offset and cursor-based pagination to the `GET
/api/experimental/chats` endpoint, following the exact same patterns
used by `GetUsers` and `GetTemplateVersionsByTemplateID`.
## Changes
### Database
- Add `after_id`, `offset_opt`, `limit_opt` params to
`GetChatsByOwnerID` SQL query
- Use composite `(updated_at, id) DESC` cursor for stable, deterministic
pagination
- Add migration with composite index on `chats (owner_id, updated_at
DESC, id DESC)`
### Backend
- Use `ParsePagination()` in `listChats` handler (matches `users.go`
pattern)
- Add `Pagination` field to `ListChatsOptions` SDK struct
### Frontend
- Add `infiniteChats()` query factory using `useInfiniteQuery` with
offset-based page params (same pattern as `infiniteWorkspaceBuilds`)
- Update `AgentsPage` to use `useInfiniteQuery`
- Add "Show more" button at the bottom of the agents sidebar (matches
`HistorySidebar` pattern)
- Keep existing `chats()` query for non-paginated uses (e.g., parent
chat lookup in `AgentDetail`)
### Tests
- Add `TestListChats/Pagination` covering `limit`, `after_id` cursor,
`offset`, and no-limit behavior
## Bug
After compaction in the chat loop, the loop re-enters and calls the LLM
with a prompt that has **no non-system messages**. Anthropic (and most
providers) require at least one user/assistant/tool message, so the API
errors with empty messages.
## Root Cause
The compaction summary was stored as `role=system`. After compaction,
`GetChatMessagesForPromptByChatID` returns only:
- The compressed system summary (matched by the CTE)
- Original non-compressed system messages (system prompts)
All original user/assistant/tool messages are excluded (they predate the
summary). The compaction assistant/tool messages are `compressed=TRUE`
and don't match the main query's `compressed=FALSE` clauses.
So `ReloadMessages` returned only system messages. The Anthropic
provider moves system messages into a separate `system` field, leaving
the `messages` API field as `[]`.
## Fix
1. **Changed compaction summary from `role=system` to `role=user`** —
the summary now appears as a user message in the reloaded prompt, giving
the model valid conversational context to respond to.
2. **Simplified the CTE** — removed the `role = 'system'` check and
narrowed `visibility IN ('model', 'both')` to just `visibility =
'model'`. The summary is the only compressed message with
`visibility=model` (the assistant has `visibility=user`, the tool has
`visibility=both`), so the role check was redundant.
## Test
`PostRunCompactionReEntryIncludesUserSummary`: verifies the re-entry
prompt contains a user message (the compaction summary) after compaction
+ reload.
Despite the SDK type having an `Archived` field for chats, this data was
never fetched from the database — the `GetChatsByOwnerID` query
hardcoded `AND archived = false`, and the `convertChat` function never
mapped the field.
This PR adds an optional `archived` query parameter to `GET
/api/experimental/chats`:
| Value | Behavior |
|-------|----------|
| *(not provided)* | Returns all chats (active and archived) |
| `archived=false` | Returns only non-archived chats |
| `archived=true` | Returns only archived chats |
This follows the same pattern used by template versions
(`sqlc.narg('archived')` nullable boolean).
Also fixes `convertChat` to populate the `Archived` field in API
responses, which was never being set despite existing on the SDK type.
## Problem
The pubsub notification handler in `chatd` re-fetched **all** messages
from the DB on every new message notification, then filtered in Go with
`msg.ID > lastMessageID`. This grows linearly with conversation length —
every new message triggers a full table scan of that chat's history.
The `AfterMessageID` field in the pubsub notification payload was
clearly designed for cursor-based fetching, but no matching query
existed.
## Fix
- Add `GetChatMessagesByChatIDAfter` SQL query with `WHERE id >
@after_id`, so the database does the filtering instead of Go.
- Use it in the pubsub notification handler in `chatd.go`, passing
`lastMessageID` as the cursor.
- Implement the dbauthz wrapper (was a `panic("not implemented")` stub
from codegen) with the same read-check-on-parent-chat pattern as
adjacent methods.
- Add dbauthz test coverage for the new method.
**Not changed:** The initial snapshot in `Subscribe()` still loads all
messages — that's correct, since a newly-connecting client needs the
full conversation state. The waste was only in the ongoing notification
path.
## Problem
When archiving an agent with subagents, the children briefly flash in
the sidebar as root-level items before disappearing. Two issues:
1. **Backend:** Archive used N+1 queries — a recursive DFS
(`archiveChatTree`, no transaction) or BFS loop (`chatd.ArchiveChat`,
N+1 queries in a tx) to walk the tree and archive each chat
individually.
2. **Frontend:** The SSE `deleted` event handler only filtered out the
parent chat from the cache. Children remained briefly, got promoted to
root-level by `buildChatTree`, then disappeared on the next re-fetch.
## Fix
**Backend:** Replace both tree-walk implementations with a single SQL
query:
```sql
UPDATE chats SET archived = true, updated_at = NOW()
WHERE id = @id OR root_chat_id = @id;
```
This leverages the existing `root_chat_id` column (already indexed) to
archive the entire tree atomically.
**Frontend:** When a `deleted` event arrives, also filter out any chats
whose `root_chat_id` matches the deleted chat, so children vanish from
the sidebar immediately with the parent.
## Changes
- `coderd/database/queries/chats.sql` — Added `ArchiveChatTreeByID`
query
- `coderd/chats.go` — Use single query, delete `archiveChatTree`
function
- `coderd/chatd/chatd.go` — Simplify `ArchiveChat` to use single query
- `coderd/database/dbauthz/dbauthz.go` — Auth wrapper for new query
- `coderd/chats_test.go` — Added `TestArchiveChat/ArchivesChildren`
subtest
- `site/src/pages/AgentsPage/AgentsPage.tsx` — Filter children in SSE
handler
- Generated files updated via `make gen`
## Summary
Remove the `workspace_agent_id` column from the `chats` table and
dynamically look up the first workspace agent instead.
## Problem
When a workspace is stopped and restarted, the workspace agent gets a
new ID. The `workspace_agent_id` stored on the chat at creation time
becomes stale, making the agent unreachable. This caused chats to break
after workspace restarts.
## Solution
Instead of persisting the agent ID, dynamically look up the first agent
from the workspace's latest build via
`GetWorkspaceAgentsInLatestBuildByWorkspaceID` whenever an agent
connection is needed. The `workspace_id` on the chat remains stable
across restarts.
This behavior may be refined later (e.g., agent selection heuristics),
but picking the first agent resolves the immediate breakage.
## Changes
- **Migration 000425**: Drop `workspace_agent_id` column from `chats`
- **SQL queries**: Remove `workspace_agent_id` from `InsertChat` and
`UpdateChatWorkspace`
- **chatd.go**: `getWorkspaceConn` and `resolveInstructions` now look up
agents dynamically from workspace ID
- **chatd.go**: Remove `refreshChatWorkspaceSnapshot` (no longer needed)
- **createworkspace.go**: Stop persisting agent ID when associating
workspace with chat
- **subagent.go**: Stop passing agent ID to child chats
- **SDK/frontend**: Remove `WorkspaceAgentID` / `workspace_agent_id`
from Chat type
---------
Co-authored-by: Kyle Carberry <kylecarbs@gmail.com>
Adds a nullable `last_error` column to the `chats` table so error
reasons survive page reloads.
**Backend:**
- Migration adds `last_error TEXT` (nullable) to chats
- `UpdateChatStatus` writes the error reason when status transitions to
`error`, clears it (NULL) on recovery
- `convertChat` maps `sql.NullString` to `*string` in the SDK
**Frontend:**
- Sidebar falls back to `chat.last_error` when no stream error reason is
cached
- Chat detail page does the same for `persistedErrorReason`
- Fixtures updated for new required field
## Summary
The UI has always labeled the action as "Archive agent" but the backend
was performing a hard `DELETE`, permanently destroying chats and all
their messages.
This change replaces the hard delete with a soft archive, consistent
with the pattern used by template versions.
## Changes
### Database
- **Migration 000423**: Add `archived boolean DEFAULT false NOT NULL`
column to `chats` table
- Replace `DeleteChatByID` query with `ArchiveChatByID` (`UPDATE SET
archived = true`)
- Add `UnarchiveChatByID` query (`UPDATE SET archived = false`)
- Filter archived chats from `GetChatsByOwnerID` (`WHERE archived =
false`)
### API
- Remove `DELETE /api/experimental/chats/{chat}`
- Add `POST /api/experimental/chats/{chat}/archive` — archives a chat
and all its descendants
- Add `POST /api/experimental/chats/{chat}/unarchive` — unarchives a
single chat (API only, no UI yet)
### Backend
- `archiveChatTree()` recursively archives child chats (replaces
`deleteChatTree()` which hard-deleted)
- Chat daemon's `ArchiveChat()` archives the full chat tree in a
transaction
- Authorization uses `ActionUpdate` instead of `ActionDelete`
### SDK
- Replace `DeleteChat()` with `ArchiveChat()` and `UnarchiveChat()`
- Add `Archived` field to `Chat` struct
### Frontend
- `archiveChat` API call uses `POST .../archive` instead of `DELETE`
- No UI changes — the "Archive agent" button now actually archives
instead of deleting
## Design Decision
This follows the **template version archive pattern** (Pattern B in the
codebase):
- `archived boolean` column (not `deleted boolean`)
- Dedicated `POST .../archive` and `POST .../unarchive` routes (not
repurposing `DELETE`)
- Reversible — users can unarchive via the API (UI for this will come
later)