## Problem
The chat listing endpoint (`GetChatsByOwnerID`) was using
`fetchWithPostFilter`, which fetches N rows from the database and then
filters them in Go memory using RBAC checks. This causes a pagination
bug: if the user requests `limit=25` but some rows fail the auth check,
fewer than 25 rows are returned even though more authorized rows exist
in the database. The client may incorrectly assume it has reached the
end of the list.
## Solution
Switch to the same pattern used by `GetWorkspaces`, `GetTemplates`, and
`GetUsers`: `prepareSQLFilter` + `GetAuthorized*` variant. The RBAC
filter is compiled to a SQL WHERE clause and injected into the query
before `ORDER BY`/`LIMIT`, so the database returns exactly the requested
number of authorized rows.
Additionally, `GetChatsByOwnerID` is renamed to `GetChats` with
`OwnerID` as an optional (nullable) filter parameter, matching the
`GetWorkspaces` naming convention.
## Changes
| File | Change |
|------|--------|
| `queries/chats.sql` | Renamed to `GetChats`, `owner_id` now optional
via CASE/NULL, added `-- @authorize_filter` |
| `queries.sql.go` | Renamed constant, params struct (`GetChatsParams`),
and method |
| `querier.go` | Interface method renamed |
| `modelqueries.go` | Added `chatQuerier` interface +
`GetAuthorizedChats` impl |
| `dbauthz/dbauthz.go` | `GetChats` now uses `prepareSQLFilter` instead
of `fetchWithPostFilter` |
| `dbauthz/dbauthz_test.go` | Updated tests for SQL filter pattern |
| `dbmock/dbmock.go` | Renamed + added mock for `GetAuthorizedChats` |
| `dbmetrics/querymetrics.go` | Renamed + added metrics wrapper |
| `rbac/regosql/configs.go` | Added `ChatConverter` (maps `org_owner` to
empty string literal since `chats` has no `organization_id` column) |
| `rbac/authz.go` | Added `ConfigChats()` |
| `chats.go` | Handler uses renamed method with `uuid.NullUUID` |
| `searchquery/search.go` | Updated return type |
| `gitsync/worker.go` | Updated interface and call site |
| Various test files | Updated for renamed types |
Creates a new table `ai_seat_state` to keep track of when users consume an ai_seat. Once a user consumes an AI seat, they will forever in this table (as it stands today).
Adds cursor-based pagination to the chat messages endpoint.
## Backend
- New `GetChatMessagesByChatIDPaginated` SQL query: returns messages in
`id DESC` order with a `before_id` keyset cursor and configurable
`limit`
- Handler parses `?before_id=N&limit=N` query params, uses the `LIMIT
N+1` trick to set `has_more` without a separate COUNT query
- Queued messages only returned on the first page (no cursor) since
they're always the most recent
- SDK client updated with `ChatMessagesPaginationOptions`
- Fully backward compatible: omitting params returns the 50 newest
messages
## Frontend
- Switches `getChatMessages` from `useQuery` to `useInfiniteQuery` with
cursor chaining via `getNextPageParam`
- Pages flattened and sorted by `id` ascending for chronological display
- `MessagesPaginationSentinel` component uses `IntersectionObserver`
(200px rootMargin prefetch) inside the existing `flex-col-reverse`
scroll container
- `flex-col-reverse` handles scroll anchoring natively when older
messages are prepended — no manual `scrollTop` adjustment needed (same
pattern as coder/blink)
## Why cursor-based instead of offset/limit
Offset-based pagination breaks when new messages arrive while paginating
backward (offsets shift, causing duplicates or missed messages). The
`before_id` cursor is stable regardless of inserts — each page is
deterministic.
## Problem
The sidebar diff status (PR icon, +additions/-deletions, file count) was
not updating in real-time. Users had to reload the page to see changes.
Two root causes:
1. **Frontend**: The `diff_status_change` WebSocket handler in
`AgentsPage.tsx` had an early `return` (line 398) that skipped
`updateInfiniteChatsCache`, so the sidebar's cache was never updated.
Even for other event types, the cache merge only spread `status` and
`title` — never `diff_status`.
2. **Server**: `publishChatPubsubEvent` in `chatd.go` constructed a
minimal `Chat` payload without `DiffStatus`, so even if the frontend
consumed the event, `updatedChat.diff_status` would be `undefined`.
## Fix
### Server (`coderd/chatd/chatd.go`)
- `publishChatPubsubEvent` now accepts an optional
`*codersdk.ChatDiffStatus` parameter; when non-nil it's set on the
outgoing `Chat` payload.
- `PublishDiffStatusChange` fetches the diff status from the DB,
converts it, and passes it through.
- Added `convertDBChatDiffStatus` (mirrors `coderd/chats.go`'s converter
to avoid circular import).
- All other callers pass `nil`.
### Frontend (`site/src/pages/AgentsPage/AgentsPage.tsx`)
- Removed the early `return` so `diff_status_change` events fall through
to the cache update logic.
- Added `isDiffStatusEvent` flag and spread `diff_status` into both the
infinite chats cache (sidebar) and the individual chat cache.
## Summary
- add an `IS DISTINCT FROM` guard to `InsertChatMessage`'s
`updated_chat` CTE so `chats.last_model_config_id` is only rewritten
when the incoming `model_config_id` actually changes
- regenerate the query layer
- add focused regression coverage for the two meaningful behaviors:
same-model inserts and real model switches
- trim redundant message-field assertions so the new test stays focused
on the guard behavior
## Proof this is an improvement
This PR reduces work in the hottest chat write query without changing
the insert behavior.
### Why the old query did unnecessary work
Before this change, `InsertChatMessage` always ran this update whenever
`model_config_id` was non-null:
```sql
UPDATE chats
SET last_model_config_id = sqlc.narg('model_config_id')::uuid
WHERE id = @chat_id::uuid
AND sqlc.narg('model_config_id')::uuid IS NOT NULL
```
That means the query rewrote the `chats` row even when
`chats.last_model_config_id` was already equal to the incoming value.
### What changes in this PR
This PR adds:
```sql
AND chats.last_model_config_id IS DISTINCT FROM sqlc.narg('model_config_id')::uuid
```
So same-model inserts still insert the message, but they no longer
perform a redundant `UPDATE chats`.
### Why this matters on the hot path
From the chat scaletest investigation that motivated this change:
- `InsertChatMessage` (+ `updated_chat` CTE) was the hottest write query
- about **104k calls**
- about **0.69 ms average latency**
- about **71.8 s total DB execution time**
We also verified common callsites where the update is provably
redundant:
- `CreateChat` inserts the chat with `LastModelConfigID =
opts.ModelConfigID`, then immediately inserts initial system/user
messages with that same model config
- follow-up user messages commonly pass `lockedChat.LastModelConfigID`
straight into `InsertChatMessage`
- assistant/tool/summary persistence keeps the current model in the
common case; only real switches or fallback cases need the chat row
update
That means a meaningful fraction of executions of the hottest DB write
query move from:
- **before:** insert message **+** rewrite chat row
- **after:** insert message only
This should reduce row churn and write contention on `chats`, especially
against other chat-row writers like `UpdateChatStatus` and
`GetChatByIDForUpdate`.
Adds the `head_branch` field (the source/feature branch name of a PR) to
the diff status pipeline. Previously only `base_branch` (target branch)
and the head commit SHA were captured from the GitHub API, but not the
head branch name itself.
## Changes
- **Migration 438**: Add `head_branch` nullable TEXT column to
`chat_diff_statuses`
- **gitprovider**: Parse `head.ref` from the GitHub API response
(alongside `head.sha`) and add `HeadBranch` to `PRStatus`
- **gitsync**: Wire `HeadBranch` through `refreshOne()` into the DB
upsert params
- **worker**: Map `HeadBranch` in `chatDiffStatusFromRow()`
- **coderd**: Convert `HeadBranch` in `convertChatDiffStatus()`
- **codersdk**: Expose as `head_branch` (`*string`, omitempty) in
`ChatDiffStatus` API response
- **Tests**: Updated `github_test.go` pull JSON fixtures and assertions
Surfaces cache token data in the analytics views and fixes table
spacing.
### Changes
- **Cache token columns**: Added cache read and cache write token counts
to all analytics views (user and admin), from SQL queries through Go SDK
types to the frontend tables and summary cards.
- **Table spacing fix**: Replaced the bare React fragment in
`ChatCostSummaryView` with a `space-y-6` container so the model and chat
breakdown tables no longer overlap.
### Data flow
`chat_messages` table already stores `cache_read_tokens` and
`cache_creation_tokens` (and uses them for cost calculation). This PR
aggregates and displays them alongside input/output tokens in:
- Summary cards (6 cards: Total Cost, Input, Output, Cache Read, Cache
Write, Messages)
- Per-model breakdown table
- Per-chat breakdown table
- Admin per-user table
Implement the backend for the desktop feature for agents.
- Adds a new `/api/experimental/chats/$id/desktop` endpoint to coderd
which exposes a VNC stream from a
[portabledesktop](https://github.com/coder/portabledesktop) process
running inside the workspace
- Adds a new `spawn_computer_use_agent` tool to chatd, which spawns a
subagent that has access to the `computer` tool which lets it interact
with the `portabledesktop` process running inside the workspace
- Adds the plumbing to make the above possible
There's a follow up frontend PR here:
https://github.com/coder/coder/pull/23006
Add cost tracking for LLM chat interactions with microdollar precision.
## Changes
- Add `chatcost` package for per-message cost calculation using
`shopspring/decimal` for intermediate arithmetic
- **Ceil rounding policy**: fractional micros round UP to next whole
micro (applied once after summing all components)
- Database migration: `total_cost_micros` BIGINT column with historical
backfill and `created_at` index
- API endpoints: per-user cost summary and admin rollup under
`/api/experimental/chats/cost/`
- SDK types: `ChatCostSummary`, `ChatCostModelBreakdown`,
`ChatCostUserRollup`
- Fix `modeloptionsgen` to handle `decimal.Decimal` as opaque numeric
type
- Update frontend pricing test fixtures for string decimal types
## Design decisions
- `NULL` = unpriced (no matching model config), `0` = free
- Reasoning tokens included in output tokens (no double-counting)
- Integer microdollars (BIGINT) for storage and API responses
- Price config uses `decimal.Decimal` for exact parsing; totals use
`int64`
Frontend: #23037
Migration 000434 converts chat_messages.role from text to a Postgres
enum, rebuilds the partial index, and adds content_version smallint.
The column is backfilled with DEFAULT 0, then the default is dropped
so future inserts must set it explicitly.
Version 0 uses the role-aware heuristic from #22958. Version 1 (all
new inserts) stores []ChatMessagePart JSON for all roles, including
system messages. ParseContent takes database.ChatMessage directly
and dispatches on version internally. Unknown versions error.
All string(codersdk.ChatMessageRole*) casts at DB write sites are
replaced with database.ChatMessageRole* constants from sqlc.
Refs #22958
File-reference parts in user messages were flattened to `TextContent` at
write time because fantasy has no file-reference content type. The
frontend never saw them as structured parts.
This moves all write paths (user, assistant, tool) from fantasy envelope
format to `codersdk.ChatMessagePart`. The streaming layer (`chatloop`)
is untouched, the conversion happens at the serialization boundary in
`persistStep`.
Old rows are still readable. `ParseContent` uses a structural heuristic
(`isFantasyEnvelopeFormat`) to distinguish legacy envelopes from SDK
parts. We chose this over try/fallback because fantasy envelopes
partially unmarshal into `ChatMessagePart` (the `type` field matches)
while silently losing content. A guard test enforces that no SDK part
can produce the envelope shape.
This is forward-only: new rows are unreadable by old code. Chat is
behind a feature flag so rollback risk is contained.
Also adds a typed `ChatMessageRole` to replace raw strings and
`fantasy.MessageRole*` casts at the persistence boundary. The type
covers `ChatMessage.Role`, `ChatStreamMessagePart.Role`, the
`PublishMessagePart` callback chain, and all DB write sites.
`fantasy.MessageRole*` remains only where we build `fantasy.Message`
structs for LLM dispatch.
Separately, `ProviderMetadata` was leaking to SSE clients via
`publishMessagePart`. `StripInternal` now runs on both the SSE and REST
paths, covering this.
Other cleanup:
- Old `db2sdk.contentBlockToPart` silently dropped metadata on
text/reasoning/tool-call content. New code preserves it.
- `providerMetadataToOptions` now logs warnings instead of silently
returning nil.
- `db2sdk` shrinks from ~250 lines of parallel conversion to ~15 lines
delegating to `chatprompt.ParseContent()`, removing the `fantasy` import
entirely.
Refs #22821
The timeout was started before the unbounded Stepper loop, so
under CI load the deadline could expire before reaching the
operations that actually use it.
Also bumps TestMigration000387 from WaitLong to WaitSuperLong.
Fixescoder/internal#1398
## Problem
1. **Personal behavior prompt not applied**: The chatd background worker
was missing `ActionReadPersonal` on `ResourceUser` in its RBAC subject.
When `resolveUserPrompt` calls `GetUserChatCustomPrompt`, the dbauthz
layer checks `ActionReadPersonal` on the user — which the chatd role
didn't have. The error was silently swallowed (returns `""`), so the
user's custom prompt was never injected into the system messages.
2. **Sequential DB calls on chat startup**: Several independent database
queries in `runChat` and `resolveChatModel` were running sequentially,
adding unnecessary latency before the LLM stream begins.
## Changes
### RBAC fix (`dbauthz.go`)
- Add `rbac.ResourceUser.Type: {policy.ActionReadPersonal}` to
`subjectChatd` site permissions
- This is the minimal permission needed — `ActionRead` on User remains
denied
### Parallelization (`chatd.go`)
Three parallelization points using `errgroup.Group`:
1. **`resolveChatModel`**: `resolveModelConfig` and
`GetEnabledChatProviders` run concurrently (both needed for
`ModelFromConfig`, which stays sequential after the wait)
2. **`runChat` startup**: `resolveChatModel` and
`GetChatMessagesForPromptByChatID` run concurrently (completely
independent)
3. **`runChat` prompt assembly**: `resolveInstructions` and
`resolveUserPrompt` run concurrently (both produce strings;
`InsertSystem` calls maintain correct order after the wait)
Same pattern applied to the `ReloadMessages` callback.
### Test (`dbauthz_test.go`)
- Add assertion in `TestAsChatd/AllowedActions` that
`ActionReadPersonal` on `ResourceUser` is permitted
## What
Adds provider-native web search tools to the chat system. Anthropic,
OpenAI, and Google all offer server-side web search — this wires them up
as opt-in per-model config options using the existing
`ChatModelProviderOptions` JSONB column (no migration).
Web search is **off by default**.
## Config
Set `web_search_enabled: true` in the model config provider options:
```json
{
"provider_options": {
"anthropic": {
"web_search_enabled": true,
"allowed_domains": ["docs.coder.com", "github.com"]
}
}
}
```
Available options per provider:
- **Anthropic**: `web_search_enabled`, `allowed_domains`,
`blocked_domains`
- **OpenAI**: `web_search_enabled`, `search_context_size`
(`low`/`medium`/`high`), `allowed_domains`
- **Google**: `web_search_enabled`
## Backend
- `codersdk/chats.go` — new fields on the per-provider option structs
- `coderd/chatd/chatd.go` — `buildProviderTools()` reads config, creates
`ProviderDefinedTool` entries (uses `anthropic.WebSearchTool()` helper
from fantasy)
- `coderd/chatd/chatloop/chatloop.go` — `ProviderTools` on `RunOptions`,
merged into `Call.Tools`. Provider-executed tool calls skip local
execution. `StreamPartTypeToolResult` with `ProviderExecuted: true` is
accumulated inline (matching fantasy's own agent.go pattern) instead of
post-stream synthesis.
- `coderd/chatd/chatprompt/` — `MarshalToolResult` carries
`ProviderMetadata` through DB persistence so multi-turn round-trips work
(Anthropic needs `encrypted_content` back)
## Frontend
- Source citations render **inline** at the tool-call position (not
bottom-of-message), using `ToolCollapsible` so they look like other tool
cards — collapsed "Searched N results" with globe icon, expand to see
source pills
- Provider-executed tool calls/results are hidden from the normal tool
card UI
- Tool-role messages with only provider-executed results return `null`
(no empty bubble)
- Both persisted (messageParsing.ts) and streaming (streamState.ts)
paths group consecutive `source` parts into a single `{ type: "sources"
}` render block
## Fantasy changes
The fantasy fork (`kylecarbs/fantasy` branch `cj/go1.25`) has the
Anthropic tool code merged in, but will hopefully go upstream from:
https://github.com/charmbracelet/fantasy/pull/163
## Summary
Scale-tested the `chatd` package with mock-based benchmarks to identify
performance bottlenecks. This PR fixes 6 of the 8 identified issues,
ranked by severity.
## Changes
### 1. Parallel tool execution (HIGH) — `chatloop.go`
`executeTools` ran tool calls sequentially. Now dispatches all calls
concurrently via goroutines with `sync.WaitGroup`. Results are
pre-allocated by index (no mutex needed). `onResult` callbacks fire as
each tool completes.
### 2. Pubsub-backed subagent await (HIGH) — `subagent.go`
`awaitSubagentCompletion` polled the DB every 200ms. Now subscribes to
the child chat's `ChatStreamNotifyChannel` via pubsub for near-instant
notifications. Fallback poll reduced to 5s. Falls back to 200ms only
when `pubsub == nil` (single-instance / in-memory).
### 3. Per-chat stream locking (MEDIUM) — `chatd.go`
Replaced single global `streamMu` + `map[uuid.UUID]*chatStreamState`
with `sync.Map` where each `chatStreamState` has its own `sync.Mutex`.
Zero cross-chat contention.
### 4. Batch chat acquisition (MEDIUM) — `chatd.go`
`processOnce` acquired 1 chat per tick. Now loops up to
`maxChatsPerAcquire = 10` per tick, avoiding idle time when many chats
are pending.
### 5. Reduced heartbeat frequency (LOW-MEDIUM) — `chatd.go`
`chatHeartbeatInterval` changed from 30s to 60s. Safe given the 5-minute
`DefaultInFlightChatStaleAfter`.
### 6. O(depth) descendant check (LOW) — `subagent.go`
Replaced top-down BFS (`O(total_descendants)` queries) with bottom-up
parent-chain walk (`O(depth)` queries). Includes cycle protection.
## Not addressed (intentionally)
- Message serialization overhead
- Buffer eviction (`buffer[1:]` pattern)
Adds `pull_request_title` and `pull_request_draft` to the chat diff
status pipeline (DB → provider → SDK → frontend). The GitHub provider
now fetches the PR title alongside existing status fields.
The agents sidebar now displays PR-state-aware icons for chats that have
a linked pull request (when the chat is in waiting/completed state):
- **Open PR**: `GitPullRequestArrow` (green)
- **Draft PR**: `GitPullRequestDraft` (gray)
- **Merged PR**: `GitMerge` (purple)
- **Closed PR**: `GitPullRequestClosed` (red)
Running/pending/paused/error chats keep their existing activity icons
(spinner, pause, error triangle).
### Changes
**Database migration** (`000432`): Adds `pull_request_title TEXT` and
`pull_request_draft BOOLEAN` columns to `chat_diff_statuses`.
**Backend pipeline**:
- `gitprovider.PRStatus` gains a `Title` field
- GitHub provider decodes the `title` from the API response
- `gitsync` and `coderd/chats.go` pass title + draft through to the DB
upsert
- `codersdk.ChatDiffStatus` exposes both new fields in the API response
**Frontend** (`AgentsSidebar.tsx`): New `getPRIconConfig()` function
resolves the appropriate Lucide git icon based on `pull_request_state`
and `pull_request_draft`. Only applies when the chat is in a terminal
state (waiting/completed).
**Real-time sync**: No changes needed — the existing
`diff_status_change` pubsub event already propagates the full
`ChatDiffStatus` including the new fields.
Adds a `created_by` column (nullable UUID) to the `chat_messages` table
to track which user created each message. Only user-sent messages
populate this field; assistant, tool, system, and summary messages leave
it null.
The column is threaded through the full stack: SQL migration, query
updates, generated Go/TypeScript types, db2sdk conversion, chatd
(including subagent paths), and API handlers. All API handlers that
insert user messages now pass the authenticated user's ID as
`created_by`.
No foreign key constraint was added, matching the existing pattern used
by `chat_model_configs.created_by`.
Removes the backend and frontend logic that extracted compact titles
from reasoning/thinking blocks. The `Title` field on `ChatMessagePart`
remains for other part types (e.g. source), but reasoning blocks no
longer have titles derived from first-line markdown bold text or
provider metadata summaries.
**Backend:**
- Remove `ReasoningTitleFromFirstLine`, `reasoningTitleFromContent`,
`reasoningSummaryTitle`, `compactReasoningSummaryTitle`, and
`reasoningSummaryHeadline` from chatprompt
- Simplify `marshalContentBlock` to plain `json.Marshal` (no title
injection)
- Remove title tracking maps and `setReasoningTitleFromText` from
chatloop stream processing
- Remove `reasoningStoredTitle` from db2sdk
- Remove related tests from db2sdk_test
**Frontend:**
- Remove `mergeThinkingTitles` from blockUtils
- Simplify `appendTextBlock` to always merge consecutive thinking blocks
- Remove `applyStreamThinkingTitle` from streamState
- Simplify reasoning/thinking stream handler to ignore title-only parts
- Update tests accordingly
Net: **-487 lines / +42 lines**
- Adds `_API_BASE_URL` to `CODER_EXTERNAL_AUTH_CONFIG_`
- Extracts and refactors existing GitHub PR sync logic to new packages
`coderd/gitsync` and `coderd/externalauth/gitprovider`
- Associated wiring and tests
Created using Opus 4.6
## Problem
When multiple concurrent callers (e.g., parallel workspace builds) read
the same single-use OAuth2 refresh token from the database and race to
exchange it with the provider, the first caller succeeds but subsequent
callers get `bad_refresh_token`. The losing caller then **clears the
valid new token** from the database, permanently breaking the auth link
until the user manually re-authenticates.
This is reliably reproducible when launching multiple workspaces
simultaneously with GitHub App external auth and user-to-server token
expiration enabled.
## Solution
Two layers of protection:
### 1. Singleflight deduplication (`Config.RefreshToken` +
`ObtainOIDCAccessToken`)
Concurrent callers for the same user/provider share a single refresh
call via `golang.org/x/sync/singleflight`, keyed by `userID`. The
singleflight callback re-reads the link from the database to pick up any
token already refreshed by a prior in-flight call, avoiding redundant
IDP round-trips entirely.
### 2. Optimistic locking on `UpdateExternalAuthLinkRefreshToken`
The SQL `WHERE` clause now includes `AND oauth_refresh_token =
@old_oauth_refresh_token`, so if two replicas (HA) race past
singleflight, the loser's destructive UPDATE is a harmless no-op rather
than overwriting the winner's valid token.
## Changes
| File | Change |
|------|--------|
| `coderd/externalauth/externalauth.go` | Added `singleflight.Group` to
`Config`; split `RefreshToken` into public wrapper +
`refreshTokenInner`; pass `OldOauthRefreshToken` to DB update |
| `coderd/provisionerdserver/provisionerdserver.go` | Wrapped OIDC
refresh in `ObtainOIDCAccessToken` with package-level singleflight |
| `coderd/database/queries/externalauth.sql` | Added optimistic lock
(`WHERE ... AND oauth_refresh_token = @old_oauth_refresh_token`) |
| `coderd/database/queries.sql.go` | Regenerated |
| `coderd/database/querier.go` | Regenerated |
| `coderd/database/dbauthz/dbauthz_test.go` | Updated test params for
new field |
| `coderd/externalauth/externalauth_test.go` | Added
`ConcurrentRefreshDedup` test; updated existing tests for singleflight
DB re-read |
## Testing
- **New test `ConcurrentRefreshDedup`**: 5 goroutines call
`RefreshToken` concurrently, asserts IDP refresh called exactly once,
all callers get same token.
- All existing `TestRefreshToken/*` subtests updated and passing.
- `TestObtainOIDCAccessToken` passing.
- `dbauthz` tests passing.
## Problem
`TestMigrate/Parallel` flakes with:
```
timeout: can't acquire database lock
```
## Root Cause
The test runs two concurrent `migrations.Up(db)` calls on the same
database. golang-migrate wraps every `Lock()` call with a [15-second
timeout](https://github.com/golang-migrate/migrate/blob/v4.19.0/migrate.go#L29)
(`DefaultLockTimeout`). Our `pgTxnDriver.Lock()` uses
`pg_advisory_xact_lock`, which blocks until the lock is available. With
430+ migrations, the first caller can hold the lock well beyond 15s (the
failing test ran for 25.88s), causing the second caller to hit the
timeout.
## Fix
Set `m.LockTimeout = 2 * time.Minute` after creating the
`migrate.Migrate` instance in `setup()`. Since `pg_advisory_xact_lock`
releases automatically when the transaction commits, there's no risk of
a stuck lock — we just need to wait long enough for a concurrent
migration to finish.
Adds a user-level custom prompt to the database.
I'll be doing a follow-up for the UI, as we currently do not have
user-level settings (it's just admin). I'll also make it very obvious
for chats where there is a user-level prompt, but I don't know how yet.
Instead of the static 'Agent has finished running.' text, extract a
summary from the last assistant message to give users meaningful context
about what the agent accomplished. Falls back to the static text if no
suitable message is found.
Co-authored-by: Kyle Carberry <kyle@carberry.com>
Adds offset and cursor-based pagination to the `GET
/api/experimental/chats` endpoint, following the exact same patterns
used by `GetUsers` and `GetTemplateVersionsByTemplateID`.
## Changes
### Database
- Add `after_id`, `offset_opt`, `limit_opt` params to
`GetChatsByOwnerID` SQL query
- Use composite `(updated_at, id) DESC` cursor for stable, deterministic
pagination
- Add migration with composite index on `chats (owner_id, updated_at
DESC, id DESC)`
### Backend
- Use `ParsePagination()` in `listChats` handler (matches `users.go`
pattern)
- Add `Pagination` field to `ListChatsOptions` SDK struct
### Frontend
- Add `infiniteChats()` query factory using `useInfiniteQuery` with
offset-based page params (same pattern as `infiniteWorkspaceBuilds`)
- Update `AgentsPage` to use `useInfiniteQuery`
- Add "Show more" button at the bottom of the agents sidebar (matches
`HistorySidebar` pattern)
- Keep existing `chats()` query for non-paginated uses (e.g., parent
chat lookup in `AgentDetail`)
### Tests
- Add `TestListChats/Pagination` covering `limit`, `after_id` cursor,
`offset`, and no-limit behavior
## Problem
The Admin → Agents → System Prompt textarea saved only to the browser's
`localStorage`. The value was never sent to the backend, never stored in
the database, and never injected into chats. Entering text, clicking
Save, and refreshing the page showed no changes — the prompt was
effectively a no-op.
## Root Cause
Three disconnected layers:
1. **Frontend** wrote to `localStorage`, never called an API.
2. **`handleCreateChat`** never read `savedSystemPrompt`.
3. **Backend** hardcoded `chatd.DefaultSystemPrompt` on every chat
creation — no field in `CreateChatRequest` accepted a custom prompt.
## Changes
### Database
- Added `GetChatSystemPrompt` / `UpsertChatSystemPrompt` queries on the
existing `site_configs` table (no migration needed).
### API
- `GET /api/experimental/chats/system-prompt` — returns the configured
prompt (any authenticated user).
- `PUT /api/experimental/chats/system-prompt` — sets the prompt
(admin-only, `rbac: deployment_config update`).
- Input validation: max 32 KiB prompt length.
### Backend
- `resolvedChatSystemPrompt(ctx)` checks for a custom prompt in the DB,
falls back to `chatd.DefaultSystemPrompt` when empty/unset.
- Logs a warning on DB errors instead of silently swallowing them.
- Replaced the hardcoded `defaultChatSystemPrompt()` call in chat
creation.
### Frontend
- Replaced `localStorage` read/write with React Query
`useQuery`/`useMutation` backed by the new endpoints.
- Fixed `useEffect` draft sync to avoid clobbering in-progress user
edits on refetch.
- Added `try/catch` error handling on save (draft stays dirty for
retry).
- Save button disabled during mutation (`isSavingSystemPrompt`).
- Query key follows kebab-case convention (`chat-system-prompt`).
### UX
- Added hint: "When empty, the built-in default prompt is used."
### Tests
- `TestChatSystemPrompt`: GET returns empty when unset, admin can set,
non-admin gets 403.
- dbauthz `TestMethodTestSuite` coverage for both new querier methods.
## Bug
After compaction in the chat loop, the loop re-enters and calls the LLM
with a prompt that has **no non-system messages**. Anthropic (and most
providers) require at least one user/assistant/tool message, so the API
errors with empty messages.
## Root Cause
The compaction summary was stored as `role=system`. After compaction,
`GetChatMessagesForPromptByChatID` returns only:
- The compressed system summary (matched by the CTE)
- Original non-compressed system messages (system prompts)
All original user/assistant/tool messages are excluded (they predate the
summary). The compaction assistant/tool messages are `compressed=TRUE`
and don't match the main query's `compressed=FALSE` clauses.
So `ReloadMessages` returned only system messages. The Anthropic
provider moves system messages into a separate `system` field, leaving
the `messages` API field as `[]`.
## Fix
1. **Changed compaction summary from `role=system` to `role=user`** —
the summary now appears as a user message in the reloaded prompt, giving
the model valid conversational context to respond to.
2. **Simplified the CTE** — removed the `role = 'system'` check and
narrowed `visibility IN ('model', 'both')` to just `visibility =
'model'`. The summary is the only compressed message with
`visibility=model` (the assistant has `visibility=user`, the tool has
`visibility=both`), so the role check was redundant.
## Test
`PostRunCompactionReEntryIncludesUserSummary`: verifies the re-entry
prompt contains a user message (the compaction summary) after compaction
+ reload.
This change adds support for image attachments to chat via add button
and clipboard paste. Files are stored in a new `chat_files` table and
referenced by ID in message content. File data is resolved from storage
at LLM dispatch time, keeping the message content column small.
Upload validates MIME types via content type or content sniffing against
an allowlist (png, jpeg, gif, webp). The retrieval endpoint serves files
with immutable caching headers. On the frontend, uploads start eagerly
on attach with a background fetch to pre-warm the browser HTTP cache so
the timeline renders instantly after send.
To allow tasks to be shareable, we need to share both the `task`
resource and the `workspace` resource, and their sharing state needs to
be kept in sync. We've already implemented all of the necessary ACL
functionality for workspaces, so we can just sort of proxy those ACLs
back to the task as well.
Follow-up to #22612. Running `git status --short` in a loop during `make
-B -j gen` still showed intermediate states for several files. This PR
fixes the remaining ones.
The main issues:
- `generate.sh` ran `gofmt` and `goimports` in-place after moving files
into the source tree. Now it formats in a workdir first and only `mv`s
the final result.
- `protoc` targets wrote directly to the source tree. Wrapped with
`scripts/atomic_protoc.sh` which redirects output to a tmpdir.
- Several generators used hardcoded `/tmp/` paths. On systems where
`/tmp` is tmpfs, `mv` degrades to copy+delete. Switched to a
project-local `_gen/` directory (gitignored, same filesystem).
- `apidoc/.gen` and `cli/index.md` used `cp` for final output. Replaced
with `mv`.
- `manifest.json` was written twice (unformatted, then formatted). Now
`.gen` writes to a staging file and the manifest target does one
formatted atomic write.
- `biome_format.sh` silently skipped files in gitignored dirs. Added
`--vcs-enabled=false`.
Two helpers reduce the Makefile boilerplate: `scripts/atomic_protoc.sh`
(wraps protoc) and an `atomic_write` Make define
(stdout-to-temp-to-target pattern). `.PRECIOUS` now also covers `.pb.go`
and mock files.
Verification: `make -B -j gen` x3 with `git status` polling, no changes.
Refs #22612
`make gen` could not run with `-j` because inter-target dependency edges
were missing. Multiple recipes compile `coderd/rbac` (which includes
generated files like `object_gen.go`), and without explicit ordering,
parallel runs produced syntax errors from mid-write reads.
Three main changes:
**Dependency graph fixes** declare the compile-time chain through
`coderd/rbac` so that `object_gen.go` is written before anything that
imports it is compiled. The DB generation targets use a GNU Make 4.3+
grouped target (`&:`) so Make knows `generate.sh` co-produces
`querier.go`, `unique_constraint.go`, `dbmetrics`, and `dbauthz` in a
single invocation. `SKIP_DUMP_SQL=1` avoids re-entrant `make` inside
`generate.sh` when the Makefile already guarantees `dump.sql` is fresh.
**`scripts/atomicwrite` package** replaces `os.WriteFile` in all gen
scripts with a temp-file-in-same-dir + rename pattern, preventing
interrupted runs from leaving partial files.
**`.PRECIOUS` and shell atomic writes** protect git-tracked generated
files from Make's default delete-on-error behavior. Since these files
are committed, deletion is worse than staleness -- `git restore` is the
recovery path.
CI now runs `make -j --output-sync -B gen` (~32s, down from ~85s
serial).
| Scenario | Before | After |
|-----------------------------------|--------------------|----------|
| `make gen` (serial) | 95s | 95s |
| `make -j gen` (parallel) | race error | **22s** |
| CI `make -j --output-sync -B gen` | forced serial ~85s | **~32s** |
Previously, WorkspaceBuildBuilder.doInTX() inserted provisioner jobs
with empty tags and used a loop in AcquireProvisionerJob that could
match other tests' pending jobs when parallel tests share a database.
Add a unique tag (jobID -> "true") to each provisioner job at insert
time, then use that tag in AcquireProvisionerJob to target only the
correct job. This follows the same pattern used in dbgen.ProvisionerJob.
Closescoder/internal#1367
closes https://github.com/coder/internal/issues/464
# Summary
This PR resolves a flaky test that was sensitive to DST transitions in
various time zones. The root of the flake was:
* a bug; the query and its tests assume 24 hours per day
* the tests used local system time, which resulted in failures for dates
proximal to DST transitions
# Changes
Query:
The original query assumed 24 hour intervals between each day, which is
not a valid assumption. It now increments `1 day` at a time.
Database tests:
Database level tests for the query all assumed 24 hour days. They now
increment in DST-aware days instead. Instead of using time.Now() as a
base for testing, the test uses a series of dates over the course of an
entire year, to ensure that DST transition dates are present in every
test run.
# API Endpoint
The endpoint that delivers the user status chart now accepts an IANA
timezone name as a parameter and passes it, keeping the existing offset
as a fallback, to the database query.
API level tests were added to ensure the correct response form and error
behaviour. Correctness of content is tested at the database level.
Despite the SDK type having an `Archived` field for chats, this data was
never fetched from the database — the `GetChatsByOwnerID` query
hardcoded `AND archived = false`, and the `convertChat` function never
mapped the field.
This PR adds an optional `archived` query parameter to `GET
/api/experimental/chats`:
| Value | Behavior |
|-------|----------|
| *(not provided)* | Returns all chats (active and archived) |
| `archived=false` | Returns only non-archived chats |
| `archived=true` | Returns only archived chats |
This follows the same pattern used by template versions
(`sqlc.narg('archived')` nullable boolean).
Also fixes `convertChat` to populate the `Archived` field in API
responses, which was never being set despite existing on the SDK type.
Adds database columns and server-side logic to track interception lineage via tool call IDs. When an interception ends, the server resolves the correlating tool call ID to find the parent interception and links them via `parent_id`.
New `provider_tool_call_id` column on `aibridge_tool_usages` and `parent_id` column on `aibridge_interceptions`, with indexes for lookup. `findParentInterceptionID` queries by tool call ID and filters out the current interception to find the parent.
Adapted from the [coder/coder `dk/prompt_provenance_poc`](https://github.com/coder/coder/compare/main...dk/prompt_provenance_poc) branch.
Depends on [coder/aibridge#188](https://github.com/coder/aibridge/pull/188).
Closes https://github.com/coder/internal/issues/1334
## Summary
Removes `time.Sleep` calls in two test files by replacing them with
deterministic or event-driven alternatives.
### Changes
**`coderd/provisionerjobs_test.go`** (34.5s → 0.25s)
Replaced `time.Sleep(1500ms)` with a direct SQL `UPDATE` to bump
`created_at` by 2 seconds. The sleep existed purely to ensure different
timestamps for sort-order testing. The fix is deterministic and cannot
flake. Uses `NewDBWithSQLDB` (the test already required real Postgres
via `WithDumpOnFailure`).
**`coderd/database/pubsub/pubsub_test.go`** (2.05s → 1.3s)
Replaced `time.Sleep(1s)` with a `testutil.Eventually` retry loop that
publishes and checks for subscriber receipt. This is the idiomatic
pattern in the codebase. The old sleep waited for pq.Listener to
re-issue LISTEN after reconnect; the new code polls until it actually
works.
## Problem
The pubsub notification handler in `chatd` re-fetched **all** messages
from the DB on every new message notification, then filtered in Go with
`msg.ID > lastMessageID`. This grows linearly with conversation length —
every new message triggers a full table scan of that chat's history.
The `AfterMessageID` field in the pubsub notification payload was
clearly designed for cursor-based fetching, but no matching query
existed.
## Fix
- Add `GetChatMessagesByChatIDAfter` SQL query with `WHERE id >
@after_id`, so the database does the filtering instead of Go.
- Use it in the pubsub notification handler in `chatd.go`, passing
`lastMessageID` as the cursor.
- Implement the dbauthz wrapper (was a `panic("not implemented")` stub
from codegen) with the same read-check-on-parent-chat pattern as
adjacent methods.
- Add dbauthz test coverage for the new method.
**Not changed:** The initial snapshot in `Subscribe()` still loads all
messages — that's correct, since a newly-connecting client needs the
full conversation state. The waste was only in the ongoing notification
path.
## Problem
When archiving an agent with subagents, the children briefly flash in
the sidebar as root-level items before disappearing. Two issues:
1. **Backend:** Archive used N+1 queries — a recursive DFS
(`archiveChatTree`, no transaction) or BFS loop (`chatd.ArchiveChat`,
N+1 queries in a tx) to walk the tree and archive each chat
individually.
2. **Frontend:** The SSE `deleted` event handler only filtered out the
parent chat from the cache. Children remained briefly, got promoted to
root-level by `buildChatTree`, then disappeared on the next re-fetch.
## Fix
**Backend:** Replace both tree-walk implementations with a single SQL
query:
```sql
UPDATE chats SET archived = true, updated_at = NOW()
WHERE id = @id OR root_chat_id = @id;
```
This leverages the existing `root_chat_id` column (already indexed) to
archive the entire tree atomically.
**Frontend:** When a `deleted` event arrives, also filter out any chats
whose `root_chat_id` matches the deleted chat, so children vanish from
the sidebar immediately with the parent.
## Changes
- `coderd/database/queries/chats.sql` — Added `ArchiveChatTreeByID`
query
- `coderd/chats.go` — Use single query, delete `archiveChatTree`
function
- `coderd/chatd/chatd.go` — Simplify `ArchiveChat` to use single query
- `coderd/database/dbauthz/dbauthz.go` — Auth wrapper for new query
- `coderd/chats_test.go` — Added `TestArchiveChat/ArchivesChildren`
subtest
- `site/src/pages/AgentsPage/AgentsPage.tsx` — Filter children in SSE
handler
- Generated files updated via `make gen`
Add a new SubjectTypeChatd RBAC subject with minimal permissions:
- Chat: CRUD
- Workspace: Read
- DeploymentConfig: Read
Replace all 10 AsSystemRestricted calls in coderd/chatd/chatd.go:
- Line 890: Use AsChatd instead of AsSystemRestricted for the background
processor context.
- Subscribe() path (5 calls): Remove system escalation entirely; these
run under the authenticated user's context from the HTTP handler.
- processChat path (4 calls): Remove redundant per-call wraps; the
context already carries AsChatd from the processor start.
Add TestAsChatd verifying allowed and denied actions.
Created using Mux (Opus 4.6)
## Summary
Remove the `workspace_agent_id` column from the `chats` table and
dynamically look up the first workspace agent instead.
## Problem
When a workspace is stopped and restarted, the workspace agent gets a
new ID. The `workspace_agent_id` stored on the chat at creation time
becomes stale, making the agent unreachable. This caused chats to break
after workspace restarts.
## Solution
Instead of persisting the agent ID, dynamically look up the first agent
from the workspace's latest build via
`GetWorkspaceAgentsInLatestBuildByWorkspaceID` whenever an agent
connection is needed. The `workspace_id` on the chat remains stable
across restarts.
This behavior may be refined later (e.g., agent selection heuristics),
but picking the first agent resolves the immediate breakage.
## Changes
- **Migration 000425**: Drop `workspace_agent_id` column from `chats`
- **SQL queries**: Remove `workspace_agent_id` from `InsertChat` and
`UpdateChatWorkspace`
- **chatd.go**: `getWorkspaceConn` and `resolveInstructions` now look up
agents dynamically from workspace ID
- **chatd.go**: Remove `refreshChatWorkspaceSnapshot` (no longer needed)
- **createworkspace.go**: Stop persisting agent ID when associating
workspace with chat
- **subagent.go**: Stop passing agent ID to child chats
- **SDK/frontend**: Remove `WorkspaceAgentID` / `workspace_agent_id`
from Chat type
---------
Co-authored-by: Kyle Carberry <kylecarbs@gmail.com>