OpenAI Responses follow-up turns were replaying full assistant/tool
history even when `store=true`, which breaks after reasoning +
provider-executed `web_search` output.
This change persists the OpenAI response ID on assistant messages, then
in `coderd/x/chatd` switches `store=true` follow-ups to
`previous_response_id` chaining with a system + new-user-only prompt.
`store=false` and missing-ID cases still fall back to manual replay.
It also updates the fake OpenAI server and integration coverage for the
chaining contract, and carries the rebased path move to `coderd/x/chatd`
plus the migration renumber needed after rebasing onto `main`.
Fallback to the configured model name in PR Insights when a model config
has a blank display name.
This updates both the by-model breakdown and recent PR rows, and adds a
regression test for blank display names.
_Disclaimer:_ _produced_ _by_ _Claude_ _Opus_ _4\.6,_ _reviewed_ _by_ _me._
**This is a breaking change.** Users who are not have `owner` or sitewide `auditor` roles will no longer be able to view interceptions.
Regular users should not need to view this information; in fact, it could be used by a malicious insider to see what information we track and don't track to exfiltrate data or perform actions unobserved.
---
Changed authorization for AI Bridge interception-related operations from system-level permissions to resource-specific permissions. The following functions now authorize against `rbac.ResourceAibridgeInterception` instead of `rbac.ResourceSystem`:
- `ListAIBridgeTokenUsagesByInterceptionIDs`
- `ListAIBridgeToolUsagesByInterceptionIDs`
- `ListAIBridgeUserPromptsByInterceptionIDs`
Updated RBAC roles to grant AI Bridge interception permissions:
- **User/Member roles**: Can create and update AI Bridge interceptions but cannot read them back
- **Service accounts**: Same create/update permissions without read access
- **Owners/Auditors**: Retain full read access to all interceptions
Removed system-level authorization bypass in `populatedAndConvertAIBridgeInterceptions` function, allowing proper resource-level authorization checks.
Updated tests to reflect the new permission model where members cannot view AI Bridge interceptions, even their own, while owners and auditors maintain full visibility.
<!--
If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting.
-->
_Disclaimer:_ _initially_ _produced_ _by_ _Claude_ _Opus_ _4\.6,_ _heavily_ _modified_ _and_ _reviewed_ _by_ _me._
Closes https://github.com/coder/internal/issues/1360
Adds a new `/api/v2/aibridge/sessions` API which returns "sessions".
Sessions, as defined in the [RFC](https://www.notion.so/coderhq/AI-Bridge-Sessions-Threads-2ccd579be59280f28021d3baf7472fbe?source=copy_link), are a set of interceptions logically grouped by a session key issued by the client.
The API design for this endpoint was done in [this doc](https://github.com/coder/internal/issues/1360).
If the client has not provided a session ID, we will revert to the thread root ID, and if that's not present we use the interception's own ID (i.e. a session of a single interception - which is effectively what we show currently in our `/api/v2/aibridge/interceptions` API).
The SQL query looks gnarly but it's relatively simple, and seems to perform well (~200ms) even when I import dogfood's `aibridge_*` tables into my workspace. If we need to improve performance on this later we can investigate materialized views, perhaps, but for now I don't think it's warranted.
---
_The PR looks large but it's got a lot of generated code; the actual changes aren't huge._
## What
Adds per-user per-model auto-compaction threshold overrides. Users can
now customize the percentage of context window usage that triggers chat
compaction, independently for each enabled model.
## Why
The compaction threshold was previously only configurable at the
deployment level (`chat_model_configs.compression_threshold`). Different
users have different preferences — some want aggressive compaction to
keep costs low, others prefer higher thresholds to retain more context.
This gives users control without requiring admin intervention.
## Architecture
**Storage:** Reuses the existing `user_configs` table (no migration
needed). Overrides are stored as key/value pairs with keys shaped
`chat_compaction_threshold:<modelConfigID>` and integer percent values.
**API:** Three new experimental endpoints under
`/api/experimental/chats/config/`:
- `GET /user-compaction-thresholds` — list all overrides for the current
user
- `PUT /user-compaction-thresholds/{modelConfig}` — upsert an override
(validates model exists and is enabled, validates 0–100 range)
- `DELETE /user-compaction-thresholds/{modelConfig}` — clear an override
(idempotent)
**Runtime resolution:** In `coderd/chatd/chatd.go`, a new
`resolveUserCompactionThreshold()` helper runs at the start of each chat
turn (inside `runChat()`), after the model config is resolved but before
`CompactionOptions` is built. If a valid override exists, it replaces
`modelConfig.CompressionThreshold`. The threshold source
(`user_override` vs `model_default`) is logged with each compaction
event.
**Precedence:** `effectiveThreshold = userOverride ??
modelConfig.CompressionThreshold`
**UI:** New "Context Compaction" subsection in the Agents → Settings →
Behavior tab, placed after Personal Instructions. Shows one row per
enabled model with the system default, a number input for the override,
and Save/Reset controls.
## Testing
- 9 API subtests covering CRUD, validation (boundary values 0/100,
out-of-range rejection), upsert behavior, idempotent delete, user
isolation, and non-existent model config
- 4 dbauthz tests (16 scenarios) verifying `ActionReadPersonal` /
`ActionUpdatePersonal` on all query methods
- 4 Storybook stories with play functions (Default, WithOverrides,
Loading, Error)
<details>
<summary>Implementation plan</summary>
### Phase 1 — Tests
- Backend API tests in `coderd/chats_test.go` (9 subtests)
- Database auth wrapper tests in
`coderd/database/dbauthz/dbauthz_test.go` (4 methods)
- Frontend stories in `UserCompactionThresholdSettings.stories.tsx` (4
stories)
### Phase 2 — Backend preference surface
- 4 SQL queries in `coderd/database/queries/users.sql` (list, get,
upsert, delete)
- `make gen` to propagate into generated artifacts
- Auth/metrics wrappers in dbauthz and dbmetrics
- SDK types and client methods in `codersdk/chats.go`
- HTTP handlers and routes in `coderd/chats.go` and `coderd/coderd.go`
- Key prefix constant shared between handlers and runtime
### Phase 3 — Runtime override
- `resolveUserCompactionThreshold()` helper in `coderd/chatd/chatd.go`
- Override injection in `runChat()` before building `CompactionOptions`
- `threshold_source` field added to compaction log
### Phase 4 — Settings UI
- API client methods and React Query hooks in `site/src/api/`
- `UserCompactionThresholdSettings` component extracted from
`SettingsPageContent`
- Per-model mutation tracking (only the active row disables during save)
- 100% warning, "System default" label, helpful empty state copy
### Phase 5 — Refactor and review fixes
- Consolidated key prefix constant in `codersdk`
- Explicit PUT range validation (not just struct tags)
- GET handler gracefully skips malformed rows instead of 500
- Boundary value, upsert, and non-existent model config tests
- UX improvements: per-model mutation state, aria-live on errors
</details>
The slices package provides type-safe generic replacements for the
old typed sort convenience functions. The codebase already uses
slices.Sort in 43 call sites; this finishes the migration for the
remaining 29.
- sort.Strings(x) -> slices.Sort(x)
- sort.Float64s(x) -> slices.Sort(x)
- sort.StringsAreSorted(x) -> slices.IsSorted(x)
- Moves `coderd/chatd/`, `coderd/gitsync/`, `enterprise/coderd/chatd/`
under `x/` parent directories to signal instability
- Adds `Experimental:` glue code comments in `coderd/coderd.go`
> 🤖 This PR was created with the help of Coder Agents, and was
reviewed by my human. 🧑💻
Continuation of https://github.com/coder/coder/pull/23067
Add filtering to the paginated org member endpoint (pretty much the same
as what I did in the previous PR with group members, except there I also
had to add pagination since it was missing).
Partially addresses #21813 (still need to make changes to the "add user"
button to be complete)
Since there are a lot of user tests already, I moved them into
`coderdtest` to be shared.
- Add `agents_workspace_ttl` site config (default: whatever the template
says a.k.a. `0s`)
- Expose via GET/PUT `/api/experimental/chats/config/workspace-ttl`
- Chat tool reads setting and passes `TTLMillis` on workspace creation
- Existing autostop infrastructure handles the rest (zero changes to
LifecycleExecutor, CalculateAutostop, or activity bumping)
- ⚠️ Template-level `UserAutostopEnabled=false` overrides this global
default. Not touching this.
- Frontend: "Workspace Lifetime" control in /agents/settings Behavior
tab (admin-only)
> This PR was created with the help of Coder Agents, and has been
reviewed by several humans and robots. 🤖🤝🧑💻
Adds a warning comment to dbauthz.AsSystemRestricted to hopefully nudge agents away from it.
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
## Summary
The search field on `/agents/settings/usage` previously only matched
against usernames. This updates the SQL query to also match against the
user's display name via `ILIKE`, and updates the frontend placeholder
and variable names to reflect the broader search scope.
## Changes
- **SQL** (`coderd/database/queries/chats.sql`,
`coderd/database/queries.sql.go`): Added `OR u.name ILIKE '%' ||
@username::text || '%'` to the `GetChatCostPerUser` query's WHERE
clause.
- **Frontend** (`site/src/pages/AgentsPage/SettingsPageContent.tsx`):
Renamed `usernameFilter`/`debouncedUsername` to
`searchFilter`/`debouncedSearch`, updated placeholder to "Search by name
or username".
---
PR generated with Coder Agents
- Wire `workspacestats.ActivityBumpWorkspace` into `trackWorkspaceUsage`
so the workspace build deadline is extended each time the chat heartbeat
fires
- Prevents mid-conversation autostop for chat workspaces
- Updates `TestHeartbeatBumpsWorkspaceUsage` verifying the deadline bump
> This PR was created with the help of Coder Agents, and was reviewed by two humans and their pet robots 🧑💻🤝🤖
## Summary
Adds the database schema, API endpoints, SDK types, and encryption
wrappers for admin-managed MCP (Model Context Protocol) server
configurations that chatd can consume. This is the backend foundation
for allowing external MCP tools (Sentry, Linear, GitHub, etc.) to be
used during AI chat sessions.
## Database
Two new tables:
- **`mcp_server_configs`**: Admin-managed server definitions with URL,
transport (Streamable HTTP / SSE), auth config (none / OAuth2 / API key
/ custom headers), tool allow/deny lists, and an availability policy
(`force_on` / `default_on` / `default_off`). Includes CHECK constraints
on transport, auth_type, and availability values.
- **`mcp_server_user_tokens`**: Per-user OAuth2 tokens for servers
requiring individual authentication. Cascades on user/config deletion.
New column on `chats` table:
- **`mcp_server_ids UUID[]`**: Per-chat MCP server selection, following
the same pattern as `model_config_id` — passed at chat creation,
changeable per-message with nil-means-no-change semantics.
## API Endpoints
All routes are under `/api/experimental/mcp/servers/` and gated behind
the `agents` experiment.
**Admin endpoints** (`ResourceDeploymentConfig` auth):
- `POST /` — Create MCP server config
- `PATCH /{id}` — Update MCP server config (full-replace)
- `DELETE /{id}` — Delete MCP server config
**Authenticated endpoints** (all users, enabled servers only for
non-admins):
- `GET /` — List configs (admins see all, members see enabled-only with
admin fields redacted)
- `GET /{id}` — Get config by ID (with `auth_connected` populated
per-user)
**OAuth2 per-user auth flow:**
- `GET /{id}/oauth2/connect` — Initiate OAuth2 flow (state cookie CSRF
protection)
- `GET /{id}/oauth2/callback` — Handle OAuth2 callback, store tokens
- `DELETE /{id}/oauth2/disconnect` — Remove stored OAuth2 tokens
## Security
- **Secrets never returned**: `OAuth2ClientSecret`, `APIKeyValue`, and
`CustomHeaders` are never in API responses — only boolean indicators
(`has_oauth2_secret`, `has_api_key`, `has_custom_headers`).
- **Field redaction for non-admins**: `convertMCPServerConfigRedacted`
strips `OAuth2ClientID`, auth URLs, scopes, and `APIKeyHeader` from
non-admin responses.
- **dbcrypt encryption at rest**: All 5 secret fields use `dbcrypt_keys`
encryption with full encrypt-on-write / decrypt-on-read wrappers (11
dbcrypt method overrides + 2 helpers), following the same pattern as
`chat_providers.api_key`.
- **OAuth2 CSRF protection**: State parameter stored in `HttpOnly`
cookie with `HTTPCookies.Apply()` for correct `Secure`/`SameSite` behind
TLS-terminating proxies.
- **dbauthz authorization**: All 18 querier methods have authorization
wrappers. Read operations use `ActionRead`, write operations use
`ActionUpdate` on `ResourceDeploymentConfig`.
## Governance Model
| Control | Implementation |
|---------|---------------|
| **Global kill switch** | `enabled` defaults to `false` |
| **Availability policy** | `force_on` (always injected), `default_on`
(pre-selected), `default_off` (opt-in) |
| **Per-chat selection** | `mcp_server_ids` on `CreateChatRequest` /
`CreateChatMessageRequest` |
| **Auth gate** | OAuth2 servers require per-user auth before tools are
injected |
| **Tool-level allow/deny** | Arrays on `mcp_server_configs` for
granular tool filtering |
| **Secrets encrypted at rest** | Uses `dbcrypt_keys` (same pattern as
`chat_providers.api_key`) |
## Tests
8 test functions covering:
- Full CRUD lifecycle (create, list, update, delete)
- Non-admin visibility filtering (enabled-only, field redaction)
- `auth_connected` population for OAuth2 vs non-OAuth2 servers
- Availability policy validation (valid values + invalid rejection)
- Unique slug enforcement (409 Conflict)
- OAuth2 disconnect idempotency
- Chat creation with `mcp_server_ids` persistence
## Known Limitations (Deferred)
These are documented and intentional for an experimental feature:
- **Audit logging** not yet wired — will add when feature stabilizes
- **Cross-field validation** (e.g., OAuth2 fields required when
`auth_type=oauth2`) — admin-only endpoint, will add when stabilizing
- **`force_on` auto-injection** — query exists but not yet wired into
chatd tool injection (follow-up)
- **Additional test coverage** — 403 auth tests, GET-by-ID tests,
callback CSRF tests planned for follow-up
## What's NOT in this PR
- Frontend UI (admin panel + chat picker)
- Actual MCP client connections (`chatd/chatmcp/` manager)
- Tool injection into `chatloop/`
## Problem
The `/agents/settings/insights` page had several issues:
1. **Duplicate PRs** in "Recent Pull Requests" — multiple chats
referencing the same PR URL each produced a row
2. **Wildly wrong costs** — the cost subquery summed ALL messages across
the entire chat *tree* (`GROUP BY root_chat_id`), so every chat in a
tree got the same inflated total. When aggregated, the same tree cost
was counted N× per PR in that tree
3. **UI clutter** — too many stat cards, too many table columns, mixed
naming conventions
## Fix
### Backend (SQL)
- **Deduplicate by PR URL** using `DISTINCT ON (COALESCE(cds.url,
c.id::text))` across all 4 queries
- **Fix cost computation**: use two CTEs — `pr_costs` sums cost from ALL
chats that reference a PR (so review chats contribute), `deduped` picks
one row per PR for state/additions/deletions via DISTINCT ON
- **Tests**: 3 subtests covering multi-chat cost summing, different PRs
no duplication, and duplicate URL counted once
### Frontend
- **3 stat cards** (down from 5): Merged, Merge rate, Cost / merge
- **2-line chart** (down from 3): created (dashed) + merged (solid)
- **4-column model table** (down from 7): Model, Merged, Merge rate,
Cost/merge
- **4-column recent table** (down from 7): Title, Status, Cost, Created
— with `table-fixed` to prevent overflow
- **Consistent naming**: no mixed PR/PRs abbreviation, contextual labels
since page title establishes context
Adds a `deleted` boolean column to the `chat_messages` table. Messages
are never physically deleted from the database — instead they are marked
as deleted so that usage and cost data is preserved.
## Changes
### Migration
- New migration (000444) adds `deleted boolean NOT NULL DEFAULT false`
to `chat_messages`
### SQL queries
- `DeleteChatMessagesAfterID` → `SoftDeleteChatMessagesAfterID` (UPDATE
SET deleted=true instead of DELETE)
- New `SoftDeleteChatMessageByID` query for single-message soft-delete
- All read queries now filter `deleted = false`:
- `GetChatMessageByID`
- `GetChatMessagesByChatID`
- `GetChatMessagesByChatIDDescPaginated`
- `GetChatMessagesForPromptByChatID` (both CTE and main query)
- `GetLastChatMessageByRole`
- Cost/usage queries (`GetChatCostSummary`, `GetChatCostPerModel`, etc.)
intentionally still include deleted messages to preserve accurate spend
tracking
### EditMessage behavior
- Previously: updated the message content in-place + hard-deleted
subsequent messages
- Now: soft-deletes the original message + soft-deletes subsequent
messages + inserts a new message with the updated content
- This preserves the original message data (tokens, cost, content) in
the database
Follow-up to #23220, addressing Cian's review comments:
- **SQL casing**: Uppercase `UNNEST` to match `NULLIF`/`COALESCE`
convention in the query.
- **Builder pattern**: `chatMessage` struct now uses unexported fields
with a `newChatMessage` constructor for required fields (role, content,
visibility, modelConfigID, contentVersion) and chainable builder methods
(`withCreatedBy`, `withCompressed`, `withUsage`, `withContextLimit`,
`withTotalCostMicros`, `withRuntimeMs`) for optional/nullable fields.
- **Batch test in chats_test**: Replaced the `for i := 0; i < 2` loop
with a single batch insert of 2 messages to actually exercise the batch
logic.
- **Multi-message querier test**: Added `BatchInsertMultipleMessages`
test verifying 3-message batch insert with role ordering, sequential
IDs, nullable field semantics (NULL for zero UUIDs and zero ints), and
token/cost assertions.
---------
Co-authored-by: Cian Johnston <cian@coder.com>
Replaces the singular `InsertChatMessage` query with
`InsertChatMessages` that uses PostgreSQL's `unnest()` for batch
inserts. This reduces the number of database round-trips when inserting
multiple messages in a single transaction.
## Changes
- **SQL**: New `InsertChatMessages :many` query using `unnest()` arrays
following the existing codebase pattern (e.g.,
`InsertWorkspaceAgentStats`). Preserves the CTE that updates
`chats.last_model_config_id` using the last non-null model config from
the batch. Uses `NULLIF` for UUID columns to handle NULL foreign keys.
- **Go layers**: Updated `querier.go`, `dbauthz.go`,
`dbmetrics/querymetrics.go`, `dbmock/dbmock.go`, and `queries.sql.go` to
use the new batch signature (`[]ChatMessage` return type, array params).
- **chatd.go**: All call sites converted to batch inserts:
- **CreateChat**: System prompt + user message batched into one call
- **persistStep**: Assistant message + tool messages batched into one
call
- **persistSummary**: Hidden summary + assistant + tool messages batched
into one call
- Single-message sites use the same API with single-element arrays
- **Helper**: New `appendChatMessage` function simplifies building batch
params at each call site.
- **Tests**: All test files updated to use the new API.
Builds on top of #23213.
## What
Adds a new admin-only **PR Insights** page for the `/agents` analytics
view — a dashboard for engineering leaders to understand code shipped by
AI agents.
### Backend
- `GET /api/v2/chats/insights/pull-requests` — admin-only endpoint
- 4 SQL queries in `chatinsights.sql` aggregating `chat_diff_statuses`
joined with chat cost data (via root chat tree rollup)
- Runs 5 parallel DB queries: current summary, previous summary (for
trends), time series, per-model breakdown, recent PRs
- SDK types auto-generate to TypeScript
### Frontend (`PRInsightsView`)
- **Stat cards**: PRs created, Merged, Merge rate, Lines shipped,
Cost/merged PR — with trend badges comparing to previous period
- **Activity chart**: Stacked area chart (created/merged/closed) using
git color tokens (`git-added-bright`, `git-merged-bright`,
`git-deleted-bright`)
- **Model performance table**: Per-model PR counts, inline merge rate
bars, diff stats, cost breakdown
- **Recent PRs table**: Status badges, review state icons, author info,
external links
- **Time range filter**: 7d/14d/30d/90d button group
- **4 Storybook stories**: Default, HighPerformance, LowVolume, NoPRs
### Data source
All PR data comes from the existing `chat_diff_statuses` table
(populated by the `gitsync.Worker` background job that polls GitHub
every 120s). No new data collection required.
### Screenshot
View in Storybook: `pages/AgentsPage/PRInsightsView`
## Summary
Adds a `runtime_ms` column to `chat_messages` that records the
wall-clock duration (in milliseconds) of each LLM step. This covers LLM
streaming, tool execution, and retries — the full time the agent is
"alive" for a step.
This is the foundation for billing by agent alive time. The column
follows the same pattern as `total_cost_micros`: stored per assistant
message, aggregatable with `SUM()` over time periods by user.
## Changes
- **Migration**: adds nullable `runtime_ms bigint` to `chat_messages`.
- **chatloop**: adds `Runtime time.Duration` field to `PersistedStep`,
measures `time.Since(stepStart)` at the beginning of each step (covering
stream + tool execution + retries).
- **chatd**: passes `step.Runtime.Milliseconds()` to the assistant
message `InsertChatMessage` call; all other message types (system, user,
tool) get `NULL`.
- **Tests**: adds `runtime > 0` assertion in chatloop tests.
## Billing query pattern
Once ready, aggregation mirrors the existing cost queries:
```sql
SELECT COALESCE(SUM(cm.runtime_ms), 0)::bigint AS total_runtime_ms
FROM chat_messages cm
JOIN chats c ON c.id = cm.chat_id
WHERE c.owner_id = @user_id
AND cm.created_at >= @start_time
AND cm.created_at < @end_time
AND cm.runtime_ms IS NOT NULL;
```
Adds a new `site_config` entry that controls whether the virtual desktop
feature for Coder Agents is enabled. It can be set via a new
`/api/experimental/chats/config/desktop-enabled` endpoint, which will be
used by the frontend.
Introduce a three-way workspace sharing setting (none, everyone,
service_accounts) replacing the boolean workspace_sharing_disabled.
In service_accounts mode, only service account-owned workspaces can be
shared while regular members' share permissions are removed. Adds a
new organization-service-account system role with per-org permissions
reconciled alongside the existing organization-member system role.
Related to:
https://linear.app/codercom/issue/PLAT-28/feat-service-accounts-sharing-mode-and-rbac-role
---------
Co-authored-by: Steven Masley <Emyrk@users.noreply.github.com>
Co-authored-by: Kayla はな <mckayla@hey.com>
## Problem
The chat listing endpoint (`GetChatsByOwnerID`) was using
`fetchWithPostFilter`, which fetches N rows from the database and then
filters them in Go memory using RBAC checks. This causes a pagination
bug: if the user requests `limit=25` but some rows fail the auth check,
fewer than 25 rows are returned even though more authorized rows exist
in the database. The client may incorrectly assume it has reached the
end of the list.
## Solution
Switch to the same pattern used by `GetWorkspaces`, `GetTemplates`, and
`GetUsers`: `prepareSQLFilter` + `GetAuthorized*` variant. The RBAC
filter is compiled to a SQL WHERE clause and injected into the query
before `ORDER BY`/`LIMIT`, so the database returns exactly the requested
number of authorized rows.
Additionally, `GetChatsByOwnerID` is renamed to `GetChats` with
`OwnerID` as an optional (nullable) filter parameter, matching the
`GetWorkspaces` naming convention.
## Changes
| File | Change |
|------|--------|
| `queries/chats.sql` | Renamed to `GetChats`, `owner_id` now optional
via CASE/NULL, added `-- @authorize_filter` |
| `queries.sql.go` | Renamed constant, params struct (`GetChatsParams`),
and method |
| `querier.go` | Interface method renamed |
| `modelqueries.go` | Added `chatQuerier` interface +
`GetAuthorizedChats` impl |
| `dbauthz/dbauthz.go` | `GetChats` now uses `prepareSQLFilter` instead
of `fetchWithPostFilter` |
| `dbauthz/dbauthz_test.go` | Updated tests for SQL filter pattern |
| `dbmock/dbmock.go` | Renamed + added mock for `GetAuthorizedChats` |
| `dbmetrics/querymetrics.go` | Renamed + added metrics wrapper |
| `rbac/regosql/configs.go` | Added `ChatConverter` (maps `org_owner` to
empty string literal since `chats` has no `organization_id` column) |
| `rbac/authz.go` | Added `ConfigChats()` |
| `chats.go` | Handler uses renamed method with `uuid.NullUUID` |
| `searchquery/search.go` | Updated return type |
| `gitsync/worker.go` | Updated interface and call site |
| Various test files | Updated for renamed types |
Creates a new table `ai_seat_state` to keep track of when users consume an ai_seat. Once a user consumes an AI seat, they will forever in this table (as it stands today).
Adds cursor-based pagination to the chat messages endpoint.
## Backend
- New `GetChatMessagesByChatIDPaginated` SQL query: returns messages in
`id DESC` order with a `before_id` keyset cursor and configurable
`limit`
- Handler parses `?before_id=N&limit=N` query params, uses the `LIMIT
N+1` trick to set `has_more` without a separate COUNT query
- Queued messages only returned on the first page (no cursor) since
they're always the most recent
- SDK client updated with `ChatMessagesPaginationOptions`
- Fully backward compatible: omitting params returns the 50 newest
messages
## Frontend
- Switches `getChatMessages` from `useQuery` to `useInfiniteQuery` with
cursor chaining via `getNextPageParam`
- Pages flattened and sorted by `id` ascending for chronological display
- `MessagesPaginationSentinel` component uses `IntersectionObserver`
(200px rootMargin prefetch) inside the existing `flex-col-reverse`
scroll container
- `flex-col-reverse` handles scroll anchoring natively when older
messages are prepended — no manual `scrollTop` adjustment needed (same
pattern as coder/blink)
## Why cursor-based instead of offset/limit
Offset-based pagination breaks when new messages arrive while paginating
backward (offsets shift, causing duplicates or missed messages). The
`before_id` cursor is stable regardless of inserts — each page is
deterministic.
## Problem
The sidebar diff status (PR icon, +additions/-deletions, file count) was
not updating in real-time. Users had to reload the page to see changes.
Two root causes:
1. **Frontend**: The `diff_status_change` WebSocket handler in
`AgentsPage.tsx` had an early `return` (line 398) that skipped
`updateInfiniteChatsCache`, so the sidebar's cache was never updated.
Even for other event types, the cache merge only spread `status` and
`title` — never `diff_status`.
2. **Server**: `publishChatPubsubEvent` in `chatd.go` constructed a
minimal `Chat` payload without `DiffStatus`, so even if the frontend
consumed the event, `updatedChat.diff_status` would be `undefined`.
## Fix
### Server (`coderd/chatd/chatd.go`)
- `publishChatPubsubEvent` now accepts an optional
`*codersdk.ChatDiffStatus` parameter; when non-nil it's set on the
outgoing `Chat` payload.
- `PublishDiffStatusChange` fetches the diff status from the DB,
converts it, and passes it through.
- Added `convertDBChatDiffStatus` (mirrors `coderd/chats.go`'s converter
to avoid circular import).
- All other callers pass `nil`.
### Frontend (`site/src/pages/AgentsPage/AgentsPage.tsx`)
- Removed the early `return` so `diff_status_change` events fall through
to the cache update logic.
- Added `isDiffStatusEvent` flag and spread `diff_status` into both the
infinite chats cache (sidebar) and the individual chat cache.
## Summary
- add an `IS DISTINCT FROM` guard to `InsertChatMessage`'s
`updated_chat` CTE so `chats.last_model_config_id` is only rewritten
when the incoming `model_config_id` actually changes
- regenerate the query layer
- add focused regression coverage for the two meaningful behaviors:
same-model inserts and real model switches
- trim redundant message-field assertions so the new test stays focused
on the guard behavior
## Proof this is an improvement
This PR reduces work in the hottest chat write query without changing
the insert behavior.
### Why the old query did unnecessary work
Before this change, `InsertChatMessage` always ran this update whenever
`model_config_id` was non-null:
```sql
UPDATE chats
SET last_model_config_id = sqlc.narg('model_config_id')::uuid
WHERE id = @chat_id::uuid
AND sqlc.narg('model_config_id')::uuid IS NOT NULL
```
That means the query rewrote the `chats` row even when
`chats.last_model_config_id` was already equal to the incoming value.
### What changes in this PR
This PR adds:
```sql
AND chats.last_model_config_id IS DISTINCT FROM sqlc.narg('model_config_id')::uuid
```
So same-model inserts still insert the message, but they no longer
perform a redundant `UPDATE chats`.
### Why this matters on the hot path
From the chat scaletest investigation that motivated this change:
- `InsertChatMessage` (+ `updated_chat` CTE) was the hottest write query
- about **104k calls**
- about **0.69 ms average latency**
- about **71.8 s total DB execution time**
We also verified common callsites where the update is provably
redundant:
- `CreateChat` inserts the chat with `LastModelConfigID =
opts.ModelConfigID`, then immediately inserts initial system/user
messages with that same model config
- follow-up user messages commonly pass `lockedChat.LastModelConfigID`
straight into `InsertChatMessage`
- assistant/tool/summary persistence keeps the current model in the
common case; only real switches or fallback cases need the chat row
update
That means a meaningful fraction of executions of the hottest DB write
query move from:
- **before:** insert message **+** rewrite chat row
- **after:** insert message only
This should reduce row churn and write contention on `chats`, especially
against other chat-row writers like `UpdateChatStatus` and
`GetChatByIDForUpdate`.
Adds the `head_branch` field (the source/feature branch name of a PR) to
the diff status pipeline. Previously only `base_branch` (target branch)
and the head commit SHA were captured from the GitHub API, but not the
head branch name itself.
## Changes
- **Migration 438**: Add `head_branch` nullable TEXT column to
`chat_diff_statuses`
- **gitprovider**: Parse `head.ref` from the GitHub API response
(alongside `head.sha`) and add `HeadBranch` to `PRStatus`
- **gitsync**: Wire `HeadBranch` through `refreshOne()` into the DB
upsert params
- **worker**: Map `HeadBranch` in `chatDiffStatusFromRow()`
- **coderd**: Convert `HeadBranch` in `convertChatDiffStatus()`
- **codersdk**: Expose as `head_branch` (`*string`, omitempty) in
`ChatDiffStatus` API response
- **Tests**: Updated `github_test.go` pull JSON fixtures and assertions
Surfaces cache token data in the analytics views and fixes table
spacing.
### Changes
- **Cache token columns**: Added cache read and cache write token counts
to all analytics views (user and admin), from SQL queries through Go SDK
types to the frontend tables and summary cards.
- **Table spacing fix**: Replaced the bare React fragment in
`ChatCostSummaryView` with a `space-y-6` container so the model and chat
breakdown tables no longer overlap.
### Data flow
`chat_messages` table already stores `cache_read_tokens` and
`cache_creation_tokens` (and uses them for cost calculation). This PR
aggregates and displays them alongside input/output tokens in:
- Summary cards (6 cards: Total Cost, Input, Output, Cache Read, Cache
Write, Messages)
- Per-model breakdown table
- Per-chat breakdown table
- Admin per-user table
Implement the backend for the desktop feature for agents.
- Adds a new `/api/experimental/chats/$id/desktop` endpoint to coderd
which exposes a VNC stream from a
[portabledesktop](https://github.com/coder/portabledesktop) process
running inside the workspace
- Adds a new `spawn_computer_use_agent` tool to chatd, which spawns a
subagent that has access to the `computer` tool which lets it interact
with the `portabledesktop` process running inside the workspace
- Adds the plumbing to make the above possible
There's a follow up frontend PR here:
https://github.com/coder/coder/pull/23006
Add cost tracking for LLM chat interactions with microdollar precision.
## Changes
- Add `chatcost` package for per-message cost calculation using
`shopspring/decimal` for intermediate arithmetic
- **Ceil rounding policy**: fractional micros round UP to next whole
micro (applied once after summing all components)
- Database migration: `total_cost_micros` BIGINT column with historical
backfill and `created_at` index
- API endpoints: per-user cost summary and admin rollup under
`/api/experimental/chats/cost/`
- SDK types: `ChatCostSummary`, `ChatCostModelBreakdown`,
`ChatCostUserRollup`
- Fix `modeloptionsgen` to handle `decimal.Decimal` as opaque numeric
type
- Update frontend pricing test fixtures for string decimal types
## Design decisions
- `NULL` = unpriced (no matching model config), `0` = free
- Reasoning tokens included in output tokens (no double-counting)
- Integer microdollars (BIGINT) for storage and API responses
- Price config uses `decimal.Decimal` for exact parsing; totals use
`int64`
Frontend: #23037
Migration 000434 converts chat_messages.role from text to a Postgres
enum, rebuilds the partial index, and adds content_version smallint.
The column is backfilled with DEFAULT 0, then the default is dropped
so future inserts must set it explicitly.
Version 0 uses the role-aware heuristic from #22958. Version 1 (all
new inserts) stores []ChatMessagePart JSON for all roles, including
system messages. ParseContent takes database.ChatMessage directly
and dispatches on version internally. Unknown versions error.
All string(codersdk.ChatMessageRole*) casts at DB write sites are
replaced with database.ChatMessageRole* constants from sqlc.
Refs #22958
File-reference parts in user messages were flattened to `TextContent` at
write time because fantasy has no file-reference content type. The
frontend never saw them as structured parts.
This moves all write paths (user, assistant, tool) from fantasy envelope
format to `codersdk.ChatMessagePart`. The streaming layer (`chatloop`)
is untouched, the conversion happens at the serialization boundary in
`persistStep`.
Old rows are still readable. `ParseContent` uses a structural heuristic
(`isFantasyEnvelopeFormat`) to distinguish legacy envelopes from SDK
parts. We chose this over try/fallback because fantasy envelopes
partially unmarshal into `ChatMessagePart` (the `type` field matches)
while silently losing content. A guard test enforces that no SDK part
can produce the envelope shape.
This is forward-only: new rows are unreadable by old code. Chat is
behind a feature flag so rollback risk is contained.
Also adds a typed `ChatMessageRole` to replace raw strings and
`fantasy.MessageRole*` casts at the persistence boundary. The type
covers `ChatMessage.Role`, `ChatStreamMessagePart.Role`, the
`PublishMessagePart` callback chain, and all DB write sites.
`fantasy.MessageRole*` remains only where we build `fantasy.Message`
structs for LLM dispatch.
Separately, `ProviderMetadata` was leaking to SSE clients via
`publishMessagePart`. `StripInternal` now runs on both the SSE and REST
paths, covering this.
Other cleanup:
- Old `db2sdk.contentBlockToPart` silently dropped metadata on
text/reasoning/tool-call content. New code preserves it.
- `providerMetadataToOptions` now logs warnings instead of silently
returning nil.
- `db2sdk` shrinks from ~250 lines of parallel conversion to ~15 lines
delegating to `chatprompt.ParseContent()`, removing the `fantasy` import
entirely.
Refs #22821
The timeout was started before the unbounded Stepper loop, so
under CI load the deadline could expire before reaching the
operations that actually use it.
Also bumps TestMigration000387 from WaitLong to WaitSuperLong.
Fixescoder/internal#1398
## Problem
1. **Personal behavior prompt not applied**: The chatd background worker
was missing `ActionReadPersonal` on `ResourceUser` in its RBAC subject.
When `resolveUserPrompt` calls `GetUserChatCustomPrompt`, the dbauthz
layer checks `ActionReadPersonal` on the user — which the chatd role
didn't have. The error was silently swallowed (returns `""`), so the
user's custom prompt was never injected into the system messages.
2. **Sequential DB calls on chat startup**: Several independent database
queries in `runChat` and `resolveChatModel` were running sequentially,
adding unnecessary latency before the LLM stream begins.
## Changes
### RBAC fix (`dbauthz.go`)
- Add `rbac.ResourceUser.Type: {policy.ActionReadPersonal}` to
`subjectChatd` site permissions
- This is the minimal permission needed — `ActionRead` on User remains
denied
### Parallelization (`chatd.go`)
Three parallelization points using `errgroup.Group`:
1. **`resolveChatModel`**: `resolveModelConfig` and
`GetEnabledChatProviders` run concurrently (both needed for
`ModelFromConfig`, which stays sequential after the wait)
2. **`runChat` startup**: `resolveChatModel` and
`GetChatMessagesForPromptByChatID` run concurrently (completely
independent)
3. **`runChat` prompt assembly**: `resolveInstructions` and
`resolveUserPrompt` run concurrently (both produce strings;
`InsertSystem` calls maintain correct order after the wait)
Same pattern applied to the `ReloadMessages` callback.
### Test (`dbauthz_test.go`)
- Add assertion in `TestAsChatd/AllowedActions` that
`ActionReadPersonal` on `ResourceUser` is permitted
## What
Adds provider-native web search tools to the chat system. Anthropic,
OpenAI, and Google all offer server-side web search — this wires them up
as opt-in per-model config options using the existing
`ChatModelProviderOptions` JSONB column (no migration).
Web search is **off by default**.
## Config
Set `web_search_enabled: true` in the model config provider options:
```json
{
"provider_options": {
"anthropic": {
"web_search_enabled": true,
"allowed_domains": ["docs.coder.com", "github.com"]
}
}
}
```
Available options per provider:
- **Anthropic**: `web_search_enabled`, `allowed_domains`,
`blocked_domains`
- **OpenAI**: `web_search_enabled`, `search_context_size`
(`low`/`medium`/`high`), `allowed_domains`
- **Google**: `web_search_enabled`
## Backend
- `codersdk/chats.go` — new fields on the per-provider option structs
- `coderd/chatd/chatd.go` — `buildProviderTools()` reads config, creates
`ProviderDefinedTool` entries (uses `anthropic.WebSearchTool()` helper
from fantasy)
- `coderd/chatd/chatloop/chatloop.go` — `ProviderTools` on `RunOptions`,
merged into `Call.Tools`. Provider-executed tool calls skip local
execution. `StreamPartTypeToolResult` with `ProviderExecuted: true` is
accumulated inline (matching fantasy's own agent.go pattern) instead of
post-stream synthesis.
- `coderd/chatd/chatprompt/` — `MarshalToolResult` carries
`ProviderMetadata` through DB persistence so multi-turn round-trips work
(Anthropic needs `encrypted_content` back)
## Frontend
- Source citations render **inline** at the tool-call position (not
bottom-of-message), using `ToolCollapsible` so they look like other tool
cards — collapsed "Searched N results" with globe icon, expand to see
source pills
- Provider-executed tool calls/results are hidden from the normal tool
card UI
- Tool-role messages with only provider-executed results return `null`
(no empty bubble)
- Both persisted (messageParsing.ts) and streaming (streamState.ts)
paths group consecutive `source` parts into a single `{ type: "sources"
}` render block
## Fantasy changes
The fantasy fork (`kylecarbs/fantasy` branch `cj/go1.25`) has the
Anthropic tool code merged in, but will hopefully go upstream from:
https://github.com/charmbracelet/fantasy/pull/163
## Summary
Scale-tested the `chatd` package with mock-based benchmarks to identify
performance bottlenecks. This PR fixes 6 of the 8 identified issues,
ranked by severity.
## Changes
### 1. Parallel tool execution (HIGH) — `chatloop.go`
`executeTools` ran tool calls sequentially. Now dispatches all calls
concurrently via goroutines with `sync.WaitGroup`. Results are
pre-allocated by index (no mutex needed). `onResult` callbacks fire as
each tool completes.
### 2. Pubsub-backed subagent await (HIGH) — `subagent.go`
`awaitSubagentCompletion` polled the DB every 200ms. Now subscribes to
the child chat's `ChatStreamNotifyChannel` via pubsub for near-instant
notifications. Fallback poll reduced to 5s. Falls back to 200ms only
when `pubsub == nil` (single-instance / in-memory).
### 3. Per-chat stream locking (MEDIUM) — `chatd.go`
Replaced single global `streamMu` + `map[uuid.UUID]*chatStreamState`
with `sync.Map` where each `chatStreamState` has its own `sync.Mutex`.
Zero cross-chat contention.
### 4. Batch chat acquisition (MEDIUM) — `chatd.go`
`processOnce` acquired 1 chat per tick. Now loops up to
`maxChatsPerAcquire = 10` per tick, avoiding idle time when many chats
are pending.
### 5. Reduced heartbeat frequency (LOW-MEDIUM) — `chatd.go`
`chatHeartbeatInterval` changed from 30s to 60s. Safe given the 5-minute
`DefaultInFlightChatStaleAfter`.
### 6. O(depth) descendant check (LOW) — `subagent.go`
Replaced top-down BFS (`O(total_descendants)` queries) with bottom-up
parent-chain walk (`O(depth)` queries). Includes cycle protection.
## Not addressed (intentionally)
- Message serialization overhead
- Buffer eviction (`buffer[1:]` pattern)
Adds `pull_request_title` and `pull_request_draft` to the chat diff
status pipeline (DB → provider → SDK → frontend). The GitHub provider
now fetches the PR title alongside existing status fields.
The agents sidebar now displays PR-state-aware icons for chats that have
a linked pull request (when the chat is in waiting/completed state):
- **Open PR**: `GitPullRequestArrow` (green)
- **Draft PR**: `GitPullRequestDraft` (gray)
- **Merged PR**: `GitMerge` (purple)
- **Closed PR**: `GitPullRequestClosed` (red)
Running/pending/paused/error chats keep their existing activity icons
(spinner, pause, error triangle).
### Changes
**Database migration** (`000432`): Adds `pull_request_title TEXT` and
`pull_request_draft BOOLEAN` columns to `chat_diff_statuses`.
**Backend pipeline**:
- `gitprovider.PRStatus` gains a `Title` field
- GitHub provider decodes the `title` from the API response
- `gitsync` and `coderd/chats.go` pass title + draft through to the DB
upsert
- `codersdk.ChatDiffStatus` exposes both new fields in the API response
**Frontend** (`AgentsSidebar.tsx`): New `getPRIconConfig()` function
resolves the appropriate Lucide git icon based on `pull_request_state`
and `pull_request_draft`. Only applies when the chat is in a terminal
state (waiting/completed).
**Real-time sync**: No changes needed — the existing
`diff_status_change` pubsub event already propagates the full
`ChatDiffStatus` including the new fields.
Adds a `created_by` column (nullable UUID) to the `chat_messages` table
to track which user created each message. Only user-sent messages
populate this field; assistant, tool, system, and summary messages leave
it null.
The column is threaded through the full stack: SQL migration, query
updates, generated Go/TypeScript types, db2sdk conversion, chatd
(including subagent paths), and API handlers. All API handlers that
insert user messages now pass the authenticated user's ID as
`created_by`.
No foreign key constraint was added, matching the existing pattern used
by `chat_model_configs.created_by`.