coder

mirror of https://github.com/coder/coder.git synced 2026-06-04 13:38:21 +00:00

Author	SHA1	Message	Date
Kyle Carberry	742694eb20	fix: filter empty text/reasoning parts before sending to LLM (#23284 ) ## Problem Anthropic rejects requests containing empty text content blocks with: ``` messages: text content blocks must be non-empty ``` Empty text parts (`""` or whitespace-only like `" "`) get persisted in the database when a stream sends `TextStart`/`TextEnd` with no `TextDelta` in between. On the next turn, these parts are loaded from the DB and sent to Anthropic, which rejects them. ## Fix Filter empty/whitespace-only text and reasoning parts at the two LLM dispatch boundaries, without modifying persistence (the raw record is preserved): - `partsToMessageParts()` in `chatprompt.go` — filters when converting persisted DB messages to fantasy message parts for LLM calls. This is the last gateway before the Anthropic provider creates `TextBlockParam` objects. - `toResponseMessages()` in `chatloop.go` — filters when building in-flight conversation messages between steps within a single turn. Note: `flushActiveState()` (the interruption path) already had this guard — the normal `TextEnd` streaming path did not, but since we're not changing persistence, the fix is applied at the dispatch layer.	2026-03-19 12:10:54 -04:00
Kyle Carberry	86cb313765	fix: update fantasy to fix OpenAI reasoning replay with Store enabled (#23297 ) ## Problem When `Store: true` is set for OpenAI Responses API calls (the new default), multi-turn conversations with reasoning models fail on the second message: ``` stream response: bad request: Item 'rs_xxx' of type 'reasoning' was provided without its required following item. ``` The fantasy library was reconstructing full `OfReasoning` input items (with encrypted content and summary) when replaying assistant messages. The API cannot pair these reconstructed reasoning items with the output items that originally followed them because the output items are sent as plain `OfMessage` without server-side IDs. ## Fix Updates the fantasy dependency (`kylecarbs/fantasy@cj/go1.25`) to skip reasoning parts during conversation replay in `toResponsesPrompt`. With `Store` enabled, the API already has the reasoning persisted server-side — it doesn't need to be replayed in the input. Fantasy PR: https://github.com/charmbracelet/fantasy/pull/181 ## Testing Adds `TestOpenAIReasoningRoundTrip` integration test that: 1. Sends a query to `o4-mini` (reasoning model with `Store: true`) 2. Verifies reasoning content is persisted 3. Sends a follow-up message — this was the failing step 4. Verifies the follow-up completes successfully Requires `OPENAI_API_KEY` env var to run.	2026-03-19 15:36:29 +00:00
Michael Suchacz	6d214644f6	fix: make TestInterruptAutoPromotionIgnoresLaterUsageLimitIncrease deterministic (#23279 ) Eliminates the timing flake in `TestInterruptAutoPromotionIgnoresLaterUsageLimitIncrease` by making the chatd worker loop clock-controllable. ## Changes `coderd/chatd/chatd.go` - Replace `time.NewTicker` calls in `Server.start()` with `p.clock.NewTicker` using named quartz tags `("chatd", "acquire")` and `("chatd", "stale-recovery")`. `coderd/chatd/chatd_test.go` - Inject `quartz.NewMock(t)` into the test via `newActiveTestServer` config override. - Trap the acquire ticker so the test controls exactly when pending chats are reacquired. - Rewrite the test flow as explicit clock-advance steps instead of wall-clock polling. `AGENTS.md` - Document the PR title scope rule (scope must be a real path containing all changed files). ## Validation - `go test ./coderd/chatd -run TestInterruptAutoPromotionIgnoresLaterUsageLimitIncrease -count=100` ✅ - `go test ./coderd/chatd` ✅ - `make lint` ✅	2026-03-19 15:14:00 +00:00
Kyle Carberry	d8ff67fb68	feat: add MCP server configuration backend for chats (#23227 ) ## Summary Adds the database schema, API endpoints, SDK types, and encryption wrappers for admin-managed MCP (Model Context Protocol) server configurations that chatd can consume. This is the backend foundation for allowing external MCP tools (Sentry, Linear, GitHub, etc.) to be used during AI chat sessions. ## Database Two new tables: - `mcp_server_configs`: Admin-managed server definitions with URL, transport (Streamable HTTP / SSE), auth config (none / OAuth2 / API key / custom headers), tool allow/deny lists, and an availability policy (`force_on` / `default_on` / `default_off`). Includes CHECK constraints on transport, auth_type, and availability values. - `mcp_server_user_tokens`: Per-user OAuth2 tokens for servers requiring individual authentication. Cascades on user/config deletion. New column on `chats` table: - `mcp_server_ids UUID[]`: Per-chat MCP server selection, following the same pattern as `model_config_id` — passed at chat creation, changeable per-message with nil-means-no-change semantics. ## API Endpoints All routes are under `/api/experimental/mcp/servers/` and gated behind the `agents` experiment. Admin endpoints (`ResourceDeploymentConfig` auth): - `POST /` — Create MCP server config - `PATCH /{id}` — Update MCP server config (full-replace) - `DELETE /{id}` — Delete MCP server config Authenticated endpoints (all users, enabled servers only for non-admins): - `GET /` — List configs (admins see all, members see enabled-only with admin fields redacted) - `GET /{id}` — Get config by ID (with `auth_connected` populated per-user) OAuth2 per-user auth flow: - `GET /{id}/oauth2/connect` — Initiate OAuth2 flow (state cookie CSRF protection) - `GET /{id}/oauth2/callback` — Handle OAuth2 callback, store tokens - `DELETE /{id}/oauth2/disconnect` — Remove stored OAuth2 tokens ## Security - Secrets never returned: `OAuth2ClientSecret`, `APIKeyValue`, and `CustomHeaders` are never in API responses — only boolean indicators (`has_oauth2_secret`, `has_api_key`, `has_custom_headers`). - Field redaction for non-admins: `convertMCPServerConfigRedacted` strips `OAuth2ClientID`, auth URLs, scopes, and `APIKeyHeader` from non-admin responses. - dbcrypt encryption at rest: All 5 secret fields use `dbcrypt_keys` encryption with full encrypt-on-write / decrypt-on-read wrappers (11 dbcrypt method overrides + 2 helpers), following the same pattern as `chat_providers.api_key`. - OAuth2 CSRF protection: State parameter stored in `HttpOnly` cookie with `HTTPCookies.Apply()` for correct `Secure`/`SameSite` behind TLS-terminating proxies. - dbauthz authorization: All 18 querier methods have authorization wrappers. Read operations use `ActionRead`, write operations use `ActionUpdate` on `ResourceDeploymentConfig`. ## Governance Model \| Control \| Implementation \| \|---------\|---------------\| \| Global kill switch \| `enabled` defaults to `false` \| \| Availability policy \| `force_on` (always injected), `default_on` (pre-selected), `default_off` (opt-in) \| \| Per-chat selection \| `mcp_server_ids` on `CreateChatRequest` / `CreateChatMessageRequest` \| \| Auth gate \| OAuth2 servers require per-user auth before tools are injected \| \| Tool-level allow/deny \| Arrays on `mcp_server_configs` for granular tool filtering \| \| Secrets encrypted at rest \| Uses `dbcrypt_keys` (same pattern as `chat_providers.api_key`) \| ## Tests 8 test functions covering: - Full CRUD lifecycle (create, list, update, delete) - Non-admin visibility filtering (enabled-only, field redaction) - `auth_connected` population for OAuth2 vs non-OAuth2 servers - Availability policy validation (valid values + invalid rejection) - Unique slug enforcement (409 Conflict) - OAuth2 disconnect idempotency - Chat creation with `mcp_server_ids` persistence ## Known Limitations (Deferred) These are documented and intentional for an experimental feature: - Audit logging not yet wired — will add when feature stabilizes - Cross-field validation (e.g., OAuth2 fields required when `auth_type=oauth2`) — admin-only endpoint, will add when stabilizing - `force_on` auto-injection — query exists but not yet wired into chatd tool injection (follow-up) - Additional test coverage — 403 auth tests, GET-by-ID tests, callback CSRF tests planned for follow-up ## What's NOT in this PR - Frontend UI (admin panel + chat picker) - Actual MCP client connections (`chatd/chatmcp/` manager) - Tool injection into `chatloop/`	2026-03-19 14:07:36 +00:00
Kyle Carberry	fdc2366227	chore: update fantasy dep to rebased cj/go1.25 branch (#23242 ) Updates the `charm.land/fantasy` replace to the rebased `cj/go1.25` branch on `kylecarbs/fantasy`, which now includes: - chore: downgrade to Go 1.25 - feat: anthropic computer use - chore: use kylecarbs/openai-go fork for coder/coder compat Switches the `openai-go/v3` replace from `SasSwart/openai-go` → `kylecarbs/openai-go`, which is the same SasSwart perf fork plus a fix for `WithJSONSet` being clobbered by deferred body serialization. Without the fix, `NewStreaming` silently drops `stream: true` from requests. See https://github.com/kylecarbs/openai-go/pull/2 for details.	2026-03-19 12:59:39 +00:00
Ethan	cda460f5df	perf(coderd/chatd): skip same-replica stream DB rereads (#23218 ) ## Problem Scaletest follow-up storms showed that the chat stream path was doing a same-replica DB reread for every durable message it had already delivered locally. In a 600-chat / 10-turn run, `/stream`-attributed `GetChatMessagesByChatID` calls reached about 14.2k across 5,400 follow-up turns — roughly 2.63 rereads per turn. The primary coderd replicas saturated their DB pools at 60/60 open connections during the storm window. The root cause: when pubsub was active, `Subscribe()` suppressed local durable `message` events and relied entirely on pubsub notify → `GetChatMessagesByChatID` for catch-up. Same-replica subscribers paid the full DB round-trip even though the persisting process was on the same replica. ## Solution Add a bounded per-chat durable message cache to `chatStreamState` so that same-replica subscribers can catch up from memory instead of the database. ### How it works 1. `publishMessage()` caches the SDK event in `chatStreamState` before local fanout and pubsub notify. 2. `publishEditedMessage()` replaces the cache with only the edited message, then publishes `FullRefresh`. 3. `Subscribe()` handles ordinary `AfterMessageID` notifies by first consulting the per-chat durable cache and only falling back to `GetChatMessagesByChatID` on cache miss. 4. `FullRefresh` always forces a DB reread (cache is bypassed). ### Safety properties - If the cache misses (e.g. message expired or remote replica), the DB catch-up still runs — no silent message loss. - `FullRefresh` (edits) always rereads from the database. - Remote replicas still use the pubsub + DB path unchanged. - The cache is bounded (`maxDurableMessageCacheSize = 256`) and scoped per chat — no unbounded memory growth. ## Impact This change removes the entire same-replica portion of the stream rereads. Based on the 600-chat follow-up run, the upper bound on saved work is the same-replica share of about 14.2k `GetChatMessagesByChatID` rereads, with the observed total stream reread rate at about 2.63 rereads per follow-up turn.	2026-03-19 14:02:00 +11:00
Hugo Dutka	d285a3e74e	fix: handle null bytes in chat messages (#22946 ) This PR fixes a bug where if a tool result contained binary data it wouldn't be persisted to the database. `jsonb` in Postgres is unable to store null bytes which are sometimes output by tool results. This change makes it so that we encode them with a special escape sequence before saving them to the database, and decode them on read. <img width="808" height="637" alt="Screenshot 2026-03-11 at 13 14 06" src="https://github.com/user-attachments/assets/9be353eb-ff26-40ec-9f0a-195022b11f43" />	2026-03-18 21:19:25 +01:00
Cian Johnston	14ed3e3644	feat: bump workspace last_used_at on chat heartbeat (#23205 ) - coderd: Wires `options.WorkspaceUsageTracker` into the chatd config. - chatd: Adds `UsageTracker` and calls `UsageTracker.Add(workspaceID)` on each heartbeat tick - chatd: adds tests to verify `last_used_at` bump behaviour > 🤖 This PR was created with the help of Coder Agents, and will be reviewed by my human. 🧑‍💻	2026-03-18 19:07:21 +00:00
Kyle Carberry	1f0d896fc9	feat: add deleted flag to chat messages for soft-delete (#23223 ) Adds a `deleted` boolean column to the `chat_messages` table. Messages are never physically deleted from the database — instead they are marked as deleted so that usage and cost data is preserved. ## Changes ### Migration - New migration (000444) adds `deleted boolean NOT NULL DEFAULT false` to `chat_messages` ### SQL queries - `DeleteChatMessagesAfterID` → `SoftDeleteChatMessagesAfterID` (UPDATE SET deleted=true instead of DELETE) - New `SoftDeleteChatMessageByID` query for single-message soft-delete - All read queries now filter `deleted = false`: - `GetChatMessageByID` - `GetChatMessagesByChatID` - `GetChatMessagesByChatIDDescPaginated` - `GetChatMessagesForPromptByChatID` (both CTE and main query) - `GetLastChatMessageByRole` - Cost/usage queries (`GetChatCostSummary`, `GetChatCostPerModel`, etc.) intentionally still include deleted messages to preserve accurate spend tracking ### EditMessage behavior - Previously: updated the message content in-place + hard-deleted subsequent messages - Now: soft-deletes the original message + soft-deletes subsequent messages + inserts a new message with the updated content - This preserves the original message data (tokens, cost, content) in the database	2026-03-18 14:37:09 -04:00
Cian Johnston	0b13ba978a	fix: rename chat logger from coderd.chats.chat-processor to coderd.chatd.processor (#23246 ) - Rename logger `coderd.chats` to `coderd.chatd` in `coderd.go` - Rename sub-logger `chat-processor` to `processor` in `chatd/chatd.go`	2026-03-18 17:48:47 +00:00
Kyle Carberry	d4a072b61e	fix: address review comments on InsertChatMessages (#23239 ) Follow-up to #23220, addressing Cian's review comments: - SQL casing: Uppercase `UNNEST` to match `NULLIF`/`COALESCE` convention in the query. - Builder pattern: `chatMessage` struct now uses unexported fields with a `newChatMessage` constructor for required fields (role, content, visibility, modelConfigID, contentVersion) and chainable builder methods (`withCreatedBy`, `withCompressed`, `withUsage`, `withContextLimit`, `withTotalCostMicros`, `withRuntimeMs`) for optional/nullable fields. - Batch test in chats_test: Replaced the `for i := 0; i < 2` loop with a single batch insert of 2 messages to actually exercise the batch logic. - Multi-message querier test: Added `BatchInsertMultipleMessages` test verifying 3-message batch insert with role ordering, sequential IDs, nullable field semantics (NULL for zero UUIDs and zero ints), and token/cost assertions. --------- Co-authored-by: Cian Johnston <cian@coder.com>	2026-03-18 17:06:44 +00:00
Kyle Carberry	483adc59fe	feat: replace InsertChatMessage with batch InsertChatMessages (#23220 ) Replaces the singular `InsertChatMessage` query with `InsertChatMessages` that uses PostgreSQL's `unnest()` for batch inserts. This reduces the number of database round-trips when inserting multiple messages in a single transaction. ## Changes - SQL: New `InsertChatMessages :many` query using `unnest()` arrays following the existing codebase pattern (e.g., `InsertWorkspaceAgentStats`). Preserves the CTE that updates `chats.last_model_config_id` using the last non-null model config from the batch. Uses `NULLIF` for UUID columns to handle NULL foreign keys. - Go layers: Updated `querier.go`, `dbauthz.go`, `dbmetrics/querymetrics.go`, `dbmock/dbmock.go`, and `queries.sql.go` to use the new batch signature (`[]ChatMessage` return type, array params). - chatd.go: All call sites converted to batch inserts: - CreateChat: System prompt + user message batched into one call - persistStep: Assistant message + tool messages batched into one call - persistSummary: Hidden summary + assistant + tool messages batched into one call - Single-message sites use the same API with single-element arrays - Helper: New `appendChatMessage` function simplifies building batch params at each call site. - Tests: All test files updated to use the new API. Builds on top of #23213.	2026-03-18 16:27:07 +00:00
Kyle Carberry	4dd8531f37	feat: track step runtime_ms on chat messages (#23219 ) ## Summary Adds a `runtime_ms` column to `chat_messages` that records the wall-clock duration (in milliseconds) of each LLM step. This covers LLM streaming, tool execution, and retries — the full time the agent is "alive" for a step. This is the foundation for billing by agent alive time. The column follows the same pattern as `total_cost_micros`: stored per assistant message, aggregatable with `SUM()` over time periods by user. ## Changes - Migration: adds nullable `runtime_ms bigint` to `chat_messages`. - chatloop: adds `Runtime time.Duration` field to `PersistedStep`, measures `time.Since(stepStart)` at the beginning of each step (covering stream + tool execution + retries). - chatd: passes `step.Runtime.Milliseconds()` to the assistant message `InsertChatMessage` call; all other message types (system, user, tool) get `NULL`. - Tests: adds `runtime > 0` assertion in chatloop tests. ## Billing query pattern Once ready, aggregation mirrors the existing cost queries: ```sql SELECT COALESCE(SUM(cm.runtime_ms), 0)::bigint AS total_runtime_ms FROM chat_messages cm JOIN chats c ON c.id = cm.chat_id WHERE c.owner_id = @user_id AND cm.created_at >= @start_time AND cm.created_at < @end_time AND cm.runtime_ms IS NOT NULL; ```	2026-03-18 10:57:35 -04:00
Kyle Carberry	b83b93ea5c	feat: add workspace awareness system message on chat creation (#23213 ) When a chat is created via `chatd`, a system message is now inserted informing the model whether the chat was created with or without a workspace. With workspace: > This chat is attached to a workspace. You can use workspace tools like execute, read_file, write_file, etc. Without workspace: > There is no workspace associated with this chat yet. Create one using the create_workspace tool before using workspace tools like execute, read_file, write_file, etc. This is a model-only visibility system message (not shown to users) that helps the model understand its available capabilities upfront — particularly important for subagents spawned without a workspace, which previously would attempt to use workspace tools and fail. Changes: - `coderd/chatd/chatd.go`: Added workspace awareness constants and inserted the system message in `CreateChat` after the system prompt, before the initial user message. - `coderd/chatd/chatd_test.go`: Added `TestCreateChatInsertsWorkspaceAwarenessMessage` with sub-tests for both with-workspace and without-workspace cases.	2026-03-18 14:01:46 +00:00
Ethan	fc3508dc60	feat: configure acquire chat batch size (#23196 ) ## Summary - add a hidden deployment config option for chat acquire batch size (`CODER_CHAT_ACQUIRE_BATCH_SIZE` / `chat.acquireBatchSize`) - thread the configured value into chatd startup while preserving the existing default of `10` - clamp the deployment value to the `int32` range before passing it into chatd - regenerate the API/docs/types/testdata artifacts for the new config field ## Why `chatd` currently acquires pending chats in batches of `10` via a compile-time default. This change makes that batch size operator-configurable from deployment config, so we can tune acquisition behavior without another code change.	2026-03-19 00:54:32 +11:00
Kyle Carberry	d42008e93d	fix: persist partial assistant response when chat is interrupted mid-stream (#23193 ) ## Problem When a user cancels a streaming chat response mid-stream, the partial content disappears entirely — both from the UI and the database. The streamed text vanishes as if the response never happened. ## Root Causes Three issues combine to prevent partial message persistence on interrupt: ### 1. StreamPartTypeError only matched `context.Canceled` (`chatloop.go`) The interrupt detection in `processStepStream` checked: ```go errors.Is(part.Error, context.Canceled) && errors.Is(context.Cause(ctx), ErrInterrupted) ``` But some providers propagate `ErrInterrupted` directly as the stream error rather than wrapping it in `context.Canceled`. This caused the condition to fail, so `flushActiveState` was never called and partial text accumulated in `activeTextContent` was lost. ### 2. No post-loop interrupt check (`chatloop.go`) If the stream iterator stops yielding parts without producing a `StreamPartTypeError` (e.g., a provider that silently closes the response body on cancel), there was no check after the `for part := range stream` loop to detect the interrupt and flush active state. ### 3. Worker ownership check blocked interrupted persists (`chatd.go`) `InterruptChat` → `setChatWaiting` clears `worker_id` in the DB before the chatloop detects the interrupt. When `persistInterruptedStep` (using `context.WithoutCancel`) tried to write the partial message, the ownership check: ```go if !lockedChat.WorkerID.Valid \|\| lockedChat.WorkerID.UUID != p.workerID { return chatloop.ErrInterrupted // always blocks! } ``` unconditionally rejected the write. The error was silently logged as a warning. ## Fix - Broaden the `StreamPartTypeError` interrupt detection to match both `context.Canceled` and `ErrInterrupted` as the stream error. - Add a post-loop interrupt check in `processStepStream` that flushes active state when the context was canceled with `ErrInterrupted`. - Allow `persistStep` to write when the chat is in `waiting` status (interrupt) even if `worker_id` was cleared. The `pending` status (from `EditMessage`, where history is truncated) still correctly blocks stale writes. ## Testing Added `TestInterruptChatPersistsPartialResponse` — an end-to-end integration test that: 1. Streams partial text chunks from a mock LLM 2. Waits for the chatloop to publish `message_part` events (confirming chunks were processed) 3. Interrupts the chat mid-stream 4. Verifies the partial assistant message is persisted in the database with the expected text content	2026-03-18 11:48:28 +00:00
Hugo Dutka	2cf47ec384	feat: virtual desktop settings toggle backend (#23171 ) Adds a new `site_config` entry that controls whether the virtual desktop feature for Coder Agents is enabled. It can be set via a new `/api/experimental/chats/config/desktop-enabled` endpoint, which will be used by the frontend.	2026-03-18 09:35:13 +01:00
Ethan	11481d7bed	perf(coderd/chatd): reduce lock contention in instruction cache and persistStep (#23144 ) ## Summary Two targeted performance improvements to the chatd server, identified through benchmarking. ### 1. RWMutex for instruction cache The instruction cache is read on every chat turn to fetch the home instruction file for a workspace agent. Writes only occur on cache misses (once per agent per 5-minute TTL window), making the access pattern ~90%+ reads. Switching from `sync.Mutex` to `sync.RWMutex` and using `RLock`/`RUnlock` on the read path allows concurrent readers instead of serializing them. Benchmark (200 concurrent chats): \| \| ns/op \| \|---\|---\| \| Mutex \| 108 \| \| RWMutex \| 32 \| \| Speedup \| 3.4x \| ### 2. Hoist JSON marshaling out of persistStep transaction `MarshalParts`, `PartFromContent`, `CalculateTotalCostMicros`, and the `usageForCost` struct population are pure CPU work that ran inside the `FOR UPDATE` transaction in `persistStep`. They have zero dependency on the database transaction. Moving all marshal and cost-calculation calls above `p.db.InTx()` means the row lock is held only for `GetChatByIDForUpdate` + `InsertChatMessage` calls. Benchmark (16 goroutines contending on same lock): \| Tool calls \| Inside lock \| Outside lock \| Speedup \| \|---\|---\|---\|---\| \| 1 \| 13,977 ns/op \| 1,055 ns/op \| 13x \| \| 5 \| 38,203 ns/op \| 3,769 ns/op \| 10x \| \| 10 \| 67,353 ns/op \| 7,284 ns/op \| 9x \| \| 20 \| 145,864 ns/op \| 14,045 ns/op \| 10x \| No behavioral changes in either commit.	2026-03-18 16:12:14 +11:00
Kyle Carberry	b779c9ee33	fix: use SQL-level auth filtering for chat listing (#23159 ) ## Problem The chat listing endpoint (`GetChatsByOwnerID`) was using `fetchWithPostFilter`, which fetches N rows from the database and then filters them in Go memory using RBAC checks. This causes a pagination bug: if the user requests `limit=25` but some rows fail the auth check, fewer than 25 rows are returned even though more authorized rows exist in the database. The client may incorrectly assume it has reached the end of the list. ## Solution Switch to the same pattern used by `GetWorkspaces`, `GetTemplates`, and `GetUsers`: `prepareSQLFilter` + `GetAuthorized*` variant. The RBAC filter is compiled to a SQL WHERE clause and injected into the query before `ORDER BY`/`LIMIT`, so the database returns exactly the requested number of authorized rows. Additionally, `GetChatsByOwnerID` is renamed to `GetChats` with `OwnerID` as an optional (nullable) filter parameter, matching the `GetWorkspaces` naming convention. ## Changes \| File \| Change \| \|------\|--------\| \| `queries/chats.sql` \| Renamed to `GetChats`, `owner_id` now optional via CASE/NULL, added `-- @authorize_filter` \| \| `queries.sql.go` \| Renamed constant, params struct (`GetChatsParams`), and method \| \| `querier.go` \| Interface method renamed \| \| `modelqueries.go` \| Added `chatQuerier` interface + `GetAuthorizedChats` impl \| \| `dbauthz/dbauthz.go` \| `GetChats` now uses `prepareSQLFilter` instead of `fetchWithPostFilter` \| \| `dbauthz/dbauthz_test.go` \| Updated tests for SQL filter pattern \| \| `dbmock/dbmock.go` \| Renamed + added mock for `GetAuthorizedChats` \| \| `dbmetrics/querymetrics.go` \| Renamed + added metrics wrapper \| \| `rbac/regosql/configs.go` \| Added `ChatConverter` (maps `org_owner` to empty string literal since `chats` has no `organization_id` column) \| \| `rbac/authz.go` \| Added `ConfigChats()` \| \| `chats.go` \| Handler uses renamed method with `uuid.NullUUID` \| \| `searchquery/search.go` \| Updated return type \| \| `gitsync/worker.go` \| Updated interface and call site \| \| Various test files \| Updated for renamed types \|	2026-03-17 12:46:24 -04:00
Kyle Carberry	075dfecd12	refactor: consolidate experimental chats API types (#23143 ) ## Summary Consolidates three areas of type duplication in the experimental chats API: ### 1. Merge archive/unarchive into `PATCH /{chat}` - Before: `POST /{chat}/archive` + `POST /{chat}/unarchive` (two endpoints, two handlers with mirrored logic) - After: `PATCH /{chat}` accepting `{ "archived": true/false }` via `UpdateChatRequest` - Removes one endpoint and ~30 lines of duplicated handler code ### 2. Collapse identical request/response prompt types - `ChatSystemPromptResponse` + `UpdateChatSystemPromptRequest` → `ChatSystemPrompt` - `UserChatCustomPromptResponse` + `UpdateUserChatCustomPromptRequest` → `UserChatCustomPrompt` - These pairs were field-for-field identical (single string field) ### 3. Merge duplicate reasoning options types - `ChatModelOpenRouterReasoningOptions` + `ChatModelVercelReasoningOptions` → `ChatModelReasoningOptions` - Same 4 fields, same types — only field ordering and enum value sets differed - Unified type uses the superset of enum values ### Files changed - `codersdk/chats.go` — SDK types and client methods - `coderd/chats.go` — Handler consolidation - `coderd/coderd.go` — Route change - `coderd/chats_test.go` — Test updates - `site/src/api/api.ts` — Frontend API client - `site/src/api/queries/chats.ts` — Query mutations - `site/src/api/queries/chats.test.ts` — Test mocks - `site/src/pages/AgentsPage/AgentsPage.tsx` — Call site - Generated files (`typesGenerated.ts`, `chatModelOptionsGenerated.json`) ### Testing - All Go tests pass (`TestArchiveChat`, `TestUnarchiveChat`, `TestChatSystemPrompt`) - All frontend tests pass (31/31 in `chats.test.ts`)	2026-03-17 14:31:11 +00:00
Ethan	41bd7acf66	perf(chatd): remove redundant chat rereads (#23161 ) ## Summary This PR removes two redundant chat rereads in `chatd`. ### Archive / unarchive - `archiveChat` and `unarchiveChat` already come through `httpmw.ChatParam`, so the handlers already have the `database.Chat` row. - Pass that row into `chatd.ArchiveChat` / `chatd.UnarchiveChat` instead of rereading by ID before publishing the sidebar events. ### End-of-turn cleanup - `processChat` no longer calls `GetChatByID` after the cleanup transaction just to refresh the chat snapshot. - Title generation already persists the generated title and emits its own `title_change` event. - To preserve best-effort title freshness for the cleanup path, the async title-generation goroutine stores the generated title in per-turn shared state and cleanup overlays it if available before publishing the `status_change` event and dispatching push notifications. ## Why - removes one DB read from archive / unarchive requests - removes one DB read from completed turns, which is the larger hot-path win - keeps the existing pubsub/event contract intact instead of broadening this into a larger event-model redesign ## Notes - `title_change` remains the authoritative title update for clients - cleanup does not wait for title generation; it uses the generated title only when it is already available	2026-03-18 00:52:06 +11:00
Ethan	a33605df58	perf(coderd/chatd): reuse workspace context within a turn (#23145 ) ## Summary - reuse workspace agent context within a single `runChat()` turn - remove duplicate latest-build agent lookups between `resolveInstructions()` and `getWorkspaceConn()` - avoid the extra `GetWorkspaceAgentByID` fetch when the selected `WorkspaceAgent` already has the needed metadata - add focused internal tests for reuse and refresh-on-dial-failure ## Why This came out of a 5000-chat / 10-turn scaletest on bravo against a single workspace. The run completed successfully, but coderd stayed DB-pool bound, and one workspace-backed hot path stood out: - `GetWorkspaceAgentsInLatestBuildByWorkspaceID ≈ 46.7k` - `GetWorkspaceByID ≈ 48.0k` - `GetWorkspaceAgentByID ≈ 2.2k` Within one `runChat()` turn, chatd was rediscovering the same workspace agent multiple times just to resolve instructions and open the workspace connection. ## What this changes This PR introduces a turn-local workspace context helper so a single acquired turn can: - resolve the selected workspace agent once - reuse that agent for instruction resolution - reuse the same `AgentConn` for workspace tools and reload/compaction This stays turn-local only, so a later turn on another replica still rebuilds fresh context from the DB. ## Expected impact This is an incremental improvement, not a full fix. It should reduce duplicated workspace-agent lookups and shave some DB pressure from a hot path for workspace-backed chats, while preserving multi-replica correctness. ## Testing - `go test ./coderd/chatd/...` - `golangci-lint run ./coderd/chatd/...`	2026-03-18 00:33:44 +11:00
Michael Suchacz	5d0eb772da	fix(cored): fix flaky TestInterruptAutoPromotionIgnoresLaterUsageLimitIncrease (#23147 )	2026-03-17 19:08:22 +11:00
Ethan	04fca84872	perf(coderd): reduce duplicated reads in push and webpush paths (#23115 ) ## Background A 5000-chat scaletest (~50k turns, ~2m45s wall time) completed successfully, but the main bottleneck was DB pool starvation from repeated reads, not individually expensive SQL. The push/webpush path showed a few especially noisy reads: - `GetLastChatMessageByRole` for push body generation - `GetEnabledChatProviders` + `GetChatModelConfigByID` for push summary model resolution - `GetWebpushSubscriptionsByUserID` for every webpush dispatch This PR keeps the optimizations that remove those duplicate reads while leaving stream behavior unchanged. ## What changes in this PR ### 1. Reuse resolved chat state for push notifications `maybeSendPushNotification` used to re-read the last assistant message and re-resolve the chat model/provider after `runChat` had already done that work. Now `runChat` returns the final assistant text plus the already-resolved model and provider keys, and the push goroutine uses that state directly. That removes the extra push-path reads for: - `GetLastChatMessageByRole` - the second `resolveChatModel` path - the provider/model lookups that came with that second resolution ### 2. Cache webpush subscriptions during dispatch `Dispatch()` previously hit `GetWebpushSubscriptionsByUserID` on every push. A small per-user in-memory cache now avoids those repeated reads. The follow-up fix keeps that optimization correct: `InvalidateUser()` bumps a per-user generation so an older in-flight fetch cannot repopulate the cache with pre-mutation data after subscribe/unsubscribe. That preserves the cache win without letting local subscription changes be silently overwritten by stale fetch results. ## Why this is safe - The push change only reuses data already produced during the same chat run. It does not change notification semantics; if there is no assistant text to summarize, the existing fallback body still applies. - The webpush change keeps the existing TTL and `410 Gone` cleanup behavior. The generation guard only prevents stale in-flight fetches from poisoning the shared cache after invalidation. - The final PR does not change stream setup, pubsub/relay behavior, or chat status snapshot timing. ## Deliberately not included - No stream-path optimization in `Subscribe`. - No inline pubsub message payloads. - No distributed cross-replica webpush cache invalidation.	2026-03-17 13:50:47 +11:00
Michael Suchacz	1031da9738	feat: add agent chat spend limiting (backend) (#23071 ) Introduces deployment-scoped spend limiting for Coder Agents, enabling administrators to control LLM costs at global, group, and individual user levels. ## Changes - Database migration (000437): `chat_usage_limit_config` (singleton), `chat_usage_limit_overrides` (per-user), `chat_usage_limit_group_overrides` (per-group) - Single-query limit resolution: individual override > min(group) > global default via `ResolveUserChatSpendLimit` - Fail-open enforcement in chatd with documented TOCTOU trade-off - Experimental API under `/api/experimental/chats/usage-limits` for CRUD on limits - `AsChatd` RBAC subject for narrowly-scoped daemon access (replaces `AsSystemRestricted`) - Generated TypeScript types for the frontend SDK ## Hierarchy 1. Individual user override (highest) 2. Minimum of group limits 3. Global default 4. Disabled / unlimited Currency stored as micro-dollars (`1,000,000` = $1.00). Frontend PR: #23072	2026-03-17 01:24:03 +01:00
Kyle Carberry	6972d073a2	fix: improve background process handling for agent tools (#23132 ) ## Problem Models frequently use shell `&` instead of `run_in_background=true` when starting long-running processes through `/agents`, causing them to die shortly after starting. This happens because: 1. No guidance in tool schema — The `ExecuteArgs` struct had zero `description` tags. The model saw `run_in_background: boolean (optional)` with no explanation of when/why to use it. 2. Shell `&` is silently broken — `sh -c "command &"` forks the process, the shell exits immediately, and the forked child becomes an orphan not tracked by the process manager. 3. No process group isolation — The SSH subsystem sets `Setsid: true` on spawned processes, but the agent process manager set no `SysProcAttr` at all. Signals only hit the top-level `sh`, not child processes. ## Investigation Compared our implementation against openai/codex and coder/mux: \| Aspect \| codex \| mux \| coder/coder (before) \| \|--------\|-------\|-----\|---------------------\| \| Background flag \| Yield/resume with `session_id` \| `run_in_background` with rich description \| `run_in_background` with no description \| \| `&` handling \| `setsid()` + `killpg()` \| `detached: true` + `killProcessTree()` \| Nothing — orphaned children escape \| \| Process isolation \| `setsid()` on every spawn \| `set -m; nohup ... setsid` for background \| No `SysProcAttr` at all \| \| Signal delivery \| `killpg(pgid, sig)` — entire group \| `kill -15 -\$pid` — negative PID \| `proc.cmd.Process.Signal()` — PID only \| ## Changes ### Fix 1: Add descriptions to `ExecuteArgs` (highest impact) The model now sees explicit guidance: "Use for long-running processes like dev servers, file watchers, or builds. Do NOT use shell & — it will not work correctly." ### Fix 2: Update tool description The top-level execute tool description now reinforces: "Use run_in_background=true for long-running processes. Never use shell '&' for backgrounding." ### Fix 3: Detect trailing `&` and auto-promote to background Defense-in-depth: if the model still uses `command &`, we strip the `&` and promote to `run_in_background=true` automatically. Correctly distinguishes `&` from `&&`. ### Fix 4: Process group isolation (`Setpgid`) New platform-specific files (`proc_other.go` / `proc_windows.go`) following the same pattern as `agentssh/exec_other.go`. Every spawned process gets its own process group. ### Fix 5: Process group signaling `signal()` now uses `syscall.Kill(-pid, sig)` on Unix to signal the entire process group, ensuring child processes from shell pipelines are also cleaned up. ## Testing All existing `agent/agentproc` tests pass. Both packages compile cleanly.	2026-03-16 16:22:10 -04:00
Kyle Carberry	741af057dc	feat: paginate chat messages endpoint with cursor-based infinite scroll (#23083 ) Adds cursor-based pagination to the chat messages endpoint. ## Backend - New `GetChatMessagesByChatIDPaginated` SQL query: returns messages in `id DESC` order with a `before_id` keyset cursor and configurable `limit` - Handler parses `?before_id=N&limit=N` query params, uses the `LIMIT N+1` trick to set `has_more` without a separate COUNT query - Queued messages only returned on the first page (no cursor) since they're always the most recent - SDK client updated with `ChatMessagesPaginationOptions` - Fully backward compatible: omitting params returns the 50 newest messages ## Frontend - Switches `getChatMessages` from `useQuery` to `useInfiniteQuery` with cursor chaining via `getNextPageParam` - Pages flattened and sorted by `id` ascending for chronological display - `MessagesPaginationSentinel` component uses `IntersectionObserver` (200px rootMargin prefetch) inside the existing `flex-col-reverse` scroll container - `flex-col-reverse` handles scroll anchoring natively when older messages are prepended — no manual `scrollTop` adjustment needed (same pattern as coder/blink) ## Why cursor-based instead of offset/limit Offset-based pagination breaks when new messages arrive while paginating backward (offsets shift, causing duplicates or missed messages). The `before_id` cursor is stable regardless of inserts — each page is deterministic.	2026-03-16 16:40:59 +00:00
Kyle Carberry	6f97539122	fix: update sidebar diff status on WebSocket events (#23116 ) ## Problem The sidebar diff status (PR icon, +additions/-deletions, file count) was not updating in real-time. Users had to reload the page to see changes. Two root causes: 1. Frontend: The `diff_status_change` WebSocket handler in `AgentsPage.tsx` had an early `return` (line 398) that skipped `updateInfiniteChatsCache`, so the sidebar's cache was never updated. Even for other event types, the cache merge only spread `status` and `title` — never `diff_status`. 2. Server: `publishChatPubsubEvent` in `chatd.go` constructed a minimal `Chat` payload without `DiffStatus`, so even if the frontend consumed the event, `updatedChat.diff_status` would be `undefined`. ## Fix ### Server (`coderd/chatd/chatd.go`) - `publishChatPubsubEvent` now accepts an optional `*codersdk.ChatDiffStatus` parameter; when non-nil it's set on the outgoing `Chat` payload. - `PublishDiffStatusChange` fetches the diff status from the DB, converts it, and passes it through. - Added `convertDBChatDiffStatus` (mirrors `coderd/chats.go`'s converter to avoid circular import). - All other callers pass `nil`. ### Frontend (`site/src/pages/AgentsPage/AgentsPage.tsx`) - Removed the early `return` so `diff_status_change` events fall through to the cache update logic. - Added `isDiffStatusEvent` flag and spread `diff_status` into both the infinite chats cache (sidebar) and the individual chat cache.	2026-03-16 15:41:32 +00:00
Mathias Fredriksson	72689c2552	fix(coderd): improve error handling in chattest, chattool, and chats (#23047 ) - Use t.Errorf in chattest non-streaming helpers so encoding failures fail the test - Thread testing.TB into writeResponsesAPIStreaming and log SSE write errors instead of silently dropping them - Bump createworkspace DB error log from Warn to Error - Use errors.Join for timeout + output error in execute.go	2026-03-13 21:41:24 +02:00
Hugo Dutka	84527390c6	feat: chat desktop backend (#23005 ) Implement the backend for the desktop feature for agents. - Adds a new `/api/experimental/chats/$id/desktop` endpoint to coderd which exposes a VNC stream from a [portabledesktop](https://github.com/coder/portabledesktop) process running inside the workspace - Adds a new `spawn_computer_use_agent` tool to chatd, which spawns a subagent that has access to the `computer` tool which lets it interact with the `portabledesktop` process running inside the workspace - Adds the plumbing to make the above possible There's a follow up frontend PR here: https://github.com/coder/coder/pull/23006	2026-03-13 19:49:34 +01:00
Mathias Fredriksson	9d33c340ec	fix(coderd): handle ignored errors across coderd packages (#22851 ) Handle previously ignored error return values in coderd: - coderd/chats.go: check sendEvent errors, log on failure - coderd/chatd/chattest: thread testing.TB through server structs, replace log.Printf with t.Logf, check writeSSEEvent errors - coderd/chatd/chattool/createworkspace.go: log UpdateChatWorkspace failure instead of discarding both return values - coderd/chatd/chattool/execute.go: surface ProcessOutput error in the timeout message returned to the caller - coderd/provisionerdserver: log stream.Send failure in the DownloadFile error helper	2026-03-13 19:53:20 +02:00
Michael Suchacz	c3b6284955	feat: add chat cost analytics backend (#23036 ) Add cost tracking for LLM chat interactions with microdollar precision. ## Changes - Add `chatcost` package for per-message cost calculation using `shopspring/decimal` for intermediate arithmetic - Ceil rounding policy: fractional micros round UP to next whole micro (applied once after summing all components) - Database migration: `total_cost_micros` BIGINT column with historical backfill and `created_at` index - API endpoints: per-user cost summary and admin rollup under `/api/experimental/chats/cost/` - SDK types: `ChatCostSummary`, `ChatCostModelBreakdown`, `ChatCostUserRollup` - Fix `modeloptionsgen` to handle `decimal.Decimal` as opaque numeric type - Update frontend pricing test fixtures for string decimal types ## Design decisions - `NULL` = unpriced (no matching model config), `0` = free - Reasoning tokens included in output tokens (no double-counting) - Integer microdollars (BIGINT) for storage and API responses - Price config uses `decimal.Decimal` for exact parsing; totals use `int64` Frontend: #23037	2026-03-13 18:30:49 +01:00
Mathias Fredriksson	4a79af1a0d	refactor: add chat_message_role enum and content_version column (#23042 ) Migration 000434 converts chat_messages.role from text to a Postgres enum, rebuilds the partial index, and adds content_version smallint. The column is backfilled with DEFAULT 0, then the default is dropped so future inserts must set it explicitly. Version 0 uses the role-aware heuristic from #22958. Version 1 (all new inserts) stores []ChatMessagePart JSON for all roles, including system messages. ParseContent takes database.ChatMessage directly and dispatches on version internally. Unknown versions error. All string(codersdk.ChatMessageRole) casts at DB write sites are replaced with database.ChatMessageRole constants from sqlc. Refs #22958	2026-03-13 16:47:36 +00:00
Mathias Fredriksson	bdbcd3428b	feat(coderd/chatd): unify chat storage on SDK parts and fix file-reference rendering (#22958 ) File-reference parts in user messages were flattened to `TextContent` at write time because fantasy has no file-reference content type. The frontend never saw them as structured parts. This moves all write paths (user, assistant, tool) from fantasy envelope format to `codersdk.ChatMessagePart`. The streaming layer (`chatloop`) is untouched, the conversion happens at the serialization boundary in `persistStep`. Old rows are still readable. `ParseContent` uses a structural heuristic (`isFantasyEnvelopeFormat`) to distinguish legacy envelopes from SDK parts. We chose this over try/fallback because fantasy envelopes partially unmarshal into `ChatMessagePart` (the `type` field matches) while silently losing content. A guard test enforces that no SDK part can produce the envelope shape. This is forward-only: new rows are unreadable by old code. Chat is behind a feature flag so rollback risk is contained. Also adds a typed `ChatMessageRole` to replace raw strings and `fantasy.MessageRole` casts at the persistence boundary. The type covers `ChatMessage.Role`, `ChatStreamMessagePart.Role`, the `PublishMessagePart` callback chain, and all DB write sites. `fantasy.MessageRole` remains only where we build `fantasy.Message` structs for LLM dispatch. Separately, `ProviderMetadata` was leaking to SSE clients via `publishMessagePart`. `StripInternal` now runs on both the SSE and REST paths, covering this. Other cleanup: - Old `db2sdk.contentBlockToPart` silently dropped metadata on text/reasoning/tool-call content. New code preserves it. - `providerMetadataToOptions` now logs warnings instead of silently returning nil. - `db2sdk` shrinks from ~250 lines of parallel conversion to ~15 lines delegating to `chatprompt.ParseContent()`, removing the `fantasy` import entirely. Refs #22821	2026-03-13 17:53:26 +02:00
Kyle Carberry	690e3a87d8	feat: move chat messages to dedicated /chats/{id}/messages endpoint (#23021 ) ## Summary Moves the messages response out of `GET /chats/{id}` and into a dedicated `GET /chats/{id}/messages` endpoint. ### Backend - `GET /chats/{id}` now returns just the `Chat` object (no messages) - `GET /chats/{id}/messages` is a new endpoint returning `ChatMessagesResponse` with `messages` and `queued_messages` - Added `ChatMessagesResponse` SDK type and `GetChatMessages` client method ### Frontend - `getChat()` API method returns `Chat` instead of `ChatWithMessages` - Added `getChatMessages()` API method for the new endpoint - Split `chatQuery` into two: `chatQuery` (metadata) and `chatMessagesQuery` (messages) - Updated all cache mutations, optimistic updates, and websocket handlers - Updated tests and stories ### Files changed \| File \| Change \| \|---\|---\| \| `coderd/coderd.go` \| Register `GET /messages` route \| \| `coderd/chats.go` \| Simplify `getChat`, add `getChatMessages` handler \| \| `codersdk/chats.go` \| New type + method, update `GetChat` return \| \| `site/src/api/api.ts` \| New method, update `getChat` \| \| `site/src/api/queries/chats.ts` \| New query, update cache mutations \| \| `site/src/pages/AgentsPage/AgentDetail.tsx` \| Use separate queries \| \| `site/src/pages/AgentsPage/AgentDetail/ChatContext.ts` \| Update types and cache writes \| \| `site/src/pages/AgentsPage/AgentsPage.tsx` \| Update websocket cache handler \|	2026-03-13 08:35:46 -04:00
Atif Ali	7777072d7a	feat(chatd): set User-Agent on all outgoing LLM requests (#22965 )	2026-03-13 15:12:04 +05:00
Kyle Carberry	0e1846fe2a	fix(agent): reap exited processes and scope process list by chat ID (#22944 )	2026-03-12 14:51:05 -07:00
Kyle Carberry	42c12176a0	fix(chatd): persist interrupted tool call steps instead of losing them (#23011 ) ## Problem When a chat is interrupted while tools are executing, the step content (text, reasoning, tool calls, and partial tool results) was being lost. Two gaps existed: 1. During tool execution: `executeTools` returns with error results for interrupted tools, but the subsequent `PersistStep(ctx, ...)` fails on the canceled context and returns `ErrInterrupted` without persisting anything. 2. PersistStep race: If the context is canceled between the post-tool interrupt check and the `PersistStep` call, the same loss occurs. This is inconsistent with how we handle stream interruptions (which properly flush and persist partial content via `persistInterruptedStep`) and how [coder/blink](https://github.com/coder/blink) handles interruptions (always inserting the response message regardless of execution phase). ## Fix Two changes in `chatloop.go`: - Post-tool-execution interrupt check: After `executeTools` returns, check if the context was interrupted and route through `persistInterruptedStep` (which uses `context.WithoutCancel` internally) to save the accumulated content. - PersistStep fallback: If `PersistStep` returns `ErrInterrupted`, retry via `persistInterruptedStep` so partial content is not lost. ## Tests - `TestRun_InterruptedDuringToolExecutionPersistsStep`: Verifies that when a tool is blocked and the chat is interrupted, the step (text + reasoning + tool call + tool error result) is persisted via the interrupt-safe path. - `TestRun_PersistStepInterruptedFallback`: Verifies that when `PersistStep` itself returns `ErrInterrupted`, the step is retried via the fallback path and content is saved.	2026-03-12 16:59:16 -04:00
Kyle Carberry	072e9a212f	fix(chatloop): keep provider-executed tool results in assistant message (#23012 ) ## Problem When a step contains both provider-executed tool calls (e.g. Anthropic web search) and local tool calls in parallel, the next loop iteration fails with the Anthropic API claiming the regular tool call has no result. However, sending a new user message (which reloads messages from the DB) works fine. ## Root cause `toResponseMessages` was placing all tool results into the tool-role message, regardless of `ProviderExecuted`. When Fantasy's Anthropic provider later converted these messages for the API, it moved the provider tool result from the tool message to the end of the previous assistant message (`prevMsg.Content = append(...)`). This placed `web_search_tool_result` after the regular `tool_use` block: ``` assistant: [server_tool_use(A), tool_use(B), web_search_tool_result(A)] ← wrong order user: [tool_result(B)] ``` The persistence layer in `chatd.go` already handles this correctly — provider-executed tool results stay in the assistant message, producing the expected ordering: ``` assistant: [server_tool_use(A), web_search_tool_result(A), tool_use(B)] ← correct order user: [tool_result(B)] ``` This is why reloading from the DB fixed it. ## Fix In the `ContentTypeToolResult` case of `toResponseMessages`, route provider-executed results to `assistantParts` instead of `toolParts`, matching the persistence layer's behavior. ## Testing Added `TestToResponseMessages_ProviderExecutedToolResultInAssistantMessage` which verifies that mixed provider+local tool results are split correctly between the assistant and tool messages.	2026-03-12 20:22:09 +00:00
Kyle Carberry	fc9e04da67	fix(chatd): handle soft-deleted workspaces in chattool start/create (#22997 ) ## Problem Both `start_workspace` and `create_workspace` chattool tools failed to handle soft-deleted workspaces correctly. Coder uses soft-delete for workspaces (`deleted = true` on the row). Both tools called `GetWorkspaceByID`, which queries `workspaces_expanded` with no `deleted = false` filter — so it returns the workspace row even when soft-deleted. The only deletion check was for `sql.ErrNoRows`, which never fires because the row still exists. ### `start_workspace` behavior (before fix) 1. Loads the soft-deleted workspace successfully 2. Finds the latest build (a delete transition) 3. Falls through to attempt to start the deleted workspace 4. Produces a confusing downstream error ### `create_workspace` behavior (before fix) 1. `checkExistingWorkspace` loads the soft-deleted workspace 2. If a delete build is in-progress: waits for it, then falsely reports `already_exists` — blocks new workspace creation 3. If the delete build succeeded: accidentally allows creation (because no agents are found), but via fragile logic rather than an explicit check ## Fix Add `ws.Deleted` checks immediately after `GetWorkspaceByID` succeeds in both tools: - `startworkspace.go`: Returns `"workspace was deleted; use create_workspace to make a new one"` - `createworkspace.go` (`checkExistingWorkspace`): Returns `(nil, false, nil)` to allow new workspace creation ## Tests - `TestStartWorkspace/DeletedWorkspace` — verifies `start_workspace` returns deleted error and never calls `StartFn` - `TestCheckExistingWorkspace_DeletedWorkspace` — verifies `checkExistingWorkspace` allows creation for soft-deleted workspaces	2026-03-12 16:09:17 +00:00
Kyle Carberry	a6697b1b29	fix(chatd): fix PE tool result persistence via fantasy bump (#22996 ) Fixes Anthropic 400 error on multi-turn conversations with web search: > web_search tool use with id srvtoolu_... was found without a corresponding web_search_tool_result block Provider-executed tool results (e.g. `web_search`) had a nil `Result` field, which serialized as `"result":null`. Fantasy's `UnmarshalToolResultOutputContent` couldn't deserialize `null` back, so the entire assistant message became unreadable after persistence. On the next LLM call, Anthropic rejected the conversation because `server_tool_use` had no matching `web_search_tool_result`. Fix: Bump the fantasy fork to e4bbc7bb3054 which returns `nil, nil` for null `Result` JSON instead of erroring. Testing: Added `integration_test.go` with `TestAnthropicWebSearchRoundTrip` (requires `ANTHROPIC_API_KEY`) that: - Sends a query triggering web search - Verifies the persisted assistant message contains all parts the UI needs: `tool-call(PE)`, `source`, `tool-result(PE)`, and `text` - Sends a follow-up to confirm the round-trip works with Anthropic	2026-03-12 16:04:30 +00:00
Kyle Carberry	c3923f2ccd	fix(chatd): keep provider-executed tool results in assistant content (#22991 ) ## Problem Anthropic's API returns a 400 error when `web_search` tool results are missing: ``` web_search tool use with id srvtoolu_... was found without a corresponding web_search_tool_result block ``` Root cause: `persistStep` in `chatd.go` splits ALL `ToolResultContent` blocks into separate tool-role DB rows. Provider-executed (PE) tool results like `web_search` must stay in the assistant message — Anthropic expects `server_tool_use` and `web_search_tool_result` in the same turn. The previous fix (#22976) added repair passes to drop PE results during reconstruction, which fixed cross-step orphans but broke the normal case (PE result correctly in the same step). ## Fix Three changes that address the root cause: 1. `persistStep` (chatd.go): Check `ProviderExecuted` before splitting `ToolResultContent` into tool rows. PE results stay in `assistantBlocks` and are stored in the assistant content column. 2. `ToMessageParts` (chatprompt.go): Propagate the `ProviderExecuted` field to `ToolResultPart` so the fantasy Anthropic provider can identify PE results and reconstruct the `web_search_tool_result` block. 3. Keep existing repair passes for backward compatibility with legacy DB data where PE results were incorrectly persisted as separate tool messages. ## Tests - `TestProviderExecutedResultInAssistantContent` — PE result stored inline in assistant content round-trips correctly with `ProviderExecuted` preserved. - `TestProviderExecutedResult_LegacyToolRow` — legacy PE results in tool-role rows are still dropped correctly. - All existing tests pass (including the 3 PE tests from #22976).	2026-03-12 09:49:53 -04:00
Kyle Carberry	53bfbf7c03	fix(chatd): improve compaction prompt to preserve forward momentum (#22989 ) ## Problem The summarization prompt explicitly tells the model to "Omit pleasantries and next-step suggestions" and the summary prefix frames the compacted context as passive history: `Summary of earlier chat context:`. After compaction mid-task, the model reads a factual recap with no forward momentum, loses its direction, and either stops or asks the user what to do. ## Research I compared our compaction prompt against several other agents: \| Agent \| Key Pattern \| \|---\|---\| \| Codex \| Prompt says "Include what remains to be done (clear next steps)". Prefix: "Another language model started to solve this problem..." \| \| Mux \| Includes "Current state of the work (what's done, what's in progress)" + appends the user's follow-up intent \| \| Continue \| "Make sure it is clear what the current stream of work was at the very end prior to compaction so that you can continue exactly where you left off" \| \| Copilot Chat \| Dedicated sections for Active Work State, Recent Operations, Pre-Summary State, and a Continuation Plan with explicit next actions \| Every other major agent explicitly preserves forward intent and in-progress state. Coder was the only one telling the model to omit next steps. ## Changes Summary prompt: - Removes `Omit next-step suggestions` - Adds structured `Include:` list with explicit items for in-progress work, remaining work, and the specific action being performed when compaction fired - Frames the operation as `context compaction` (matching Codex's framing) Summary prefix: - Old: `Summary of earlier chat context:` - New: `The following is a summary of the earlier conversation. The assistant was actively working when the context was compacted. Continue the work described below:` The prefix is the first thing the model reads post-compaction — framing it as an active handoff with an explicit "Continue" directive primes the model to resume work rather than wait.	2026-03-12 13:03:06 +00:00
Michael Suchacz	fba00a6b3a	feat(agents): add chat model pricing metadata (#22959 ) ## Summary - add chat model pricing metadata to the agents admin form and SDK metadata - split pricing into its own section and show default pricing as placeholders - apply default pricing when admins leave pricing fields blank	2026-03-12 07:37:33 +01:00
Kyle Carberry	3325b86903	fix(chatd): skip provider-executed tools in message repair (#22976 )	2026-03-12 02:54:14 +00:00
Kyle Carberry	58f295059c	fix: grant chatd ActionReadPersonal on User and parallelize runChat DB calls (#22970 ) ## Problem 1. Personal behavior prompt not applied: The chatd background worker was missing `ActionReadPersonal` on `ResourceUser` in its RBAC subject. When `resolveUserPrompt` calls `GetUserChatCustomPrompt`, the dbauthz layer checks `ActionReadPersonal` on the user — which the chatd role didn't have. The error was silently swallowed (returns `""`), so the user's custom prompt was never injected into the system messages. 2. Sequential DB calls on chat startup: Several independent database queries in `runChat` and `resolveChatModel` were running sequentially, adding unnecessary latency before the LLM stream begins. ## Changes ### RBAC fix (`dbauthz.go`) - Add `rbac.ResourceUser.Type: {policy.ActionReadPersonal}` to `subjectChatd` site permissions - This is the minimal permission needed — `ActionRead` on User remains denied ### Parallelization (`chatd.go`) Three parallelization points using `errgroup.Group`: 1. `resolveChatModel`: `resolveModelConfig` and `GetEnabledChatProviders` run concurrently (both needed for `ModelFromConfig`, which stays sequential after the wait) 2. `runChat` startup: `resolveChatModel` and `GetChatMessagesForPromptByChatID` run concurrently (completely independent) 3. `runChat` prompt assembly: `resolveInstructions` and `resolveUserPrompt` run concurrently (both produce strings; `InsertSystem` calls maintain correct order after the wait) Same pattern applied to the `ReloadMessages` callback. ### Test (`dbauthz_test.go`) - Add assertion in `TestAsChatd/AllowedActions` that `ActionReadPersonal` on `ResourceUser` is permitted	2026-03-11 22:07:46 +00:00
Kyle Carberry	57dc23f603	feat(chatd): add provider-native web search tools to chats (#22909 ) ## What Adds provider-native web search tools to the chat system. Anthropic, OpenAI, and Google all offer server-side web search — this wires them up as opt-in per-model config options using the existing `ChatModelProviderOptions` JSONB column (no migration). Web search is off by default. ## Config Set `web_search_enabled: true` in the model config provider options: ```json { "provider_options": { "anthropic": { "web_search_enabled": true, "allowed_domains": ["docs.coder.com", "github.com"] } } } ``` Available options per provider: - Anthropic: `web_search_enabled`, `allowed_domains`, `blocked_domains` - OpenAI: `web_search_enabled`, `search_context_size` (`low`/`medium`/`high`), `allowed_domains` - Google: `web_search_enabled` ## Backend - `codersdk/chats.go` — new fields on the per-provider option structs - `coderd/chatd/chatd.go` — `buildProviderTools()` reads config, creates `ProviderDefinedTool` entries (uses `anthropic.WebSearchTool()` helper from fantasy) - `coderd/chatd/chatloop/chatloop.go` — `ProviderTools` on `RunOptions`, merged into `Call.Tools`. Provider-executed tool calls skip local execution. `StreamPartTypeToolResult` with `ProviderExecuted: true` is accumulated inline (matching fantasy's own agent.go pattern) instead of post-stream synthesis. - `coderd/chatd/chatprompt/` — `MarshalToolResult` carries `ProviderMetadata` through DB persistence so multi-turn round-trips work (Anthropic needs `encrypted_content` back) ## Frontend - Source citations render inline at the tool-call position (not bottom-of-message), using `ToolCollapsible` so they look like other tool cards — collapsed "Searched N results" with globe icon, expand to see source pills - Provider-executed tool calls/results are hidden from the normal tool card UI - Tool-role messages with only provider-executed results return `null` (no empty bubble) - Both persisted (messageParsing.ts) and streaming (streamState.ts) paths group consecutive `source` parts into a single `{ type: "sources" }` render block ## Fantasy changes The fantasy fork (`kylecarbs/fantasy` branch `cj/go1.25`) has the Anthropic tool code merged in, but will hopefully go upstream from: https://github.com/charmbracelet/fantasy/pull/163	2026-03-11 21:33:15 +00:00
Kyle Carberry	1f37df4db3	perf(chatd): fix six scale bottlenecks identified by benchmarking (#22957 ) ## Summary Scale-tested the `chatd` package with mock-based benchmarks to identify performance bottlenecks. This PR fixes 6 of the 8 identified issues, ranked by severity. ## Changes ### 1. Parallel tool execution (HIGH) — `chatloop.go` `executeTools` ran tool calls sequentially. Now dispatches all calls concurrently via goroutines with `sync.WaitGroup`. Results are pre-allocated by index (no mutex needed). `onResult` callbacks fire as each tool completes. ### 2. Pubsub-backed subagent await (HIGH) — `subagent.go` `awaitSubagentCompletion` polled the DB every 200ms. Now subscribes to the child chat's `ChatStreamNotifyChannel` via pubsub for near-instant notifications. Fallback poll reduced to 5s. Falls back to 200ms only when `pubsub == nil` (single-instance / in-memory). ### 3. Per-chat stream locking (MEDIUM) — `chatd.go` Replaced single global `streamMu` + `map[uuid.UUID]*chatStreamState` with `sync.Map` where each `chatStreamState` has its own `sync.Mutex`. Zero cross-chat contention. ### 4. Batch chat acquisition (MEDIUM) — `chatd.go` `processOnce` acquired 1 chat per tick. Now loops up to `maxChatsPerAcquire = 10` per tick, avoiding idle time when many chats are pending. ### 5. Reduced heartbeat frequency (LOW-MEDIUM) — `chatd.go` `chatHeartbeatInterval` changed from 30s to 60s. Safe given the 5-minute `DefaultInFlightChatStaleAfter`. ### 6. O(depth) descendant check (LOW) — `subagent.go` Replaced top-down BFS (`O(total_descendants)` queries) with bottom-up parent-chain walk (`O(depth)` queries). Includes cycle protection. ## Not addressed (intentionally) - Message serialization overhead - Buffer eviction (`buffer[1:]` pattern)	2026-03-11 14:00:08 -04:00
Kyle Carberry	bb59477648	feat(db): add created_by column to chat_messages table (#22940 ) Adds a `created_by` column (nullable UUID) to the `chat_messages` table to track which user created each message. Only user-sent messages populate this field; assistant, tool, system, and summary messages leave it null. The column is threaded through the full stack: SQL migration, query updates, generated Go/TypeScript types, db2sdk conversion, chatd (including subagent paths), and API handlers. All API handlers that insert user messages now pass the authenticated user's ID as `created_by`. No foreign key constraint was added, matching the existing pattern used by `chat_model_configs.created_by`.	2026-03-11 10:00:38 -04:00
Kyle Carberry	0a026fde39	refactor: remove reasoning title extraction from chat pipeline (#22926 ) Removes the backend and frontend logic that extracted compact titles from reasoning/thinking blocks. The `Title` field on `ChatMessagePart` remains for other part types (e.g. source), but reasoning blocks no longer have titles derived from first-line markdown bold text or provider metadata summaries. Backend: - Remove `ReasoningTitleFromFirstLine`, `reasoningTitleFromContent`, `reasoningSummaryTitle`, `compactReasoningSummaryTitle`, and `reasoningSummaryHeadline` from chatprompt - Simplify `marshalContentBlock` to plain `json.Marshal` (no title injection) - Remove title tracking maps and `setReasoningTitleFromText` from chatloop stream processing - Remove `reasoningStoredTitle` from db2sdk - Remove related tests from db2sdk_test Frontend: - Remove `mergeThinkingTitles` from blockUtils - Simplify `appendTextBlock` to always merge consecutive thinking blocks - Remove `applyStreamThinkingTitle` from streamState - Simplify reasoning/thinking stream handler to ignore title-only parts - Update tests accordingly Net: -487 lines / +42 lines	2026-03-11 11:01:26 +00:00

1 2 3

107 Commits