coder

mirror of https://github.com/coder/coder.git synced 2026-06-03 13:08:25 +00:00

Author	SHA1	Message	Date
Cian Johnston	6194bd6f57	fix: address post-merge review findings for chat org scoping (#24297 ) Addresses review findings from #23827 that were added post-merge: - Persisted attachments now store `organizationId`; mismatched orgs pruned on restore - Workspace selection reconciliation: stale IDs from previous orgs dropped via derived `effectiveWorkspaceId` - Org picker uses `permittedOrganizations()` for RBAC-aware filtering - Org picker hidden when user belongs to only one org - Ref-sync `useEffect` replaced with `useEffectEvent` - `CreateWorkspace()` and `ListTemplates()` take `organizationID` and `db` as required function parameters instead of optional struct fields — compiler enforces them, removes scattered nil guards - Cross-org template check in `CreateWorkspace` is now unconditional - `ListTemplates` org-scoping filter now has test coverage - `setupChatInfra` comment fixed; test helpers use params structs instead of positional UUIDs - Enterprise test documents that org admin only sees own chats (handler hardcodes `OwnerID` — future work needs sidebar UI before lifting that restriction) > 🤖	2026-04-15 11:39:05 +01:00
Callum Styan	730edba87a	fix: fix false positive disconnected agent metric reporting (#24225 ) We noticed during higher active workspace counts that the agent connection metric, generated via a query to the database, would report a relatively high amount of agents as disconnected. Somewhere between 5 and 20%. However, other metrics such as # of websocket connections would suggest that all agent connections are healthy. Looking at the `Agents` function in prometheus metrics, plus the query execution time (not accounting for actual database RT time) revealed that this reporting of agents as disconnected was almost certainly false positives due to clock drift in the way we're generating the metric values. At 10k metrics, with a p50 of 2ms and p99 of 5ms, the entire `agents` function could take upwards of 50s to execute. Because we were doing a query/database RT to query th apps for each agent individually, and grabbing a `time.Now` value on each iteration of that loop, it's likely the portion of agents that were reported as disconnected were those that had last heartbeat the furthest in the past. The fix here is to set a consistent `now` before fetching agent data to avoid clock drift inflating the inactive timeout comparison, and replace the per-agent app query N+1 with a single batched lookup to prevent loop execution time from pushing agents over the disconnected threshold. Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 22:23:06 -07:00
Cian Johnston	c552f9f281	fix: stop group spend limits from leaking across org boundaries (#24294 ) Three SQL queries (`GetUserGroupSpendLimit`, `ResolveUserChatSpendLimit`, `GetUserChatSpendInPeriod`) aggregated chat spend limits and usage globally across all organizations. A restrictive group limit in org A would bleed into org B. ## Changes - Add `organization_id` parameter to all three SQL queries in `coderd/database/queries/chats.sql` - When nil UUID is passed, queries fall back to global behavior (backward compat for HTTP dashboard endpoints) - When real org ID is passed, limits and spend are scoped to that organization - Thread `organizationID` through `ResolveUsageLimitStatus` → `checkUsageLimit` → all chatd call sites - Update dbauthz wrappers for new param structs - HTTP endpoints (`chatCostSummary`, `getMyChatUsageLimitStatus`) pass `uuid.Nil` with TODO for future org-scoped UI - Add `TestResolveUsageLimitStatus_OrgScoped` with 5 test cases covering org isolation, nil-UUID fallback, spend scoping, and user override priority Closes coder/internal#1466 > 🤖	2026-04-14 16:56:17 +01:00
Yevhenii Shcherbina	b78eba9f9d	feat: make sure creds are always masked (#24241 ) ## Summary Adds a `sanitizeCredentialHint` safety check in the db-to-SDK conversion layer to ensure credential hints are always masked before being exposed in the API. Also adds `credential_kind` and `credential_hint` assertions to the session threads API test.	2026-04-13 10:14:38 -04:00
Thomas Kosiewski	6ab30123bf	feat: add chat debug log tables, queries, and SDK types (#23913 )	2026-04-13 15:06:06 +02:00
Cian Johnston	22062ec52e	feat: add organization scoping to chats (#23827 ) Fixes https://github.com/coder/internal/issues/1436 * Adds organization_id to chats with backfill (workspace org → user org membership → default org) * No support yet for ACLs (follow-up issue) - Cross-org workspace binding rejected (both in `CreateChatRequest` and in `create_workspace` tool - Adds `OrganizationAutocomplete` to `AgentCreateForm` - Docs updated with `organization_id` in chats-api.md > 🤖 Written by a Coder Agent. Reviewed by many humans and many agents. --------- Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>	2026-04-13 12:31:25 +01:00
Mathias Fredriksson	a62ead8588	fix(coderd): sort pinned chats first in GetChats pagination (#24222 ) The GetChats SQL query ordered by (updated_at, id) DESC with no pin_order awareness. A pinned chat with an old updated_at could land on page 2+ and be invisible in the sidebar's Pinned section. Add a 4-column ORDER BY: pinned-first flag DESC, negated pin_order DESC, updated_at DESC, id DESC. The negation trick keeps all sort columns DESC so the cursor tuple < comparison still works. Update the after_id cursor clause to match the expanded sort key. Fix the false handler comment claiming PinChatByID bumps updated_at.	2026-04-10 17:13:19 +00:00
J. Scott Miller	7bde763b66	feat: add workspace build transition to provisioner job list (#24131 ) Closes #16332 Previously `coder provisioner jobs list` showed no indication of what a workspace build job was doing (i.e., start, stop, or delete). This adds `workspace_build_transition` to the provisioner job metadata, exposed in both the REST API and CLI. Template and workspace name columns were also added, both available via `-c`. ``` $ coder provisioner jobs list -c id,type,status,"workspace build transition" ID TYPE STATUS WORKSPACE BUILD TRANSITION 95f35545-a59f-4900-813d-80b8c8fd7a33 template_version_import succeeded 0a903bbe-cef5-4e72-9e62-f7e7b4dfbb7a workspace_build succeeded start ```	2026-04-10 09:50:11 -05:00
Matt Vollmer	36141fafad	feat: stack insights tables vertically and paginate Pull requests table (#24198 ) The "By model" and "Pull requests" tables on the PR Insights page (`/agents/settings/insights`) were side-by-side at `lg` breakpoints, and the Pull requests table was hard-capped at 20 rows by the backend. - Replaced `lg:grid-cols-2` with a single-column stacked layout so both tables span the full content width. - Removed the `LIMIT 20` from the `GetPRInsightsRecentPRs` SQL query so all PRs in the selected time range are returned. - Can add this back if we need it. If we do, we should add a little subheader above this table to indicate that we're not showing all PRs within the selected timeframe. - Added client-side pagination to the Pull requests table using `PaginationWidgetBase` (page size 10), matching the existing pattern in `ChatCostSummaryView`. - Renamed the section heading from "Recent" to "Pull requests" since it now shows the full set for the time range. <img width="1481" height="1817" alt="image" src="https://github.com/user-attachments/assets/0066c42f-4d7b-4cee-b64b-6680848edc68" /> > 🤖 PR generated with Coder Agents	2026-04-10 10:48:54 -04:00
Garrett Delfosse	3462c31f43	fix: update directory for terraform-managed subagents (#24220 ) When a devcontainer subagent is terraform-managed, the provisioner sets its directory to the host-side `workspace_folder` path at build time. At runtime, the agent injection code determines the correct container-internal path from `devcontainer read-configuration` and sends it via `CreateSubAgent`. However, the `CreateSubAgent` handler only updated `display_apps` for pre-existing agents, ignoring the `Directory` field. This caused SSH/terminal sessions to land in `~` instead of the workspace folder (e.g. `/workspaces/foo`). Add `UpdateWorkspaceAgentDirectoryByID` query and call it in the terraform-managed subagent update path to also persist the directory. Fixes PLAT-118 <details><summary>Root cause analysis</summary> Two code paths set the subagent `Directory` field: 1. Provisioner (build time): `insertDevcontainerSubagent` in `provisionerdserver.go` stores `dc.GetWorkspaceFolder()` — the host-side path from the `coder_devcontainer` Terraform resource (e.g. `/home/coder/project`). 2. Agent injection (runtime): `maybeInjectSubAgentIntoContainerLocked` in `api.go` reads the devcontainer config and gets the correct container-internal path (e.g. `/workspaces/project`), then calls `client.Create(ctx, subAgentConfig)`. For terraform-managed subagents (those with `req.Id != nil`), `CreateSubAgent` in `coderd/agentapi/subagent.go` recognized the pre-existing agent and entered the update path — but only called `UpdateWorkspaceAgentDisplayAppsByID`, discarding the `Directory` field from the request. The agent kept the stale host-side path, which doesn't exist inside the container, causing `expandPathToAbs` to fall back to `~`. </details> > [!NOTE] > Generated by Coder Agents	2026-04-10 10:11:22 -04:00
Zach	95cff8c5fb	feat: add REST API handlers and client methods for user secrets (#24107 ) Add the five REST endpoints for managing user secrets, SDK client methods, and handler tests. Endpoints: - `POST /api/v2/users/{user}/secrets` - `GET /api/v2/users/{user}/secrets` - `GET /api/v2/users/{user}/secrets/{name}` - `PATCH /api/v2/users/{user}/secrets/{name}` - `DELETE /api/v2/users/{user}/secrets/{name}` Routes are registered under the existing `/{user}` group with `ExtractUserParam`. The delete query was changed from `:exec` to `:execrows` so the handler can distinguish "not found" from success (DELETE with `:exec` silently returns nil for zero affected rows).	2026-04-09 12:12:55 -06:00
Yevhenii Shcherbina	8237822441	feat: byok observability api (#24207 ) ## Summary Exposes `credential_kind` and `credential_hint` on AI Bridge session threads, making credential metadata visible in the session detail API. Each thread in the `/api/v2/aibridge/sessions/{session_id}` response now includes: - `credential_kind`: `centralized` or `byok` - `credential_hint`: masked credential (e.g. `sk-a...pgAA`) Values are taken from the thread's root interception. ## Changes - `codersdk/aibridge.go`: Added `CredentialKind` and `CredentialHint` fields to `AIBridgeThread` - `coderd/database/db2sdk/db2sdk.go`: Populated from root interception in `buildAIBridgeThread` - `SessionTimeline.stories.tsx`: Added fields to mock thread data	2026-04-09 11:41:17 -04:00
Kyle Carberry	391b22aef7	feat: add CLI commands for managing chat context from workspaces (#24105 ) Adds `coder exp chat context add` and `coder exp chat context clear` commands that run inside a workspace to manage chat context files via the agent token. `add` reads instruction and skill files from a directory (defaulting to cwd) and inserts them as context-file messages into an active chat. Multiple calls are additive — `instructionFromContextFiles` already accumulates all context-file parts across messages. `clear` soft-deletes all context-file messages, causing `contextFileAgentID()` to return `!found` on the next turn, which triggers `needsInstructionPersist=true` and re-fetches defaults from the agent. Both commands auto-detect the target chat via `CODER_CHAT_ID` (already set by `agentproc` on chat-spawned processes), or fall back to single-active-chat resolution for the agent. The `--chat` flag overrides both. Also adds sub-agent context inheritance: `createChildSubagentChat` now copies parent context-file messages to child chats at spawn time, so delegated sub-agents share the same instruction context without independently re-fetching from the workspace agent. <details><summary>Implementation details</summary> New files: - `cli/exp_chat.go` — CLI command tree under `coder exp chat context` Modified files: - `agent/agentcontextconfig/api.go` — `ConfigFromDir()` reads context from an arbitrary directory without env vars - `codersdk/agentsdk/agentsdk.go` — `AddChatContext`/`ClearChatContext` SDK methods - `coderd/workspaceagents.go` — POST/DELETE handlers on `/workspaceagents/me/chat-context` - `coderd/coderd.go` — Route registration - `coderd/database/queries/chats.sql` — `GetActiveChatsByAgentID`, `SoftDeleteContextFileMessages` - `coderd/database/dbauthz/dbauthz.go` — RBAC implementations for new queries - `coderd/x/chatd/subagent.go` — `copyParentContextFiles` for sub-agent inheritance - `cli/root.go` — Register `chatCommand()` in `AGPLExperimental()` Auth pattern: Uses `AgentAuth` (same as `coder external-auth`) — agent token via `CODER_AGENT_TOKEN` + `CODER_AGENT_URL` env vars. </details> > 🤖 Generated by Coder Agents --------- Co-authored-by: Michael Suchacz <203725896+ibetitsmike@users.noreply.github.com>	2026-04-09 16:33:00 +02:00
Atif Ali	584c61acb5	fix: mark connecting agents as unhealthy instead of healthy (#24044 ) ## Problem Workspaces showed as "Healthy" immediately after creation while the agent was still downloading, starting, or connecting. If the agent never connected, the workspace stayed "Healthy" for the entire connection timeout (~120s), then abruptly flipped to "Unhealthy". ## Root cause In `db2sdk.WorkspaceAgent`, the health switch had no case for `WorkspaceAgentConnecting`. Agents in `connecting` status with a non-`off` lifecycle (e.g. `created` after a fresh build) fell through to the `default` case and were marked `Healthy = true`. ## Fix Add an explicit case for `WorkspaceAgentConnecting` that sets `Healthy = false` with reason `"agent has not yet connected"`. The case is placed after the existing `!connected + off` case (which correctly catches stopped agents as "not running") and before the `timeout`/`disconnected` cases. ``` Status + Lifecycle → Health reason ────────────────────────────────────────────────────── any !connected + off → "agent is not running" connecting + created/starting → "agent has not yet connected" ← NEW timeout + any → "agent is taking too long to connect" disconnected + any → "agent has lost connection" connected + start_error → "agent startup script exited with an error" connected + shutting_down → "agent is shutting down" connected + ready/starting → healthy ``` The frontend already handles this case — `getAgentHealthIssue()` returns "Workspace agent is still connecting" with `severity: "info"` for unhealthy workspaces with connecting agents. ## Test changes - Healthy test: now actually connects the agent via `agenttest.New` before asserting health (previously passed due to the bug). - New Connecting test: verifies a never-connected agent is correctly marked unhealthy. - Mixed health test: connects a1 and waits for the mixed state (`a1.Healthy && !workspace.Healthy`) to avoid a race where both agents are initially connecting. - Sub-agent excluded test: connects the parent agent and waits for it to be healthy before creating the sub-agent. - TestWorkspaceAgent/Connect: flipped assertion to `Health.Healthy == false` for a `dbfake` agent that never connects. <details> <summary>Review notes</summary> ### Known follow-up The `healthy:false` workspace search filter maps to `[disconnected, timeout]` and does not include `connecting`. This is a pre-existing gap that is now more consequential — a workspace unhealthy solely due to a connecting agent won't appear in `healthy:false` results. Worth a follow-up issue. ### Deep review findings addressed \| Finding \| Severity \| Status \| \|---------\|----------\|--------\| \| Mixed health test race (all 3 reviewers) \| P2 \| Fixed — tightened `Eventually` condition \| \| `TestWorkspaceAgent/Connect` assertion break \| P1 \| Fixed — flipped assertion \| \| CLI renders red for connecting agents \| Obs \| Acknowledged — design trade-off, accurate but visually strong for transient state \| \| Switch case ordering overlap \| Obs \| Documented with inline comment \| </details> > 🤖 This PR was created with the help of Coder Agents, and needs a human review. 🧑💻	2026-04-09 13:21:28 +05:00
Zach	9b91af8ab7	feat: add user secrets SDK types and db2sdk converters (#24102 ) Adds the SDK types and database-to-SDK conversion helpers for the user secrets feature.	2026-04-08 16:48:41 -06:00
Yevhenii Shcherbina	7f496c2f18	feat: byok-observability for aibridge (#23808 ) ## Summary Adds `credential_kind` and `credential_hint` columns to `aibridge_interceptions` to record how each LLM request was authenticated and provide a masked credential identifier for audit purposes. This enables admins to distinguish between centralized API keys, personal API keys, and subscription-based credentials in the interceptions audit log. ## Changes - New migration adding `credential_kind`and `credential_hint` to `aibridge_interceptions` - Updated `InsertAIBridgeInterception` query and proto definition to carry the new fields - Wired proto fields through `translator.go` and `aibridgedserver.go` to the database Depends on https://github.com/coder/aibridge/pull/239	2026-04-08 13:24:28 -04:00
Kyle Carberry	b969d66978	feat: add dynamic tools support for chat API (#24036 ) Adds client-executed dynamic tools to the chat API. Dynamic tools are declared by the client at chat creation time, presented to the LLM alongside built-in tools, but executed by the client rather than chatd. This enables external systems (Slack bots, IDE extensions, Discord bots, CI/CD integrations) to plug custom tools into the LLM chat loop without modifying chatd's built-in tool set. Modeled after OpenAI's Assistants API: the chat pauses with `requires_action` status when the LLM calls a dynamic tool, the client POSTs results back via `POST /chats/{id}/tool-results`, and the chat resumes. See [this example](https://github.com/coder/coder-slackbot-poc) as a reference for how this is used. It's highly-configurable, which would enable creating chats from webhooks, periodically polling, or running as a Slackbot. <details> <summary>Design context</summary> ### Architecture The chatloop exits when it encounters dynamic tools and re-enters when results arrive. No blocking channels, no pubsub for tool results, no in-memory registry. The DB is the only coordination mechanism. ``` Phase 1 (chatloop): LLM response → execute built-in tools only → Persist(assistant + built-in results) → status = requires_action → chatloop exits Phase 2 (POST /tool-results): Persist(dynamic tool results) → status = pending → wakeCh → chatloop re-enters ``` ### Validation (POST /tool-results) 1. Chat status must be `requires_action` (409 if not) 2. Read chat's `dynamic_tools` → set of dynamic tool names 3. Read last assistant message → extract tool-call parts matching dynamic tool names 4. Submitted tool_call_ids must match exactly (400 for missing/extra) 5. Persist tool-result message parts, set status to `pending`, signal wake ### Idempotency Tool call IDs scoped per LLM step. State machine (`requires_action` → `pending`) is the guard. First POST wins, subsequent get 409. ### Mixed tool calls When the LLM calls both built-in and dynamic tools in one step, built-in tools execute immediately. Their results are persisted in phase 1. Dynamic tool results arrive via POST in phase 2. The LLM sees all results when the chatloop resumes. </details> > 🤖 Generated by Coder Agents	2026-04-08 11:54:44 -04:00
Kyle Carberry	c5d720f73d	feat(coderd): add telemetry for agents chats and messages (#24068 ) Adds telemetry collection for the agents chat system (`/agents`) to the existing telemetry snapshot pipeline. Three new snapshot fields: - `Chats` — per-chat metadata (id, owner, status, mode, workspace_id, root_chat_id, has_parent, archived, model config) collected time-windowed via `createdAfter` - `ChatMessageSummaries` — per-chat aggregated message metrics (counts by role, token sums by type, cost, runtime, model count, compression count) collected time-windowed - `ChatModelConfigs` — model configuration metadata (provider, model, context limit, enabled, default) collected as full dump No PII is included — titles, message content, and URLs are excluded at the SQL level. Only structural metadata flows through telemetry. <details><summary>Implementation plan</summary> ### SQL Queries (`coderd/database/queries/chats.sql`) - `GetChatsCreatedAfter` — time-windowed chat metadata - `GetChatMessageSummariesPerChat` — per-chat message aggregates via `GROUP BY` - `GetChatModelConfigsForTelemetry` — full dump of model configs ### Telemetry (`coderd/telemetry/telemetry.go`) - `Chat`, `ChatMessageSummary`, `ChatModelConfig` structs - `ConvertChat`, `ConvertChatMessageSummary`, `ConvertChatModelConfig` conversion functions - Three `eg.Go()` blocks in `createSnapshot()` following the existing collection pattern ### Authorization (`coderd/database/dbauthz/dbauthz.go`) - System-only access for all three queries via `rbac.ResourceSystem` ### Tests - `TestChatsTelemetry` in `coderd/telemetry/telemetry_test.go` — creates chats (root + child), messages with token/cost data, model configs; verifies all snapshot fields - dbauthz test entries for all three queries in `coderd/database/dbauthz/dbauthz_test.go` </details> > 🤖 Generated by Coder Agents	2026-04-08 09:47:44 -04:00
Cian Johnston	233343c010	feat: add chat and chat_files cleanup to dbpurge (#23833 ) Fixes https://github.com/coder/coder/issues/23910 Adds periodic cleanup of chats and chat files to the dbpurge background goroutine, with a configurable retention period exposed in the Agent settings UI. > 🤖 Written by a Coder Agent. Reviewed by a human.	2026-04-08 11:08:09 +01:00
Zach	565a15bc9b	feat: update user secrets queries for REST API and injection (#23998 ) Update queries as prep work for user secrets API development: - Switch all lookups and mutations from ID-based to user_id + name - Split list query into metadata-only (for API responses) and with-values (for provisioner/agent) - Add partial update support using CASE WHEN pattern for write-only value fields - Include value_key_id in create for dbcrypt encryption support - Update dbauthz wrappers and remove stale methods from dbmetrics	2026-04-07 09:03:28 -06:00
Kyle Carberry	684f21740d	perf(coderd): batch chat heartbeat queries into single UPDATE per interval (#24037 ) ## Summary Replaces N per-chat heartbeat goroutines with a single centralized heartbeat loop that issues one `UPDATE` per 30s interval for all running chats on a worker. ## Problem Each running chat spawned a dedicated goroutine that issued an individual `UPDATE chats SET heartbeat_at = NOW() WHERE id = $1 AND worker_id = $2 AND status = 'running'` query every 30 seconds. At 10,000 concurrent chats this produces ~333 DB queries/second just for heartbeats, plus ~333 `ActivityBumpWorkspace` CTE queries/second from `trackWorkspaceUsage`. ## Solution New `UpdateChatHeartbeats` (plural) SQL query replaces the old singular `UpdateChatHeartbeat`: ```sql UPDATE chats SET heartbeat_at = @now::timestamptz WHERE worker_id = @worker_id::uuid AND status = 'running'::chat_status RETURNING id; ``` A single `heartbeatLoop` goroutine on the `Server`: 1. Ticks every `chatHeartbeatInterval` (30s) 2. Issues one batch UPDATE for all registered chats 3. Detects stolen/completed chats via set-difference (equivalent of old `rows == 0`) 4. Calls `trackWorkspaceUsage` for surviving chats `processChat` registers an entry in the heartbeat registry instead of spawning a goroutine. ## Impact \| Metric \| Before (10K chats) \| After (10K chats) \| \|---\|---\|---\| \| Heartbeat queries/sec \| ~333 \| ~0.03 (1 per 30s per replica) \| \| Heartbeat goroutines \| 10,000 \| 1 \| \| Self-interrupt detection \| Per-chat `rows==0` \| Batch set-difference \| --- > 🤖 Generated by Coder Agents <details><summary>Implementation notes</summary> - Uses `@now` parameter instead of `NOW()` so tests with `quartz.Mock` can control timestamps. - `heartbeatEntry` stores `context.CancelCauseFunc` + workspace state for the centralized loop. - `recoverStaleChats` is unaffected — it reads `heartbeat_at` which is still updated. - The old singular `UpdateChatHeartbeat` is removed entirely. - `dbauthz` wrapper uses system-level `rbac.ResourceChat` authorization (same pattern as `AcquireChats`). </details>	2026-04-07 10:25:46 -04:00
George K	86ca61d6ca	perf: cap count queries and emit native UUID comparisons for audit/connection logs (#23835 ) Audit and connection log pages were timing out due to expensive COUNT(*) queries over large tables. This commit adds opt-in count capping: requests can return a `count_cap` field signaling that the count was truncated at a threshold, avoiding full table scans that caused page timeouts. Text-cast UUID comparisons in regosql-generated authorization queries also contributed to the slowdown by preventing index usage for connection and audit log queries. These now emit native UUID operators. Frontend changes handle the capped state in usePaginatedQuery and PaginationWidget, optionally displaying a capped count in the pagination UI (e.g. "Showing 2,076 to 2,100 of 2,000+ logs") Related to: https://linear.app/codercom/issue/PLAT-31/connectionaudit-log-performance-issue	2026-04-07 07:24:53 -07:00
Cian Johnston	d5a1792f07	feat: track chat file associations with chat_file_links on chats (#23537 ) Needed by #23833 Adds a `chat_file_links` association table to track which files are associated with each chat. - `AppendChatFileIDs` query links a file to a chat with deduplication - `GetChatFileMetadataByIDs` query returns lightweight file metadata by IDs - Tool-created files (e.g. `propose_plan`) are linked to the chat after insert - User-uploaded files are linked to the chat when the referencing message is sent - Single-chat GET endpoint hydrates `files: ChatFileMetadata[]` on the response > 🤖 Created by Coder Agents and massaged into shape by a human.	2026-04-07 12:05:29 +01:00
Kyle Carberry	a2ce74f398	feat: add total_runtime_ms to chat cost analytics endpoints (#24050 ) Surface the aggregated `runtime_ms` from `chat_messages` through all four cost analytics queries (summary, per-model, per-chat, per-user). This is the key billing metric for agent compute time. The per-chat breakdown already groups by `root_chat_id`, so subagent runtime is automatically rolled up under the parent chat — no additional query changes needed. <details> <summary>Implementation details</summary> SQL (`coderd/database/queries/chats.sql`): Added `COALESCE(SUM(cm.runtime_ms), 0)::bigint AS total_runtime_ms` to `GetChatCostSummary`, `GetChatCostPerModel`, `GetChatCostPerChat`, and `GetChatCostPerUser`. Go SDK (`codersdk/chats.go`): Added `TotalRuntimeMs int64` to `ChatCostSummary`, `ChatCostModelBreakdown`, `ChatCostChatBreakdown`, and `ChatCostUserRollup`. Handler (`coderd/exp_chats.go`): Wired the new field through all converter functions and the response assembly. Tests (`coderd/exp_chats_test.go`): Updated fixture to seed non-zero `runtime_ms` values and added assertions for the new field at summary, per-model, and per-chat levels. </details> > 🤖 Generated by Coder Agents	2026-04-06 12:10:57 -04:00
Jon Ayers	a1d51f0dab	feat: batch connection logs to avoid DB lock contention (#23727 ) - Running 30k connections was generating a ton of lock contention in the DB	2026-04-03 15:47:26 -05:00
Jon Ayers	333503f74e	feat: improve coordinator peer mapping performance (#23696 ) - Skipping DB querying entirely for peers that aren't actually connected to our coordinator - Opportunistically batching the queries for peers	2026-04-03 14:22:58 -05:00
Paweł Banaszewski	8369fa88fd	feat: add columns for cached tokens from aibridge (#23832 ) Two new columns added to aibridge_token_usages: - cache_read_input_tokens (BIGINT, default 0) - cache_write_input_tokens (BIGINT, default 0) Migration backfills existing rows by extracting values from the metadata JSONB column (cache_read_input, input_cached, prompt_cached for reads (max value selected since only 1 should be set), cache_creation_input for writes). All references to data from metadata were updated to reference new columns. No other changes then changing where data is extracted from. Requires aibridge library version bump to include: https://github.com/coder/aibridge/pull/229 Fixes: https://github.com/coder/aibridge/issues/150	2026-04-03 16:27:31 +02:00
Zach	990c006f28	feat(coderd/database): add value_key_id column to user_secrets for encryption (#23997 ) Add a nullable `value_key_id` column to the `user_secrets` table with a foreign key to `dbcrypt_keys`. This is the column dbcrypt uses to track which encryption key encrypted a given secret's value. This is required for encryption of user secret values. The column was missing from the original migration (000357).	2026-04-02 15:40:32 -06:00
Michael Suchacz	7d0a0c6495	feat: provider key policies and user provider settings (#23751 )	2026-04-02 19:46:42 +02:00
Susana Ferreira	fb788530b3	feat: add provider_name column to aibridge interceptions (#23960 ) ## Description Adds `provider_name` to aibridge interceptions to store the provider instance name alongside the provider type. This allows distinguishing between multiple instances of the same provider type (e.g. `copilot` vs `copilot-business`). ## Changes * Add `provider_name` column to `aibridge_interceptions` table with backfill from `provider`. * Add `provider_name` field to the proto `RecordInterceptionRequest` message. * Add `ProviderName` to the `codersdk.AIBridgeInterception` API response. _Disclaimer: initially produced by Claude Opus 4.6, modified and reviewed by @ssncferreira ._	2026-04-02 10:58:13 +01:00
Ethan	7757cd8e08	refactor(coderd/x/chatd): insert chats directly as pending on creation (#23888 ) Previously, `CreateChat` inserted the `chats` row with the DB default status (`waiting`), then updated it to `pending` in the same transaction via `setChatPendingWithStore`. This wasted two extra queries per chat creation (`GetChatByID` + `UpdateChatStatus`) and rewrote the same row immediately after inserting it. Now `CreateChat` passes the status directly to `InsertChat`, so the row is written once in its final create-time state. The `setChatPendingWithStore` helper is removed entirely. `InsertChat` now requires an explicit `status` parameter at all callsites instead of relying on a DB column default. ## Motivation On an experimental branch we're trialing firing all chatd notifications from plpgsql triggers. The old two-step insert made that awkward: in an `AFTER INSERT` trigger, `NEW` only contained the insert-time row (`waiting`), not the final committed state (`pending`). To emit the correct event payload the trigger had to be deferred and re-read the row from `chats` at commit time. With this change, `NEW` already contains the correct row to publish — no deferred trigger, no extra `SELECT`, simpler and cheaper trigger logic. That said, this seems like a worthwhile change regardless of the trigger experiment: writing the final row state once removes unnecessary DB work on every chat creation and makes the create path easier to reason about.	2026-04-02 14:13:51 +11:00
Ethan	5cba59af79	fix(coderd): unarchive child chats with parents (#23761 ) Unarchiving a root chat now restores descendant chats in the database and emits lifecycle events for every affected chat so passive sessions converge without a full refetch. This keeps archive and unarchive symmetric at both the data and watch-stream layers by returning the affected chat family from the database, using those post-update rows for chatd pubsub fanout, and covering descendant lifecycle delivery with a watch-level regression test. Closes #23666	2026-04-01 15:30:25 +11:00
Danny Kopping	9fa103929a	perf: make `ListAIBridgeSessions` 10x faster (#23774 ) _Disclaimer: produced using Claude Opus 4.6, reviewed by me, and validated against Dogfood dataset._ The `ListAIBridgeSessions` query materialized and aggregated all matching interceptions before paginating, then ran expensive token/prompt lookups across the full dataset. For a page of 25 sessions against ~200k interceptions (our dogfood dataset), this meant: - Three CTEs scanning all rows (filtered_interceptions, session_tokens, session_root) - ARRAY_AGG(fi.id) collecting every interception ID per session - Lateral prompt lookup via ANY(array_of_all_ids) running for every session, not just the page - ~90MB of disk sorts and JIT compilation kicking in The improvement is to restructure to paginate first and enrich after: a single CTE groups interceptions into sessions with only cheap aggregates (MIN, MAX, COUNT), applies cursor pagination and LIMIT, then lateral joins fetch metadata, tokens, and prompts for just the ~25-row page. Measured against 220k interceptions / 160k sessions: \| Metric \| Before \| After \| \|--------------------\|--------\|-------\| \| Execution time \| 1800ms \| 185ms \| \| Shared buffer hits \| 737k \| 2.6k \| \| Disk sort spill \| 86MB \| 16MB \| \| Lateral loops \| 160k \| 25 \| https://grafana.dev.coder.com/goto/fbODPGtvR?orgId=1 the results are identical, just _much_ faster. --- Also includes some additional tests which I added prior to refactoring the query to ensure no regressions on edge-cases. --------- Signed-off-by: Danny Kopping <danny@coder.com>	2026-03-31 14:42:23 +02:00
Cian Johnston	3ce82bb885	feat: add chat-access site-wide role to gate chat creation (#23724 ) - Add `chat-access` built-in role granting chat CRUD at User scope - Exclude `ResourceChat` from member, org member, and org service account `allPermsExcept` calls - Allow system, owner, and user-admin to assign the new role - Migration auto-assigns role to users who have ever created a chat - Update RBAC test matrix: `memberMe` denied, `chatAccessUser` allowed Breaking change: Members without `chat-access` lose chat creation ability. Migration covers existing chat creators. Members who have never created a chat do not get this role automatically applied. > 🤖 This PR was created by a Coder Agent and reviewed by me.	2026-03-31 10:07:21 +01:00
Kyle Carberry	a5cc579453	feat: add last_injected_context column to chats table (#23798 ) Adds a nullable JSONB column `last_injected_context` to the `chats` table that stores the most recently persisted injected context parts (AGENTS.md context-file and skill message parts). The column is updated only when `persistInstructionFiles()` runs — on first workspace attach or when the agent changes — so there are no redundant writes on subsequent turns. Internal fields (`ContextFileContent`, `ContextFileOS`, `ContextFileDirectory`, `SkillDir`) are stripped at write time so the column only holds small metadata. No stripping needed on the read path. <details> <summary>Implementation notes</summary> - New migration `000456` adds nullable `last_injected_context JSONB` column. - New SQL query `UpdateChatLastInjectedContext` writes the column without touching `updated_at`. - `persistInstructionFiles()` strips internal fields from parts via `StripInternal()` before persisting. - Sentinel path (no AGENTS.md) persists skill-only parts when skills exist. - `codersdk.Chat` exposes `LastInjectedContext []ChatMessagePart` (omitempty). - `db2sdk.Chat()` passes through the already-clean data. </details>	2026-03-30 14:11:30 -04:00
Jake Howell	71a492a374	feat: implement `<ClientFilter />` to AI Bridge request logs (#22694 ) Closes #22136 This pull-request implements a `<ClientFilter />` to our `Request Logs` page for AI Bridge. This will allow the user to select a client which they wish to filter against. Technically the backend is able to actually filter against multiple clients at once however the frontend doesn't currently have a nice way of supporting this (future improvement). <img width="1447" height="831" alt="image" src="https://github.com/user-attachments/assets/0be234e2-25f2-4a89-b971-d74817395da1" /> --------- Co-authored-by: Jeremy Ruppel <jeremy.ruppel@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 17:18:28 -04:00
Kyle Carberry	bcdc35ee3e	feat: add chat read/unread indicator to sidebar (#23129 ) ## Summary Adds read/unread tracking for chats so users can see which agent conversations have new assistant messages they haven't viewed. ## Backend Changes - Adds `last_read_message_id` column to the `chats` table (migration 000439). - Computes `has_unread` as a virtual column in `GetChatsByOwnerID` using an `EXISTS` subquery checking for assistant messages beyond the read cursor. - Exposes `has_unread` on the `codersdk.Chat` struct and auto-generated TypeScript types. - Updates `last_read_message_id` on stream connect/disconnect in `streamChat`, avoiding per-message API calls during active streaming. - Uses `context.WithoutCancel` for the deferred disconnect write so the DB update succeeds even after the client disconnects. ## Frontend Changes - Bold title (`font-semibold`) for unread chats in the sidebar. - Small blue dot indicator next to the relative timestamp. - Suppresses unread indicator for the currently active chat via `isActive` from NavLink. ## Design Decisions - Only `assistant` messages count as unread — the user's own messages don't trigger the indicator. - No foreign key on `last_read_message_id` since messages can be deleted (via rollback/truncation) and the column is just a high-water mark. - Zero API calls during streaming: exactly 2 DB writes per stream session (connect + disconnect). - Unread state refreshes on chat list load and window focus. The `watchChats` WebSocket optimistically marks non-active chats as unread on `status_change` events, but does not carry a server-computed `has_unread` field. Navigating to a chat optimistically clears its unread indicator in the cache.	2026-03-27 12:15:04 -04:00
Kyle Carberry	d973a709df	feat: add model_intent option to MCP server configs (#23717 ) Add a per-MCP-server `model_intent` toggle that wraps tool schemas with a `model_intent` field, requiring the LLM to provide a human-readable description of each tool call's purpose. The intent string is shown as a status label in the UI instead of opaque tool names, and is transparently stripped before the call reaches the remote MCP server. Built-in tools have rich specialized renderers (terminal blocks, file diffs, etc.) and don't need this. MCP tools hit `GenericToolRenderer` which only shows raw tool names and JSON — that's where model_intent adds value. The model learns what to provide via the JSON Schema `description` on the `model_intent` property itself — no system prompt changes needed. <details> <summary>Implementation details</summary> ### Architecture Inspired by the `withModelIntent()` pattern from `coder/blink`, adapted for Go + React. The wrapping is entirely in the `mcpclient` layer — tool implementations never see `model_intent`. Schema wrapping (`mcpToolWrapper.Info()`): When enabled, wraps the original tool parameters under a `properties` key and adds a `model_intent` string field with a rich description that teaches the model inline. Input unwrapping (`mcpToolWrapper.Run()`): Strips `model_intent` and unwraps `properties` before forwarding to the remote MCP server. Handles three input shapes models may produce: 1. `{ model_intent, properties: {...} }` — correct format 2. `{ model_intent, key: val, ... }` — flat, no wrapper 3. Malformed — falls through gracefully Frontend extraction: `streamState.ts` extracts `model_intent` from incrementally parsed streaming JSON. `messageParsing.ts` extracts it from persisted tool call args. UI rendering: `GenericToolRenderer` shows the capitalized intent string as the primary label when available, falling back to the raw tool name. ### Changes - Database: `model_intent` boolean column on `mcp_server_configs` - SDK: `ModelIntent` field on config/create/update types - API: pass-through in create/update handlers + converter - mcpclient: schema wrapping in `Info()`, input unwrapping in `Run()` - Frontend: extraction from streaming + persisted args - UI: intent label in `GenericToolRenderer`, toggle in admin panel - Tests: 6 new tests (schema wrapping, unwrapping, passthrough, fallback) ### Decision log - Option lives on MCPServerConfig, not model config: Built-in tools already have rich renderers; only MCP tools benefit from model_intent. - No system prompt changes: The JSON Schema `description` on the `model_intent` property teaches the model inline. - Pointer bool on update request: Follows existing pattern (`*bool`) so PATCH requests don't reset the value when omitted. </details>	2026-03-27 14:23:25 +00:00
Michael Suchacz	2312e5c428	feat: add manual chat title regeneration (#23633 ) ## Summary Adds a "Generate new title" action that lets users manually regenerate a chat's title using richer conversation context than the automatic first-message title path. ## Changes ### Backend - New endpoint: `POST /api/experimental/chats/{chatID}/title/regenerate` returns the updated Chat with a regenerated title - Manual title algorithm: Extracts useful user/assistant text turns → selects first user turn + last 3 turns → builds context with gap markers → renders prompt with anti-recency guidance → calls lightweight model → normalizes output - Helpers: `extractManualTitleTurns`, `selectManualTitleTurnIndexes`, `buildManualTitleContext`, `renderManualTitlePrompt`, `generateManualTitle` — all private, with the public `Server.RegenerateChatTitle` method - SDK: `ExperimentalClient.RegenerateChatTitle(ctx, chatID) (Chat, error)` - Persists title via existing `UpdateChatByID` and broadcasts `ChatEventKindTitleChange` ### Frontend - API client method + React Query mutation with cache invalidation - "Generate new title" menu item (with wand icon) in both TopBar and Sidebar dropdown menus - Loading/disabled state while regeneration is in-flight - Error toast on failure - Stories updated for both menus ### Tests - `quickgen_test.go`: Table-driven tests for all 4 helper functions (turn extraction, index selection, context building, prompt rendering) - `exp_chats_test.go`: Handler tests (ChatNotFound, NotFoundForDifferentUser, NoDaemon) ## Design notes - The existing auto-title path (`maybeGenerateChatTitle`, `titleInput`) is completely unchanged - Manual regeneration uses richer context (first user turn + last 3 turns + gap markers) vs the auto path's single first message - Endpoint is experimental and marked with `@x-apidocgen {"skip": true}`	2026-03-27 01:47:19 +01:00
Matt Vollmer	113aaa79a0	feat: add pinned chats with drag-to-reorder (#23615 ) https://github.com/user-attachments/assets/bd5d12a1-61b3-4b7d-83b6-317bdfb60b3c ## Summary Adds pinned chats to the agents page sidebar with server-side persistence and drag-to-reorder. Users can pin/unpin chats via the context menu, and pinned chats appear in a dedicated "Pinned" section above the time-grouped list. ## Database Migration `000453_chat_pin_order`: adds `pin_order integer DEFAULT 0 NOT NULL` column on `chats` (0 = unpinned, 1+ = pinned in display order). Three SQL queries handle pin operations server-side using CTEs with `ROW_NUMBER()`: - `PinChatByID`: normalizes existing orders and appends to end - `UnpinChatByID`: sets target to 0 and compacts remaining pins - `UpdateChatPinOrder`: shifts neighbors, clamps to `[1, pinned_count]` All queries exclude archived chats. `ArchiveChatByID` clears `pin_order` on archive. The handler rejects pinning archived chats with 400. ## Backend Pin/unpin/reorder go through the existing `PATCH /api/experimental/chats/{chat}` via the `pin_order` field on `UpdateChatRequest`. The handler routes based on current pin state: `pin_order == 0` unpins, `> 0` on an already-pinned chat reorders, `> 0` on an unpinned chat appends to end. ## Frontend - `pinChat` / `unpinChat` / `reorderPinnedChat` optimistic mutations using shared `isChatListQuery` predicate - Sidebar renders Pinned section above time groups, excludes pinned chats from time groups - Pin/Unpin context menu items (hidden for child/delegated chats) - `@dnd-kit/core` + `@dnd-kit/sortable` for drag-to-reorder with `MouseSensor`, `TouchSensor`, and `KeyboardSensor` - Local pin-order override prevents flash on drop; click blocker prevents NavLink navigation after drag --- PR generated with Coder Agents	2026-03-26 16:52:02 -04:00
Cian Johnston	bfee7e6245	fix: populate all chat fields in pubsub events (#23664 ) Problem: `publishChatPubsubEvent` was constructing a partial `codersdk.Chat` that omitted `LastModelConfigID` and other fields. Go's zero-value UUID caused the sidebar to show "Default model" for chats received via SSE. Solution: - Extracted `convertChat`/`convertChats` from `exp_chats.go` into `db2sdk.Chat`/`db2sdk.Chats`, alongside existing `ChatMessage`, `ChatQueuedMessage`, and `ChatDiffStatus` converters. `publishChatPubsubEvent` now calls `db2sdk.Chat(chat, nil)` instead of maintaining its own copy of the conversion logic - Added backend integration test `TestWatchChats/CreatedEventIncludesAllChatFields` - Added frontend regression tests for nil-UUID and valid model config ID cases > 🤖 Created by Coder Agents, reviewed by this human.	2026-03-26 16:49:26 +00:00
Danny Kopping	801e57d430	feat: session detail API (#23203 )	2026-03-26 18:09:53 +02:00
Michael Suchacz	4f063cdc47	feat: separate default and additional Coder Agents system prompts (#23616 ) Admins can now control whether the built-in Coder Agents default system prompt is prepended to their custom instructions, rather than having the custom prompt silently replace the default. Changes: - New `include_default_system_prompt` boolean toggle (defaults to `true` for existing deployments) stored as a site config key — no migration needed. - GET `/api/experimental/chats/config/system-prompt` returns the toggle state, the custom prompt, and a preview of the built-in default. - PUT persists both the toggle and custom prompt atomically in a single transaction. - `resolvedChatSystemPrompt()` composes `[default?, custom?]` joined by `\n\n`, falling back to the built-in default on DB errors. - Settings UI adds a Switch toggle with conditional helper text and a "Preview" button that shows the built-in default prompt via the existing `TextPreviewDialog`. - Comprehensive test coverage: 15 subtests covering toggle behavior, prompt composition matrix, auth boundaries, and integration with chat creation.	2026-03-26 13:32:41 +01:00
Cian Johnston	d175e799da	feat: show agent badge on workspace list (#23453 ) - Adds `GET /api/experimental/chats/by-workspace` endpoint that returns workspace_id → latest chat_id mapping - Modifies FE to fetch this alongside the workspace list, gated on `agents` experiment and render an "Agent" badge similar to the existing "Task" badge in `WorkspacesTable` - Badge links to the "latest chat" linked to the given workspace. Notes: - Intentionally uses `fetchWithPostFilter` for RBAC to decouple from workspaces API — will migrate to `workspaces_expanded` view later. - If users have multiple chats linked to the same workspace, the badge will link to the most recently updated one. > 🤖 This PR was created with the help of Coder Agents, and has been reviewed by my human. 🧑‍💻	2026-03-26 11:30:12 +00:00
Jaayden Halko	3fb7c6264f	feat: display the AI add-on column in the UI on the Users and Organization Members tables (#23291 ) ## Summary Adds an entitlement-gated AI add-on column to both the Users table and the Organization Members table. When `ai_governance_user_limit` is entitled, each row shows whether the user is consuming an AI seat. ## Background The AI governance add-on tracks which users are consuming AI seats. Admins need visibility into per-user seat consumption directly from the user management tables. This change surfaces that information through both the site-wide Users table and the per-organization Members table, gated behind the `ai_governance_user_limit` entitlement so the column only appears when the feature is licensed. ## Implementation ### Backend - New SQL query `GetUserAISeatStates` (`coderd/database/queries/aiseatstate.sql`) — returns user IDs consuming an AI seat, derived from: - Users with entries in `aibridge_interceptions` (AI Bridge usage) - Users who own workspaces with `has_ai_task = true` builds (AI Tasks usage) - SDK types — added `has_ai_seat: boolean` to `codersdk.User` and `codersdk.OrganizationMemberWithUserData` - Handler wiring — both the Users list endpoint (`coderd/users.go`) and all Members endpoints (`coderd/members.go`) query AI seat state per page of user IDs and populate the response field - dbauthz — per-user `ActionRead` checks on `ResourceUserObject` ### Frontend - Shared `AISeatCell` component (`site/src/modules/users/AISeatCell.tsx`) — green `CircleCheck` for consuming, gray `X` for non-consuming - `TableColumnHelpTooltip` — extended with `ai_addon` variant with tooltip: "Users with access to AI features like AI Bridge, Boundary, or Tasks who are actively consuming a seat." - Column visibility gated behind `useFeatureVisibility().ai_governance_user_limit` ## Validation - Backend: dbauthz full method suite (`TestMethodTestSuite`) passes including new `GetUserAISeatStates` test - Backend: `TestGetUsers`, `TestUsersFilter`, CLI golden file tests pass - Frontend: 7/7 tests pass across `UsersPage.test.tsx` and `OrganizationMembersPage.test.tsx` (column visibility gating both directions) - `go build ./coderd/...` compiles clean - `pnpm --dir site run lint:types` passes - `make gen` clean ## Risks - Pagination performance: The AI seat query is scoped to the current page's user IDs (not a full table scan), keeping it efficient for paginated views. - Semantic scope: The workspace-side AI seat derivation uses "any build with `has_ai_task = true`" rather than "latest build only". If the product intent is latest-build-only, this can be tightened in a follow-up. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-6` • Thinking: `xhigh` • Cost: `$27.25`_ <!-- mux-attribution: model=anthropic:claude-opus-4-6 thinking=xhigh costs=27.25 -->	2026-03-26 10:36:40 +00:00
Ethan	61e31ec5cc	perf(coderd/x/chatd): persist workspace agent binding across chat turns (#23274 ) ## Summary This change removes the steady-state "resolve the latest workspace agent" query from chat execution. Instead of asking the database for the latest build's agent on every turn, a chat now persists the workspace/build/agent binding it actually uses and reuses that binding across subsequent turns. The common path becomes "load the bound agent by ID and dial it", with fallback paths to repair the binding when it is missing, stale, or intentionally changed. ## What changes - add `workspace_id`, `build_id`, and `agent_id` binding fields to `chats` - expose those fields through the chat API / SDK so the execution context is explicit - load the persisted binding first in chatd, instead of always resolving the latest build's agent - persist a refreshed binding when chatd has to re-resolve the workspace agent - keep child / subagent chats on the same bound workspace context by inheriting the parent binding - leave `build_id` / `agent_id` unset for flows like `create_workspace`, then bind them lazily on the next agent-backed turn ## Runtime behavior The binding is treated as an optimistic cache of the agent a chat should use: - if the bound agent still exists and dials successfully, we use it without a latest-build lookup - if the bound agent is missing or no longer reachable, chatd re-resolves against the latest build and persists the new binding - if a workspace mutation changes the chat's target workspace, the binding is updated as part of that mutation To avoid reintroducing a hot-path query, dialing uses lazy validation: - start dialing the cached agent immediately - only validate against the latest build if the dial is still pending after a short delay - if validation finds a different agent, cancel the stale dial, switch to the current agent, and persist the repaired binding ## Result The hot path stops issuing `GetWorkspaceAgentsInLatestBuildByWorkspaceID` for every user message, which is the source of the DB pressure this PR is addressing. At the same time, chats still converge to the correct workspace agent when the binding becomes stale due to rebuilds or explicit workspace changes.	2026-03-26 17:22:38 +11:00
Kyle Carberry	d4660d8a69	feat: add labels to chats (#23594 ) ## Summary Adds a general-purpose `map[string]string` label system to chats, stored as jsonb with a GIN index for efficient containment queries. This is a standalone foundational feature that will be used by the upcoming Automations feature for session identity (matching webhook events to existing chats), replacing the need for bespoke session-key tables. ## Changes ### Database - Migration 000451: Adds `labels jsonb NOT NULL DEFAULT '{}'` column to `chats` table with a GIN index (`idx_chats_labels`) - `InsertChat`: Accepts labels on creation via `COALESCE(@labels, '{}')` - `UpdateChatByID`: Supports partial update — `COALESCE(sqlc.narg('labels'), labels)` preserves existing labels when NULL is passed - `GetChats`: New `has_labels` filter using PostgreSQL `@>` containment operator - `GetAuthorizedChats`: Synced with generated `GetChats` (new column scan + query param) ### API - Create chat (`POST /chats`): Accepts optional `labels` field, validated before creation - Update chat (`PATCH /chats/{chat}`): Supports `labels` field for atomic label replacement - List chats (`GET /chats`): Supports `?label=key:value` query parameters (multiple are AND-ed) ### SDK - `Chat`, `CreateChatRequest`, `UpdateChatRequest`, `ListChatsOptions` all gain `Labels` fields - `UpdateChatRequest.Labels` is a pointer (`map[string]string`) so `nil` means "don't change" vs empty map means "clear all" ### Validation (`coderd/httpapi/labels.go`) - Max 50 labels per chat - Key: 1–64 chars, must match `[a-zA-Z0-9][a-zA-Z0-9._/-]` (supports namespaced keys like `github.repo`, `automation/pr-number`) - Value: 1–256 chars - 13 test cases covering all edge cases ### Chat runtime - `chatd.CreateOptions` gains `Labels` field, threaded through to `InsertChat` - Existing `UpdateChatByID` callers (e.g., quickgen title updates) are unaffected — NULL labels preserve existing values via COALESCE	2026-03-25 17:26:26 +00:00
Cian Johnston	796872f4de	feat: add deployment-wide template allowlist for chats (#23262 ) - Stores a deployment-wide agents template allowlist in `site_configs` (`agents_template_allowlist`) - Adds `GET/PUT /api/experimental/chats/config/template-allowlist` endpoints - Filters `list_templates`, `read_template`, and `create_workspace` chat tools by allowlist, if defined (empty=all allowed) - Add "Templates" admin settings tab in Agents UI ([what it looks like](https://624de63c6aacee003aa84340-sitjilsyrr.chromatic.com/?path=/story/pages-agentspage-agentsettingspageview--template-allowlist)) > 🤖 This PR was created with the help of Coder Agents, and has been reviewed by my human. 🧑‍💻	2026-03-25 15:19:17 +00:00
Kyle Carberry	40395c6e32	fix(coderd): fast-retry PR discovery after git push (#23579 ) ## Problem When chatd pushes a branch and then creates a PR (e.g. `git push` followed by `gh pr create`), the gitsync background worker often picks up the stale `chat_diff_statuses` row between the two operations. At that point no PR exists yet, so the worker skips the row. However, the acquisition SQL locks the row for 5 minutes (crash-recovery interval), creating a dead zone where the PR diff is invisible in the UI until the user manually navigates to the chat. ### Root cause 1. `git push` triggers `GIT_ASKPASS` → coderd external-auth handler → `MarkStale()` sets `stale_at = now - 1s` 2. Background worker acquires the row within ~10s, atomically bumps `stale_at = NOW() + 5 min` (crash-recovery lock) 3. Worker calls `ResolveBranchPullRequest` → no PR exists yet → returns `nil` → worker skips with `continue` 4. `gh pr create` completes moments later, but uses its own auth (not `GIT_ASKPASS`), so no second `MarkStale` fires 5. Row is locked for 5 minutes before the worker can retry Loading the chat works immediately because `GET /chats/{chat}` calls `resolveChatDiffStatus` synchronously, which discovers the PR inline. ## Fix When `ResolveBranchPullRequest` returns nil (no PR yet) and the row was recently marked stale (within 2 minutes), apply a short 15-second backoff via `BackoffChatDiffStatus` instead of letting the 5-minute acquisition lock stand. Outside the retry window, the worker skips the row as before — no indefinite fast-polling for branches that never receive a PR. To make the "recently marked stale" check work, `updated_at` is no longer overwritten by the acquisition and backoff SQL queries. This preserves it as a reliable "last externally changed" timestamp (set by `MarkStale` or a successful refresh). ### Behavior summary \| Scenario \| `updated_at` age \| Backoff \| Effective retry \| \|---\|---\|---\|---\| \| Fresh push, no PR yet \| < 2 min \| 15s (`NoPRBackoff`) \| ~15s \| \| Old row, no PR \| ≥ 2 min \| None (skip) \| ~5 min (acquisition lock) \| \| Error (any age) \| Any \| 120s (`DiffStatusTTL`) \| ~120s \| \| Success (any age) \| Any \| 120s (`DiffStatusTTL`) \| ~120s \| ## Changes - `coderd/database/queries/chats.sql` — Remove `updated_at = NOW()` from `AcquireStaleChatDiffStatuses` and `BackoffChatDiffStatus` - `coderd/database/queries.sql.go` — Regenerated - `coderd/x/gitsync/worker.go` — Add `NoPRBackoff` (15s) and `NoPRRetryWindow` (2 min) constants; apply short backoff only within the retry window - `coderd/x/gitsync/worker_test.go` — Add `TestWorker_NoPR_RecentMarkStale_BacksOffShort` and `TestWorker_NoPR_OldRow_Skips`	2026-03-25 10:09:44 -04:00
Asher	81188b9ac9	feat: add filtering by service account (#23468 ) You can now filter by/out service accounts using `service_account:true/false` or using the filter dropdown.	2026-03-24 10:13:25 -08:00

1 2 3 4 5 ...

1441 Commits