coder

mirror of https://github.com/coder/coder.git synced 2026-06-02 20:48:20 +00:00

Author	SHA1	Message	Date
Thomas Kosiewski	06bad73df4	feat: add admin-configurable advisor API, SDK, and queries (#24621 ) ## Summary Add the admin-configurable advisor configuration: database-backed storage, SDK types, and the experimental HTTP handlers that back the admin settings UI (later PRs). Follows the same "site-configs" pattern as Virtual Desktop. ## Motivation The advisor needs runtime-tunable knobs (enable/disable, per-run cap, max output tokens, reasoning effort, optional model override) without a service restart or redeploy. Using the existing `site_configs` K/V table keeps this pattern consistent with other admin features and avoids a bespoke schema. ## Changes ### Database (`coderd/database/queries/siteconfig.sql`) - `GetChatAdvisorConfig` returns the stored JSON blob (default `'{}'`) under key `agents_advisor_config`. - `UpsertChatAdvisorConfig` uses the standard `INSERT ... ON CONFLICT` pattern. - Regenerated via `make gen` (queries.sql.go + mocks). ### SDK (`codersdk/chats.go`) - `AdvisorConfig` type with `Enabled`, `MaxUsesPerRun`, `MaxOutputTokens`, `ReasoningEffort` (`""` / `low` / `medium` / `high`), `ModelConfigID uuid.UUID`. - Client methods: `ChatAdvisorConfig(ctx)` / `UpdateChatAdvisorConfig(ctx, cfg)`. ### API (`coderd/exp_chats.go`) - `GET /api/experimental/chats/config/advisor`: reads current config; relies on `ActorFromContext` validation. - `PUT /api/experimental/chats/config/advisor`: requires `policy.ActionUpdate` on `rbac.ResourceDeploymentConfig`. - Handlers unmarshal `{}` to a typed zero value and re-marshal on upsert for schema stability. - Tests in `exp_chats_test.go` cover empty defaults, round-trip update, unauthorized update, and invalid body. ## Stack context This is PR 3 of 6 in the advisor feature stack. Consumed by: - PR 4 (`feat/advisor-04-chatd-runtime`), which reads this config on every `runChat`. - PR 6 (`feat/advisor-06-admin-settings-ui`), which renders the admin form. ## Scope / non-goals - No `chatd` read path (lands in PR 4). - No UI (lands in PR 6). - `agents_advisor_config` remains a single-row JSON blob; we intentionally do not shard per-org/per-template yet. ## Validation - `make gen` - `go test ./coderd/database/... -run TestChatAdvisor` - `go test ./coderd/... -run TestChatAdvisorConfig` - `make lint` --- <details> <summary>📋 Implementation Plan (shared across the advisor stack)</summary> # Plan: Add a Mux-style advisor tool to coder agents/chatd ## Outcome Add a first-class `advisor` tool to agent chats in `coderd/x/chatd` that feels native to Coder: - it is a built-in server-side tool, not an MCP/dynamic-tool workaround; - it performs a nested tool-less model call for strategic advice; - it is exposed only when eligible, and the prompt mentions it only when it is actually available; - it is treated as a planning-only tool so it does not run alongside action tools in the same batch; - it tracks usage/cost separately enough for operators to reason about it; - it has a minimally polished UI in the Agents page; - and it ships with explicit dogfooding evidence, including screenshots and repro videos. ## Design decisions to lock before coding 1. Primary architecture: native built-in tool in `chattool/`, backed by a small `chatadvisor` package. 2. Nested model execution: reuse chatd's existing model/provider stack for a one-step, tool-less advisor call rather than inventing a new provider pathway. 3. Execution policy: treat `advisor` as an exclusive/planning-only tool; mixed batches must return structured policy errors and force the model to retry cleanly. 4. Availability: initial rollout is for root agent chats only; disable for child/sub-agent chats until recursion/cost policy is proven. 5. Prompt sync: use one eligibility boolean to drive both tool registration and advisor guidance injection. 6. Persistence/cost split: MVP should keep advisor usage visible in result metadata and server metrics; only add DB schema if product/billing explicitly needs queryable advisor-specific cost. 7. UI scope: generic tool rendering is an acceptable temporary milestone during backend bring-up, but the release candidate should include a dedicated lightweight advisor renderer. ## Delivery model The work should be executed as coordinated workstreams with one integration owner and parallel contributors for low-conflict areas. The integration owner should own `coderd/x/chatd/chatd.go` because prompt assembly, tool registration, and model resolution all converge there. ## Detailed workstreams ### Repo evidence used for this plan <details> <summary>Mux reference and current chatd seams</summary> Mux reference implementation - `src/node/services/tools/advisor.ts` — native advisor tool implementation. - `src/common/constants/advisor.ts` — advisor prompt/constants and truncation policy. - `src/common/utils/tools/tools.ts` — conditional tool registration. - `src/node/services/streamContextBuilder.ts` — injects advisor guidance only when the tool is available. Current chatd seams - `coderd/x/chatd/chatd.go` - `processChat()` — tool assembly, prompt assembly, and chatloop invocation. - `resolveChatModel()` — current model/provider/key resolution seam. - `type Config struct` — server-level chatd configuration surface. - `coderd/x/chatd/chatloop/chatloop.go` - `Run()` — main streaming/model loop. - `executeTools()` — built-in tool execution/batching seam. - `coderd/x/chatd/chattool/` — built-in tool implementations. - `site/src/pages/AgentsPage/components/ChatElements/tools/Tool.tsx` — tool renderer dispatch. - `site/src/pages/AgentsPage/components/ChatConversation/messageParsing.ts` and `ConversationTimeline.tsx` — tool/result merge and rendering flow. </details> ### Workstream map and ownership \| Workstream \| Primary owner \| Main files \| Can run in parallel? \| Done when \| \|---\|---\|---\|---\|---\| \| 0. Integration + gating \| Integration lead \| `coderd/x/chatd/chatd.go` \| No; central merge lane \| Tool registration, prompt sync, and model selection are wired together \| \| 1. Advisor runtime + tool \| Backend agent \| new `coderd/x/chatd/chatadvisor/`, new `coderd/x/chatd/chattool/advisor.go` \| Yes \| Tool can perform a tool-less advisor call in memory and return structured results \| \| 2. Planning-only execution policy \| Chatloop agent \| `coderd/x/chatd/chatloop/chatloop.go`, related tests \| Yes \| Mixed `advisor` + action-tool batches are rejected cleanly and deterministically \| \| 3. Metrics/usage/config \| Backend/telemetry agent \| `chatd.go`, `chatloop/metrics.go`, optional config plumbing \| Partially; coordinate with integration lead \| Advisor usage is separately visible in metadata/metrics and limits are enforced \| \| 4. Frontend rendering \| Frontend agent \| `site/.../tools/Tool.tsx`, new `AdvisorTool.tsx`, stories \| Yes after result schema stabilizes \| Advisor renders as a readable card and story tests pass \| \| 5. Dogfood + QA evidence \| QA agent \| dev server, Storybook, dogfood output \| After backend + UI are usable \| Repro videos, screenshots, and a concise QA report exist \| ### Parallelization rules - Do not split `coderd/x/chatd/chatd.go` across multiple execution agents without an integration lead. That file owns prompt building, tool registration, model resolution, and cost persistence. - Workstreams 1 and 2 can be developed in parallel and then stacked onto the integration branch. - Workstream 4 should begin once the backend result schema is agreed on, even if the backend is still behind a feature flag. - Any agent that needs to re-check Mux behavior should clone `coder/mux` into a temporary directory (for example, `$(mktemp -d)/mux`) and inspect it read-only; do not vendor or copy code from Mux directly. ## Phase 0 — Preflight and guardrails ### Goals - Align the team on the smallest shippable architecture. - Prevent scope creep into MCP/dynamic-tool/sub-agent variants. - Decide upfront what is MVP vs. follow-up. ### Tasks 1. Confirm the MVP boundary. - Ship a built-in advisor tool first. - Do not make MCP, dynamic tools, or sub-agents the primary implementation. - Do not add transient streaming phases in the first backend PR unless they fall out almost for free. 2. Confirm local workflow hygiene before coding. - Ensure the repo is using the project git hooks from `scripts/githooks`. - Do not bypass hooks with `--no-verify`. - Use `./scripts/develop.sh` for the full dev server rather than manual build/run commands. 3. Lock the model-selection policy. - Recommended MVP: advisor uses the same resolved provider/model/cost config as the current chat, with advisor-specific max-output and usage caps. - Follow-up only if required: add a separate `AdvisorModelConfigID`-style override that resolves through the existing `configCache`/model-config path. Do not invent a new free-form `provider:model` parser if chatd already stores provider/model separately. 4. Lock the persistence policy. - Recommended MVP: no DB migration. Persist advisor-visible metadata in the tool result and record separate metrics in memory/Prometheus. - Only if product/billing explicitly asks for queryable advisor cost: add a later DB migration or usage table, following the normal `queries/.sql` + `make gen` workflow. 5. Create an execution ADR note in the work item or tracking doc.* - Capture: built-in tool, tool-less nested call, root-chat-only rollout, exclusive execution policy, MVP no-DB-migration default. ### Quality gate - Everyone on the team can state the same answers to these questions: - Is advisor a built-in tool? Yes. - Can advisor run with action tools in the same batch? No. - Does advisor get tools of its own? No. - Is a DB migration required for MVP? No, unless billing insists. ## Phase 1 — Build the advisor runtime and tool wrapper ### Goals Create the core advisor implementation in a way that is easy to test and keeps `chattool/` thin. ### Files to add - `coderd/x/chatd/chatadvisor/types.go` - `coderd/x/chatd/chatadvisor/guidance.go` - `coderd/x/chatd/chatadvisor/handoff.go` - `coderd/x/chatd/chatadvisor/runtime.go` - `coderd/x/chatd/chatadvisor/runner.go` - `coderd/x/chatd/chattool/advisor.go` ### Responsibilities by file 1. `types.go` - Define the input/result schema used by the tool and UI. - Keep the result shape close to Mux so the UI and model both have predictable cases. - Recommended result variants: - `advice` - `limit_reached` - `error` Recommended shape: ```go type AdvisorArgs struct { Question string `json:"question"` } type AdvisorResult struct { Type string `json:"type"` Advice string `json:"advice,omitempty"` Error string `json:"error,omitempty"` AdvisorModel string `json:"advisor_model,omitempty"` RemainingUses int `json:"remaining_uses,omitempty"` Usage AdvisorUsageResult `json:"usage,omitempty"` } ``` 2. `guidance.go`* - Hold two strings: - the nested advisor system prompt; - the parent-agent guidance block to inject into the outer system prompt. - The nested advisor prompt must say, in plain language: - you are advising the parent agent; - you do not address the end user directly; - you do not claim actions happened; - you return concise strategic guidance and tradeoffs. 3. `runtime.go` - Define the per-run runtime state. - Recommended fields: - resolved model + model config; - provider keys/options reused from the outer chat; - `MaxUsesPerRun`; - `MaxOutputTokens`; - atomic/current call counter; - callback(s) to obtain the current prompt snapshot and current-step snapshot; - optional metrics/usage hook. - Add fail-fast validation for impossible config: nil model, non-positive limits, empty prompt builders, etc. 4. `handoff.go` - Build the advisor handoff message from: - the explicit question; - the exact prompt/messages the parent model just used; - the current step's text/reasoning snapshot, if available; - the most recent relevant tool outputs, if they are already in the prompt snapshot. - Important: use the already-prepared outer prompt tail, not a fresh DB reload. That keeps the advisor aligned with compaction and the exact context the outer model saw. - Apply hard truncation budgets with recent-context bias. 5. `runner.go` - Execute the nested advisor call. - Recommended implementation: call `chatloop.Run()` in an in-memory, one-step mode: - `Tools: nil` - `ProviderTools: nil` - `MaxSteps: 1` - `PersistStep`: capture the assistant output in memory instead of writing DB rows - Reuse the existing provider/model/cost path instead of building a second provider runner. - Assert that no tool definitions are passed to the nested call. 6. `chattool/advisor.go` - Keep this file thin and consistent with other built-ins. - Responsibilities: - decode `AdvisorArgs`; - validate `Question` is non-empty and bounded; - call the `chatadvisor` runner; - return a structured tool response. ### Defensive programming requirements - Assert `Question` is non-empty after trimming. - Assert runtime limits are positive. - Assert the nested advisor call runs with zero tools/provider tools. - Assert `AdvisorResult.Type` is one of the known variants before returning. - Assert remaining uses never goes negative. ### Acceptance criteria - A unit test can call the advisor tool with a fake model and receive a stable `advice` result. - The nested advisor call is impossible to run with tools accidentally attached. - The core logic lives in `chatadvisor/`, not embedded inside `chatd.go`. ## Phase 2 — Wire advisor into chatd and keep prompt/tool availability in sync ### Goals Register the tool in the right place, expose it only when eligible, and inject system guidance only when the tool is present. ### Files to modify - `coderd/x/chatd/chatd.go` - optionally a small helper file if `chatd.go` becomes too crowded ### Tasks 1. Compute one eligibility boolean in `processChat()`. Recommended inputs: - server-level advisor enabled flag; - root chat only (`chat.ParentChatID == uuid.Nil` or equivalent existing root/child check); - a usable resolved model/provider exists; - optional experiment/workspace/org gate if product wants staged rollout. 2. Create the runtime once per outer chat run. - Use the model/config/keys resolved by `resolveChatModel()`. - Reuse provider options from the current chat's `ChatModelCallConfig`. - Set `MaxUsesPerRun` and `MaxOutputTokens` from advisor config defaults. 3. Register the tool in the built-in tool block. - Insert after the skill tools and before MCP tools in `processChat()`. - Record `builtinToolNames["advisor"] = true` so metrics stay bounded. 4. Inject advisor guidance into the outer system prompt using the same boolean. - Use `chatprompt.InsertSystem()` in the same prompt assembly path that already injects user/system instructions. - Place the block near the existing instruction insertion, before plan-path/skill context blocks. - Wrap the guidance in an explicit tag like `<advisor-guidance>` so it is easy to spot in tests and future refactors. 5. Keep advisor out of child chats for the first release. - That avoids recursion/cost blowups with `spawn_agent` / `wait_agent` flows. - Document this explicitly in the rollout notes and tests. ### Acceptance criteria - If advisor is disabled, neither the tool nor the prompt guidance appears. - If advisor is enabled, both the tool and the prompt guidance appear. - Root chats can use advisor; child chats cannot. - Built-in tool names include `advisor` so metrics do not collapse it into the generic `mcp` label. ## Phase 3 — Enforce planning-only execution policy in `chatloop` ### Goals Prevent the model from calling `advisor` and action tools in the same execution batch. ### Files to modify - `coderd/x/chatd/chatloop/chatloop.go` - related chatloop tests ### Recommended implementation Keep the MVP small; do not build a general policy engine yet. 1. Add a minimal field to `chatloop.RunOptions`, for example: ```go ExclusiveToolName string ``` 2. In `Run()` / `executeTools()`, detect the case where the exclusive tool appears in the same local-tool batch as any other locally executed tool. 3. When that happens, synthesize structured tool-result errors for the affected calls instead of executing anything in the batch. - `advisor` should receive a clear error like: _advisor must be called by itself before action tools_. - The sibling action tools should receive a paired policy error like: _this tool was skipped because advisor must run alone_. 4. Let the outer model see those tool errors and retry cleanly. - This is simpler and safer than partial execution or hidden deferral. - It preserves deterministic transcript history for debugging. 5. Pass the just-finished step snapshot into the tool execution context. - The advisor runtime should be able to see the current step's text/reasoning content, because that is often the best hint about what the outer model is trying to decide. ### Why this is the right fit - It matches the intended semantics: advisor is consulted before* taking action. - It avoids subtle race conditions caused by concurrent built-in tool execution. - It keeps the behavior easy to test with fake models. ### Acceptance criteria - A model-emitted batch containing only `advisor` succeeds. - A model-emitted batch containing `advisor` plus any other locally executed tool returns deterministic policy errors and executes nothing. - Non-advisor tool execution stays unchanged for normal chats. ## Phase 4 — Usage limits, metrics, and configuration ### Goals Make advisor safe to operate without over-designing billing/storage in the first release. ### Files to modify - `coderd/x/chatd/chatd.go` - `coderd/x/chatd/chatloop/metrics.go` as needed - `coderd/x/chatd/chatd.go` `Config` struct and constructor path - optional follow-up config/db files only if a separate advisor model or persistent billing is required ### Tasks 1. Add explicit server config knobs for MVP. Recommended fields on `chatd.Config` or a nested advisor config struct: - `AdvisorEnabled bool` - `AdvisorMaxUsesPerRun int` - `AdvisorMaxOutputTokens int64` 2. Track usage per outer run. - Reset the counter for each `processChat()` invocation. - Return `remaining_uses` in the tool result. - Return `limit_reached` when the cap is exhausted. 3. Expose advisor usage metadata in the tool result. - Include model name and token/cost summary if available. - Use the same `callConfig.Cost` calculation path as the outer chat for MVP if advisor reuses the same model. 4. Record server-side metrics. - Count advisor invocations, failures, and latency. - Ensure they show up under the built-in tool label `advisor`. 5. Optional decision gate: separate advisor model. - If product insists on a stronger/different advisor model, add a follow-up config hook that resolves another existing chat model config through the same `configCache` path. - Keep that out of the first landing PR unless it is required for acceptance. 6. Optional decision gate: queryable advisor cost. - If this becomes required, spin a follow-up DB task: - update `coderd/database/queries/.sql`; - add migration files; - run `make gen`; - update audit mappings if a new auditable type/field is introduced. ### Acceptance criteria - Advisor calls are capped per outer run. - Limit exhaustion is user-visible in the tool result. - Metrics distinguish advisor calls from other built-in tools. - MVP does not require a schema migration unless explicitly approved. ## Phase 5 — Frontend rendering and Storybook coverage ### Goals Make advisor feel intentional in the Agents UI without blocking the backend on fancy streaming UI. ### Files to modify - `site/src/pages/AgentsPage/components/ChatElements/tools/Tool.tsx` - new `site/src/pages/AgentsPage/components/ChatElements/tools/AdvisorTool.tsx` - Storybook story file(s) in the same tools directory ### Delivery strategy 1. Intermediate milestone during backend bring-up:* rely on the existing generic tool renderer if needed. - This is acceptable only as a short-lived integration checkpoint. 2. Release milestone: add a dedicated lightweight `AdvisorTool` renderer. - Reuse existing primitives: - `ToolCollapsible` - `ToolIcon` - `Response` for markdown/prose rendering - `ScrollArea` if the advice can be long - Keep styling light and consistent with the Agents page. - Do not add unnecessary React memoization in `site/src/pages/AgentsPage/`; that area is already React-Compiler aware. 3. Render the structured result states cleanly. - `advice` — readable prose/markdown with optional metadata footer. - `limit_reached` — warning-style message. - `error` — error state with visible fallback text. - `running` — existing tool loading state/spinner is enough for MVP. 4. Add Storybook coverage instead of ad-hoc component tests. Recommended stories: - successful advice; - running/loading; - limit reached; - error. 5. Keep the UI contract narrow. - Prefer one text field like `advice` plus small metadata rather than a deeply nested schema. - That keeps the UI resilient to prompt iteration. ### Acceptance criteria - The advisor tool card renders readable content rather than raw quoted JSON in the final release branch. - Running, limit, and error states are visibly distinct. - Storybook stories and play assertions cover the new states. - Existing tool rendering flows remain unchanged. ## Phase 6 — Automated tests and validation gates ### Backend tests to add 1. Advisor runtime/tool tests - question validation; - tool-less nested execution assertion; - success result shaping; - limit-reached result shaping; - error result shaping. 2. Prompt/gating tests in chatd - advisor disabled ⇒ no tool, no guidance; - advisor enabled/root chat ⇒ tool + guidance; - child chat ⇒ advisor absent. 3. Chatloop policy tests - advisor alone runs; - advisor + action tool mixed batch returns deterministic policy errors; - non-advisor tools still execute normally. 4. Usage/metrics tests - per-run cap resets correctly; - builtin tool labeling includes `advisor`; - returned metadata includes model/usage summary when available. ### Frontend tests to add - Storybook `play()` assertions for the advisor renderer states. - Verify expand/collapse behavior and visible fallback text. - Verify the message timeline still renders adjacent tools correctly. ### Recommended command sequence Run these as the implementation matures, not only at the end: 1. Backend-focused gate after phases 1–4: - `make test RUN=TestAdvisor` - `make test RUN=TestChatloopAdvisor` - `make lint` 2. Frontend-focused gate after phase 5: - `pnpm test:storybook src/pages/AgentsPage/components/ChatElements/tools/AdvisorTool.stories.tsx` - `pnpm lint` - `pnpm format` 3. Final repo gate before handoff: - `make pre-commit` - run any additional targeted `make test RUN=...` selections covering touched chatd paths > Use the exact new test names the implementing agents create; the names above are recommended anchors, not existing tests. ## Dogfooding plan ### Principle Dogfood the change as a real agent feature, not just a unit-tested backend. Per the dogfood and `agent-browser` skills, the reviewer should get watchable repro videos plus screenshots that make the behavior obvious without reading logs. ### Required setup 1. Start the full dev environment with: - `./scripts/develop.sh` 2. If the frontend renderer changes, also start Storybook from `site/` with: - `pnpm storybook --no-open` 3. Use `agent-browser` directly — never `npx agent-browser`. 4. Use named browser sessions and an output folder such as: - `./dogfood-output/advisor/` - with subfolders `screenshots/` and `videos/` ### Evidence protocol For every interactive scenario below: 1. Start video recording before the action. 2. Capture step-by-step screenshots at human pace. 3. Capture one annotated screenshot of the final state. 4. Stop the recording. 5. Note the exact pass/fail observation in the QA report. For static UI states (for example Storybook error/limit cards), an annotated screenshot is sufficient; video is optional but still encouraged by this project’s review preference. ### Dogfood scenarios #### Scenario A — Happy path in the real Agents UI Goal: prove that a root agent chat can invoke advisor and produce a readable recommendation before taking further action. Steps: 1. Open the Agents page with an advisor-enabled root chat. 2. Start a repro video. 3. Send a prompt that should reasonably trigger strategic planning, such as an architecture or multi-tradeoff question. 4. Capture screenshots of: - the prompt before send; - the running advisor state; - the completed advisor card and the assistant’s follow-up response. 5. Stop recording. Pass criteria: - advisor appears in the timeline; - the rendered result is readable; - the assistant can continue after consuming the advisor output. #### Scenario B — Advisor unavailable path Goal: prove the feature is truly gated. Suggested variants (at least one is required, both are better): - feature flag/config off; - child/sub-agent chat. Evidence: - annotated screenshot of the chat/tool state showing advisor is absent; - short video if toggling the gate live is part of the repro. Pass criteria: - no advisor tool is available; - no advisor-specific prompt behavior leaks through. #### Scenario C — UI states in Storybook Goal: prove the renderer handles non-happy states cleanly. Required story states: - success/advice; - running; - limit reached; - error. Evidence: - one screenshot per state; - at least one short video showing collapse/expand behavior. Pass criteria: - success renders readable advice; - limit/error have visible fallback text; - the component behaves like the other tool cards. #### Scenario D — Regression sweep of nearby tools Goal: ensure advisor does not break the surrounding chat timeline. Check at minimum: - another existing built-in tool still renders correctly near advisor; - sub-agent/tool cards still expand/collapse normally; - no obvious console errors appear in the Agents page during the advisor flow. Evidence: - screenshots of adjacent tool cards; - console/error capture if anything suspicious appears. ### `agent-browser` usage notes for the QA agent - Prefer `agent-browser batch` for 2+ sequential commands when no intermediate parsing is needed. - Use `snapshot -i` to discover interactive refs. - Re-snapshot after navigation or major DOM changes. - Avoid `wait --load networkidle` unless the page is known to go idle; prefer explicit element/text waits or short fixed waits. - Record videos at human pace and include pauses that a reviewer can follow. ## Rollout plan ### Initial rollout - Gate behind a server-side advisor-enabled flag. - Enable only for selected internal/root agent chats first. - Watch metrics for: - invocation count; - failure rate; - latency; - obvious retry loops. ### Expansion conditions Expand beyond the initial rollout only after the following are true: - mixed-batch policy behavior is stable; - cost impact is understood; - frontend UX is readable in production-like dogfood; - no recursion surprises have appeared with sub-agent flows. ### Explicit non-goals for the first release - advisor inside child/sub-agent chats; - provider-agnostic streaming phase UI; - MCP-based external advisor implementation; - mandatory DB-backed advisor cost reporting. ## Final acceptance checklist - [ ] `advisor` is a built-in chatd tool, not an MCP/dynamic-tool substitute. - [ ] The nested advisor call is tool-less and bounded to one in-memory step. - [ ] One eligibility boolean controls both tool registration and prompt guidance injection. - [ ] Root chats can use advisor; child chats cannot in the initial rollout. - [ ] Mixed advisor/action batches produce deterministic policy errors instead of partial execution. - [ ] Per-run usage caps and limit-reached behavior work. - [ ] Advisor usage is visible in metadata/metrics without forcing a DB migration for MVP. - [ ] The Agents UI has a readable advisor card and Storybook coverage. - [ ] Dogfooding produced screenshots and repro videos for the required scenarios. - [ ] Validation commands (`make lint`, targeted `make test`, Storybook tests, `make pre-commit`) passed before handoff. ## Suggested PR split 1. PR 1 — Backend foundation - `chatadvisor/` package - `chattool/advisor.go` - `chatloop` exclusive policy - chatd gating/prompt sync - backend tests 2. PR 2 — Frontend + QA - advisor renderer - stories/play assertions - dogfood artifacts and QA notes 3. PR 3 — Optional follow-ups only if demanded by stakeholders - separate advisor model override - persistent advisor billing/queryability - transient phase-stream UX </details> --- _Generated with [`mux`](https://github.com/coder/mux) • Model: `anthropic:claude-opus-4-7` • Thinking: `max`_	2026-04-30 14:53:08 +02:00
Asher	be57af5ff0	feat: add exit code and status to workspace agent scripts (#24505 ) For scripts that have not finished or in dry run cases these will be omitted.	2026-04-29 12:24:26 -08:00
Zach	1c30d52b2b	feat: audit user secret create, update, and delete (#24756 ) Emit user secret audit log entries for create/update/delete operations. Reads stay un-audited, matching every other resource. Audit log entries record changes in user secret name, environment variable name, file path, and value. The secret value column is marked `ActionSecret` so the diff records the change without showing the ciphertext or plaintext. Closes a TOCTOU window on delete to ensure no phantom audit logs for a delete of a non-existent secret. Secret update accepts a small TOCTOU window matching the other audited resources (templates, workspaces, chats). The two-query pattern is wrapped in a transaction so audit state can't leak from a failed mutation.	2026-04-29 12:57:47 -06:00
Cian Johnston	a876287d36	feat: auto-archive inactive chats with audit trail (#24642 ) Adds a background job in `dbpurge` that periodically archives chats inactive beyond a configurable threshold. Each archived root chat gets a background audit entry tagged `chat_auto_archive`. Disabled by default. * New `AutoArchiveInactiveChats` SQL query with LATERAL last-activity subquery and partial index on archive candidates * `site_configs`-backed `auto_archive_days` setting with admin-only PUT, any-authenticated-user GET * Cascade archive via `root_chat_id`; pinned chats and active threads exempt * Root-only audit dispatch on detached context, matching manual archive (`patchChat`) behavior * 11 subtests covering disabled no-op, boundary, deleted messages, child activity, pinned exemption, multi-owner, idempotency, and batch pagination PR #24643 adds per-owner digest notifications. PR #24704 adds the requisite UI controls. > 🤖	2026-04-24 14:18:28 +01:00
Danielle Maywood	3a9a60dff8	feat: add collapsible thinking blocks with configurable display mode (#24635 )	2026-04-24 11:29:08 +00:00
Michael Suchacz	3d90546aae	feat: add general subagent model override (#24610 ) Adds a deployment-wide admin override for general delegated subagents. ## What changed - store the general override in `site_configs` and expose it through the shared `agent-model-override/{context}` API - apply the general override when spawning delegated general subagents, while preserving the existing Explore override behavior - reuse a shared Agents settings form for the general and Explore override sections ## Validation - `make gen` - `go test ./coderd -run 'TestChatModelOverrides'` - `go test ./coderd/x/chatd -run 'TestSpawnAgent_(GeneralUsesConfiguredModelOverride\|GeneralOverrideLogsAndFallsBackWhenCredentialsUnavailable\|GeneralOverrideLogsAndFallsBackWhenProviderDisabled)'` - `pnpm -C site lint:types` - `pnpm -C site test:storybook -- AgentSettingsAgentsPageView.stories.tsx` - `make lint` - `make pre-commit` > Mux is acting on Mike's behalf.	2026-04-24 12:37:20 +02:00
Ethan	ad1906589d	fix(coderd): allow deleting chat providers used in historical chats (#24568 ) Drop the `chat_model_configs.provider -> chat_providers.provider` foreign key and soft-delete model configs when their provider is removed. The provider row is now hard-deleted inside a transaction that also tombstones its model configs and promotes a replacement default when needed. Historical chats and messages keep pointing at the soft-deleted model config rows, which are hidden from live/admin queries but still resolve for read. The runtime chat path already falls back to the default model config when a soft-deleted config is looked up. Replaces the lost FK validation in the create/update model-config handlers with an explicit provider lookup that returns the existing `Chat provider is not configured.` 400. ## UX Admin deleting a chat provider that has historical usage - Before: blocked with 400 `Provider models are still referenced by existing chats.` Admins had no in-product way to remove a provider that had ever been used. - After: delete succeeds (204). Any model configs under that provider are soft-deleted. If the removed provider owned the default model config, one of the remaining live configs is auto-promoted to the new default. The promotion is deterministic (`ensureDefaultChatModelConfig` picks the first live config by `provider ASC, model ASC, updated_at DESC, id DESC`); there is no picker, and no toast or response detail names which config became the new default. End users with chats that used a deleted provider's model - Old chats still open and their history still renders unchanged. - Sending a new turn in such a chat silently falls back to the current default model. No banner or warning tells the user the original model is gone. - The model picker no longer lists the deleted model. - If no default model config exists at all after the delete, sending a new turn fails with `no default chat model config is available`. Admin creating or updating a model config against a provider that is not configured - Same as before: 400 `Chat provider is not configured.` Only the detection mechanism changed (explicit `FOR UPDATE` lookup inside the transaction, which also serializes against a concurrent provider delete). Admin updating a model config whose row disappears mid-transaction - Now returns the standard 404 `Resource not found or you do not have access to this resource` instead of the previous 500 that leaked `sql: no rows in result set` in the detail. Unrelated internal races (for example a race on the promoted default candidate) are still reported as 500 so they are not misclassified as "your target is gone". Closes CODAGT-23	2026-04-22 19:34:34 +10:00
Jaayden Halko	410f9a5e19	feat: allow renaming of agent chat title (#24489 ) Co-authored-by: Coder Agents <noreply@coder.com>	2026-04-20 14:00:46 +01:00
Thomas Kosiewski	df7e838c21	feat(coderd): wire debug logging into chat lifecycle (#23917 )	2026-04-20 12:27:16 +02:00
Mathias Fredriksson	fc2493780f	fix: exclude subagent chats from sidebar pagination (#24404 ) GetChats now returns only root chats (parent_chat_id IS NULL). A new GetChildChatsByParentIDs query fetches children for visible roots and embeds them in each parent's Children field. The singular getChat endpoint does the same. Archive invariant is one-way: parent archived implies child archived. Parent archive/unarchive cascades via root_chat_id. Individual child archive is permitted; child unarchive while the parent is archived is rejected atomically (row lock on child, re-read parent inside the transaction). Embedded children are filtered by the caller's archive state so individually-archived children stay hidden from active-parent views. Gitsync MarkStale uses GetChatsByWorkspaceIDs directly; MarkStaleParams.OwnerID removed (dead after the switch). Frontend: buildChatTree reads from the embedded children field, WebSocket handlers route child events into the parent's children array, and archiving a child strips it from the parent cache.	2026-04-20 13:19:59 +03:00
Spike Curtis	e19b21b7d5	chore: add GetLatestWorkspaceBuildWithStatusByWorkspaceID query (#24441 ) <!-- If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting. --> relates to GRU-18 Adds new database query supporting the Agent Connection Watch we will add.	2026-04-17 22:47:08 -04:00
Thomas Kosiewski	91f9de27a1	feat(coderd): add chat debug service and summary aggregation (#23916 )	2026-04-17 16:27:53 +02:00
Dean Sheather	4ba74dcdc8	feat(coderd): add PR status summary to telemetry snapshots (#24379 ) Adds aggregate PR counts (total, open, merged, closed) from `chat_diff_statuses` to telemetry snapshots, giving visibility into AI agent PR outcomes across deployments. The existing telemetry system reports `Chats`, `ChatMessageSummaries`, and `ChatModelConfigs`, but had no PR-level data. This adds a `ChatDiffStatusSummary` field to the `Snapshot` struct with four all-time counts derived from a single aggregate query. <details> <summary>Implementation details</summary> - New SQL query `GetChatDiffStatusSummary` counts `chat_diff_statuses` rows with non-NULL `pull_request_state`, grouped by state (open/merged/closed). - `ChatDiffStatusSummary` struct added to telemetry `Snapshot`, collected via a parallel `eg.Go()` block in `createSnapshot()`. - `dbauthz` wrapper uses `rbac.ResourceSystem` (telemetry-only pattern). - Test covers both empty state (zero counts) and populated state (mixed states + NULL-state exclusion). </details> > 🤖 Generated by Coder Agents	2026-04-17 21:56:11 +10:00
Michael Suchacz	73b5058923	feat: add Explore mode as subagent-only modality (#24448 ) > This PR was authored by Mux on behalf of Mike. Introduce Explore mode, a read-only subagent modality for delegated discovery and code investigation. ## What Adds a `spawn_explore_agent` tool that creates child chats restricted to read-only operations. An admin can optionally configure a deployment-wide model override so Explore subagents use a model optimized for large context or reasoning without changing the root chat's model. ### Backend - New `ChatModeExplore` enum value (migration 000471). - `spawn_explore_agent` tool definition with read-only allowlist: `read_file`, `execute`, `process_output`, `read_skill`, `read_skill_file`. Write tools, file editors, and nested subagent spawning are blocked. - Deployment config storage for the Explore model override (`agents_chat_explore_model_override` in `site_configs`). - Model resolution hierarchy: configured override, then current turn model, then global default. Silent fallback with warning log when the override becomes unavailable. - RBAC: `AsChatd` for daemon reads, `ActionRead` and `ActionUpdate` on `ResourceDeploymentConfig` for admin API calls. - Plan mode root chats can use `spawn_explore_agent` for read-only research, matching the planning prompt guidance. - The Explore override config API now reports malformed saved overrides as "treated as unset" so admins can clear them explicitly. ### Frontend - `ExploreModelOverrideSettings` component in admin agent behavior settings. Uses `ModelSelector`, handles unavailable model warnings, and supports explicit Save and Clear actions. - Malformed saved overrides show a warning and require an explicit Save to clear, instead of Clear auto-submitting behind the scenes. ### Tests - Integration: `TestExploreSubagentIsReadOnly` (full spawn flow, tool verification, prompt overlay, DB state). - Unit: tool allowlist tests for explore, plan, and default modes. - Internal: model override resolution with valid, invalid UUID, disabled, and unconfigured override scenarios. - RBAC: `dbauthz_test.go` for `GetChatExploreModelOverride` and `UpsertChatExploreModelOverride`. - API: admin set and clear, malformed stored override reporting, disabled model rejection, non-admin denial.	2026-04-17 13:40:17 +02:00
Michael Suchacz	1092093e98	feat: add internal subagent model override wiring (#24399 ) > Mux working on behalf of Mike. ## Summary - add an enabled chat model config lookup by ID for internal callers - keep `spawn_agent` unchanged while threading an internal model override through child subagent chat creation - extend chatd coverage for inherited bindings, plan mode, and internal override behavior ## Validation - `go test ./coderd/x/chatd ./coderd/database/dbauthz` - `make lint`	2026-04-16 17:08:02 +02:00
Michael Suchacz	e5707a13d6	feat: support multiple agents with shared instance-identity auth (#24325 ) > This PR was authored by Mux on behalf of Mike. ## Summary Adds support for multiple peer root workspace agents sharing the same `auth_instance_id`, so AWS, Azure, and GCP instance-identity auth can issue the correct session token for a selected agent instead of assuming a single root agent per instance. ## Problem When a Terraform template attaches two or more `coder_agent` resources (with `auth = "aws-instance-identity"`) to a single compute instance, every agent shares the same cloud instance ID. The existing singular lookup picks whichever agent was created most recently, silently ignoring the others. ## Solution Introduce an optional pre-auth agent selector (`CODER_AGENT_NAME`) and make the server-side lookup ambiguity-aware. Database layer: - `GetWorkspaceAgentsByInstanceID` (`:many`): returns all matching root agents for an instance ID. - `GetWorkspaceAgentByInstanceIDAndName` (`:one`): returns the named root agent for disambiguation. SDK and CLI: - `agent_name` field added to AWS, Azure, and GCP request structs (`omitempty` for backward compatibility). - `CODER_AGENT_NAME` env var and `--agent-name` flag wired into the agent bootstrap before instance-identity auth runs. Server handler (`handleAuthInstanceID`): - When `agent_name` is present: direct lookup by (instance ID, name). - When absent: legacy lookup, then resource-scoped ambiguity check. Returns 409 with available agent names if multiple root agents match. - Whitespace-only names are trimmed and treated as unspecified. - Sub-agents remain excluded (`parent_id IS NULL` filter). Verification template: - `examples/templates/aws-multi-agent/` provisions one EC2 instance with two agents (`main` and `dev`), both using instance-identity auth with `CODER_AGENT_NAME` set in the cloud-init user data. ## Backward compatibility Existing single-agent deployments work unchanged. The `agent_name` field is optional with `omitempty`, and the unnamed path preserves today's behavior when only one root agent matches.	2026-04-16 13:59:09 +02:00
Michael Suchacz	1cf0354f72	feat: add plan mode with restricted tool boundary (#24236 ) > This PR was authored by Mux on behalf of Mike. ## Summary - add persistent plan mode for chats and the chat-specific plan file flow - add structured planning tools such as `ask_user_question` and `propose_plan` - keep `write_file` and `edit_files` constrained to the chat-specific plan file during plan turns - allow shell exploration in plan mode, including subagents, via `execute` and `process_output` - block implementation-oriented, provider-native, MCP, dynamic, and computer-use tools during plan turns - update the chat UI, tests, and docs for the new planning flow	2026-04-16 11:12:01 +02:00
Cian Johnston	c552f9f281	fix: stop group spend limits from leaking across org boundaries (#24294 ) Three SQL queries (`GetUserGroupSpendLimit`, `ResolveUserChatSpendLimit`, `GetUserChatSpendInPeriod`) aggregated chat spend limits and usage globally across all organizations. A restrictive group limit in org A would bleed into org B. ## Changes - Add `organization_id` parameter to all three SQL queries in `coderd/database/queries/chats.sql` - When nil UUID is passed, queries fall back to global behavior (backward compat for HTTP dashboard endpoints) - When real org ID is passed, limits and spend are scoped to that organization - Thread `organizationID` through `ResolveUsageLimitStatus` → `checkUsageLimit` → all chatd call sites - Update dbauthz wrappers for new param structs - HTTP endpoints (`chatCostSummary`, `getMyChatUsageLimitStatus`) pass `uuid.Nil` with TODO for future org-scoped UI - Add `TestResolveUsageLimitStatus_OrgScoped` with 5 test cases covering org isolation, nil-UUID fallback, spend scoping, and user override priority Closes coder/internal#1466 > 🤖	2026-04-14 16:56:17 +01:00
Thomas Kosiewski	6ab30123bf	feat: add chat debug log tables, queries, and SDK types (#23913 )	2026-04-13 15:06:06 +02:00
Matt Vollmer	36141fafad	feat: stack insights tables vertically and paginate Pull requests table (#24198 ) The "By model" and "Pull requests" tables on the PR Insights page (`/agents/settings/insights`) were side-by-side at `lg` breakpoints, and the Pull requests table was hard-capped at 20 rows by the backend. - Replaced `lg:grid-cols-2` with a single-column stacked layout so both tables span the full content width. - Removed the `LIMIT 20` from the `GetPRInsightsRecentPRs` SQL query so all PRs in the selected time range are returned. - Can add this back if we need it. If we do, we should add a little subheader above this table to indicate that we're not showing all PRs within the selected timeframe. - Added client-side pagination to the Pull requests table using `PaginationWidgetBase` (page size 10), matching the existing pattern in `ChatCostSummaryView`. - Renamed the section heading from "Recent" to "Pull requests" since it now shows the full set for the time range. <img width="1481" height="1817" alt="image" src="https://github.com/user-attachments/assets/0066c42f-4d7b-4cee-b64b-6680848edc68" /> > 🤖 PR generated with Coder Agents	2026-04-10 10:48:54 -04:00
Garrett Delfosse	3462c31f43	fix: update directory for terraform-managed subagents (#24220 ) When a devcontainer subagent is terraform-managed, the provisioner sets its directory to the host-side `workspace_folder` path at build time. At runtime, the agent injection code determines the correct container-internal path from `devcontainer read-configuration` and sends it via `CreateSubAgent`. However, the `CreateSubAgent` handler only updated `display_apps` for pre-existing agents, ignoring the `Directory` field. This caused SSH/terminal sessions to land in `~` instead of the workspace folder (e.g. `/workspaces/foo`). Add `UpdateWorkspaceAgentDirectoryByID` query and call it in the terraform-managed subagent update path to also persist the directory. Fixes PLAT-118 <details><summary>Root cause analysis</summary> Two code paths set the subagent `Directory` field: 1. Provisioner (build time): `insertDevcontainerSubagent` in `provisionerdserver.go` stores `dc.GetWorkspaceFolder()` — the host-side path from the `coder_devcontainer` Terraform resource (e.g. `/home/coder/project`). 2. Agent injection (runtime): `maybeInjectSubAgentIntoContainerLocked` in `api.go` reads the devcontainer config and gets the correct container-internal path (e.g. `/workspaces/project`), then calls `client.Create(ctx, subAgentConfig)`. For terraform-managed subagents (those with `req.Id != nil`), `CreateSubAgent` in `coderd/agentapi/subagent.go` recognized the pre-existing agent and entered the update path — but only called `UpdateWorkspaceAgentDisplayAppsByID`, discarding the `Directory` field from the request. The agent kept the stale host-side path, which doesn't exist inside the container, causing `expandPathToAbs` to fall back to `~`. </details> > [!NOTE] > Generated by Coder Agents	2026-04-10 10:11:22 -04:00
Zach	95cff8c5fb	feat: add REST API handlers and client methods for user secrets (#24107 ) Add the five REST endpoints for managing user secrets, SDK client methods, and handler tests. Endpoints: - `POST /api/v2/users/{user}/secrets` - `GET /api/v2/users/{user}/secrets` - `GET /api/v2/users/{user}/secrets/{name}` - `PATCH /api/v2/users/{user}/secrets/{name}` - `DELETE /api/v2/users/{user}/secrets/{name}` Routes are registered under the existing `/{user}` group with `ExtractUserParam`. The delete query was changed from `:exec` to `:execrows` so the handler can distinguish "not found" from success (DELETE with `:exec` silently returns nil for zero affected rows).	2026-04-09 12:12:55 -06:00
Kyle Carberry	391b22aef7	feat: add CLI commands for managing chat context from workspaces (#24105 ) Adds `coder exp chat context add` and `coder exp chat context clear` commands that run inside a workspace to manage chat context files via the agent token. `add` reads instruction and skill files from a directory (defaulting to cwd) and inserts them as context-file messages into an active chat. Multiple calls are additive — `instructionFromContextFiles` already accumulates all context-file parts across messages. `clear` soft-deletes all context-file messages, causing `contextFileAgentID()` to return `!found` on the next turn, which triggers `needsInstructionPersist=true` and re-fetches defaults from the agent. Both commands auto-detect the target chat via `CODER_CHAT_ID` (already set by `agentproc` on chat-spawned processes), or fall back to single-active-chat resolution for the agent. The `--chat` flag overrides both. Also adds sub-agent context inheritance: `createChildSubagentChat` now copies parent context-file messages to child chats at spawn time, so delegated sub-agents share the same instruction context without independently re-fetching from the workspace agent. <details><summary>Implementation details</summary> New files: - `cli/exp_chat.go` — CLI command tree under `coder exp chat context` Modified files: - `agent/agentcontextconfig/api.go` — `ConfigFromDir()` reads context from an arbitrary directory without env vars - `codersdk/agentsdk/agentsdk.go` — `AddChatContext`/`ClearChatContext` SDK methods - `coderd/workspaceagents.go` — POST/DELETE handlers on `/workspaceagents/me/chat-context` - `coderd/coderd.go` — Route registration - `coderd/database/queries/chats.sql` — `GetActiveChatsByAgentID`, `SoftDeleteContextFileMessages` - `coderd/database/dbauthz/dbauthz.go` — RBAC implementations for new queries - `coderd/x/chatd/subagent.go` — `copyParentContextFiles` for sub-agent inheritance - `cli/root.go` — Register `chatCommand()` in `AGPLExperimental()` Auth pattern: Uses `AgentAuth` (same as `coder external-auth`) — agent token via `CODER_AGENT_TOKEN` + `CODER_AGENT_URL` env vars. </details> > 🤖 Generated by Coder Agents --------- Co-authored-by: Michael Suchacz <203725896+ibetitsmike@users.noreply.github.com>	2026-04-09 16:33:00 +02:00
Kyle Carberry	c5d720f73d	feat(coderd): add telemetry for agents chats and messages (#24068 ) Adds telemetry collection for the agents chat system (`/agents`) to the existing telemetry snapshot pipeline. Three new snapshot fields: - `Chats` — per-chat metadata (id, owner, status, mode, workspace_id, root_chat_id, has_parent, archived, model config) collected time-windowed via `createdAfter` - `ChatMessageSummaries` — per-chat aggregated message metrics (counts by role, token sums by type, cost, runtime, model count, compression count) collected time-windowed - `ChatModelConfigs` — model configuration metadata (provider, model, context limit, enabled, default) collected as full dump No PII is included — titles, message content, and URLs are excluded at the SQL level. Only structural metadata flows through telemetry. <details><summary>Implementation plan</summary> ### SQL Queries (`coderd/database/queries/chats.sql`) - `GetChatsCreatedAfter` — time-windowed chat metadata - `GetChatMessageSummariesPerChat` — per-chat message aggregates via `GROUP BY` - `GetChatModelConfigsForTelemetry` — full dump of model configs ### Telemetry (`coderd/telemetry/telemetry.go`) - `Chat`, `ChatMessageSummary`, `ChatModelConfig` structs - `ConvertChat`, `ConvertChatMessageSummary`, `ConvertChatModelConfig` conversion functions - Three `eg.Go()` blocks in `createSnapshot()` following the existing collection pattern ### Authorization (`coderd/database/dbauthz/dbauthz.go`) - System-only access for all three queries via `rbac.ResourceSystem` ### Tests - `TestChatsTelemetry` in `coderd/telemetry/telemetry_test.go` — creates chats (root + child), messages with token/cost data, model configs; verifies all snapshot fields - dbauthz test entries for all three queries in `coderd/database/dbauthz/dbauthz_test.go` </details> > 🤖 Generated by Coder Agents	2026-04-08 09:47:44 -04:00
Cian Johnston	233343c010	feat: add chat and chat_files cleanup to dbpurge (#23833 ) Fixes https://github.com/coder/coder/issues/23910 Adds periodic cleanup of chats and chat files to the dbpurge background goroutine, with a configurable retention period exposed in the Agent settings UI. > 🤖 Written by a Coder Agent. Reviewed by a human.	2026-04-08 11:08:09 +01:00
Zach	565a15bc9b	feat: update user secrets queries for REST API and injection (#23998 ) Update queries as prep work for user secrets API development: - Switch all lookups and mutations from ID-based to user_id + name - Split list query into metadata-only (for API responses) and with-values (for provisioner/agent) - Add partial update support using CASE WHEN pattern for write-only value fields - Include value_key_id in create for dbcrypt encryption support - Update dbauthz wrappers and remove stale methods from dbmetrics	2026-04-07 09:03:28 -06:00
Kyle Carberry	684f21740d	perf(coderd): batch chat heartbeat queries into single UPDATE per interval (#24037 ) ## Summary Replaces N per-chat heartbeat goroutines with a single centralized heartbeat loop that issues one `UPDATE` per 30s interval for all running chats on a worker. ## Problem Each running chat spawned a dedicated goroutine that issued an individual `UPDATE chats SET heartbeat_at = NOW() WHERE id = $1 AND worker_id = $2 AND status = 'running'` query every 30 seconds. At 10,000 concurrent chats this produces ~333 DB queries/second just for heartbeats, plus ~333 `ActivityBumpWorkspace` CTE queries/second from `trackWorkspaceUsage`. ## Solution New `UpdateChatHeartbeats` (plural) SQL query replaces the old singular `UpdateChatHeartbeat`: ```sql UPDATE chats SET heartbeat_at = @now::timestamptz WHERE worker_id = @worker_id::uuid AND status = 'running'::chat_status RETURNING id; ``` A single `heartbeatLoop` goroutine on the `Server`: 1. Ticks every `chatHeartbeatInterval` (30s) 2. Issues one batch UPDATE for all registered chats 3. Detects stolen/completed chats via set-difference (equivalent of old `rows == 0`) 4. Calls `trackWorkspaceUsage` for surviving chats `processChat` registers an entry in the heartbeat registry instead of spawning a goroutine. ## Impact \| Metric \| Before (10K chats) \| After (10K chats) \| \|---\|---\|---\| \| Heartbeat queries/sec \| ~333 \| ~0.03 (1 per 30s per replica) \| \| Heartbeat goroutines \| 10,000 \| 1 \| \| Self-interrupt detection \| Per-chat `rows==0` \| Batch set-difference \| --- > 🤖 Generated by Coder Agents <details><summary>Implementation notes</summary> - Uses `@now` parameter instead of `NOW()` so tests with `quartz.Mock` can control timestamps. - `heartbeatEntry` stores `context.CancelCauseFunc` + workspace state for the centralized loop. - `recoverStaleChats` is unaffected — it reads `heartbeat_at` which is still updated. - The old singular `UpdateChatHeartbeat` is removed entirely. - `dbauthz` wrapper uses system-level `rbac.ResourceChat` authorization (same pattern as `AcquireChats`). </details>	2026-04-07 10:25:46 -04:00
Cian Johnston	d5a1792f07	feat: track chat file associations with chat_file_links on chats (#23537 ) Needed by #23833 Adds a `chat_file_links` association table to track which files are associated with each chat. - `AppendChatFileIDs` query links a file to a chat with deduplication - `GetChatFileMetadataByIDs` query returns lightweight file metadata by IDs - Tool-created files (e.g. `propose_plan`) are linked to the chat after insert - User-uploaded files are linked to the chat when the referencing message is sent - Single-chat GET endpoint hydrates `files: ChatFileMetadata[]` on the response > 🤖 Created by Coder Agents and massaged into shape by a human.	2026-04-07 12:05:29 +01:00
Jon Ayers	a1d51f0dab	feat: batch connection logs to avoid DB lock contention (#23727 ) - Running 30k connections was generating a ton of lock contention in the DB	2026-04-03 15:47:26 -05:00
Jon Ayers	333503f74e	feat: improve coordinator peer mapping performance (#23696 ) - Skipping DB querying entirely for peers that aren't actually connected to our coordinator - Opportunistically batching the queries for peers	2026-04-03 14:22:58 -05:00
Michael Suchacz	7d0a0c6495	feat: provider key policies and user provider settings (#23751 )	2026-04-02 19:46:42 +02:00
Ethan	5cba59af79	fix(coderd): unarchive child chats with parents (#23761 ) Unarchiving a root chat now restores descendant chats in the database and emits lifecycle events for every affected chat so passive sessions converge without a full refetch. This keeps archive and unarchive symmetric at both the data and watch-stream layers by returning the affected chat family from the database, using those post-update rows for chatd pubsub fanout, and covering descendant lifecycle delivery with a watch-level regression test. Closes #23666	2026-04-01 15:30:25 +11:00
Kyle Carberry	a5cc579453	feat: add last_injected_context column to chats table (#23798 ) Adds a nullable JSONB column `last_injected_context` to the `chats` table that stores the most recently persisted injected context parts (AGENTS.md context-file and skill message parts). The column is updated only when `persistInstructionFiles()` runs — on first workspace attach or when the agent changes — so there are no redundant writes on subsequent turns. Internal fields (`ContextFileContent`, `ContextFileOS`, `ContextFileDirectory`, `SkillDir`) are stripped at write time so the column only holds small metadata. No stripping needed on the read path. <details> <summary>Implementation notes</summary> - New migration `000456` adds nullable `last_injected_context JSONB` column. - New SQL query `UpdateChatLastInjectedContext` writes the column without touching `updated_at`. - `persistInstructionFiles()` strips internal fields from parts via `StripInternal()` before persisting. - Sentinel path (no AGENTS.md) persists skill-only parts when skills exist. - `codersdk.Chat` exposes `LastInjectedContext []ChatMessagePart` (omitempty). - `db2sdk.Chat()` passes through the already-clean data. </details>	2026-03-30 14:11:30 -04:00
Jake Howell	71a492a374	feat: implement `<ClientFilter />` to AI Bridge request logs (#22694 ) Closes #22136 This pull-request implements a `<ClientFilter />` to our `Request Logs` page for AI Bridge. This will allow the user to select a client which they wish to filter against. Technically the backend is able to actually filter against multiple clients at once however the frontend doesn't currently have a nice way of supporting this (future improvement). <img width="1447" height="831" alt="image" src="https://github.com/user-attachments/assets/0be234e2-25f2-4a89-b971-d74817395da1" /> --------- Co-authored-by: Jeremy Ruppel <jeremy.ruppel@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 17:18:28 -04:00
Kyle Carberry	bcdc35ee3e	feat: add chat read/unread indicator to sidebar (#23129 ) ## Summary Adds read/unread tracking for chats so users can see which agent conversations have new assistant messages they haven't viewed. ## Backend Changes - Adds `last_read_message_id` column to the `chats` table (migration 000439). - Computes `has_unread` as a virtual column in `GetChatsByOwnerID` using an `EXISTS` subquery checking for assistant messages beyond the read cursor. - Exposes `has_unread` on the `codersdk.Chat` struct and auto-generated TypeScript types. - Updates `last_read_message_id` on stream connect/disconnect in `streamChat`, avoiding per-message API calls during active streaming. - Uses `context.WithoutCancel` for the deferred disconnect write so the DB update succeeds even after the client disconnects. ## Frontend Changes - Bold title (`font-semibold`) for unread chats in the sidebar. - Small blue dot indicator next to the relative timestamp. - Suppresses unread indicator for the currently active chat via `isActive` from NavLink. ## Design Decisions - Only `assistant` messages count as unread — the user's own messages don't trigger the indicator. - No foreign key on `last_read_message_id` since messages can be deleted (via rollback/truncation) and the column is just a high-water mark. - Zero API calls during streaming: exactly 2 DB writes per stream session (connect + disconnect). - Unread state refreshes on chat list load and window focus. The `watchChats` WebSocket optimistically marks non-active chats as unread on `status_change` events, but does not carry a server-computed `has_unread` field. Navigating to a chat optimistically clears its unread indicator in the cache.	2026-03-27 12:15:04 -04:00
Michael Suchacz	2312e5c428	feat: add manual chat title regeneration (#23633 ) ## Summary Adds a "Generate new title" action that lets users manually regenerate a chat's title using richer conversation context than the automatic first-message title path. ## Changes ### Backend - New endpoint: `POST /api/experimental/chats/{chatID}/title/regenerate` returns the updated Chat with a regenerated title - Manual title algorithm: Extracts useful user/assistant text turns → selects first user turn + last 3 turns → builds context with gap markers → renders prompt with anti-recency guidance → calls lightweight model → normalizes output - Helpers: `extractManualTitleTurns`, `selectManualTitleTurnIndexes`, `buildManualTitleContext`, `renderManualTitlePrompt`, `generateManualTitle` — all private, with the public `Server.RegenerateChatTitle` method - SDK: `ExperimentalClient.RegenerateChatTitle(ctx, chatID) (Chat, error)` - Persists title via existing `UpdateChatByID` and broadcasts `ChatEventKindTitleChange` ### Frontend - API client method + React Query mutation with cache invalidation - "Generate new title" menu item (with wand icon) in both TopBar and Sidebar dropdown menus - Loading/disabled state while regeneration is in-flight - Error toast on failure - Stories updated for both menus ### Tests - `quickgen_test.go`: Table-driven tests for all 4 helper functions (turn extraction, index selection, context building, prompt rendering) - `exp_chats_test.go`: Handler tests (ChatNotFound, NotFoundForDifferentUser, NoDaemon) ## Design notes - The existing auto-title path (`maybeGenerateChatTitle`, `titleInput`) is completely unchanged - Manual regeneration uses richer context (first user turn + last 3 turns + gap markers) vs the auto path's single first message - Endpoint is experimental and marked with `@x-apidocgen {"skip": true}`	2026-03-27 01:47:19 +01:00
Matt Vollmer	113aaa79a0	feat: add pinned chats with drag-to-reorder (#23615 ) https://github.com/user-attachments/assets/bd5d12a1-61b3-4b7d-83b6-317bdfb60b3c ## Summary Adds pinned chats to the agents page sidebar with server-side persistence and drag-to-reorder. Users can pin/unpin chats via the context menu, and pinned chats appear in a dedicated "Pinned" section above the time-grouped list. ## Database Migration `000453_chat_pin_order`: adds `pin_order integer DEFAULT 0 NOT NULL` column on `chats` (0 = unpinned, 1+ = pinned in display order). Three SQL queries handle pin operations server-side using CTEs with `ROW_NUMBER()`: - `PinChatByID`: normalizes existing orders and appends to end - `UnpinChatByID`: sets target to 0 and compacts remaining pins - `UpdateChatPinOrder`: shifts neighbors, clamps to `[1, pinned_count]` All queries exclude archived chats. `ArchiveChatByID` clears `pin_order` on archive. The handler rejects pinning archived chats with 400. ## Backend Pin/unpin/reorder go through the existing `PATCH /api/experimental/chats/{chat}` via the `pin_order` field on `UpdateChatRequest`. The handler routes based on current pin state: `pin_order == 0` unpins, `> 0` on an already-pinned chat reorders, `> 0` on an unpinned chat appends to end. ## Frontend - `pinChat` / `unpinChat` / `reorderPinnedChat` optimistic mutations using shared `isChatListQuery` predicate - Sidebar renders Pinned section above time groups, excludes pinned chats from time groups - Pin/Unpin context menu items (hidden for child/delegated chats) - `@dnd-kit/core` + `@dnd-kit/sortable` for drag-to-reorder with `MouseSensor`, `TouchSensor`, and `KeyboardSensor` - Local pin-order override prevents flash on drop; click blocker prevents NavLink navigation after drag --- PR generated with Coder Agents	2026-03-26 16:52:02 -04:00
Danny Kopping	801e57d430	feat: session detail API (#23203 )	2026-03-26 18:09:53 +02:00
Michael Suchacz	4f063cdc47	feat: separate default and additional Coder Agents system prompts (#23616 ) Admins can now control whether the built-in Coder Agents default system prompt is prepended to their custom instructions, rather than having the custom prompt silently replace the default. Changes: - New `include_default_system_prompt` boolean toggle (defaults to `true` for existing deployments) stored as a site config key — no migration needed. - GET `/api/experimental/chats/config/system-prompt` returns the toggle state, the custom prompt, and a preview of the built-in default. - PUT persists both the toggle and custom prompt atomically in a single transaction. - `resolvedChatSystemPrompt()` composes `[default?, custom?]` joined by `\n\n`, falling back to the built-in default on DB errors. - Settings UI adds a Switch toggle with conditional helper text and a "Preview" button that shows the built-in default prompt via the existing `TextPreviewDialog`. - Comprehensive test coverage: 15 subtests covering toggle behavior, prompt composition matrix, auth boundaries, and integration with chat creation.	2026-03-26 13:32:41 +01:00
Cian Johnston	d175e799da	feat: show agent badge on workspace list (#23453 ) - Adds `GET /api/experimental/chats/by-workspace` endpoint that returns workspace_id → latest chat_id mapping - Modifies FE to fetch this alongside the workspace list, gated on `agents` experiment and render an "Agent" badge similar to the existing "Task" badge in `WorkspacesTable` - Badge links to the "latest chat" linked to the given workspace. Notes: - Intentionally uses `fetchWithPostFilter` for RBAC to decouple from workspaces API — will migrate to `workspaces_expanded` view later. - If users have multiple chats linked to the same workspace, the badge will link to the most recently updated one. > 🤖 This PR was created with the help of Coder Agents, and has been reviewed by my human. 🧑‍💻	2026-03-26 11:30:12 +00:00
Jaayden Halko	3fb7c6264f	feat: display the AI add-on column in the UI on the Users and Organization Members tables (#23291 ) ## Summary Adds an entitlement-gated AI add-on column to both the Users table and the Organization Members table. When `ai_governance_user_limit` is entitled, each row shows whether the user is consuming an AI seat. ## Background The AI governance add-on tracks which users are consuming AI seats. Admins need visibility into per-user seat consumption directly from the user management tables. This change surfaces that information through both the site-wide Users table and the per-organization Members table, gated behind the `ai_governance_user_limit` entitlement so the column only appears when the feature is licensed. ## Implementation ### Backend - New SQL query `GetUserAISeatStates` (`coderd/database/queries/aiseatstate.sql`) — returns user IDs consuming an AI seat, derived from: - Users with entries in `aibridge_interceptions` (AI Bridge usage) - Users who own workspaces with `has_ai_task = true` builds (AI Tasks usage) - SDK types — added `has_ai_seat: boolean` to `codersdk.User` and `codersdk.OrganizationMemberWithUserData` - Handler wiring — both the Users list endpoint (`coderd/users.go`) and all Members endpoints (`coderd/members.go`) query AI seat state per page of user IDs and populate the response field - dbauthz — per-user `ActionRead` checks on `ResourceUserObject` ### Frontend - Shared `AISeatCell` component (`site/src/modules/users/AISeatCell.tsx`) — green `CircleCheck` for consuming, gray `X` for non-consuming - `TableColumnHelpTooltip` — extended with `ai_addon` variant with tooltip: "Users with access to AI features like AI Bridge, Boundary, or Tasks who are actively consuming a seat." - Column visibility gated behind `useFeatureVisibility().ai_governance_user_limit` ## Validation - Backend: dbauthz full method suite (`TestMethodTestSuite`) passes including new `GetUserAISeatStates` test - Backend: `TestGetUsers`, `TestUsersFilter`, CLI golden file tests pass - Frontend: 7/7 tests pass across `UsersPage.test.tsx` and `OrganizationMembersPage.test.tsx` (column visibility gating both directions) - `go build ./coderd/...` compiles clean - `pnpm --dir site run lint:types` passes - `make gen` clean ## Risks - Pagination performance: The AI seat query is scoped to the current page's user IDs (not a full table scan), keeping it efficient for paginated views. - Semantic scope: The workspace-side AI seat derivation uses "any build with `has_ai_task = true`" rather than "latest build only". If the product intent is latest-build-only, this can be tightened in a follow-up. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-6` • Thinking: `xhigh` • Cost: `$27.25`_ <!-- mux-attribution: model=anthropic:claude-opus-4-6 thinking=xhigh costs=27.25 -->	2026-03-26 10:36:40 +00:00
Ethan	61e31ec5cc	perf(coderd/x/chatd): persist workspace agent binding across chat turns (#23274 ) ## Summary This change removes the steady-state "resolve the latest workspace agent" query from chat execution. Instead of asking the database for the latest build's agent on every turn, a chat now persists the workspace/build/agent binding it actually uses and reuses that binding across subsequent turns. The common path becomes "load the bound agent by ID and dial it", with fallback paths to repair the binding when it is missing, stale, or intentionally changed. ## What changes - add `workspace_id`, `build_id`, and `agent_id` binding fields to `chats` - expose those fields through the chat API / SDK so the execution context is explicit - load the persisted binding first in chatd, instead of always resolving the latest build's agent - persist a refreshed binding when chatd has to re-resolve the workspace agent - keep child / subagent chats on the same bound workspace context by inheriting the parent binding - leave `build_id` / `agent_id` unset for flows like `create_workspace`, then bind them lazily on the next agent-backed turn ## Runtime behavior The binding is treated as an optimistic cache of the agent a chat should use: - if the bound agent still exists and dials successfully, we use it without a latest-build lookup - if the bound agent is missing or no longer reachable, chatd re-resolves against the latest build and persists the new binding - if a workspace mutation changes the chat's target workspace, the binding is updated as part of that mutation To avoid reintroducing a hot-path query, dialing uses lazy validation: - start dialing the cached agent immediately - only validate against the latest build if the dial is still pending after a short delay - if validation finds a different agent, cancel the stale dial, switch to the current agent, and persist the repaired binding ## Result The hot path stops issuing `GetWorkspaceAgentsInLatestBuildByWorkspaceID` for every user message, which is the source of the DB pressure this PR is addressing. At the same time, chats still converge to the correct workspace agent when the binding becomes stale due to rebuilds or explicit workspace changes.	2026-03-26 17:22:38 +11:00
Kyle Carberry	d4660d8a69	feat: add labels to chats (#23594 ) ## Summary Adds a general-purpose `map[string]string` label system to chats, stored as jsonb with a GIN index for efficient containment queries. This is a standalone foundational feature that will be used by the upcoming Automations feature for session identity (matching webhook events to existing chats), replacing the need for bespoke session-key tables. ## Changes ### Database - Migration 000451: Adds `labels jsonb NOT NULL DEFAULT '{}'` column to `chats` table with a GIN index (`idx_chats_labels`) - `InsertChat`: Accepts labels on creation via `COALESCE(@labels, '{}')` - `UpdateChatByID`: Supports partial update — `COALESCE(sqlc.narg('labels'), labels)` preserves existing labels when NULL is passed - `GetChats`: New `has_labels` filter using PostgreSQL `@>` containment operator - `GetAuthorizedChats`: Synced with generated `GetChats` (new column scan + query param) ### API - Create chat (`POST /chats`): Accepts optional `labels` field, validated before creation - Update chat (`PATCH /chats/{chat}`): Supports `labels` field for atomic label replacement - List chats (`GET /chats`): Supports `?label=key:value` query parameters (multiple are AND-ed) ### SDK - `Chat`, `CreateChatRequest`, `UpdateChatRequest`, `ListChatsOptions` all gain `Labels` fields - `UpdateChatRequest.Labels` is a pointer (`map[string]string`) so `nil` means "don't change" vs empty map means "clear all" ### Validation (`coderd/httpapi/labels.go`) - Max 50 labels per chat - Key: 1–64 chars, must match `[a-zA-Z0-9][a-zA-Z0-9._/-]` (supports namespaced keys like `github.repo`, `automation/pr-number`) - Value: 1–256 chars - 13 test cases covering all edge cases ### Chat runtime - `chatd.CreateOptions` gains `Labels` field, threaded through to `InsertChat` - Existing `UpdateChatByID` callers (e.g., quickgen title updates) are unaffected — NULL labels preserve existing values via COALESCE	2026-03-25 17:26:26 +00:00
Cian Johnston	796872f4de	feat: add deployment-wide template allowlist for chats (#23262 ) - Stores a deployment-wide agents template allowlist in `site_configs` (`agents_template_allowlist`) - Adds `GET/PUT /api/experimental/chats/config/template-allowlist` endpoints - Filters `list_templates`, `read_template`, and `create_workspace` chat tools by allowlist, if defined (empty=all allowed) - Add "Templates" admin settings tab in Agents UI ([what it looks like](https://624de63c6aacee003aa84340-sitjilsyrr.chromatic.com/?path=/story/pages-agentspage-agentsettingspageview--template-allowlist)) > 🤖 This PR was created with the help of Coder Agents, and has been reviewed by my human. 🧑‍💻	2026-03-25 15:19:17 +00:00
Danny Kopping	43a1af3cd6	feat: session list API (#23202 ) <!-- If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting. --> _Disclaimer:_ _initially_ _produced_ _by_ _Claude_ _Opus_ _4\.6,_ _heavily_ _modified_ _and_ _reviewed_ _by_ _me._ Closes https://github.com/coder/internal/issues/1360 Adds a new `/api/v2/aibridge/sessions` API which returns "sessions". Sessions, as defined in the [RFC](https://www.notion.so/coderhq/AI-Bridge-Sessions-Threads-2ccd579be59280f28021d3baf7472fbe?source=copy_link), are a set of interceptions logically grouped by a session key issued by the client. The API design for this endpoint was done in [this doc](https://github.com/coder/internal/issues/1360). If the client has not provided a session ID, we will revert to the thread root ID, and if that's not present we use the interception's own ID (i.e. a session of a single interception - which is effectively what we show currently in our `/api/v2/aibridge/interceptions` API). The SQL query looks gnarly but it's relatively simple, and seems to perform well (~200ms) even when I import dogfood's `aibridge_*` tables into my workspace. If we need to improve performance on this later we can investigate materialized views, perhaps, but for now I don't think it's warranted. --- _The PR looks large but it's got a lot of generated code; the actual changes aren't huge._	2026-03-24 08:58:47 +02:00
Michael Suchacz	82f965a0ae	feat: per-user per-model chat compaction threshold overrides (#23412 ) ## What Adds per-user per-model auto-compaction threshold overrides. Users can now customize the percentage of context window usage that triggers chat compaction, independently for each enabled model. ## Why The compaction threshold was previously only configurable at the deployment level (`chat_model_configs.compression_threshold`). Different users have different preferences — some want aggressive compaction to keep costs low, others prefer higher thresholds to retain more context. This gives users control without requiring admin intervention. ## Architecture Storage: Reuses the existing `user_configs` table (no migration needed). Overrides are stored as key/value pairs with keys shaped `chat_compaction_threshold:<modelConfigID>` and integer percent values. API: Three new experimental endpoints under `/api/experimental/chats/config/`: - `GET /user-compaction-thresholds` — list all overrides for the current user - `PUT /user-compaction-thresholds/{modelConfig}` — upsert an override (validates model exists and is enabled, validates 0–100 range) - `DELETE /user-compaction-thresholds/{modelConfig}` — clear an override (idempotent) Runtime resolution: In `coderd/chatd/chatd.go`, a new `resolveUserCompactionThreshold()` helper runs at the start of each chat turn (inside `runChat()`), after the model config is resolved but before `CompactionOptions` is built. If a valid override exists, it replaces `modelConfig.CompressionThreshold`. The threshold source (`user_override` vs `model_default`) is logged with each compaction event. Precedence: `effectiveThreshold = userOverride ?? modelConfig.CompressionThreshold` UI: New "Context Compaction" subsection in the Agents → Settings → Behavior tab, placed after Personal Instructions. Shows one row per enabled model with the system default, a number input for the override, and Save/Reset controls. ## Testing - 9 API subtests covering CRUD, validation (boundary values 0/100, out-of-range rejection), upsert behavior, idempotent delete, user isolation, and non-existent model config - 4 dbauthz tests (16 scenarios) verifying `ActionReadPersonal` / `ActionUpdatePersonal` on all query methods - 4 Storybook stories with play functions (Default, WithOverrides, Loading, Error) <details> <summary>Implementation plan</summary> ### Phase 1 — Tests - Backend API tests in `coderd/chats_test.go` (9 subtests) - Database auth wrapper tests in `coderd/database/dbauthz/dbauthz_test.go` (4 methods) - Frontend stories in `UserCompactionThresholdSettings.stories.tsx` (4 stories) ### Phase 2 — Backend preference surface - 4 SQL queries in `coderd/database/queries/users.sql` (list, get, upsert, delete) - `make gen` to propagate into generated artifacts - Auth/metrics wrappers in dbauthz and dbmetrics - SDK types and client methods in `codersdk/chats.go` - HTTP handlers and routes in `coderd/chats.go` and `coderd/coderd.go` - Key prefix constant shared between handlers and runtime ### Phase 3 — Runtime override - `resolveUserCompactionThreshold()` helper in `coderd/chatd/chatd.go` - Override injection in `runChat()` before building `CompactionOptions` - `threshold_source` field added to compaction log ### Phase 4 — Settings UI - API client methods and React Query hooks in `site/src/api/` - `UserCompactionThresholdSettings` component extracted from `SettingsPageContent` - Per-model mutation tracking (only the active row disables during save) - 100% warning, "System default" label, helpful empty state copy ### Phase 5 — Refactor and review fixes - Consolidated key prefix constant in `codersdk` - Explicit PUT range validation (not just struct tags) - GET handler gracefully skips malformed rows instead of 500 - Boundary value, upsert, and non-existent model config tests - UX improvements: per-model mutation state, aria-live on errors </details>	2026-03-24 00:48:18 +01:00
Asher	24ab216dd1	feat: add new group members endpoint with filtering and pagination (#23067 ) Partially addresses #21813 (still need to make changes to the "add user" button to be complete) Since there are a lot of user tests already, I moved them into `coderdtest` to be shared.	2026-03-20 12:43:03 -08:00
Cian Johnston	ff8dcca2c7	feat: add global chat workspace TTL setting (#23265 ) - Add `agents_workspace_ttl` site config (default: whatever the template says a.k.a. `0s`) - Expose via GET/PUT `/api/experimental/chats/config/workspace-ttl` - Chat tool reads setting and passes `TTLMillis` on workspace creation - Existing autostop infrastructure handles the rest (zero changes to LifecycleExecutor, CalculateAutostop, or activity bumping) - ⚠️ Template-level `UserAutostopEnabled=false` overrides this global default. Not touching this. - Frontend: "Workspace Lifetime" control in /agents/settings Behavior tab (admin-only) > This PR was created with the help of Coder Agents, and has been reviewed by several humans and robots. 🤖🤝🧑‍💻	2026-03-20 17:38:39 +00:00
Kyle Carberry	d8ff67fb68	feat: add MCP server configuration backend for chats (#23227 ) ## Summary Adds the database schema, API endpoints, SDK types, and encryption wrappers for admin-managed MCP (Model Context Protocol) server configurations that chatd can consume. This is the backend foundation for allowing external MCP tools (Sentry, Linear, GitHub, etc.) to be used during AI chat sessions. ## Database Two new tables: - `mcp_server_configs`: Admin-managed server definitions with URL, transport (Streamable HTTP / SSE), auth config (none / OAuth2 / API key / custom headers), tool allow/deny lists, and an availability policy (`force_on` / `default_on` / `default_off`). Includes CHECK constraints on transport, auth_type, and availability values. - `mcp_server_user_tokens`: Per-user OAuth2 tokens for servers requiring individual authentication. Cascades on user/config deletion. New column on `chats` table: - `mcp_server_ids UUID[]`: Per-chat MCP server selection, following the same pattern as `model_config_id` — passed at chat creation, changeable per-message with nil-means-no-change semantics. ## API Endpoints All routes are under `/api/experimental/mcp/servers/` and gated behind the `agents` experiment. Admin endpoints (`ResourceDeploymentConfig` auth): - `POST /` — Create MCP server config - `PATCH /{id}` — Update MCP server config (full-replace) - `DELETE /{id}` — Delete MCP server config Authenticated endpoints (all users, enabled servers only for non-admins): - `GET /` — List configs (admins see all, members see enabled-only with admin fields redacted) - `GET /{id}` — Get config by ID (with `auth_connected` populated per-user) OAuth2 per-user auth flow: - `GET /{id}/oauth2/connect` — Initiate OAuth2 flow (state cookie CSRF protection) - `GET /{id}/oauth2/callback` — Handle OAuth2 callback, store tokens - `DELETE /{id}/oauth2/disconnect` — Remove stored OAuth2 tokens ## Security - Secrets never returned: `OAuth2ClientSecret`, `APIKeyValue`, and `CustomHeaders` are never in API responses — only boolean indicators (`has_oauth2_secret`, `has_api_key`, `has_custom_headers`). - Field redaction for non-admins: `convertMCPServerConfigRedacted` strips `OAuth2ClientID`, auth URLs, scopes, and `APIKeyHeader` from non-admin responses. - dbcrypt encryption at rest: All 5 secret fields use `dbcrypt_keys` encryption with full encrypt-on-write / decrypt-on-read wrappers (11 dbcrypt method overrides + 2 helpers), following the same pattern as `chat_providers.api_key`. - OAuth2 CSRF protection: State parameter stored in `HttpOnly` cookie with `HTTPCookies.Apply()` for correct `Secure`/`SameSite` behind TLS-terminating proxies. - dbauthz authorization: All 18 querier methods have authorization wrappers. Read operations use `ActionRead`, write operations use `ActionUpdate` on `ResourceDeploymentConfig`. ## Governance Model \| Control \| Implementation \| \|---------\|---------------\| \| Global kill switch \| `enabled` defaults to `false` \| \| Availability policy \| `force_on` (always injected), `default_on` (pre-selected), `default_off` (opt-in) \| \| Per-chat selection \| `mcp_server_ids` on `CreateChatRequest` / `CreateChatMessageRequest` \| \| Auth gate \| OAuth2 servers require per-user auth before tools are injected \| \| Tool-level allow/deny \| Arrays on `mcp_server_configs` for granular tool filtering \| \| Secrets encrypted at rest \| Uses `dbcrypt_keys` (same pattern as `chat_providers.api_key`) \| ## Tests 8 test functions covering: - Full CRUD lifecycle (create, list, update, delete) - Non-admin visibility filtering (enabled-only, field redaction) - `auth_connected` population for OAuth2 vs non-OAuth2 servers - Availability policy validation (valid values + invalid rejection) - Unique slug enforcement (409 Conflict) - OAuth2 disconnect idempotency - Chat creation with `mcp_server_ids` persistence ## Known Limitations (Deferred) These are documented and intentional for an experimental feature: - Audit logging not yet wired — will add when feature stabilizes - Cross-field validation (e.g., OAuth2 fields required when `auth_type=oauth2`) — admin-only endpoint, will add when stabilizing - `force_on` auto-injection — query exists but not yet wired into chatd tool injection (follow-up) - Additional test coverage — 403 auth tests, GET-by-ID tests, callback CSRF tests planned for follow-up ## What's NOT in this PR - Frontend UI (admin panel + chat picker) - Actual MCP client connections (`chatd/chatmcp/` manager) - Tool injection into `chatloop/`	2026-03-19 14:07:36 +00:00
Kyle Carberry	1f0d896fc9	feat: add deleted flag to chat messages for soft-delete (#23223 ) Adds a `deleted` boolean column to the `chat_messages` table. Messages are never physically deleted from the database — instead they are marked as deleted so that usage and cost data is preserved. ## Changes ### Migration - New migration (000444) adds `deleted boolean NOT NULL DEFAULT false` to `chat_messages` ### SQL queries - `DeleteChatMessagesAfterID` → `SoftDeleteChatMessagesAfterID` (UPDATE SET deleted=true instead of DELETE) - New `SoftDeleteChatMessageByID` query for single-message soft-delete - All read queries now filter `deleted = false`: - `GetChatMessageByID` - `GetChatMessagesByChatID` - `GetChatMessagesByChatIDDescPaginated` - `GetChatMessagesForPromptByChatID` (both CTE and main query) - `GetLastChatMessageByRole` - Cost/usage queries (`GetChatCostSummary`, `GetChatCostPerModel`, etc.) intentionally still include deleted messages to preserve accurate spend tracking ### EditMessage behavior - Previously: updated the message content in-place + hard-deleted subsequent messages - Now: soft-deletes the original message + soft-deletes subsequent messages + inserts a new message with the updated content - This preserves the original message data (tokens, cost, content) in the database	2026-03-18 14:37:09 -04:00

1 2 3 4 5 ...

318 Commits