mirror of
https://github.com/coder/coder.git
synced 2026-06-02 20:48:20 +00:00
e57525002cc57f6a759c3b476a638fd107099aa3
629 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
e57525002c |
chore: remove agents experiment flag and mark feature as beta (#24432)
Remove the `ExperimentAgents` feature flag so the Agents feature is always available without requiring `--experiments=agents`. The feature is now in beta. Existing deployments that still pass `--experiments=agents` will get a harmless "ignoring unknown experiment" warning on startup. ### Changes **Backend:** - Remove `RequireExperimentWithDevBypass` middleware from chat and MCP server routes - Always include `AgentsAccessRole` in assignable site roles (later refactored to org-scoped on main; rebase keeps that) - Always set `AgentsTabVisible = true`, then drop the entire dead `AgentsTabVisible` metadata pipeline (Go htmlState field, populateHTMLState goroutine, HTML meta tag, useEmbeddedMetadata registration, mock); no production consumer reads it. `AgentsNavItem` already gates on `permissions.createChat`. - Make `blob:` CSP `img-src` addition unconditional - Remove `ExperimentAgents` constant, `DisplayName` case, and `ExperimentsKnown` entry **CLI:** - Graduate the agents TUI from `coder exp agents` to `coder agents` (moved from `AGPLExperimental()` to `CoreSubcommands()`) - Drop the `agent` alias so it does not collide with the hidden workspace-agent command - Rename implementation files `cli/exp_agents_*.go` -> `cli/agents_*.go` and internal identifiers (`expChatsTUIModel` -> `chatsTUIModel`, `newExpChatsTUIModel` -> `newChatsTUIModel`, `setupExpAgentsBackend` -> `setupAgentsBackend`, `startExpAgentsSession` -> `startAgentsSession`, `expAgentsPtr` -> `agentsPtr`, `expAgentsSession` -> `agentsSession`, `TestExpAgents*` -> `TestAgents*`). `expClient` (the `*codersdk.ExperimentalClient` local) is kept; `coderd/exp_chats*.go` and other still-experimental `cli/exp_*.go` commands are intentionally untouched. **Frontend:** - Remove experiment check from `AgentsNavItem` - render when `canCreateChat` is true - Remove `agentsEnabled` experiment check from `WorkspacesPage`, then gate `chatsByWorkspace` on `permissions.createChat` so users without chat access don't trigger the per-page DB query (Copilot review feedback) - Add `FeatureStageBadge` (beta) next to the Coder logo in the Agents sidebar (desktop + mobile) **Docs:** - Remove experiment flag setup instructions from `early-access.md` and `getting-started.md` (and rename `early-access.md`'s "Enable Coder Agents" heading to "Set up Coder Agents", since there is no enablement step left) - Update `chats-api.md` and `getting-started.md`'s Chats API note to say "beta" instead of "experimental" - `docs/manifest.json`: drop "experimental" from the Chats API sidebar description - `make gen` regenerated `docs/reference/cli/agents.md` and the CLI index - `scripts/check_emdash.sh`: exclude `cli/testdata/*.golden` and `enterprise/cli/testdata/*.golden` from the new repo-wide emdash lint, since serpent emits emdash borders in every generated `--help` golden file **Tests:** - Remove `ExperimentAgents` setup from all test files (14 occurrences across 7 files) - Update stale "with the agents experiment" comments in `coderd/x/chatd/integration_test.go` and `coderd/mcp_test.go` <img width="1185" height="900" alt="image" src="https://github.com/user-attachments/assets/b420bc8f-41d6-42c6-abd8-ad572533d651" /> > 🤖 Generated by Coder Agents |
||
|
|
06bad73df4 |
feat: add admin-configurable advisor API, SDK, and queries (#24621)
## Summary
Add the **admin-configurable advisor configuration**: database-backed storage, SDK types, and the experimental HTTP handlers that back the admin settings UI (later PRs). Follows the same "site-configs" pattern as Virtual Desktop.
## Motivation
The advisor needs runtime-tunable knobs (enable/disable, per-run cap, max output tokens, reasoning effort, optional model override) without a service restart or redeploy. Using the existing `site_configs` K/V table keeps this pattern consistent with other admin features and avoids a bespoke schema.
## Changes
### Database (`coderd/database/queries/siteconfig.sql`)
- `GetChatAdvisorConfig` returns the stored JSON blob (default `'{}'`) under key `agents_advisor_config`.
- `UpsertChatAdvisorConfig` uses the standard `INSERT ... ON CONFLICT` pattern.
- Regenerated via `make gen` (queries.sql.go + mocks).
### SDK (`codersdk/chats.go`)
- `AdvisorConfig` type with `Enabled`, `MaxUsesPerRun`, `MaxOutputTokens`, `ReasoningEffort` (`""` / `low` / `medium` / `high`), `ModelConfigID uuid.UUID`.
- Client methods: `ChatAdvisorConfig(ctx)` / `UpdateChatAdvisorConfig(ctx, cfg)`.
### API (`coderd/exp_chats.go`)
- `GET /api/experimental/chats/config/advisor`: reads current config; relies on `ActorFromContext` validation.
- `PUT /api/experimental/chats/config/advisor`: requires `policy.ActionUpdate` on `rbac.ResourceDeploymentConfig`.
- Handlers unmarshal `{}` to a typed zero value and re-marshal on upsert for schema stability.
- Tests in `exp_chats_test.go` cover empty defaults, round-trip update, unauthorized update, and invalid body.
## Stack context
This is **PR 3 of 6** in the advisor feature stack. Consumed by:
- PR 4 (`feat/advisor-04-chatd-runtime`), which reads this config on every `runChat`.
- PR 6 (`feat/advisor-06-admin-settings-ui`), which renders the admin form.
## Scope / non-goals
- No `chatd` read path (lands in PR 4).
- No UI (lands in PR 6).
- `agents_advisor_config` remains a single-row JSON blob; we intentionally do not shard per-org/per-template yet.
## Validation
- `make gen`
- `go test ./coderd/database/... -run TestChatAdvisor`
- `go test ./coderd/... -run TestChatAdvisorConfig`
- `make lint`
---
<details>
<summary>📋 Implementation Plan (shared across the advisor stack)</summary>
# Plan: Add a Mux-style advisor tool to coder agents/chatd
## Outcome
Add a first-class `advisor` tool to agent chats in `coderd/x/chatd` that feels native to Coder:
- it is a built-in server-side tool, not an MCP/dynamic-tool workaround;
- it performs a nested **tool-less** model call for strategic advice;
- it is exposed only when eligible, and the prompt mentions it only when it is actually available;
- it is treated as a **planning-only** tool so it does not run alongside action tools in the same batch;
- it tracks usage/cost separately enough for operators to reason about it;
- it has a minimally polished UI in the Agents page;
- and it ships with explicit dogfooding evidence, including screenshots and repro videos.
## Design decisions to lock before coding
1. **Primary architecture:** native built-in tool in `chattool/`, backed by a small `chatadvisor` package.
2. **Nested model execution:** reuse chatd's existing model/provider stack for a one-step, tool-less advisor call rather than inventing a new provider pathway.
3. **Execution policy:** treat `advisor` as an exclusive/planning-only tool; mixed batches must return structured policy errors and force the model to retry cleanly.
4. **Availability:** initial rollout is for root agent chats only; disable for child/sub-agent chats until recursion/cost policy is proven.
5. **Prompt sync:** use one eligibility boolean to drive both tool registration and advisor guidance injection.
6. **Persistence/cost split:** MVP should keep advisor usage visible in result metadata and server metrics; only add DB schema if product/billing explicitly needs queryable advisor-specific cost.
7. **UI scope:** generic tool rendering is an acceptable temporary milestone during backend bring-up, but the release candidate should include a dedicated lightweight advisor renderer.
## Delivery model
The work should be executed as coordinated workstreams with one integration owner and parallel contributors for low-conflict areas. The integration owner should own `coderd/x/chatd/chatd.go` because prompt assembly, tool registration, and model resolution all converge there.
## Detailed workstreams
### Repo evidence used for this plan
<details>
<summary>Mux reference and current chatd seams</summary>
**Mux reference implementation**
- `src/node/services/tools/advisor.ts` — native advisor tool implementation.
- `src/common/constants/advisor.ts` — advisor prompt/constants and truncation policy.
- `src/common/utils/tools/tools.ts` — conditional tool registration.
- `src/node/services/streamContextBuilder.ts` — injects advisor guidance only when the tool is available.
**Current chatd seams**
- `coderd/x/chatd/chatd.go`
- `processChat()` — tool assembly, prompt assembly, and chatloop invocation.
- `resolveChatModel()` — current model/provider/key resolution seam.
- `type Config struct` — server-level chatd configuration surface.
- `coderd/x/chatd/chatloop/chatloop.go`
- `Run()` — main streaming/model loop.
- `executeTools()` — built-in tool execution/batching seam.
- `coderd/x/chatd/chattool/` — built-in tool implementations.
- `site/src/pages/AgentsPage/components/ChatElements/tools/Tool.tsx` — tool renderer dispatch.
- `site/src/pages/AgentsPage/components/ChatConversation/messageParsing.ts` and `ConversationTimeline.tsx` — tool/result merge and rendering flow.
</details>
### Workstream map and ownership
| Workstream | Primary owner | Main files | Can run in parallel? | Done when |
|---|---|---|---|---|
| 0. Integration + gating | Integration lead | `coderd/x/chatd/chatd.go` | No; central merge lane | Tool registration, prompt sync, and model selection are wired together |
| 1. Advisor runtime + tool | Backend agent | new `coderd/x/chatd/chatadvisor/`, new `coderd/x/chatd/chattool/advisor.go` | Yes | Tool can perform a tool-less advisor call in memory and return structured results |
| 2. Planning-only execution policy | Chatloop agent | `coderd/x/chatd/chatloop/chatloop.go`, related tests | Yes | Mixed `advisor` + action-tool batches are rejected cleanly and deterministically |
| 3. Metrics/usage/config | Backend/telemetry agent | `chatd.go`, `chatloop/metrics.go`, optional config plumbing | Partially; coordinate with integration lead | Advisor usage is separately visible in metadata/metrics and limits are enforced |
| 4. Frontend rendering | Frontend agent | `site/.../tools/Tool.tsx`, new `AdvisorTool.tsx`, stories | Yes after result schema stabilizes | Advisor renders as a readable card and story tests pass |
| 5. Dogfood + QA evidence | QA agent | dev server, Storybook, dogfood output | After backend + UI are usable | Repro videos, screenshots, and a concise QA report exist |
### Parallelization rules
- **Do not split `coderd/x/chatd/chatd.go` across multiple execution agents without an integration lead.** That file owns prompt building, tool registration, model resolution, and cost persistence.
- Workstreams 1 and 2 can be developed in parallel and then stacked onto the integration branch.
- Workstream 4 should begin once the backend result schema is agreed on, even if the backend is still behind a feature flag.
- Any agent that needs to re-check Mux behavior should clone `coder/mux` into a temporary directory (for example, `$(mktemp -d)/mux`) and inspect it read-only; do not vendor or copy code from Mux directly.
## Phase 0 — Preflight and guardrails
### Goals
- Align the team on the smallest shippable architecture.
- Prevent scope creep into MCP/dynamic-tool/sub-agent variants.
- Decide upfront what is MVP vs. follow-up.
### Tasks
1. **Confirm the MVP boundary.**
- Ship a built-in advisor tool first.
- Do **not** make MCP, dynamic tools, or sub-agents the primary implementation.
- Do **not** add transient streaming phases in the first backend PR unless they fall out almost for free.
2. **Confirm local workflow hygiene before coding.**
- Ensure the repo is using the project git hooks from `scripts/githooks`.
- Do not bypass hooks with `--no-verify`.
- Use `./scripts/develop.sh` for the full dev server rather than manual build/run commands.
3. **Lock the model-selection policy.**
- **Recommended MVP:** advisor uses the same resolved provider/model/cost config as the current chat, with advisor-specific max-output and usage caps.
- **Follow-up only if required:** add a separate `AdvisorModelConfigID`-style override that resolves through the existing `configCache`/model-config path. Do not invent a new free-form `provider:model` parser if chatd already stores provider/model separately.
4. **Lock the persistence policy.**
- **Recommended MVP:** no DB migration. Persist advisor-visible metadata in the tool result and record separate metrics in memory/Prometheus.
- **Only if product/billing explicitly asks for queryable advisor cost:** add a later DB migration or usage table, following the normal `queries/*.sql` + `make gen` workflow.
5. **Create an execution ADR note in the work item or tracking doc.**
- Capture: built-in tool, tool-less nested call, root-chat-only rollout, exclusive execution policy, MVP no-DB-migration default.
### Quality gate
- Everyone on the team can state the same answers to these questions:
- Is advisor a built-in tool? **Yes.**
- Can advisor run with action tools in the same batch? **No.**
- Does advisor get tools of its own? **No.**
- Is a DB migration required for MVP? **No, unless billing insists.**
## Phase 1 — Build the advisor runtime and tool wrapper
### Goals
Create the core advisor implementation in a way that is easy to test and keeps `chattool/` thin.
### Files to add
- `coderd/x/chatd/chatadvisor/types.go`
- `coderd/x/chatd/chatadvisor/guidance.go`
- `coderd/x/chatd/chatadvisor/handoff.go`
- `coderd/x/chatd/chatadvisor/runtime.go`
- `coderd/x/chatd/chatadvisor/runner.go`
- `coderd/x/chatd/chattool/advisor.go`
### Responsibilities by file
1. **`types.go`**
- Define the input/result schema used by the tool and UI.
- Keep the result shape close to Mux so the UI and model both have predictable cases.
- Recommended result variants:
- `advice`
- `limit_reached`
- `error`
Recommended shape:
```go
type AdvisorArgs struct {
Question string `json:"question"`
}
type AdvisorResult struct {
Type string `json:"type"`
Advice string `json:"advice,omitempty"`
Error string `json:"error,omitempty"`
AdvisorModel string `json:"advisor_model,omitempty"`
RemainingUses int `json:"remaining_uses,omitempty"`
Usage *AdvisorUsageResult `json:"usage,omitempty"`
}
```
2. **`guidance.go`**
- Hold two strings:
- the nested advisor system prompt;
- the parent-agent guidance block to inject into the outer system prompt.
- The nested advisor prompt must say, in plain language:
- you are advising the parent agent;
- you do not address the end user directly;
- you do not claim actions happened;
- you return concise strategic guidance and tradeoffs.
3. **`runtime.go`**
- Define the per-run runtime state.
- Recommended fields:
- resolved model + model config;
- provider keys/options reused from the outer chat;
- `MaxUsesPerRun`;
- `MaxOutputTokens`;
- atomic/current call counter;
- callback(s) to obtain the current prompt snapshot and current-step snapshot;
- optional metrics/usage hook.
- Add fail-fast validation for impossible config: nil model, non-positive limits, empty prompt builders, etc.
4. **`handoff.go`**
- Build the advisor handoff message from:
- the explicit question;
- the exact prompt/messages the parent model just used;
- the current step's text/reasoning snapshot, if available;
- the most recent relevant tool outputs, if they are already in the prompt snapshot.
- **Important:** use the already-prepared outer prompt tail, not a fresh DB reload. That keeps the advisor aligned with compaction and the exact context the outer model saw.
- Apply hard truncation budgets with recent-context bias.
5. **`runner.go`**
- Execute the nested advisor call.
- **Recommended implementation:** call `chatloop.Run()` in an in-memory, one-step mode:
- `Tools: nil`
- `ProviderTools: nil`
- `MaxSteps: 1`
- `PersistStep`: capture the assistant output in memory instead of writing DB rows
- Reuse the existing provider/model/cost path instead of building a second provider runner.
- Assert that no tool definitions are passed to the nested call.
6. **`chattool/advisor.go`**
- Keep this file thin and consistent with other built-ins.
- Responsibilities:
- decode `AdvisorArgs`;
- validate `Question` is non-empty and bounded;
- call the `chatadvisor` runner;
- return a structured tool response.
### Defensive programming requirements
- Assert `Question` is non-empty after trimming.
- Assert runtime limits are positive.
- Assert the nested advisor call runs with zero tools/provider tools.
- Assert `AdvisorResult.Type` is one of the known variants before returning.
- Assert remaining uses never goes negative.
### Acceptance criteria
- A unit test can call the advisor tool with a fake model and receive a stable `advice` result.
- The nested advisor call is impossible to run with tools accidentally attached.
- The core logic lives in `chatadvisor/`, not embedded inside `chatd.go`.
## Phase 2 — Wire advisor into chatd and keep prompt/tool availability in sync
### Goals
Register the tool in the right place, expose it only when eligible, and inject system guidance only when the tool is present.
### Files to modify
- `coderd/x/chatd/chatd.go`
- optionally a small helper file if `chatd.go` becomes too crowded
### Tasks
1. **Compute one eligibility boolean in `processChat()`.**
Recommended inputs:
- server-level advisor enabled flag;
- root chat only (`chat.ParentChatID == uuid.Nil` or equivalent existing root/child check);
- a usable resolved model/provider exists;
- optional experiment/workspace/org gate if product wants staged rollout.
2. **Create the runtime once per outer chat run.**
- Use the model/config/keys resolved by `resolveChatModel()`.
- Reuse provider options from the current chat's `ChatModelCallConfig`.
- Set `MaxUsesPerRun` and `MaxOutputTokens` from advisor config defaults.
3. **Register the tool in the built-in tool block.**
- Insert after the skill tools and before MCP tools in `processChat()`.
- Record `builtinToolNames["advisor"] = true` so metrics stay bounded.
4. **Inject advisor guidance into the outer system prompt using the same boolean.**
- Use `chatprompt.InsertSystem()` in the same prompt assembly path that already injects user/system instructions.
- Place the block near the existing instruction insertion, before plan-path/skill context blocks.
- Wrap the guidance in an explicit tag like `<advisor-guidance>` so it is easy to spot in tests and future refactors.
5. **Keep advisor out of child chats for the first release.**
- That avoids recursion/cost blowups with `spawn_agent` / `wait_agent` flows.
- Document this explicitly in the rollout notes and tests.
### Acceptance criteria
- If advisor is disabled, neither the tool nor the prompt guidance appears.
- If advisor is enabled, both the tool and the prompt guidance appear.
- Root chats can use advisor; child chats cannot.
- Built-in tool names include `advisor` so metrics do not collapse it into the generic `mcp` label.
## Phase 3 — Enforce planning-only execution policy in `chatloop`
### Goals
Prevent the model from calling `advisor` and action tools in the same execution batch.
### Files to modify
- `coderd/x/chatd/chatloop/chatloop.go`
- related chatloop tests
### Recommended implementation
Keep the MVP small; do **not** build a general policy engine yet.
1. Add a minimal field to `chatloop.RunOptions`, for example:
```go
ExclusiveToolName *string
```
2. In `Run()` / `executeTools()`, detect the case where the exclusive tool appears in the same local-tool batch as any other locally executed tool.
3. When that happens, synthesize structured tool-result errors for the affected calls instead of executing anything in the batch.
- `advisor` should receive a clear error like: _advisor must be called by itself before action tools_.
- The sibling action tools should receive a paired policy error like: _this tool was skipped because advisor must run alone_.
4. Let the outer model see those tool errors and retry cleanly.
- This is simpler and safer than partial execution or hidden deferral.
- It preserves deterministic transcript history for debugging.
5. Pass the just-finished step snapshot into the tool execution context.
- The advisor runtime should be able to see the current step's text/reasoning content, because that is often the best hint about what the outer model is trying to decide.
### Why this is the right fit
- It matches the intended semantics: advisor is consulted **before** taking action.
- It avoids subtle race conditions caused by concurrent built-in tool execution.
- It keeps the behavior easy to test with fake models.
### Acceptance criteria
- A model-emitted batch containing only `advisor` succeeds.
- A model-emitted batch containing `advisor` plus any other locally executed tool returns deterministic policy errors and executes nothing.
- Non-advisor tool execution stays unchanged for normal chats.
## Phase 4 — Usage limits, metrics, and configuration
### Goals
Make advisor safe to operate without over-designing billing/storage in the first release.
### Files to modify
- `coderd/x/chatd/chatd.go`
- `coderd/x/chatd/chatloop/metrics.go` as needed
- `coderd/x/chatd/chatd.go` `Config` struct and constructor path
- optional follow-up config/db files only if a separate advisor model or persistent billing is required
### Tasks
1. **Add explicit server config knobs for MVP.**
Recommended fields on `chatd.Config` or a nested advisor config struct:
- `AdvisorEnabled bool`
- `AdvisorMaxUsesPerRun int`
- `AdvisorMaxOutputTokens int64`
2. **Track usage per outer run.**
- Reset the counter for each `processChat()` invocation.
- Return `remaining_uses` in the tool result.
- Return `limit_reached` when the cap is exhausted.
3. **Expose advisor usage metadata in the tool result.**
- Include model name and token/cost summary if available.
- Use the same `callConfig.Cost` calculation path as the outer chat for MVP if advisor reuses the same model.
4. **Record server-side metrics.**
- Count advisor invocations, failures, and latency.
- Ensure they show up under the built-in tool label `advisor`.
5. **Optional decision gate: separate advisor model.**
- If product insists on a stronger/different advisor model, add a follow-up config hook that resolves another existing chat model config through the same `configCache` path.
- Keep that out of the first landing PR unless it is required for acceptance.
6. **Optional decision gate: queryable advisor cost.**
- If this becomes required, spin a follow-up DB task:
- update `coderd/database/queries/*.sql`;
- add migration files;
- run `make gen`;
- update audit mappings if a new auditable type/field is introduced.
### Acceptance criteria
- Advisor calls are capped per outer run.
- Limit exhaustion is user-visible in the tool result.
- Metrics distinguish advisor calls from other built-in tools.
- MVP does not require a schema migration unless explicitly approved.
## Phase 5 — Frontend rendering and Storybook coverage
### Goals
Make advisor feel intentional in the Agents UI without blocking the backend on fancy streaming UI.
### Files to modify
- `site/src/pages/AgentsPage/components/ChatElements/tools/Tool.tsx`
- new `site/src/pages/AgentsPage/components/ChatElements/tools/AdvisorTool.tsx`
- Storybook story file(s) in the same tools directory
### Delivery strategy
1. **Intermediate milestone during backend bring-up:** rely on the existing generic tool renderer if needed.
- This is acceptable only as a short-lived integration checkpoint.
2. **Release milestone:** add a dedicated lightweight `AdvisorTool` renderer.
- Reuse existing primitives:
- `ToolCollapsible`
- `ToolIcon`
- `Response` for markdown/prose rendering
- `ScrollArea` if the advice can be long
- Keep styling light and consistent with the Agents page.
- Do not add unnecessary React memoization in `site/src/pages/AgentsPage/`; that area is already React-Compiler aware.
3. **Render the structured result states cleanly.**
- `advice` — readable prose/markdown with optional metadata footer.
- `limit_reached` — warning-style message.
- `error` — error state with visible fallback text.
- `running` — existing tool loading state/spinner is enough for MVP.
4. **Add Storybook coverage instead of ad-hoc component tests.**
Recommended stories:
- successful advice;
- running/loading;
- limit reached;
- error.
5. **Keep the UI contract narrow.**
- Prefer one text field like `advice` plus small metadata rather than a deeply nested schema.
- That keeps the UI resilient to prompt iteration.
### Acceptance criteria
- The advisor tool card renders readable content rather than raw quoted JSON in the final release branch.
- Running, limit, and error states are visibly distinct.
- Storybook stories and play assertions cover the new states.
- Existing tool rendering flows remain unchanged.
## Phase 6 — Automated tests and validation gates
### Backend tests to add
1. **Advisor runtime/tool tests**
- question validation;
- tool-less nested execution assertion;
- success result shaping;
- limit-reached result shaping;
- error result shaping.
2. **Prompt/gating tests in chatd**
- advisor disabled ⇒ no tool, no guidance;
- advisor enabled/root chat ⇒ tool + guidance;
- child chat ⇒ advisor absent.
3. **Chatloop policy tests**
- advisor alone runs;
- advisor + action tool mixed batch returns deterministic policy errors;
- non-advisor tools still execute normally.
4. **Usage/metrics tests**
- per-run cap resets correctly;
- builtin tool labeling includes `advisor`;
- returned metadata includes model/usage summary when available.
### Frontend tests to add
- Storybook `play()` assertions for the advisor renderer states.
- Verify expand/collapse behavior and visible fallback text.
- Verify the message timeline still renders adjacent tools correctly.
### Recommended command sequence
Run these as the implementation matures, not only at the end:
1. Backend-focused gate after phases 1–4:
- `make test RUN=TestAdvisor`
- `make test RUN=TestChatloopAdvisor`
- `make lint`
2. Frontend-focused gate after phase 5:
- `pnpm test:storybook src/pages/AgentsPage/components/ChatElements/tools/AdvisorTool.stories.tsx`
- `pnpm lint`
- `pnpm format`
3. Final repo gate before handoff:
- `make pre-commit`
- run any additional targeted `make test RUN=...` selections covering touched chatd paths
> Use the exact new test names the implementing agents create; the names above are recommended anchors, not existing tests.
## Dogfooding plan
### Principle
Dogfood the change as a real agent feature, not just a unit-tested backend. Per the dogfood and `agent-browser` skills, the reviewer should get **watchable repro videos** plus screenshots that make the behavior obvious without reading logs.
### Required setup
1. Start the full dev environment with:
- `./scripts/develop.sh`
2. If the frontend renderer changes, also start Storybook from `site/` with:
- `pnpm storybook --no-open`
3. Use `agent-browser` directly — **never `npx agent-browser`**.
4. Use named browser sessions and an output folder such as:
- `./dogfood-output/advisor/`
- with subfolders `screenshots/` and `videos/`
### Evidence protocol
For every interactive scenario below:
1. Start video recording **before** the action.
2. Capture step-by-step screenshots at human pace.
3. Capture one annotated screenshot of the final state.
4. Stop the recording.
5. Note the exact pass/fail observation in the QA report.
For static UI states (for example Storybook error/limit cards), an annotated screenshot is sufficient; video is optional but still encouraged by this project’s review preference.
### Dogfood scenarios
#### Scenario A — Happy path in the real Agents UI
**Goal:** prove that a root agent chat can invoke advisor and produce a readable recommendation before taking further action.
Steps:
1. Open the Agents page with an advisor-enabled root chat.
2. Start a repro video.
3. Send a prompt that should reasonably trigger strategic planning, such as an architecture or multi-tradeoff question.
4. Capture screenshots of:
- the prompt before send;
- the running advisor state;
- the completed advisor card and the assistant’s follow-up response.
5. Stop recording.
Pass criteria:
- advisor appears in the timeline;
- the rendered result is readable;
- the assistant can continue after consuming the advisor output.
#### Scenario B — Advisor unavailable path
**Goal:** prove the feature is truly gated.
Suggested variants (at least one is required, both are better):
- feature flag/config off;
- child/sub-agent chat.
Evidence:
- annotated screenshot of the chat/tool state showing advisor is absent;
- short video if toggling the gate live is part of the repro.
Pass criteria:
- no advisor tool is available;
- no advisor-specific prompt behavior leaks through.
#### Scenario C — UI states in Storybook
**Goal:** prove the renderer handles non-happy states cleanly.
Required story states:
- success/advice;
- running;
- limit reached;
- error.
Evidence:
- one screenshot per state;
- at least one short video showing collapse/expand behavior.
Pass criteria:
- success renders readable advice;
- limit/error have visible fallback text;
- the component behaves like the other tool cards.
#### Scenario D — Regression sweep of nearby tools
**Goal:** ensure advisor does not break the surrounding chat timeline.
Check at minimum:
- another existing built-in tool still renders correctly near advisor;
- sub-agent/tool cards still expand/collapse normally;
- no obvious console errors appear in the Agents page during the advisor flow.
Evidence:
- screenshots of adjacent tool cards;
- console/error capture if anything suspicious appears.
### `agent-browser` usage notes for the QA agent
- Prefer `agent-browser batch` for 2+ sequential commands when no intermediate parsing is needed.
- Use `snapshot -i` to discover interactive refs.
- Re-snapshot after navigation or major DOM changes.
- Avoid `wait --load networkidle` unless the page is known to go idle; prefer explicit element/text waits or short fixed waits.
- Record videos at human pace and include pauses that a reviewer can follow.
## Rollout plan
### Initial rollout
- Gate behind a server-side advisor-enabled flag.
- Enable only for selected internal/root agent chats first.
- Watch metrics for:
- invocation count;
- failure rate;
- latency;
- obvious retry loops.
### Expansion conditions
Expand beyond the initial rollout only after the following are true:
- mixed-batch policy behavior is stable;
- cost impact is understood;
- frontend UX is readable in production-like dogfood;
- no recursion surprises have appeared with sub-agent flows.
### Explicit non-goals for the first release
- advisor inside child/sub-agent chats;
- provider-agnostic streaming phase UI;
- MCP-based external advisor implementation;
- mandatory DB-backed advisor cost reporting.
## Final acceptance checklist
- [ ] `advisor` is a built-in chatd tool, not an MCP/dynamic-tool substitute.
- [ ] The nested advisor call is tool-less and bounded to one in-memory step.
- [ ] One eligibility boolean controls both tool registration and prompt guidance injection.
- [ ] Root chats can use advisor; child chats cannot in the initial rollout.
- [ ] Mixed advisor/action batches produce deterministic policy errors instead of partial execution.
- [ ] Per-run usage caps and limit-reached behavior work.
- [ ] Advisor usage is visible in metadata/metrics without forcing a DB migration for MVP.
- [ ] The Agents UI has a readable advisor card and Storybook coverage.
- [ ] Dogfooding produced screenshots and repro videos for the required scenarios.
- [ ] Validation commands (`make lint`, targeted `make test`, Storybook tests, `make pre-commit`) passed before handoff.
## Suggested PR split
1. **PR 1 — Backend foundation**
- `chatadvisor/` package
- `chattool/advisor.go`
- `chatloop` exclusive policy
- chatd gating/prompt sync
- backend tests
2. **PR 2 — Frontend + QA**
- advisor renderer
- stories/play assertions
- dogfood artifacts and QA notes
3. **PR 3 — Optional follow-ups only if demanded by stakeholders**
- separate advisor model override
- persistent advisor billing/queryability
- transient phase-stream UX
</details>
---
_Generated with [`mux`](https://github.com/coder/mux) • Model: `anthropic:claude-opus-4-7` • Thinking: `max`_
|
||
|
|
dd49a818f9 |
fix: export chatd.Start to separate server lifecycle (#24761)
chatd.New() no longer auto-starts the acquire/wake loop. Callers that want chat processing call server.Start() explicitly. Tests that want a passive server skip Start(); heartbeat, stream janitor, and stale recovery still run. Closes coder/internal#1502 |
||
|
|
a876287d36 |
feat: auto-archive inactive chats with audit trail (#24642)
Adds a background job in `dbpurge` that periodically archives chats inactive beyond a configurable threshold. Each archived root chat gets a background audit entry tagged `chat_auto_archive`. Disabled by default. * New `AutoArchiveInactiveChats` SQL query with LATERAL last-activity subquery and partial index on archive candidates * `site_configs`-backed `auto_archive_days` setting with admin-only PUT, any-authenticated-user GET * Cascade archive via `root_chat_id`; pinned chats and active threads exempt * Root-only audit dispatch on detached context, matching manual archive (`patchChat`) behavior * 11 subtests covering disabled no-op, boundary, deleted messages, child activity, pinned exemption, multi-owner, idempotency, and batch pagination PR #24643 adds per-owner digest notifications. PR #24704 adds the requisite UI controls. > 🤖 |
||
|
|
3d90546aae |
feat: add general subagent model override (#24610)
Adds a deployment-wide admin override for general delegated subagents.
## What changed
- store the general override in `site_configs` and expose it through the
shared `agent-model-override/{context}` API
- apply the general override when spawning delegated general subagents,
while preserving the existing Explore override behavior
- reuse a shared Agents settings form for the general and Explore
override sections
## Validation
- `make gen`
- `go test ./coderd -run 'TestChatModelOverrides'`
- `go test ./coderd/x/chatd -run
'TestSpawnAgent_(GeneralUsesConfiguredModelOverride|GeneralOverrideLogsAndFallsBackWhenCredentialsUnavailable|GeneralOverrideLogsAndFallsBackWhenProviderDisabled)'`
- `pnpm -C site lint:types`
- `pnpm -C site test:storybook --
AgentSettingsAgentsPageView.stories.tsx`
- `make lint`
- `make pre-commit`
> Mux is acting on Mike's behalf.
|
||
|
|
411ed21059 | fix(coderd): omit frame-ancestors CSP for embed routes (#24529) | ||
|
|
410f9a5e19 |
feat: allow renaming of agent chat title (#24489)
Co-authored-by: Coder Agents <noreply@coder.com> |
||
|
|
18a30a7a10 | feat: add chat debug HTTP handlers and API docs (#23918) | ||
|
|
615be176b8 | fix(coderd): add frame-ancestors CSP directive to prevent clickjacking (#24474) | ||
|
|
73b5058923 |
feat: add Explore mode as subagent-only modality (#24448)
> This PR was authored by Mux on behalf of Mike. Introduce Explore mode, a read-only subagent modality for delegated discovery and code investigation. ## What Adds a `spawn_explore_agent` tool that creates child chats restricted to read-only operations. An admin can optionally configure a deployment-wide model override so Explore subagents use a model optimized for large context or reasoning without changing the root chat's model. ### Backend - New `ChatModeExplore` enum value (migration 000471). - `spawn_explore_agent` tool definition with read-only allowlist: `read_file`, `execute`, `process_output`, `read_skill`, `read_skill_file`. Write tools, file editors, and nested subagent spawning are blocked. - Deployment config storage for the Explore model override (`agents_chat_explore_model_override` in `site_configs`). - Model resolution hierarchy: configured override, then current turn model, then global default. Silent fallback with warning log when the override becomes unavailable. - RBAC: `AsChatd` for daemon reads, `ActionRead` and `ActionUpdate` on `ResourceDeploymentConfig` for admin API calls. - Plan mode root chats can use `spawn_explore_agent` for read-only research, matching the planning prompt guidance. - The Explore override config API now reports malformed saved overrides as "treated as unset" so admins can clear them explicitly. ### Frontend - `ExploreModelOverrideSettings` component in admin agent behavior settings. Uses `ModelSelector`, handles unavailable model warnings, and supports explicit Save and Clear actions. - Malformed saved overrides show a warning and require an explicit Save to clear, instead of Clear auto-submitting behind the scenes. ### Tests - Integration: `TestExploreSubagentIsReadOnly` (full spawn flow, tool verification, prompt overlay, DB state). - Unit: tool allowlist tests for explore, plan, and default modes. - Internal: model override resolution with valid, invalid UUID, disabled, and unconfigured override scenarios. - RBAC: `dbauthz_test.go` for `GetChatExploreModelOverride` and `UpsertChatExploreModelOverride`. - API: admin set and clear, malformed stored override reporting, disabled model rejection, non-admin denial. |
||
|
|
1cf0354f72 |
feat: add plan mode with restricted tool boundary (#24236)
> This PR was authored by Mux on behalf of Mike. ## Summary - add persistent plan mode for chats and the chat-specific plan file flow - add structured planning tools such as `ask_user_question` and `propose_plan` - keep `write_file` and `edit_files` constrained to the chat-specific plan file during plan turns - allow shell exploration in plan mode, including subagents, via `execute` and `process_output` - block implementation-oriented, provider-native, MCP, dynamic, and computer-use tools during plan turns - update the chat UI, tests, and docs for the new planning flow |
||
|
|
d7439a9de0 |
feat: add Prometheus metrics for chatd subsystem (#24371)
Adds 7 Prometheus metrics to the chatd subsystem and introduces typed
`ActivityBumpReason` for deadline bump attribution.
| Metric | Type | Labels |
|--------|------|--------|
| `coderd_chatd_chats` | Gauge | `state` (streaming, waiting) |
| `coderd_chatd_message_count` | Histogram | `provider` |
| `coderd_chatd_prompt_size_bytes` | Histogram | `provider` |
| `coderd_chatd_tool_result_size_bytes` | Histogram | `provider`,
`tool_name` |
| `coderd_chatd_ttft_seconds` | Histogram | `provider` |
| `coderd_chatd_compaction_total` | Counter | `provider`, `result` |
| `coderd_chatd_steps_total` | Counter | `provider` |
> 🤖
|
||
|
|
116323d3cf |
feat: graduate web-push from experiment to always-on (#24310)
* Removes experiment `web-push`.
* Falls back to NoopWebpusher in case of error
* Checks browser capability in FE
* Adds note to agents getting-started docs regarding webpush without TLS
> 🤖
|
||
|
|
95cff8c5fb |
feat: add REST API handlers and client methods for user secrets (#24107)
Add the five REST endpoints for managing user secrets, SDK client
methods, and handler tests.
Endpoints:
- `POST /api/v2/users/{user}/secrets`
- `GET /api/v2/users/{user}/secrets`
- `GET /api/v2/users/{user}/secrets/{name}`
- `PATCH /api/v2/users/{user}/secrets/{name}`
- `DELETE /api/v2/users/{user}/secrets/{name}`
Routes are registered under the existing `/{user}` group with
`ExtractUserParam`. The delete query was changed from `:exec` to
`:execrows` so the handler can distinguish "not found" from success
(DELETE with `:exec` silently returns nil for zero affected rows).
|
||
|
|
391b22aef7 |
feat: add CLI commands for managing chat context from workspaces (#24105)
Adds `coder exp chat context add` and `coder exp chat context clear` commands that run inside a workspace to manage chat context files via the agent token. `add` reads instruction and skill files from a directory (defaulting to cwd) and inserts them as context-file messages into an active chat. Multiple calls are additive — `instructionFromContextFiles` already accumulates all context-file parts across messages. `clear` soft-deletes all context-file messages, causing `contextFileAgentID()` to return `!found` on the next turn, which triggers `needsInstructionPersist=true` and re-fetches defaults from the agent. Both commands auto-detect the target chat via `CODER_CHAT_ID` (already set by `agentproc` on chat-spawned processes), or fall back to single-active-chat resolution for the agent. The `--chat` flag overrides both. Also adds sub-agent context inheritance: `createChildSubagentChat` now copies parent context-file messages to child chats at spawn time, so delegated sub-agents share the same instruction context without independently re-fetching from the workspace agent. <details><summary>Implementation details</summary> **New files:** - `cli/exp_chat.go` — CLI command tree under `coder exp chat context` **Modified files:** - `agent/agentcontextconfig/api.go` — `ConfigFromDir()` reads context from an arbitrary directory without env vars - `codersdk/agentsdk/agentsdk.go` — `AddChatContext`/`ClearChatContext` SDK methods - `coderd/workspaceagents.go` — POST/DELETE handlers on `/workspaceagents/me/chat-context` - `coderd/coderd.go` — Route registration - `coderd/database/queries/chats.sql` — `GetActiveChatsByAgentID`, `SoftDeleteContextFileMessages` - `coderd/database/dbauthz/dbauthz.go` — RBAC implementations for new queries - `coderd/x/chatd/subagent.go` — `copyParentContextFiles` for sub-agent inheritance - `cli/root.go` — Register `chatCommand()` in `AGPLExperimental()` **Auth pattern:** Uses `AgentAuth` (same as `coder external-auth`) — agent token via `CODER_AGENT_TOKEN` + `CODER_AGENT_URL` env vars. </details> > 🤖 Generated by Coder Agents --------- Co-authored-by: Michael Suchacz <203725896+ibetitsmike@users.noreply.github.com> |
||
|
|
b969d66978 |
feat: add dynamic tools support for chat API (#24036)
Adds client-executed dynamic tools to the chat API. Dynamic tools are
declared by the client at chat creation time, presented to the LLM
alongside built-in tools, but executed by the client rather than chatd.
This enables external systems (Slack bots, IDE extensions, Discord bots,
CI/CD integrations) to plug custom tools into the LLM chat loop without
modifying chatd's built-in tool set.
Modeled after OpenAI's Assistants API: the chat pauses with
`requires_action` status when the LLM calls a dynamic tool, the client
POSTs results back via `POST /chats/{id}/tool-results`, and the chat
resumes.
See [this example](https://github.com/coder/coder-slackbot-poc) as a
reference for how this is used. It's highly-configurable, which would
enable creating chats from webhooks, periodically polling, or running as
a Slackbot.
<details>
<summary>Design context</summary>
### Architecture
The chatloop **exits** when it encounters dynamic tools and
**re-enters** when results arrive. No blocking channels, no pubsub for
tool results, no in-memory registry. The DB is the only coordination
mechanism.
```
Phase 1 (chatloop):
LLM response → execute built-in tools only →
Persist(assistant + built-in results) →
status = requires_action → chatloop exits
Phase 2 (POST /tool-results):
Persist(dynamic tool results) →
status = pending → wakeCh → chatloop re-enters
```
### Validation (POST /tool-results)
1. Chat status must be `requires_action` (409 if not)
2. Read chat's `dynamic_tools` → set of dynamic tool names
3. Read last assistant message → extract tool-call parts matching
dynamic tool names
4. Submitted tool_call_ids must match exactly (400 for missing/extra)
5. Persist tool-result message parts, set status to `pending`, signal
wake
### Idempotency
Tool call IDs scoped per LLM step. State machine (`requires_action` →
`pending`) is the guard. First POST wins, subsequent get 409.
### Mixed tool calls
When the LLM calls both built-in and dynamic tools in one step, built-in
tools execute immediately. Their results are persisted in phase 1.
Dynamic tool results arrive via POST in phase 2. The LLM sees all
results when the chatloop resumes.
</details>
> 🤖 Generated by Coder Agents
|
||
|
|
233343c010 |
feat: add chat and chat_files cleanup to dbpurge (#23833)
Fixes https://github.com/coder/coder/issues/23910 Adds periodic cleanup of chats and chat files to the dbpurge background goroutine, with a configurable retention period exposed in the Agent settings UI. > 🤖 Written by a Coder Agent. Reviewed by a human. |
||
|
|
919dc299fc |
feat: agent reads context files and discovers skills locally (#23935)
Piggybacks on #23878. Moves instruction file reading and skill discovery from `chatd` (server-side, via multiple `LS`/`ReadFile` round-trips through the agent connection) to the agent itself (local filesystem access). This intentionally drops backward compatibility with older agents that don't support the context-config endpoint. Agents and server are deployed together; there is no rolling-update contract to maintain here. ## What changed The agent's `GET /api/v0/context-config` response now returns `[]ChatMessagePart` directly — the same types chatd persists. This eliminates intermediate type conversions and makes the protocol extensible. | Field | Type | Description | |---|---|---| | `parts` | `[]ChatMessagePart` | Context-file and skill parts, ready to persist | | `working_dir` | `string` | Agent's resolved working directory | Removed from the response: `instructions_dirs`, `instructions_file`, `skills_dirs`, `skill_meta_file`, `mcp_config_files` — the agent reads files locally and returns their content as parts. Removed from chatd: all legacy `LS`/`ReadFile` fallback code (`readHomeInstructionFile`, `readInstructionDirFile`, `DiscoverSkills` via LS, etc). ## Why The previous architecture had the agent resolve paths, serve them over HTTP, then `chatd` make N+1 round-trips back through the agent connection to read files. The agent has direct filesystem access and should just read the files. ## Key design decisions - **Agent returns `ChatMessagePart` directly** — same types chatd persists. No intermediate `InstructionFileEntry`/`SkillEntry` types needed. - **`SkillMeta.MetaFile`** — persisted via `ContextFileSkillMetaFile` on the skill part, so custom meta file names (`CODER_AGENT_EXP_SKILL_META_FILE`) survive across chat turns. - **No pre-read body** — `read_skill` always dials the workspace to fetch the skill body on demand. Simpler than caching the body in the response. - **MCP config paths kept agent-internal** — `MCPConfigFiles()` getter, not sent over the wire. - **No backward compat fallback** — old agents that don't support context-config get no instruction files. This is acceptable since agent and server deploy together. |
||
|
|
7d0a0c6495 | feat: provider key policies and user provider settings (#23751) | ||
|
|
2312e5c428 |
feat: add manual chat title regeneration (#23633)
## Summary
Adds a "Generate new title" action that lets users manually regenerate a
chat's title using richer conversation context than the automatic
first-message title path.
## Changes
### Backend
- **New endpoint:** `POST
/api/experimental/chats/{chatID}/title/regenerate` returns the updated
Chat with a regenerated title
- **Manual title algorithm:** Extracts useful user/assistant text turns
→ selects first user turn + last 3 turns → builds context with gap
markers → renders prompt with anti-recency guidance → calls lightweight
model → normalizes output
- **Helpers:** `extractManualTitleTurns`,
`selectManualTitleTurnIndexes`, `buildManualTitleContext`,
`renderManualTitlePrompt`, `generateManualTitle` — all private, with the
public `Server.RegenerateChatTitle` method
- **SDK:** `ExperimentalClient.RegenerateChatTitle(ctx, chatID) (Chat,
error)`
- Persists title via existing `UpdateChatByID` and broadcasts
`ChatEventKindTitleChange`
### Frontend
- API client method + React Query mutation with cache invalidation
- "Generate new title" menu item (with wand icon) in both TopBar and
Sidebar dropdown menus
- Loading/disabled state while regeneration is in-flight
- Error toast on failure
- Stories updated for both menus
### Tests
- `quickgen_test.go`: Table-driven tests for all 4 helper functions
(turn extraction, index selection, context building, prompt rendering)
- `exp_chats_test.go`: Handler tests (ChatNotFound,
NotFoundForDifferentUser, NoDaemon)
## Design notes
- The existing auto-title path (`maybeGenerateChatTitle`, `titleInput`)
is completely unchanged
- Manual regeneration uses richer context (first user turn + last 3
turns + gap markers) vs the auto path's single first message
- Endpoint is experimental and marked with `@x-apidocgen {"skip": true}`
|
||
|
|
d175e799da |
feat: show agent badge on workspace list (#23453)
- Adds `GET /api/experimental/chats/by-workspace` endpoint that returns workspace_id → latest chat_id mapping - Modifies FE to fetch this alongside the workspace list, gated on `agents` experiment and render an "Agent" badge similar to the existing "Task" badge in `WorkspacesTable` - Badge links to the "latest chat" linked to the given workspace. Notes: - Intentionally uses `fetchWithPostFilter` for RBAC to decouple from workspaces API — will migrate to `workspaces_expanded` view later. - If users have multiple chats linked to the same workspace, the badge will link to the most recently updated one. > 🤖 This PR was created with the help of Coder Agents, and has been reviewed by my human. 🧑💻 |
||
|
|
796872f4de |
feat: add deployment-wide template allowlist for chats (#23262)
- Stores a deployment-wide agents template allowlist in `site_configs` (`agents_template_allowlist`) - Adds `GET/PUT /api/experimental/chats/config/template-allowlist` endpoints - Filters `list_templates`, `read_template`, and `create_workspace` chat tools by allowlist, if defined (empty=all allowed) - Add "Templates" admin settings tab in Agents UI ([what it looks like](https://624de63c6aacee003aa84340-sitjilsyrr.chromatic.com/?path=/story/pages-agentspage-agentsettingspageview--template-allowlist)) > 🤖 This PR was created with the help of Coder Agents, and has been reviewed by my human. 🧑💻 |
||
|
|
c0a323a751 |
fix(coderd): use DB liveness for chat workspace reuse (#23551)
create_workspace could create a replacement workspace after a single 5s agent dial failed, even when the existing workspace agent had recently checked in. That made temporary reachability blips look like dead workspaces and let chatd replace a running workspace too aggressively. Use the workspace agent's DB-backed status with the deployment's AgentInactiveDisconnectTimeout before allowing replacement. Recently connected and still-connecting agents now reuse the existing workspace, while disconnected or timed-out agents still allow a new workspace. This also threads the inactivity timeout through chatd and adds focused coverage for the reuse and replacement branches. |
||
|
|
82f965a0ae |
feat: per-user per-model chat compaction threshold overrides (#23412)
## What
Adds per-user per-model auto-compaction threshold overrides. Users can
now customize the percentage of context window usage that triggers chat
compaction, independently for each enabled model.
## Why
The compaction threshold was previously only configurable at the
deployment level (`chat_model_configs.compression_threshold`). Different
users have different preferences — some want aggressive compaction to
keep costs low, others prefer higher thresholds to retain more context.
This gives users control without requiring admin intervention.
## Architecture
**Storage:** Reuses the existing `user_configs` table (no migration
needed). Overrides are stored as key/value pairs with keys shaped
`chat_compaction_threshold:<modelConfigID>` and integer percent values.
**API:** Three new experimental endpoints under
`/api/experimental/chats/config/`:
- `GET /user-compaction-thresholds` — list all overrides for the current
user
- `PUT /user-compaction-thresholds/{modelConfig}` — upsert an override
(validates model exists and is enabled, validates 0–100 range)
- `DELETE /user-compaction-thresholds/{modelConfig}` — clear an override
(idempotent)
**Runtime resolution:** In `coderd/chatd/chatd.go`, a new
`resolveUserCompactionThreshold()` helper runs at the start of each chat
turn (inside `runChat()`), after the model config is resolved but before
`CompactionOptions` is built. If a valid override exists, it replaces
`modelConfig.CompressionThreshold`. The threshold source
(`user_override` vs `model_default`) is logged with each compaction
event.
**Precedence:** `effectiveThreshold = userOverride ??
modelConfig.CompressionThreshold`
**UI:** New "Context Compaction" subsection in the Agents → Settings →
Behavior tab, placed after Personal Instructions. Shows one row per
enabled model with the system default, a number input for the override,
and Save/Reset controls.
## Testing
- 9 API subtests covering CRUD, validation (boundary values 0/100,
out-of-range rejection), upsert behavior, idempotent delete, user
isolation, and non-existent model config
- 4 dbauthz tests (16 scenarios) verifying `ActionReadPersonal` /
`ActionUpdatePersonal` on all query methods
- 4 Storybook stories with play functions (Default, WithOverrides,
Loading, Error)
<details>
<summary>Implementation plan</summary>
### Phase 1 — Tests
- Backend API tests in `coderd/chats_test.go` (9 subtests)
- Database auth wrapper tests in
`coderd/database/dbauthz/dbauthz_test.go` (4 methods)
- Frontend stories in `UserCompactionThresholdSettings.stories.tsx` (4
stories)
### Phase 2 — Backend preference surface
- 4 SQL queries in `coderd/database/queries/users.sql` (list, get,
upsert, delete)
- `make gen` to propagate into generated artifacts
- Auth/metrics wrappers in dbauthz and dbmetrics
- SDK types and client methods in `codersdk/chats.go`
- HTTP handlers and routes in `coderd/chats.go` and `coderd/coderd.go`
- Key prefix constant shared between handlers and runtime
### Phase 3 — Runtime override
- `resolveUserCompactionThreshold()` helper in `coderd/chatd/chatd.go`
- Override injection in `runChat()` before building `CompactionOptions`
- `threshold_source` field added to compaction log
### Phase 4 — Settings UI
- API client methods and React Query hooks in `site/src/api/`
- `UserCompactionThresholdSettings` component extracted from
`SettingsPageContent`
- Per-model mutation tracking (only the active row disables during save)
- 100% warning, "System default" label, helpful empty state copy
### Phase 5 — Refactor and review fixes
- Consolidated key prefix constant in `codersdk`
- Explicit PUT range validation (not just struct tags)
- GET handler gracefully skips malformed rows instead of 500
- Boundary value, upsert, and non-existent model config tests
- UX improvements: per-model mutation state, aria-live on errors
</details>
|
||
|
|
80a172f932 |
chore: move chatd and related packages to /x/ subpackage (#23445)
- Moves `coderd/chatd/`, `coderd/gitsync/`, `enterprise/coderd/chatd/` under `x/` parent directories to signal instability - Adds `Experimental:` glue code comments in `coderd/coderd.go` > 🤖 This PR was created with the help of Coder Agents, and was reviewed by my human. 🧑💻 |
||
|
|
ea37f1ff86 |
feat: pass session token as query param on agent chat WebSockets (#23405)
## Problem When the Coder chat UI is embedded in a VS Code webview, the session token is set via the Coder-Session-Token header for HTTP requests. However, browsers cannot attach custom headers to WebSocket connections, and VS Code Electron webview environment does not support cookies set via Set-Cookie from iframe origins. This causes all chat WebSocket connections to fail with authorization errors. ## Solution Pass the session token as a coder_session_token query parameter on all chat-related WebSocket connections. The backend already accepts this parameter (see APITokenFromRequest in coderd/httpmw/apikey.go). The token is only included when API.getSessionToken() returns a value, which only happens in the embed bootstrap flow. Normal browser sessions use cookies and are unaffected. > Built with [Coder Agents](https://coder.com/agents) |
||
|
|
ff8dcca2c7 |
feat: add global chat workspace TTL setting (#23265)
- Add `agents_workspace_ttl` site config (default: whatever the template says a.k.a. `0s`) - Expose via GET/PUT `/api/experimental/chats/config/workspace-ttl` - Chat tool reads setting and passes `TTLMillis` on workspace creation - Existing autostop infrastructure handles the rest (zero changes to LifecycleExecutor, CalculateAutostop, or activity bumping) - ⚠️ Template-level `UserAutostopEnabled=false` overrides this global default. Not touching this. - Frontend: "Workspace Lifetime" control in /agents/settings Behavior tab (admin-only) > This PR was created with the help of Coder Agents, and has been reviewed by several humans and robots. 🤖🤝🧑💻 |
||
|
|
83809bb380 |
feat: add token-to-cookie endpoint for embedded chat WebSocket auth (#23280)
## Problem The VS Code extension embeds the Coder agent chat UI in an iframe, passing the session token via `postMessage`. HTTP requests use the `Coder-Session-Token` header, but browser WebSocket connections **cannot carry custom headers** — they rely on cookies. This causes all WebSocket requests (e.g. streaming chat messages) to fail with authorization errors in the embedded iframe. ## Solution Add `POST /api/v2/users/me/session/token-to-cookie` — a lightweight endpoint that converts the current (already-validated) session token into a `Set-Cookie` response. The frontend embed bootstrap flow calls this immediately after `API.setSessionToken(token)`, before any WebSocket connections are opened. ### Backend (`coderd/userauth.go`, `coderd/coderd.go`) - New handler `postSessionTokenCookie` behind `apiKeyMiddleware`. - Reads the validated token via `httpmw.APITokenFromRequest(r)`. - Sets an `HttpOnly` cookie with the API key's expiry, applying site-wide cookie config (Secure, SameSite, host prefix) via `HTTPCookies.Apply`. - Returns `204 No Content`. ### Frontend (`site/src/pages/AgentsPage/EmbedContext.tsx`) - `bootstrapChatEmbedSessionFn` now calls the new endpoint after setting the header token and before fetching user/permissions. - The cookie is in place before any WebSocket connections are opened. ## Security - **No privilege escalation**: The token is already valid — this just moves it from a header credential to a cookie credential. - **POST only**: Avoids CSRF-via-navigation. - **Same origin**: The iframe loads from the Coder server, so the cookie applies to the correct domain. - **HttpOnly**: The cookie is not accessible to JavaScript. > Built with [Coder Agents](https://coder.com/agents) 🤖 |
||
|
|
d8ff67fb68 |
feat: add MCP server configuration backend for chats (#23227)
## Summary
Adds the database schema, API endpoints, SDK types, and encryption
wrappers for admin-managed MCP (Model Context Protocol) server
configurations that chatd can consume. This is the backend foundation
for allowing external MCP tools (Sentry, Linear, GitHub, etc.) to be
used during AI chat sessions.
## Database
Two new tables:
- **`mcp_server_configs`**: Admin-managed server definitions with URL,
transport (Streamable HTTP / SSE), auth config (none / OAuth2 / API key
/ custom headers), tool allow/deny lists, and an availability policy
(`force_on` / `default_on` / `default_off`). Includes CHECK constraints
on transport, auth_type, and availability values.
- **`mcp_server_user_tokens`**: Per-user OAuth2 tokens for servers
requiring individual authentication. Cascades on user/config deletion.
New column on `chats` table:
- **`mcp_server_ids UUID[]`**: Per-chat MCP server selection, following
the same pattern as `model_config_id` — passed at chat creation,
changeable per-message with nil-means-no-change semantics.
## API Endpoints
All routes are under `/api/experimental/mcp/servers/` and gated behind
the `agents` experiment.
**Admin endpoints** (`ResourceDeploymentConfig` auth):
- `POST /` — Create MCP server config
- `PATCH /{id}` — Update MCP server config (full-replace)
- `DELETE /{id}` — Delete MCP server config
**Authenticated endpoints** (all users, enabled servers only for
non-admins):
- `GET /` — List configs (admins see all, members see enabled-only with
admin fields redacted)
- `GET /{id}` — Get config by ID (with `auth_connected` populated
per-user)
**OAuth2 per-user auth flow:**
- `GET /{id}/oauth2/connect` — Initiate OAuth2 flow (state cookie CSRF
protection)
- `GET /{id}/oauth2/callback` — Handle OAuth2 callback, store tokens
- `DELETE /{id}/oauth2/disconnect` — Remove stored OAuth2 tokens
## Security
- **Secrets never returned**: `OAuth2ClientSecret`, `APIKeyValue`, and
`CustomHeaders` are never in API responses — only boolean indicators
(`has_oauth2_secret`, `has_api_key`, `has_custom_headers`).
- **Field redaction for non-admins**: `convertMCPServerConfigRedacted`
strips `OAuth2ClientID`, auth URLs, scopes, and `APIKeyHeader` from
non-admin responses.
- **dbcrypt encryption at rest**: All 5 secret fields use `dbcrypt_keys`
encryption with full encrypt-on-write / decrypt-on-read wrappers (11
dbcrypt method overrides + 2 helpers), following the same pattern as
`chat_providers.api_key`.
- **OAuth2 CSRF protection**: State parameter stored in `HttpOnly`
cookie with `HTTPCookies.Apply()` for correct `Secure`/`SameSite` behind
TLS-terminating proxies.
- **dbauthz authorization**: All 18 querier methods have authorization
wrappers. Read operations use `ActionRead`, write operations use
`ActionUpdate` on `ResourceDeploymentConfig`.
## Governance Model
| Control | Implementation |
|---------|---------------|
| **Global kill switch** | `enabled` defaults to `false` |
| **Availability policy** | `force_on` (always injected), `default_on`
(pre-selected), `default_off` (opt-in) |
| **Per-chat selection** | `mcp_server_ids` on `CreateChatRequest` /
`CreateChatMessageRequest` |
| **Auth gate** | OAuth2 servers require per-user auth before tools are
injected |
| **Tool-level allow/deny** | Arrays on `mcp_server_configs` for
granular tool filtering |
| **Secrets encrypted at rest** | Uses `dbcrypt_keys` (same pattern as
`chat_providers.api_key`) |
## Tests
8 test functions covering:
- Full CRUD lifecycle (create, list, update, delete)
- Non-admin visibility filtering (enabled-only, field redaction)
- `auth_connected` population for OAuth2 vs non-OAuth2 servers
- Availability policy validation (valid values + invalid rejection)
- Unique slug enforcement (409 Conflict)
- OAuth2 disconnect idempotency
- Chat creation with `mcp_server_ids` persistence
## Known Limitations (Deferred)
These are documented and intentional for an experimental feature:
- **Audit logging** not yet wired — will add when feature stabilizes
- **Cross-field validation** (e.g., OAuth2 fields required when
`auth_type=oauth2`) — admin-only endpoint, will add when stabilizing
- **`force_on` auto-injection** — query exists but not yet wired into
chatd tool injection (follow-up)
- **Additional test coverage** — 403 auth tests, GET-by-ID tests,
callback CSRF tests planned for follow-up
## What's NOT in this PR
- Frontend UI (admin panel + chat picker)
- Actual MCP client connections (`chatd/chatmcp/` manager)
- Tool injection into `chatloop/`
|
||
|
|
7a98b4a876 |
fix(coderd): gate OAuth2 well-known endpoints behind experiment flag (#23278)
- Add `RequireExperimentWithDevBypass` middleware to `/.well-known/oauth-authorization-server` and `/.well-known/oauth-protected-resource` routes, matching the existing `/oauth2` routes. - Clients can now detect OAuth2 support via unauthenticated discovery (404 = not available). Fixes #21608 |
||
|
|
be1c06dec9 |
feat: add endpoint and CLI for users to view their own OIDC claims (#23053)
- Adds a new API endpoint `GET /api/v2/users/oidc-claims` that returns only the **merged claims** (not the separate id_token/userinfo breakdown). Scoped exclusively to the authenticated user's own identity — no user parameter, so users cannot view each other's claims. - Adds a new CLI command:** `coder users oidc-claims` that hits the above endpoint. - The existing owner-only debug endpoint is preserved unchanged for admins who need the full claim breakdown. > 🤖 This PR was created with the help of Coder Agents, and will be reviewed by my human. 🧑💻 |
||
|
|
14ed3e3644 |
feat: bump workspace last_used_at on chat heartbeat (#23205)
- coderd: Wires `options.WorkspaceUsageTracker` into the chatd config. - chatd: Adds `UsageTracker` and calls `UsageTracker.Add(workspaceID)` on each heartbeat tick - chatd: adds tests to verify `last_used_at` bump behaviour > 🤖 This PR was created with the help of Coder Agents, and will be reviewed by my human. 🧑💻 |
||
|
|
90cf4f0a91 |
refactor: consolidate chat streaming endpoints under /stream (#23248)
Moves per-chat streaming/watch endpoints under a `/stream` sub-route for
better API consistency:
| Before | After |
|--------|-------|
| `GET /{chat}/stream` | `GET /{chat}/stream/` |
| `GET /{chat}/desktop` | `GET /{chat}/stream/desktop` |
| `GET /{chat}/git/watch` | `GET /{chat}/stream/git` |
### Changes
- **`coderd/coderd.go`** — Route definitions: replaced flat routes with
`r.Route("/stream", ...)` sub-router
- **`site/src/api/api.ts`** — Updated WebSocket URLs for `watchChatGit`
and `watchChatDesktop`
- **`coderd/chats_test.go`** — Updated desktop test URL
- **`coderd/workspaceagents_internal_test.go`** — Updated git watcher
test URLs (route mounts + dial URLs)
- **`site/src/pages/AgentsPage/AgentDetail.stories.tsx`** — Updated
storybook WebSocket mock paths
|
||
|
|
0b13ba978a |
fix: rename chat logger from coderd.chats.chat-processor to coderd.chatd.processor (#23246)
- Rename logger `coderd.chats` to `coderd.chatd` in `coderd.go` - Rename sub-logger `chat-processor` to `processor` in `chatd/chatd.go` |
||
|
|
d6fef96d72 |
feat: add PR insights analytics dashboard (#23215)
## What Adds a new admin-only **PR Insights** page for the `/agents` analytics view — a dashboard for engineering leaders to understand code shipped by AI agents. ### Backend - `GET /api/v2/chats/insights/pull-requests` — admin-only endpoint - 4 SQL queries in `chatinsights.sql` aggregating `chat_diff_statuses` joined with chat cost data (via root chat tree rollup) - Runs 5 parallel DB queries: current summary, previous summary (for trends), time series, per-model breakdown, recent PRs - SDK types auto-generate to TypeScript ### Frontend (`PRInsightsView`) - **Stat cards**: PRs created, Merged, Merge rate, Lines shipped, Cost/merged PR — with trend badges comparing to previous period - **Activity chart**: Stacked area chart (created/merged/closed) using git color tokens (`git-added-bright`, `git-merged-bright`, `git-deleted-bright`) - **Model performance table**: Per-model PR counts, inline merge rate bars, diff stats, cost breakdown - **Recent PRs table**: Status badges, review state icons, author info, external links - **Time range filter**: 7d/14d/30d/90d button group - **4 Storybook stories**: Default, HighPerformance, LowVolume, NoPRs ### Data source All PR data comes from the existing `chat_diff_statuses` table (populated by the `gitsync.Worker` background job that polls GitHub every 120s). No new data collection required. ### Screenshot View in Storybook: `pages/AgentsPage/PRInsightsView` |
||
|
|
fc3508dc60 |
feat: configure acquire chat batch size (#23196)
## Summary - add a hidden deployment config option for chat acquire batch size (`CODER_CHAT_ACQUIRE_BATCH_SIZE` / `chat.acquireBatchSize`) - thread the configured value into chatd startup while preserving the existing default of `10` - clamp the deployment value to the `int32` range before passing it into chatd - regenerate the API/docs/types/testdata artifacts for the new config field ## Why `chatd` currently acquires pending chats in batches of `10` via a compile-time default. This change makes that batch size operator-configurable from deployment config, so we can tune acquisition behavior without another code change. |
||
|
|
2cf47ec384 |
feat: virtual desktop settings toggle backend (#23171)
Adds a new `site_config` entry that controls whether the virtual desktop feature for Coder Agents is enabled. It can be set via a new `/api/experimental/chats/config/desktop-enabled` endpoint, which will be used by the frontend. |
||
|
|
075dfecd12 |
refactor: consolidate experimental chats API types (#23143)
## Summary
Consolidates three areas of type duplication in the experimental chats
API:
### 1. Merge archive/unarchive into `PATCH /{chat}`
- **Before:** `POST /{chat}/archive` + `POST /{chat}/unarchive` (two
endpoints, two handlers with mirrored logic)
- **After:** `PATCH /{chat}` accepting `{ "archived": true/false }` via
`UpdateChatRequest`
- Removes one endpoint and ~30 lines of duplicated handler code
### 2. Collapse identical request/response prompt types
- `ChatSystemPromptResponse` + `UpdateChatSystemPromptRequest` →
`ChatSystemPrompt`
- `UserChatCustomPromptResponse` + `UpdateUserChatCustomPromptRequest` →
`UserChatCustomPrompt`
- These pairs were field-for-field identical (single string field)
### 3. Merge duplicate reasoning options types
- `ChatModelOpenRouterReasoningOptions` +
`ChatModelVercelReasoningOptions` → `ChatModelReasoningOptions`
- Same 4 fields, same types — only field ordering and enum value sets
differed
- Unified type uses the superset of enum values
### Files changed
- `codersdk/chats.go` — SDK types and client methods
- `coderd/chats.go` — Handler consolidation
- `coderd/coderd.go` — Route change
- `coderd/chats_test.go` — Test updates
- `site/src/api/api.ts` — Frontend API client
- `site/src/api/queries/chats.ts` — Query mutations
- `site/src/api/queries/chats.test.ts` — Test mocks
- `site/src/pages/AgentsPage/AgentsPage.tsx` — Call site
- Generated files (`typesGenerated.ts`,
`chatModelOptionsGenerated.json`)
### Testing
- All Go tests pass (`TestArchiveChat`, `TestUnarchiveChat`,
`TestChatSystemPrompt`)
- All frontend tests pass (31/31 in `chats.test.ts`)
|
||
|
|
1031da9738 |
feat: add agent chat spend limiting (backend) (#23071)
Introduces deployment-scoped spend limiting for Coder Agents, enabling administrators to control LLM costs at global, group, and individual user levels. ## Changes - **Database migration (000437)**: `chat_usage_limit_config` (singleton), `chat_usage_limit_overrides` (per-user), `chat_usage_limit_group_overrides` (per-group) - **Single-query limit resolution**: individual override > min(group) > global default via `ResolveUserChatSpendLimit` - **Fail-open enforcement** in chatd with documented TOCTOU trade-off - **Experimental API** under `/api/experimental/chats/usage-limits` for CRUD on limits - **`AsChatd` RBAC subject** for narrowly-scoped daemon access (replaces `AsSystemRestricted`) - **Generated TypeScript types** for the frontend SDK ## Hierarchy 1. Individual user override (highest) 2. Minimum of group limits 3. Global default 4. Disabled / unlimited Currency stored as micro-dollars (`1,000,000` = $1.00). Frontend PR: #23072 |
||
|
|
93b9d70a9b |
chore: add audit log entry when ai seat is consumed (#22683)
When an ai seat is consumed, an audit log entry is made. This only happens the first time a seat is used. |
||
|
|
abf59ee7a6 |
feat: track ai seat usage (#22682)
When a user uses an AI feature, we record them in the `ai_seat_state` as consuming a seat. Added in debouching to prevent excessive writes to the db for this feature. There is no need for frequent updates. |
||
|
|
27cbf5474b |
refactor: remove /diff-status endpoint, include diff_status in chat payload (#23082)
The `/chats/{chat}/diff-status` endpoint was redundant because:
- The `Chat` type already has a `DiffStatus` field
- Listing chats already resolves and returns `diff_status`
- The `getChat` endpoint was the only one not resolving it (passing
`nil`)
## Changes
**Backend:**
- `getChat` now calls `resolveChatDiffStatus` and includes the result in
the response
- Removed `getChatDiffStatus` handler, route (`GET /diff-status`), and
SDK method
- Tests updated to use `GetChat` instead of `GetChatDiffStatus`
**Frontend:**
- `AgentDetail.tsx`: uses `chatQuery.data?.diff_status` instead of
separate query
- `RemoteDiffPanel.tsx`: accepts `diffStatus` as a prop instead of
fetching internally
- `AgentsPage.tsx`: `diff_status_change` events now invalidate the chat
query
- Removed `chatDiffStatus` query, `chatDiffStatusKey`, and
`getChatDiffStatus` API method
|
||
|
|
36665e17b2 |
feat: add WatchAllWorkspaceBuilds endpoint for autostart scaletests (#22057)
This PR adds a `WatchAllWorkspaces` function with `watch-all-workspaces` endpoint, which can be used to listen on a single global pubsub channel for _all_ workspace build updates, and makes use of it in the autostart scaletest. This negates the need to use a workspace watch pubsub channel _per_ workspace, which has auth overhead associated with each call. This is especially relevant in situations such as the autostart scaletest, where we need to start/stop a set of workspaces before we can configure their autostart config. The overhead associated with all the watch requests skews the scaletest results and makes it harder to reason about the performance of the autostart feature itself. The autostart scaletest also no longer generates its own metrics nor does it wait for all the workspaces to actually start via autostart. We should update the scaletest dashboard after both PRs are merged to measure autostart performance via the new metrics. The new function/endpoint and its usage in the autostart scaletest are gated behind an experiment feature flag, this is something we should discuss whether we want to enable the endpoint in prod by default or not. If so, we can remove the experiment. --------- Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Callum Styan <callum@coder.com> |
||
|
|
84527390c6 |
feat: chat desktop backend (#23005)
Implement the backend for the desktop feature for agents. - Adds a new `/api/experimental/chats/$id/desktop` endpoint to coderd which exposes a VNC stream from a [portabledesktop](https://github.com/coder/portabledesktop) process running inside the workspace - Adds a new `spawn_computer_use_agent` tool to chatd, which spawns a subagent that has access to the `computer` tool which lets it interact with the `portabledesktop` process running inside the workspace - Adds the plumbing to make the above possible There's a follow up frontend PR here: https://github.com/coder/coder/pull/23006 |
||
|
|
c3b6284955 |
feat: add chat cost analytics backend (#23036)
Add cost tracking for LLM chat interactions with microdollar precision. ## Changes - Add `chatcost` package for per-message cost calculation using `shopspring/decimal` for intermediate arithmetic - **Ceil rounding policy**: fractional micros round UP to next whole micro (applied once after summing all components) - Database migration: `total_cost_micros` BIGINT column with historical backfill and `created_at` index - API endpoints: per-user cost summary and admin rollup under `/api/experimental/chats/cost/` - SDK types: `ChatCostSummary`, `ChatCostModelBreakdown`, `ChatCostUserRollup` - Fix `modeloptionsgen` to handle `decimal.Decimal` as opaque numeric type - Update frontend pricing test fixtures for string decimal types ## Design decisions - `NULL` = unpriced (no matching model config), `0` = free - Reasoning tokens included in output tokens (no double-counting) - Integer microdollars (BIGINT) for storage and API responses - Price config uses `decimal.Decimal` for exact parsing; totals use `int64` Frontend: #23037 |
||
|
|
df2360f56a |
feat(coderd): add consolidated /debug/profile endpoint for pprof collection (#22892)
## Summary Adds a new `GET /api/v2/debug/profile` endpoint that collects multiple pprof profiles in a single request and returns them as a tar.gz archive. This allows collecting profiles (including block and mutex) without requiring `CODER_PPROF_ENABLE` to be set, and without restarting `coderd`. Closes #21679 ## What it does The endpoint: - Temporarily enables block and mutex profiling (normally disabled at runtime) - Runs CPU profile and/or trace for a configurable duration (default 10s, max 60s) - Collects snapshot profiles (heap, allocs, block, mutex, goroutine, threadcreate) - Returns a tar.gz archive containing all requested `.prof` files - Uses an atomic bool to prevent concurrent collections (returns 409 Conflict) - Is protected by the existing debug endpoint RBAC (owner-only) **Supported profile types:** cpu, heap, allocs, block, mutex, goroutine, threadcreate, trace **Query parameters:** - `duration`: How long to run timed profiles (default: `10s`, max: `60s`) - `profiles`: Comma-separated list of profile types (default: `cpu,heap,allocs,block,mutex,goroutine`) ## Additional changes - **SDK client method** (`codersdk.Client.DebugCollectProfile`) for easy programmatic access - **`coder support bundle --pprof` integration**: tries the consolidated endpoint first, falls back to individual `/debug/pprof/*` endpoints for older servers - **8 new tests** covering defaults, custom profiles, trace+CPU, validation errors, authorization, and conflict detection |
||
|
|
690e3a87d8 |
feat: move chat messages to dedicated /chats/{id}/messages endpoint (#23021)
## Summary
Moves the messages response out of `GET /chats/{id}` and into a
dedicated `GET /chats/{id}/messages` endpoint.
### Backend
- `GET /chats/{id}` now returns just the `Chat` object (no messages)
- `GET /chats/{id}/messages` is a new endpoint returning
`ChatMessagesResponse` with `messages` and `queued_messages`
- Added `ChatMessagesResponse` SDK type and `GetChatMessages` client
method
### Frontend
- `getChat()` API method returns `Chat` instead of `ChatWithMessages`
- Added `getChatMessages()` API method for the new endpoint
- Split `chatQuery` into two: `chatQuery` (metadata) and
`chatMessagesQuery` (messages)
- Updated all cache mutations, optimistic updates, and websocket
handlers
- Updated tests and stories
### Files changed
| File | Change |
|---|---|
| `coderd/coderd.go` | Register `GET /messages` route |
| `coderd/chats.go` | Simplify `getChat`, add `getChatMessages` handler
|
| `codersdk/chats.go` | New type + method, update `GetChat` return |
| `site/src/api/api.ts` | New method, update `getChat` |
| `site/src/api/queries/chats.ts` | New query, update cache mutations |
| `site/src/pages/AgentsPage/AgentDetail.tsx` | Use separate queries |
| `site/src/pages/AgentsPage/AgentDetail/ChatContext.ts` | Update types
and cache writes |
| `site/src/pages/AgentsPage/AgentsPage.tsx` | Update websocket cache
handler |
|
||
|
|
bc27274aba |
feat(coderd): refactors github pr sync functionality (#22715)
- Adds `_API_BASE_URL` to `CODER_EXTERNAL_AUTH_CONFIG_` - Extracts and refactors existing GitHub PR sync logic to new packages `coderd/gitsync` and `coderd/externalauth/gitprovider` - Associated wiring and tests Created using Opus 4.6 |
||
|
|
b6d1a11c58 |
feat(chatd): add user-level custom prompt for agent chats (#22896)
Adds a user-level custom prompt to the database. I'll be doing a follow-up for the UI, as we currently do not have user-level settings (it's just admin). I'll also make it very obvious for chats where there is a user-level prompt, but I don't know how yet. |
||
|
|
c933ddcffd |
fix(agents): persist system prompt server-side instead of localStorage (#22857)
## Problem The Admin → Agents → System Prompt textarea saved only to the browser's `localStorage`. The value was never sent to the backend, never stored in the database, and never injected into chats. Entering text, clicking Save, and refreshing the page showed no changes — the prompt was effectively a no-op. ## Root Cause Three disconnected layers: 1. **Frontend** wrote to `localStorage`, never called an API. 2. **`handleCreateChat`** never read `savedSystemPrompt`. 3. **Backend** hardcoded `chatd.DefaultSystemPrompt` on every chat creation — no field in `CreateChatRequest` accepted a custom prompt. ## Changes ### Database - Added `GetChatSystemPrompt` / `UpsertChatSystemPrompt` queries on the existing `site_configs` table (no migration needed). ### API - `GET /api/experimental/chats/system-prompt` — returns the configured prompt (any authenticated user). - `PUT /api/experimental/chats/system-prompt` — sets the prompt (admin-only, `rbac: deployment_config update`). - Input validation: max 32 KiB prompt length. ### Backend - `resolvedChatSystemPrompt(ctx)` checks for a custom prompt in the DB, falls back to `chatd.DefaultSystemPrompt` when empty/unset. - Logs a warning on DB errors instead of silently swallowing them. - Replaced the hardcoded `defaultChatSystemPrompt()` call in chat creation. ### Frontend - Replaced `localStorage` read/write with React Query `useQuery`/`useMutation` backed by the new endpoints. - Fixed `useEffect` draft sync to avoid clobbering in-progress user edits on refetch. - Added `try/catch` error handling on save (draft stays dirty for retry). - Save button disabled during mutation (`isSavingSystemPrompt`). - Query key follows kebab-case convention (`chat-system-prompt`). ### UX - Added hint: "When empty, the built-in default prompt is used." ### Tests - `TestChatSystemPrompt`: GET returns empty when unset, admin can set, non-admin gets 403. - dbauthz `TestMethodTestSuite` coverage for both new querier methods. |