mirror of
https://github.com/coder/coder.git
synced 2026-06-03 13:08:25 +00:00
06bad73df4
## Summary
Add the **admin-configurable advisor configuration**: database-backed storage, SDK types, and the experimental HTTP handlers that back the admin settings UI (later PRs). Follows the same "site-configs" pattern as Virtual Desktop.
## Motivation
The advisor needs runtime-tunable knobs (enable/disable, per-run cap, max output tokens, reasoning effort, optional model override) without a service restart or redeploy. Using the existing `site_configs` K/V table keeps this pattern consistent with other admin features and avoids a bespoke schema.
## Changes
### Database (`coderd/database/queries/siteconfig.sql`)
- `GetChatAdvisorConfig` returns the stored JSON blob (default `'{}'`) under key `agents_advisor_config`.
- `UpsertChatAdvisorConfig` uses the standard `INSERT ... ON CONFLICT` pattern.
- Regenerated via `make gen` (queries.sql.go + mocks).
### SDK (`codersdk/chats.go`)
- `AdvisorConfig` type with `Enabled`, `MaxUsesPerRun`, `MaxOutputTokens`, `ReasoningEffort` (`""` / `low` / `medium` / `high`), `ModelConfigID uuid.UUID`.
- Client methods: `ChatAdvisorConfig(ctx)` / `UpdateChatAdvisorConfig(ctx, cfg)`.
### API (`coderd/exp_chats.go`)
- `GET /api/experimental/chats/config/advisor`: reads current config; relies on `ActorFromContext` validation.
- `PUT /api/experimental/chats/config/advisor`: requires `policy.ActionUpdate` on `rbac.ResourceDeploymentConfig`.
- Handlers unmarshal `{}` to a typed zero value and re-marshal on upsert for schema stability.
- Tests in `exp_chats_test.go` cover empty defaults, round-trip update, unauthorized update, and invalid body.
## Stack context
This is **PR 3 of 6** in the advisor feature stack. Consumed by:
- PR 4 (`feat/advisor-04-chatd-runtime`), which reads this config on every `runChat`.
- PR 6 (`feat/advisor-06-admin-settings-ui`), which renders the admin form.
## Scope / non-goals
- No `chatd` read path (lands in PR 4).
- No UI (lands in PR 6).
- `agents_advisor_config` remains a single-row JSON blob; we intentionally do not shard per-org/per-template yet.
## Validation
- `make gen`
- `go test ./coderd/database/... -run TestChatAdvisor`
- `go test ./coderd/... -run TestChatAdvisorConfig`
- `make lint`
---
<details>
<summary>📋 Implementation Plan (shared across the advisor stack)</summary>
# Plan: Add a Mux-style advisor tool to coder agents/chatd
## Outcome
Add a first-class `advisor` tool to agent chats in `coderd/x/chatd` that feels native to Coder:
- it is a built-in server-side tool, not an MCP/dynamic-tool workaround;
- it performs a nested **tool-less** model call for strategic advice;
- it is exposed only when eligible, and the prompt mentions it only when it is actually available;
- it is treated as a **planning-only** tool so it does not run alongside action tools in the same batch;
- it tracks usage/cost separately enough for operators to reason about it;
- it has a minimally polished UI in the Agents page;
- and it ships with explicit dogfooding evidence, including screenshots and repro videos.
## Design decisions to lock before coding
1. **Primary architecture:** native built-in tool in `chattool/`, backed by a small `chatadvisor` package.
2. **Nested model execution:** reuse chatd's existing model/provider stack for a one-step, tool-less advisor call rather than inventing a new provider pathway.
3. **Execution policy:** treat `advisor` as an exclusive/planning-only tool; mixed batches must return structured policy errors and force the model to retry cleanly.
4. **Availability:** initial rollout is for root agent chats only; disable for child/sub-agent chats until recursion/cost policy is proven.
5. **Prompt sync:** use one eligibility boolean to drive both tool registration and advisor guidance injection.
6. **Persistence/cost split:** MVP should keep advisor usage visible in result metadata and server metrics; only add DB schema if product/billing explicitly needs queryable advisor-specific cost.
7. **UI scope:** generic tool rendering is an acceptable temporary milestone during backend bring-up, but the release candidate should include a dedicated lightweight advisor renderer.
## Delivery model
The work should be executed as coordinated workstreams with one integration owner and parallel contributors for low-conflict areas. The integration owner should own `coderd/x/chatd/chatd.go` because prompt assembly, tool registration, and model resolution all converge there.
## Detailed workstreams
### Repo evidence used for this plan
<details>
<summary>Mux reference and current chatd seams</summary>
**Mux reference implementation**
- `src/node/services/tools/advisor.ts` — native advisor tool implementation.
- `src/common/constants/advisor.ts` — advisor prompt/constants and truncation policy.
- `src/common/utils/tools/tools.ts` — conditional tool registration.
- `src/node/services/streamContextBuilder.ts` — injects advisor guidance only when the tool is available.
**Current chatd seams**
- `coderd/x/chatd/chatd.go`
- `processChat()` — tool assembly, prompt assembly, and chatloop invocation.
- `resolveChatModel()` — current model/provider/key resolution seam.
- `type Config struct` — server-level chatd configuration surface.
- `coderd/x/chatd/chatloop/chatloop.go`
- `Run()` — main streaming/model loop.
- `executeTools()` — built-in tool execution/batching seam.
- `coderd/x/chatd/chattool/` — built-in tool implementations.
- `site/src/pages/AgentsPage/components/ChatElements/tools/Tool.tsx` — tool renderer dispatch.
- `site/src/pages/AgentsPage/components/ChatConversation/messageParsing.ts` and `ConversationTimeline.tsx` — tool/result merge and rendering flow.
</details>
### Workstream map and ownership
| Workstream | Primary owner | Main files | Can run in parallel? | Done when |
|---|---|---|---|---|
| 0. Integration + gating | Integration lead | `coderd/x/chatd/chatd.go` | No; central merge lane | Tool registration, prompt sync, and model selection are wired together |
| 1. Advisor runtime + tool | Backend agent | new `coderd/x/chatd/chatadvisor/`, new `coderd/x/chatd/chattool/advisor.go` | Yes | Tool can perform a tool-less advisor call in memory and return structured results |
| 2. Planning-only execution policy | Chatloop agent | `coderd/x/chatd/chatloop/chatloop.go`, related tests | Yes | Mixed `advisor` + action-tool batches are rejected cleanly and deterministically |
| 3. Metrics/usage/config | Backend/telemetry agent | `chatd.go`, `chatloop/metrics.go`, optional config plumbing | Partially; coordinate with integration lead | Advisor usage is separately visible in metadata/metrics and limits are enforced |
| 4. Frontend rendering | Frontend agent | `site/.../tools/Tool.tsx`, new `AdvisorTool.tsx`, stories | Yes after result schema stabilizes | Advisor renders as a readable card and story tests pass |
| 5. Dogfood + QA evidence | QA agent | dev server, Storybook, dogfood output | After backend + UI are usable | Repro videos, screenshots, and a concise QA report exist |
### Parallelization rules
- **Do not split `coderd/x/chatd/chatd.go` across multiple execution agents without an integration lead.** That file owns prompt building, tool registration, model resolution, and cost persistence.
- Workstreams 1 and 2 can be developed in parallel and then stacked onto the integration branch.
- Workstream 4 should begin once the backend result schema is agreed on, even if the backend is still behind a feature flag.
- Any agent that needs to re-check Mux behavior should clone `coder/mux` into a temporary directory (for example, `$(mktemp -d)/mux`) and inspect it read-only; do not vendor or copy code from Mux directly.
## Phase 0 — Preflight and guardrails
### Goals
- Align the team on the smallest shippable architecture.
- Prevent scope creep into MCP/dynamic-tool/sub-agent variants.
- Decide upfront what is MVP vs. follow-up.
### Tasks
1. **Confirm the MVP boundary.**
- Ship a built-in advisor tool first.
- Do **not** make MCP, dynamic tools, or sub-agents the primary implementation.
- Do **not** add transient streaming phases in the first backend PR unless they fall out almost for free.
2. **Confirm local workflow hygiene before coding.**
- Ensure the repo is using the project git hooks from `scripts/githooks`.
- Do not bypass hooks with `--no-verify`.
- Use `./scripts/develop.sh` for the full dev server rather than manual build/run commands.
3. **Lock the model-selection policy.**
- **Recommended MVP:** advisor uses the same resolved provider/model/cost config as the current chat, with advisor-specific max-output and usage caps.
- **Follow-up only if required:** add a separate `AdvisorModelConfigID`-style override that resolves through the existing `configCache`/model-config path. Do not invent a new free-form `provider:model` parser if chatd already stores provider/model separately.
4. **Lock the persistence policy.**
- **Recommended MVP:** no DB migration. Persist advisor-visible metadata in the tool result and record separate metrics in memory/Prometheus.
- **Only if product/billing explicitly asks for queryable advisor cost:** add a later DB migration or usage table, following the normal `queries/*.sql` + `make gen` workflow.
5. **Create an execution ADR note in the work item or tracking doc.**
- Capture: built-in tool, tool-less nested call, root-chat-only rollout, exclusive execution policy, MVP no-DB-migration default.
### Quality gate
- Everyone on the team can state the same answers to these questions:
- Is advisor a built-in tool? **Yes.**
- Can advisor run with action tools in the same batch? **No.**
- Does advisor get tools of its own? **No.**
- Is a DB migration required for MVP? **No, unless billing insists.**
## Phase 1 — Build the advisor runtime and tool wrapper
### Goals
Create the core advisor implementation in a way that is easy to test and keeps `chattool/` thin.
### Files to add
- `coderd/x/chatd/chatadvisor/types.go`
- `coderd/x/chatd/chatadvisor/guidance.go`
- `coderd/x/chatd/chatadvisor/handoff.go`
- `coderd/x/chatd/chatadvisor/runtime.go`
- `coderd/x/chatd/chatadvisor/runner.go`
- `coderd/x/chatd/chattool/advisor.go`
### Responsibilities by file
1. **`types.go`**
- Define the input/result schema used by the tool and UI.
- Keep the result shape close to Mux so the UI and model both have predictable cases.
- Recommended result variants:
- `advice`
- `limit_reached`
- `error`
Recommended shape:
```go
type AdvisorArgs struct {
Question string `json:"question"`
}
type AdvisorResult struct {
Type string `json:"type"`
Advice string `json:"advice,omitempty"`
Error string `json:"error,omitempty"`
AdvisorModel string `json:"advisor_model,omitempty"`
RemainingUses int `json:"remaining_uses,omitempty"`
Usage *AdvisorUsageResult `json:"usage,omitempty"`
}
```
2. **`guidance.go`**
- Hold two strings:
- the nested advisor system prompt;
- the parent-agent guidance block to inject into the outer system prompt.
- The nested advisor prompt must say, in plain language:
- you are advising the parent agent;
- you do not address the end user directly;
- you do not claim actions happened;
- you return concise strategic guidance and tradeoffs.
3. **`runtime.go`**
- Define the per-run runtime state.
- Recommended fields:
- resolved model + model config;
- provider keys/options reused from the outer chat;
- `MaxUsesPerRun`;
- `MaxOutputTokens`;
- atomic/current call counter;
- callback(s) to obtain the current prompt snapshot and current-step snapshot;
- optional metrics/usage hook.
- Add fail-fast validation for impossible config: nil model, non-positive limits, empty prompt builders, etc.
4. **`handoff.go`**
- Build the advisor handoff message from:
- the explicit question;
- the exact prompt/messages the parent model just used;
- the current step's text/reasoning snapshot, if available;
- the most recent relevant tool outputs, if they are already in the prompt snapshot.
- **Important:** use the already-prepared outer prompt tail, not a fresh DB reload. That keeps the advisor aligned with compaction and the exact context the outer model saw.
- Apply hard truncation budgets with recent-context bias.
5. **`runner.go`**
- Execute the nested advisor call.
- **Recommended implementation:** call `chatloop.Run()` in an in-memory, one-step mode:
- `Tools: nil`
- `ProviderTools: nil`
- `MaxSteps: 1`
- `PersistStep`: capture the assistant output in memory instead of writing DB rows
- Reuse the existing provider/model/cost path instead of building a second provider runner.
- Assert that no tool definitions are passed to the nested call.
6. **`chattool/advisor.go`**
- Keep this file thin and consistent with other built-ins.
- Responsibilities:
- decode `AdvisorArgs`;
- validate `Question` is non-empty and bounded;
- call the `chatadvisor` runner;
- return a structured tool response.
### Defensive programming requirements
- Assert `Question` is non-empty after trimming.
- Assert runtime limits are positive.
- Assert the nested advisor call runs with zero tools/provider tools.
- Assert `AdvisorResult.Type` is one of the known variants before returning.
- Assert remaining uses never goes negative.
### Acceptance criteria
- A unit test can call the advisor tool with a fake model and receive a stable `advice` result.
- The nested advisor call is impossible to run with tools accidentally attached.
- The core logic lives in `chatadvisor/`, not embedded inside `chatd.go`.
## Phase 2 — Wire advisor into chatd and keep prompt/tool availability in sync
### Goals
Register the tool in the right place, expose it only when eligible, and inject system guidance only when the tool is present.
### Files to modify
- `coderd/x/chatd/chatd.go`
- optionally a small helper file if `chatd.go` becomes too crowded
### Tasks
1. **Compute one eligibility boolean in `processChat()`.**
Recommended inputs:
- server-level advisor enabled flag;
- root chat only (`chat.ParentChatID == uuid.Nil` or equivalent existing root/child check);
- a usable resolved model/provider exists;
- optional experiment/workspace/org gate if product wants staged rollout.
2. **Create the runtime once per outer chat run.**
- Use the model/config/keys resolved by `resolveChatModel()`.
- Reuse provider options from the current chat's `ChatModelCallConfig`.
- Set `MaxUsesPerRun` and `MaxOutputTokens` from advisor config defaults.
3. **Register the tool in the built-in tool block.**
- Insert after the skill tools and before MCP tools in `processChat()`.
- Record `builtinToolNames["advisor"] = true` so metrics stay bounded.
4. **Inject advisor guidance into the outer system prompt using the same boolean.**
- Use `chatprompt.InsertSystem()` in the same prompt assembly path that already injects user/system instructions.
- Place the block near the existing instruction insertion, before plan-path/skill context blocks.
- Wrap the guidance in an explicit tag like `<advisor-guidance>` so it is easy to spot in tests and future refactors.
5. **Keep advisor out of child chats for the first release.**
- That avoids recursion/cost blowups with `spawn_agent` / `wait_agent` flows.
- Document this explicitly in the rollout notes and tests.
### Acceptance criteria
- If advisor is disabled, neither the tool nor the prompt guidance appears.
- If advisor is enabled, both the tool and the prompt guidance appear.
- Root chats can use advisor; child chats cannot.
- Built-in tool names include `advisor` so metrics do not collapse it into the generic `mcp` label.
## Phase 3 — Enforce planning-only execution policy in `chatloop`
### Goals
Prevent the model from calling `advisor` and action tools in the same execution batch.
### Files to modify
- `coderd/x/chatd/chatloop/chatloop.go`
- related chatloop tests
### Recommended implementation
Keep the MVP small; do **not** build a general policy engine yet.
1. Add a minimal field to `chatloop.RunOptions`, for example:
```go
ExclusiveToolName *string
```
2. In `Run()` / `executeTools()`, detect the case where the exclusive tool appears in the same local-tool batch as any other locally executed tool.
3. When that happens, synthesize structured tool-result errors for the affected calls instead of executing anything in the batch.
- `advisor` should receive a clear error like: _advisor must be called by itself before action tools_.
- The sibling action tools should receive a paired policy error like: _this tool was skipped because advisor must run alone_.
4. Let the outer model see those tool errors and retry cleanly.
- This is simpler and safer than partial execution or hidden deferral.
- It preserves deterministic transcript history for debugging.
5. Pass the just-finished step snapshot into the tool execution context.
- The advisor runtime should be able to see the current step's text/reasoning content, because that is often the best hint about what the outer model is trying to decide.
### Why this is the right fit
- It matches the intended semantics: advisor is consulted **before** taking action.
- It avoids subtle race conditions caused by concurrent built-in tool execution.
- It keeps the behavior easy to test with fake models.
### Acceptance criteria
- A model-emitted batch containing only `advisor` succeeds.
- A model-emitted batch containing `advisor` plus any other locally executed tool returns deterministic policy errors and executes nothing.
- Non-advisor tool execution stays unchanged for normal chats.
## Phase 4 — Usage limits, metrics, and configuration
### Goals
Make advisor safe to operate without over-designing billing/storage in the first release.
### Files to modify
- `coderd/x/chatd/chatd.go`
- `coderd/x/chatd/chatloop/metrics.go` as needed
- `coderd/x/chatd/chatd.go` `Config` struct and constructor path
- optional follow-up config/db files only if a separate advisor model or persistent billing is required
### Tasks
1. **Add explicit server config knobs for MVP.**
Recommended fields on `chatd.Config` or a nested advisor config struct:
- `AdvisorEnabled bool`
- `AdvisorMaxUsesPerRun int`
- `AdvisorMaxOutputTokens int64`
2. **Track usage per outer run.**
- Reset the counter for each `processChat()` invocation.
- Return `remaining_uses` in the tool result.
- Return `limit_reached` when the cap is exhausted.
3. **Expose advisor usage metadata in the tool result.**
- Include model name and token/cost summary if available.
- Use the same `callConfig.Cost` calculation path as the outer chat for MVP if advisor reuses the same model.
4. **Record server-side metrics.**
- Count advisor invocations, failures, and latency.
- Ensure they show up under the built-in tool label `advisor`.
5. **Optional decision gate: separate advisor model.**
- If product insists on a stronger/different advisor model, add a follow-up config hook that resolves another existing chat model config through the same `configCache` path.
- Keep that out of the first landing PR unless it is required for acceptance.
6. **Optional decision gate: queryable advisor cost.**
- If this becomes required, spin a follow-up DB task:
- update `coderd/database/queries/*.sql`;
- add migration files;
- run `make gen`;
- update audit mappings if a new auditable type/field is introduced.
### Acceptance criteria
- Advisor calls are capped per outer run.
- Limit exhaustion is user-visible in the tool result.
- Metrics distinguish advisor calls from other built-in tools.
- MVP does not require a schema migration unless explicitly approved.
## Phase 5 — Frontend rendering and Storybook coverage
### Goals
Make advisor feel intentional in the Agents UI without blocking the backend on fancy streaming UI.
### Files to modify
- `site/src/pages/AgentsPage/components/ChatElements/tools/Tool.tsx`
- new `site/src/pages/AgentsPage/components/ChatElements/tools/AdvisorTool.tsx`
- Storybook story file(s) in the same tools directory
### Delivery strategy
1. **Intermediate milestone during backend bring-up:** rely on the existing generic tool renderer if needed.
- This is acceptable only as a short-lived integration checkpoint.
2. **Release milestone:** add a dedicated lightweight `AdvisorTool` renderer.
- Reuse existing primitives:
- `ToolCollapsible`
- `ToolIcon`
- `Response` for markdown/prose rendering
- `ScrollArea` if the advice can be long
- Keep styling light and consistent with the Agents page.
- Do not add unnecessary React memoization in `site/src/pages/AgentsPage/`; that area is already React-Compiler aware.
3. **Render the structured result states cleanly.**
- `advice` — readable prose/markdown with optional metadata footer.
- `limit_reached` — warning-style message.
- `error` — error state with visible fallback text.
- `running` — existing tool loading state/spinner is enough for MVP.
4. **Add Storybook coverage instead of ad-hoc component tests.**
Recommended stories:
- successful advice;
- running/loading;
- limit reached;
- error.
5. **Keep the UI contract narrow.**
- Prefer one text field like `advice` plus small metadata rather than a deeply nested schema.
- That keeps the UI resilient to prompt iteration.
### Acceptance criteria
- The advisor tool card renders readable content rather than raw quoted JSON in the final release branch.
- Running, limit, and error states are visibly distinct.
- Storybook stories and play assertions cover the new states.
- Existing tool rendering flows remain unchanged.
## Phase 6 — Automated tests and validation gates
### Backend tests to add
1. **Advisor runtime/tool tests**
- question validation;
- tool-less nested execution assertion;
- success result shaping;
- limit-reached result shaping;
- error result shaping.
2. **Prompt/gating tests in chatd**
- advisor disabled ⇒ no tool, no guidance;
- advisor enabled/root chat ⇒ tool + guidance;
- child chat ⇒ advisor absent.
3. **Chatloop policy tests**
- advisor alone runs;
- advisor + action tool mixed batch returns deterministic policy errors;
- non-advisor tools still execute normally.
4. **Usage/metrics tests**
- per-run cap resets correctly;
- builtin tool labeling includes `advisor`;
- returned metadata includes model/usage summary when available.
### Frontend tests to add
- Storybook `play()` assertions for the advisor renderer states.
- Verify expand/collapse behavior and visible fallback text.
- Verify the message timeline still renders adjacent tools correctly.
### Recommended command sequence
Run these as the implementation matures, not only at the end:
1. Backend-focused gate after phases 1–4:
- `make test RUN=TestAdvisor`
- `make test RUN=TestChatloopAdvisor`
- `make lint`
2. Frontend-focused gate after phase 5:
- `pnpm test:storybook src/pages/AgentsPage/components/ChatElements/tools/AdvisorTool.stories.tsx`
- `pnpm lint`
- `pnpm format`
3. Final repo gate before handoff:
- `make pre-commit`
- run any additional targeted `make test RUN=...` selections covering touched chatd paths
> Use the exact new test names the implementing agents create; the names above are recommended anchors, not existing tests.
## Dogfooding plan
### Principle
Dogfood the change as a real agent feature, not just a unit-tested backend. Per the dogfood and `agent-browser` skills, the reviewer should get **watchable repro videos** plus screenshots that make the behavior obvious without reading logs.
### Required setup
1. Start the full dev environment with:
- `./scripts/develop.sh`
2. If the frontend renderer changes, also start Storybook from `site/` with:
- `pnpm storybook --no-open`
3. Use `agent-browser` directly — **never `npx agent-browser`**.
4. Use named browser sessions and an output folder such as:
- `./dogfood-output/advisor/`
- with subfolders `screenshots/` and `videos/`
### Evidence protocol
For every interactive scenario below:
1. Start video recording **before** the action.
2. Capture step-by-step screenshots at human pace.
3. Capture one annotated screenshot of the final state.
4. Stop the recording.
5. Note the exact pass/fail observation in the QA report.
For static UI states (for example Storybook error/limit cards), an annotated screenshot is sufficient; video is optional but still encouraged by this project’s review preference.
### Dogfood scenarios
#### Scenario A — Happy path in the real Agents UI
**Goal:** prove that a root agent chat can invoke advisor and produce a readable recommendation before taking further action.
Steps:
1. Open the Agents page with an advisor-enabled root chat.
2. Start a repro video.
3. Send a prompt that should reasonably trigger strategic planning, such as an architecture or multi-tradeoff question.
4. Capture screenshots of:
- the prompt before send;
- the running advisor state;
- the completed advisor card and the assistant’s follow-up response.
5. Stop recording.
Pass criteria:
- advisor appears in the timeline;
- the rendered result is readable;
- the assistant can continue after consuming the advisor output.
#### Scenario B — Advisor unavailable path
**Goal:** prove the feature is truly gated.
Suggested variants (at least one is required, both are better):
- feature flag/config off;
- child/sub-agent chat.
Evidence:
- annotated screenshot of the chat/tool state showing advisor is absent;
- short video if toggling the gate live is part of the repro.
Pass criteria:
- no advisor tool is available;
- no advisor-specific prompt behavior leaks through.
#### Scenario C — UI states in Storybook
**Goal:** prove the renderer handles non-happy states cleanly.
Required story states:
- success/advice;
- running;
- limit reached;
- error.
Evidence:
- one screenshot per state;
- at least one short video showing collapse/expand behavior.
Pass criteria:
- success renders readable advice;
- limit/error have visible fallback text;
- the component behaves like the other tool cards.
#### Scenario D — Regression sweep of nearby tools
**Goal:** ensure advisor does not break the surrounding chat timeline.
Check at minimum:
- another existing built-in tool still renders correctly near advisor;
- sub-agent/tool cards still expand/collapse normally;
- no obvious console errors appear in the Agents page during the advisor flow.
Evidence:
- screenshots of adjacent tool cards;
- console/error capture if anything suspicious appears.
### `agent-browser` usage notes for the QA agent
- Prefer `agent-browser batch` for 2+ sequential commands when no intermediate parsing is needed.
- Use `snapshot -i` to discover interactive refs.
- Re-snapshot after navigation or major DOM changes.
- Avoid `wait --load networkidle` unless the page is known to go idle; prefer explicit element/text waits or short fixed waits.
- Record videos at human pace and include pauses that a reviewer can follow.
## Rollout plan
### Initial rollout
- Gate behind a server-side advisor-enabled flag.
- Enable only for selected internal/root agent chats first.
- Watch metrics for:
- invocation count;
- failure rate;
- latency;
- obvious retry loops.
### Expansion conditions
Expand beyond the initial rollout only after the following are true:
- mixed-batch policy behavior is stable;
- cost impact is understood;
- frontend UX is readable in production-like dogfood;
- no recursion surprises have appeared with sub-agent flows.
### Explicit non-goals for the first release
- advisor inside child/sub-agent chats;
- provider-agnostic streaming phase UI;
- MCP-based external advisor implementation;
- mandatory DB-backed advisor cost reporting.
## Final acceptance checklist
- [ ] `advisor` is a built-in chatd tool, not an MCP/dynamic-tool substitute.
- [ ] The nested advisor call is tool-less and bounded to one in-memory step.
- [ ] One eligibility boolean controls both tool registration and prompt guidance injection.
- [ ] Root chats can use advisor; child chats cannot in the initial rollout.
- [ ] Mixed advisor/action batches produce deterministic policy errors instead of partial execution.
- [ ] Per-run usage caps and limit-reached behavior work.
- [ ] Advisor usage is visible in metadata/metrics without forcing a DB migration for MVP.
- [ ] The Agents UI has a readable advisor card and Storybook coverage.
- [ ] Dogfooding produced screenshots and repro videos for the required scenarios.
- [ ] Validation commands (`make lint`, targeted `make test`, Storybook tests, `make pre-commit`) passed before handoff.
## Suggested PR split
1. **PR 1 — Backend foundation**
- `chatadvisor/` package
- `chattool/advisor.go`
- `chatloop` exclusive policy
- chatd gating/prompt sync
- backend tests
2. **PR 2 — Frontend + QA**
- advisor renderer
- stories/play assertions
- dogfood artifacts and QA notes
3. **PR 3 — Optional follow-ups only if demanded by stakeholders**
- separate advisor model override
- persistent advisor billing/queryability
- transient phase-stream UX
</details>
---
_Generated with [`mux`](https://github.com/coder/mux) • Model: `anthropic:claude-opus-4-7` • Thinking: `max`_
334 lines
13 KiB
SQL
334 lines
13 KiB
SQL
-- name: UpsertDefaultProxy :exec
|
|
-- The default proxy is implied and not actually stored in the database.
|
|
-- So we need to store it's configuration here for display purposes.
|
|
-- The functional values are immutable and controlled implicitly.
|
|
INSERT INTO site_configs (key, value)
|
|
VALUES
|
|
('default_proxy_display_name', @display_name :: text),
|
|
('default_proxy_icon_url', @icon_url :: text)
|
|
ON CONFLICT
|
|
(key)
|
|
DO UPDATE SET value = EXCLUDED.value WHERE site_configs.key = EXCLUDED.key
|
|
;
|
|
|
|
-- name: GetDefaultProxyConfig :one
|
|
SELECT
|
|
COALESCE((SELECT value FROM site_configs WHERE key = 'default_proxy_display_name'), 'Default') :: text AS display_name,
|
|
COALESCE((SELECT value FROM site_configs WHERE key = 'default_proxy_icon_url'), '/emojis/1f3e1.png') :: text AS icon_url
|
|
;
|
|
|
|
-- name: InsertDeploymentID :exec
|
|
INSERT INTO site_configs (key, value) VALUES ('deployment_id', $1);
|
|
|
|
-- name: GetDeploymentID :one
|
|
SELECT value FROM site_configs WHERE key = 'deployment_id';
|
|
|
|
-- name: InsertDERPMeshKey :exec
|
|
INSERT INTO site_configs (key, value) VALUES ('derp_mesh_key', $1);
|
|
|
|
-- name: GetDERPMeshKey :one
|
|
SELECT value FROM site_configs WHERE key = 'derp_mesh_key';
|
|
|
|
-- name: UpsertLastUpdateCheck :exec
|
|
INSERT INTO site_configs (key, value) VALUES ('last_update_check', $1)
|
|
ON CONFLICT (key) DO UPDATE SET value = $1 WHERE site_configs.key = 'last_update_check';
|
|
|
|
-- name: GetLastUpdateCheck :one
|
|
SELECT value FROM site_configs WHERE key = 'last_update_check';
|
|
|
|
-- name: UpsertAnnouncementBanners :exec
|
|
INSERT INTO site_configs (key, value) VALUES ('announcement_banners', $1)
|
|
ON CONFLICT (key) DO UPDATE SET value = $1 WHERE site_configs.key = 'announcement_banners';
|
|
|
|
-- name: GetAnnouncementBanners :one
|
|
SELECT value FROM site_configs WHERE key = 'announcement_banners';
|
|
|
|
-- name: UpsertLogoURL :exec
|
|
INSERT INTO site_configs (key, value) VALUES ('logo_url', $1)
|
|
ON CONFLICT (key) DO UPDATE SET value = $1 WHERE site_configs.key = 'logo_url';
|
|
|
|
-- name: GetLogoURL :one
|
|
SELECT value FROM site_configs WHERE key = 'logo_url';
|
|
|
|
-- name: UpsertApplicationName :exec
|
|
INSERT INTO site_configs (key, value) VALUES ('application_name', $1)
|
|
ON CONFLICT (key) DO UPDATE SET value = $1 WHERE site_configs.key = 'application_name';
|
|
|
|
-- name: GetApplicationName :one
|
|
SELECT value FROM site_configs WHERE key = 'application_name';
|
|
|
|
-- name: GetHealthSettings :one
|
|
SELECT
|
|
COALESCE((SELECT value FROM site_configs WHERE key = 'health_settings'), '{}') :: text AS health_settings
|
|
;
|
|
|
|
-- name: UpsertHealthSettings :exec
|
|
INSERT INTO site_configs (key, value) VALUES ('health_settings', $1)
|
|
ON CONFLICT (key) DO UPDATE SET value = $1 WHERE site_configs.key = 'health_settings';
|
|
|
|
-- name: GetNotificationsSettings :one
|
|
SELECT
|
|
COALESCE((SELECT value FROM site_configs WHERE key = 'notifications_settings'), '{}') :: text AS notifications_settings
|
|
;
|
|
|
|
-- name: UpsertNotificationsSettings :exec
|
|
INSERT INTO site_configs (key, value) VALUES ('notifications_settings', $1)
|
|
ON CONFLICT (key) DO UPDATE SET value = $1 WHERE site_configs.key = 'notifications_settings';
|
|
|
|
-- name: GetPrebuildsSettings :one
|
|
SELECT
|
|
COALESCE((SELECT value FROM site_configs WHERE key = 'prebuilds_settings'), '{}') :: text AS prebuilds_settings
|
|
;
|
|
|
|
-- name: UpsertPrebuildsSettings :exec
|
|
INSERT INTO site_configs (key, value) VALUES ('prebuilds_settings', $1)
|
|
ON CONFLICT (key) DO UPDATE SET value = $1 WHERE site_configs.key = 'prebuilds_settings';
|
|
|
|
-- name: GetRuntimeConfig :one
|
|
SELECT value FROM site_configs WHERE site_configs.key = $1;
|
|
|
|
-- name: UpsertRuntimeConfig :exec
|
|
INSERT INTO site_configs (key, value) VALUES ($1, $2)
|
|
ON CONFLICT (key) DO UPDATE SET value = $2 WHERE site_configs.key = $1;
|
|
|
|
-- name: DeleteRuntimeConfig :exec
|
|
DELETE FROM site_configs
|
|
WHERE site_configs.key = $1;
|
|
|
|
-- name: GetOAuth2GithubDefaultEligible :one
|
|
SELECT
|
|
CASE
|
|
WHEN value = 'true' THEN TRUE
|
|
ELSE FALSE
|
|
END
|
|
FROM site_configs
|
|
WHERE key = 'oauth2_github_default_eligible';
|
|
|
|
-- name: UpsertOAuth2GithubDefaultEligible :exec
|
|
INSERT INTO site_configs (key, value)
|
|
VALUES (
|
|
'oauth2_github_default_eligible',
|
|
CASE
|
|
WHEN sqlc.arg(eligible)::bool THEN 'true'
|
|
ELSE 'false'
|
|
END
|
|
)
|
|
ON CONFLICT (key) DO UPDATE
|
|
SET value = CASE
|
|
WHEN sqlc.arg(eligible)::bool THEN 'true'
|
|
ELSE 'false'
|
|
END
|
|
WHERE site_configs.key = 'oauth2_github_default_eligible';
|
|
|
|
-- name: UpsertWebpushVAPIDKeys :exec
|
|
INSERT INTO site_configs (key, value)
|
|
VALUES
|
|
('webpush_vapid_public_key', @vapid_public_key :: text),
|
|
('webpush_vapid_private_key', @vapid_private_key :: text)
|
|
ON CONFLICT (key)
|
|
DO UPDATE SET value = EXCLUDED.value WHERE site_configs.key = EXCLUDED.key;
|
|
|
|
-- name: GetWebpushVAPIDKeys :one
|
|
SELECT
|
|
COALESCE((SELECT value FROM site_configs WHERE key = 'webpush_vapid_public_key'), '') :: text AS vapid_public_key,
|
|
COALESCE((SELECT value FROM site_configs WHERE key = 'webpush_vapid_private_key'), '') :: text AS vapid_private_key;
|
|
|
|
-- name: GetChatSystemPrompt :one
|
|
SELECT
|
|
COALESCE((SELECT value FROM site_configs WHERE key = 'agents_chat_system_prompt'), '') :: text AS chat_system_prompt;
|
|
|
|
-- GetChatSystemPromptConfig returns both chat system prompt settings in a
|
|
-- single read to avoid torn reads between separate site-config lookups.
|
|
-- The include-default fallback preserves the legacy behavior where a
|
|
-- non-empty custom prompt implied opting out before the explicit toggle
|
|
-- existed.
|
|
-- name: GetChatSystemPromptConfig :one
|
|
SELECT
|
|
COALESCE((SELECT value FROM site_configs WHERE key = 'agents_chat_system_prompt'), '') :: text AS chat_system_prompt,
|
|
COALESCE(
|
|
(SELECT value = 'true' FROM site_configs WHERE key = 'agents_chat_include_default_system_prompt'),
|
|
NOT EXISTS (
|
|
SELECT 1
|
|
FROM site_configs
|
|
WHERE key = 'agents_chat_system_prompt'
|
|
AND value != ''
|
|
)
|
|
) :: boolean AS include_default_system_prompt;
|
|
|
|
-- name: UpsertChatSystemPrompt :exec
|
|
INSERT INTO site_configs (key, value) VALUES ('agents_chat_system_prompt', $1)
|
|
ON CONFLICT (key) DO UPDATE SET value = $1 WHERE site_configs.key = 'agents_chat_system_prompt';
|
|
|
|
-- name: GetChatPlanModeInstructions :one
|
|
SELECT
|
|
COALESCE((SELECT value FROM site_configs WHERE key = 'agents_chat_plan_mode_instructions'), '') :: text AS plan_mode_instructions;
|
|
|
|
-- name: UpsertChatPlanModeInstructions :exec
|
|
INSERT INTO site_configs (key, value) VALUES ('agents_chat_plan_mode_instructions', $1)
|
|
ON CONFLICT (key) DO UPDATE SET value = $1 WHERE site_configs.key = 'agents_chat_plan_mode_instructions';
|
|
|
|
-- name: GetChatExploreModelOverride :one
|
|
SELECT
|
|
COALESCE((SELECT value FROM site_configs WHERE key = 'agents_chat_explore_model_override'), '') :: text AS model_config_id;
|
|
|
|
-- name: UpsertChatExploreModelOverride :exec
|
|
INSERT INTO site_configs (key, value) VALUES ('agents_chat_explore_model_override', $1)
|
|
ON CONFLICT (key) DO UPDATE SET value = $1 WHERE site_configs.key = 'agents_chat_explore_model_override';
|
|
|
|
-- name: GetChatGeneralModelOverride :one
|
|
SELECT
|
|
COALESCE((SELECT value FROM site_configs WHERE key = 'agents_chat_general_model_override'), '') :: text AS model_config_id;
|
|
|
|
-- name: UpsertChatGeneralModelOverride :exec
|
|
INSERT INTO site_configs (key, value) VALUES ('agents_chat_general_model_override', $1)
|
|
ON CONFLICT (key) DO UPDATE SET value = $1 WHERE site_configs.key = 'agents_chat_general_model_override';
|
|
|
|
-- name: GetChatDesktopEnabled :one
|
|
SELECT
|
|
COALESCE((SELECT value = 'true' FROM site_configs WHERE key = 'agents_desktop_enabled'), false) :: boolean AS enable_desktop;
|
|
|
|
-- name: UpsertChatDesktopEnabled :exec
|
|
INSERT INTO site_configs (key, value)
|
|
VALUES (
|
|
'agents_desktop_enabled',
|
|
CASE
|
|
WHEN sqlc.arg(enable_desktop)::bool THEN 'true'
|
|
ELSE 'false'
|
|
END
|
|
)
|
|
ON CONFLICT (key) DO UPDATE
|
|
SET value = CASE
|
|
WHEN sqlc.arg(enable_desktop)::bool THEN 'true'
|
|
ELSE 'false'
|
|
END
|
|
WHERE site_configs.key = 'agents_desktop_enabled';
|
|
|
|
-- GetChatAdvisorConfig returns the deployment-wide runtime configuration
|
|
-- for the experimental chat advisor as a JSON blob. Callers unmarshal the
|
|
-- result into codersdk.AdvisorConfig. Returns '{}' when unset so zero
|
|
-- values apply by default.
|
|
-- name: GetChatAdvisorConfig :one
|
|
SELECT
|
|
COALESCE((SELECT value FROM site_configs WHERE key = 'agents_advisor_config'), '{}') :: text AS advisor_config;
|
|
|
|
-- UpsertChatAdvisorConfig stores the deployment-wide runtime configuration
|
|
-- for the experimental chat advisor. Callers marshal codersdk.AdvisorConfig
|
|
-- to JSON before invoking this query.
|
|
-- name: UpsertChatAdvisorConfig :exec
|
|
INSERT INTO site_configs (key, value) VALUES ('agents_advisor_config', $1)
|
|
ON CONFLICT (key) DO UPDATE SET value = $1 WHERE site_configs.key = 'agents_advisor_config';
|
|
|
|
-- GetChatDebugLoggingAllowUsers returns the runtime admin setting that
|
|
-- allows users to opt into chat debug logging when the deployment does
|
|
-- not already force debug logging on globally.
|
|
-- name: GetChatDebugLoggingAllowUsers :one
|
|
SELECT
|
|
COALESCE((SELECT value = 'true' FROM site_configs WHERE key = 'agents_chat_debug_logging_allow_users'), false) :: boolean AS allow_users;
|
|
|
|
-- UpsertChatDebugLoggingAllowUsers updates the runtime admin setting that
|
|
-- allows users to opt into chat debug logging.
|
|
-- name: UpsertChatDebugLoggingAllowUsers :exec
|
|
INSERT INTO site_configs (key, value)
|
|
VALUES (
|
|
'agents_chat_debug_logging_allow_users',
|
|
CASE
|
|
WHEN sqlc.arg(allow_users)::bool THEN 'true'
|
|
ELSE 'false'
|
|
END
|
|
)
|
|
ON CONFLICT (key) DO UPDATE
|
|
SET value = CASE
|
|
WHEN sqlc.arg(allow_users)::bool THEN 'true'
|
|
ELSE 'false'
|
|
END
|
|
WHERE site_configs.key = 'agents_chat_debug_logging_allow_users';
|
|
|
|
-- GetChatTemplateAllowlist returns the JSON-encoded template allowlist.
|
|
-- Returns an empty string when no allowlist has been configured (all templates allowed).
|
|
-- name: GetChatTemplateAllowlist :one
|
|
SELECT
|
|
COALESCE((SELECT value FROM site_configs WHERE key = 'agents_template_allowlist'), '') :: text AS template_allowlist;
|
|
|
|
-- GetChatIncludeDefaultSystemPrompt preserves the legacy default
|
|
-- for deployments created before the explicit include-default toggle.
|
|
-- When the toggle is unset, a non-empty custom prompt implies false;
|
|
-- otherwise the setting defaults to true.
|
|
-- name: GetChatIncludeDefaultSystemPrompt :one
|
|
SELECT
|
|
COALESCE(
|
|
(SELECT value = 'true' FROM site_configs WHERE key = 'agents_chat_include_default_system_prompt'),
|
|
NOT EXISTS (
|
|
SELECT 1
|
|
FROM site_configs
|
|
WHERE key = 'agents_chat_system_prompt'
|
|
AND value != ''
|
|
)
|
|
) :: boolean AS include_default_system_prompt;
|
|
|
|
-- name: UpsertChatIncludeDefaultSystemPrompt :exec
|
|
INSERT INTO site_configs (key, value)
|
|
VALUES (
|
|
'agents_chat_include_default_system_prompt',
|
|
CASE
|
|
WHEN sqlc.arg(include_default_system_prompt)::bool THEN 'true'
|
|
ELSE 'false'
|
|
END
|
|
)
|
|
ON CONFLICT (key) DO UPDATE
|
|
SET value = CASE
|
|
WHEN sqlc.arg(include_default_system_prompt)::bool THEN 'true'
|
|
ELSE 'false'
|
|
END
|
|
WHERE site_configs.key = 'agents_chat_include_default_system_prompt';
|
|
|
|
-- name: GetChatWorkspaceTTL :one
|
|
-- Returns the global TTL for chat workspaces as a Go duration string.
|
|
-- Returns "0s" (disabled) when no value has been configured.
|
|
SELECT
|
|
COALESCE(
|
|
(SELECT value FROM site_configs WHERE key = 'agents_workspace_ttl'),
|
|
'0s'
|
|
)::text AS workspace_ttl;
|
|
|
|
-- name: UpsertChatTemplateAllowlist :exec
|
|
INSERT INTO site_configs (key, value) VALUES ('agents_template_allowlist', @template_allowlist)
|
|
ON CONFLICT (key) DO UPDATE SET value = @template_allowlist WHERE site_configs.key = 'agents_template_allowlist';
|
|
|
|
-- name: UpsertChatWorkspaceTTL :exec
|
|
INSERT INTO site_configs (key, value)
|
|
VALUES ('agents_workspace_ttl', @workspace_ttl::text)
|
|
ON CONFLICT (key) DO UPDATE
|
|
SET value = @workspace_ttl::text
|
|
WHERE site_configs.key = 'agents_workspace_ttl';
|
|
|
|
-- name: GetChatRetentionDays :one
|
|
-- Returns the chat retention period in days. Chats archived longer
|
|
-- than this and orphaned chat files older than this are purged by
|
|
-- dbpurge. Returns 30 (days) when no value has been configured.
|
|
-- A value of 0 disables chat purging entirely.
|
|
SELECT COALESCE(
|
|
(SELECT value::integer FROM site_configs
|
|
WHERE key = 'agents_chat_retention_days'),
|
|
30
|
|
) :: integer AS retention_days;
|
|
|
|
-- name: UpsertChatRetentionDays :exec
|
|
INSERT INTO site_configs (key, value)
|
|
VALUES ('agents_chat_retention_days', CAST(@retention_days AS integer)::text)
|
|
ON CONFLICT (key) DO UPDATE SET value = CAST(@retention_days AS integer)::text
|
|
WHERE site_configs.key = 'agents_chat_retention_days';
|
|
|
|
-- name: GetChatAutoArchiveDays :one
|
|
-- Auto-archive window in days. 0 disables.
|
|
SELECT COALESCE(
|
|
(SELECT value::integer FROM site_configs
|
|
WHERE key = 'agents_chat_auto_archive_days'),
|
|
@default_auto_archive_days::integer
|
|
) :: integer AS auto_archive_days;
|
|
|
|
-- name: UpsertChatAutoArchiveDays :exec
|
|
INSERT INTO site_configs (key, value)
|
|
VALUES ('agents_chat_auto_archive_days', CAST(@auto_archive_days AS integer)::text)
|
|
ON CONFLICT (key) DO UPDATE SET value = CAST(@auto_archive_days AS integer)::text
|
|
WHERE site_configs.key = 'agents_chat_auto_archive_days';
|