Commit Graph

492 Commits

Author SHA1 Message Date
Michael Suchacz 06526a5822 feat: use AI provider chat APIs (#25415) 2026-05-22 07:53:23 +02:00
Michael Suchacz 40878eeba4 feat: add AI provider schema expansion (#25412) 2026-05-22 02:16:01 +02:00
Steven Masley 9b6eadab77 fix: drop N+1 db query on template ACL available (#25465)
Fixes
[PLAT-149](https://linear.app/codercom/issue/PLAT-149/template-permissions-search-is-extremely-slow-with-many-groups).

`/acl/available` ran a db query per group. A deployment with >5,000
groups made this route extremely slow.
2026-05-20 22:40:50 +00:00
Danny Kopping dd3223451b feat: add AI providers HTTP CRUD handlers (#24894) 2026-05-20 10:21:36 +02:00
Michael Suchacz 5a8d0016a5 feat: add personal skill storage, API, and SDK (#25363)
> Mux updated this PR on behalf of Mike.

## Stack Context

This PR is the storage, permissions, API, and SDK layer for experimental
personal skills. #25362 has landed on `main`, so this branch is
restacked directly on `main`.

Stack order:
1. #25363 storage, permissions, API, and SDK
2. #25365 API test coverage
3. #25366 chattool and chatd integration
4. #25066 settings UI and docs
5. #25386 personal skills slash menu

## What?

Adds the `user_skills` database table, generated queries, RBAC resources
and scopes, audit resource handling, experimental user-scoped CRUD
endpoints, SDK types, and generated API/site types.

Follow-up review and restack fixes:
- Enforce a bounded personal skill description in parser and database
constraints.
- Return `403 Forbidden` for unauthorized create and update attempts.
- Return explicit conflict responses when soft-deleted users are
targeted.
- Keep user admins out of personal skills, while site owners can read
and delete but not create or update.
- Document trigger-raised constraint names and keep schema constants
covered by tests.
- Reuse `UserSkillMetadata` in the full `UserSkill` SDK response type.
- Generate user skill IDs in Go instead of relying on a database
default.
- Rebase on latest `main` and renumber the user skills migration to
`000502_user_skills`.

## Why?

Personal skills need durable user-owned storage with owner
authorization, limited site-owner moderation, and a hidden API surface
before chatd can consume them.

## Validation

- `make gen`
- `go test ./coderd/database -run '^TestUserSkillSchemaConstants$'
-count=1`
- `go test ./coderd/database/dbauthz -run
'^TestMethodTestSuite/TestUserSkills$' -count=1`
- `go test ./coderd -run '^TestPatchUserSkill$' -count=1`
- `go test ./codersdk ./coderd/database/db2sdk`
- `make lint`
- pre-commit hook on `97fd58108d`
2026-05-20 00:09:09 +02:00
Danielle Maywood 170a6e1fe9 feat: add chat sharing foundation (#25041) 2026-05-18 22:32:05 +01:00
Garrett Delfosse 78d4cf9e47 fix: soft-delete stale workspace agents on new build (#25207) 2026-05-18 08:33:29 -04:00
Yevhenii Shcherbina 238968cfa0 feat: add per-group AI budget table and endpoints (#25203)
Closes
https://linear.app/codercom/issue/AIGOV-284/add-group-budgets-table-and-crud-api

## Summary

Adds the `group_ai_budgets` table and the following endpoints:

- `GET /api/v2/groups/{group}/ai/budget`
- `PUT /api/v2/groups/{group}/ai/budget`
- `DELETE /api/v2/groups/{group}/ai/budget`

Each group may have at most one budget row. If no row exists, no budget
is enforced.

### Feature gate
  
Added `RequireFeatureMW(FeatureAIBridge)` on the `/ai/budget` sub-route.

## RBAC

Authorization reuses `rbac.ResourceGroup` with the existing
`.InOrganization(...).WithID(...)` scoping model.

The `dbauthz` wrappers load the parent `groups` row and authorize
against it.

No new resource type is introduced. As a result, anyone with
`group:update` permissions (Owner, OrgAdmin, or UserAdmin within the
organization) can manage AI budgets for that group.

## Read access for group members

`database.Group.RBACObject()` grants `policy.ActionRead` to all members
of the group through the group ACL:

```go
func (g Group) RBACObject() rbac.Object {
	return rbac.ResourceGroup.WithID(g.ID).
		InOrg(g.OrganizationID).
		// Group members can read the group.
		WithGroupACL(map[string][]policy.Action{
			g.ID.String(): {
				policy.ActionRead,
			},
		})
}
```

Because the `GET` endpoint authorizes against the same loaded `Group`
object, any group member can call:

```text
GET /api/v2/groups/{group}/ai/budget
```

`PUT` and `DELETE` remain admin-only. The group ACL grants only
`ActionRead`, so write operations continue to require role-based
`group:update` permissions.

## Alternative considered

A dedicated `rbac.ResourceGroupAiBudget` resource would allow budget
management to be separated from general group administration.

We decided not to add that complexity for now.
2026-05-14 15:54:37 -04:00
Danny Kopping 841b777ccd feat: add ai_providers table, queries, dbauthz, audit, RBAC (#24892) 2026-05-14 16:10:46 +02:00
Danielle Maywood 25a803221e feat: add shell tool display mode preference (#25029) 2026-05-14 14:25:07 +01:00
Michael Suchacz cb37047dce feat: dedicated /prompts endpoint for chat history cycle (#25083)
Follow-up to #25004. The merged change cycles only through messages
already loaded in the in-memory chat store (page size 50). Long chats
and chats whose oldest turns have rolled out of the page lose access to
their earlier prompts in the composer's up/down arrow cycle. This PR
adds a dedicated server endpoint that returns the full prompt history,
newest first, and rewires the composer to use it.

## What changed

### Endpoint

`GET /api/experimental/chats/{chat}/prompts?limit=N`

```go
type ChatPrompt struct { ID int64; Text string }
type ChatPromptsResponse struct { Prompts []ChatPrompt }
```

- `limit`: `0..2000`. `0` (the default) is treated as the server-side
default of 500; out-of-range values return `400`. Negative values are
rejected by the SDK's `PositiveInt32` parser before reaching the
handler.
- Auth: parent-chat read in `dbauthz`, mirroring
`GetChatMessagesByChatID`.
- The SQL filters `role='user'`, `deleted=false`, `visibility IN
('user','both')`, guards the lateral with `jsonb_typeof(content) =
'array'` so legacy V0 scalar-string rows are silently skipped, then
unrolls `content` JSONB with `WITH ORDINALITY` and concatenates only
`type='text'` parts in original order via `string_agg(... ORDER BY
ordinality)`. Messages whose joined text is whitespace-only are dropped
via `HAVING ... ~ '\S'` so cycling never lands on a blank entry.

### Partial index (migration `000494`)

```sql
CREATE INDEX idx_chat_messages_user_prompts
ON chat_messages (chat_id, id DESC)
WHERE deleted = false
  AND role = 'user'
  AND visibility IN ('user', 'both');
```

The partial WHERE matches the query's filter exactly and the key order
matches `ORDER BY id DESC`, so the planner gets both the filter and the
ordering from the index without a sort step.

`EXPLAIN ANALYZE` on a synthetic 51-chat × 5,000-message dataset (≈260k
rows, 10k user prompts in the target chat, `random_page_cost=1.1`):

| | Plan | Buffers hit | Time |
|---|---|---|---|
| Without index | `Index Scan Backward using chat_messages_pkey`,
**250,848 rows removed by filter** | 6,683 | 32.4 ms |
| With index | `Index Scan using idx_chat_messages_user_prompts`, no
filter | 38 | 1.3 ms |

≈25× faster, 175× fewer buffer hits.

### Frontend

- `chatPromptsKey` / `chatPromptsQuery` factories in
`site/src/api/queries/chats.ts` (`staleTime: 30s`, `enabled: chatId !==
""`, asks the server for 500 prompts).
- `ChatPageContent.tsx` replaces the in-memory derivation with
`useQuery(chatPromptsQuery(chatId ?? ""))`. The composer's existing
`cycleHistorySnapshotRef` anchors the in-flight cycle so a refetch
arriving mid-cycle cannot shift the indexed prompt out from under the
user.
- `getEditableUserMessagePayload` now concatenates user-message text
parts verbatim, mirroring the server's `string_agg(part->>'text', ''
ORDER BY ordinality)`, instead of routing through the streaming-oriented
`parseMessageContent` / `appendText` pipeline (which drops
whitespace-only chunks — correct for assistant streams, wrong for a
user's persisted message). This keeps the cycle and the edit path in
agreement on the same message. File blocks are still pulled separately
via
`parseMessageContent(...).blocks.filter(isEditableUserMessageFileBlock)`.
- Cache invalidation in `createChatMessage.onSuccess`,
`editChatMessage.onSettled`, and `useChatStore.upsertCacheMessages`
(only when an upserted message has `role === "user"`).
- Page-level stories pre-seed `chatPromptsKey(CHAT_ID)` from the same
`messagesData` to keep them offline.

## Tests

- New `TestGetChatUserPrompts` in `coderd/exp_chats_test.go` with five
subtests:
- `NewestFirstFiltering` — multi-part concatenation, non-text parts
skipped, whitespace-only filtered, soft-deleted excluded, `model`-only
visibility excluded, assistant-role excluded by `cm.role = 'user'`,
legacy V0 scalar row silently excluded by the `jsonb_typeof` guard,
ordering newest first.
- `LimitClampsResults` — explicit `limit=2` returns the two newest
prompts.
  - `InvalidLimitRejected` — `limit=5000` is `400 Bad Request`.
- `NotFoundForOtherUsers` — a separate user in the same org gets `404`,
not the prompts.
- `EmptyResultIsJSONArray` — zero-message chat and assistant-only chat
both return `Prompts: []` (non-nil, empty).
- New unit test in `messageParsing.test.ts` asserting that
`getEditableUserMessagePayload(["hello", " ", "world"])` returns `"hello
world"`, locking in the agreement with the SQL `string_agg`.
- `dbauthz_test.go` adds the
`MethodTestSuite.TestChats/GetChatUserPromptsByChatID` entry, asserting
parent-chat `policy.ActionRead`.
- `pnpm test src/pages/AgentsPage` — 1159 passed, 2 skipped.
- `make gen` produces no diff.

## Manual verification

Seeded a dev chat with Claude Sonnet 4.6 via the aibridge Anthropic
provider and posted 20 user prompts end-to-end. Verified that the
`/prompts` endpoint returns 20 rows newest-first, that `limit=10` clamps
correctly, that `limit=0` uses the server default of 500, and that the
up/down keyboard cycle in the composer walks the same sequence (and
reverses correctly back to the empty draft).

## Out of scope

- Cross-chat history.
- Per-user opt-out for the cycle.
- File-reference / attachment cycling — the cycle continues to reproduce
plain text only, by design.

<details>
<summary>Implementation plan</summary>

# CODAGT-319 Follow-up — Dedicated `/prompts` endpoint

## Context

The merged feature ([#25004](https://github.com/coder/coder/pull/25004)
/ [d32842f](https://github.com/coder/coder/commit/d32842f)) cycles only
through messages already loaded in the in-memory chat store, which is
capped at the first 50 messages of the current page. Long chats and
chats whose oldest turns have rolled out of the page can no longer
recall their full prompt history. This follow-up exposes a dedicated
server endpoint that returns the user-authored prompts in a chat, newest
first, and rewires the composer to use it.

## Design

### Endpoint

`GET /api/experimental/chats/{chat}/prompts?limit=N`

Returns:

```go
type ChatPrompt struct {
    ID   int64
    Text string
}
type ChatPromptsResponse struct {
    Prompts []ChatPrompt
}
```

- `limit`: `0..2000`. `0` (the default) → server-side default of 500.
The wire-level default is encoded in SQL as `COALESCE(NULLIF($limit, 0),
500)`. Negatives are rejected upstream by `PositiveInt32`; the handler
only caps the upper bound.
- Auth: parent-chat read in `dbauthz`, mirroring
`GetChatMessagesByChatID`.
- Listed under the experimental router so we can iterate without API
guarantees.

### SQL

The query lives in `coderd/database/queries/chats.sql` as
`GetChatUserPromptsByChatID`:

- Filters `role='user'`, `deleted=false`, `visibility IN
('user','both')` to mirror the composer's "what the user actually typed
and can re-send" contract.
- Guards the lateral with `jsonb_typeof(content) = 'array'` so legacy V0
rows whose content is a scalar JSON string (predates migration `000434`)
are silently excluded instead of raising `"cannot extract elements from
a scalar"`.
- Unrolls `content` JSONB with `jsonb_array_elements WITH ORDINALITY`
and concatenates only `type='text'` parts, preserving original order via
`string_agg(... ORDER BY ordinality)`.
- Casts the result to `text` so sqlc emits a `string` field instead of
`[]byte`.
- Drops whitespace-only prompts via `HAVING string_agg(...) ~ '\S'` so
cycling never lands on a blank entry.
- Orders by `cm.id DESC` (`id` is a sequence, so this is "newest first"
without relying on `created_at`).

### Index

New partial index added in migration `000494`:

```sql
CREATE INDEX idx_chat_messages_user_prompts
ON chat_messages (chat_id, id DESC)
WHERE deleted = false
  AND role = 'user'
  AND visibility IN ('user', 'both');
```

The partial WHERE clause matches the query's filter exactly, so the
planner can use the index for both filtering and ordering without a sort
step.

### Frontend

- `chatPromptsKey(chatId)` and `chatPromptsQuery(chatId)` factories in
`site/src/api/queries/chats.ts`. `staleTime: 30s`, `enabled: chatId !==
""`. Asks the server for 500 prompts (well below the 2000 max, plenty
for the cycle).
- `ChatPageContent.tsx` replaces the in-memory derivation with
`useQuery(chatPromptsQuery(chatId ?? ""))`. The composer's
`cycleHistorySnapshotRef` already takes a stable snapshot at cycle
entry, so a refetch arriving mid-cycle cannot shift the indexed prompt
out from under the user.
- `getEditableUserMessagePayload` extracts the edit-path text from raw
user-message parts (filter `type === "text"`, join verbatim) instead of
going through `parseMessageContent` / `appendText`, which is built for
assistant streams and intentionally drops whitespace-only chunks.
Without this, cycling and clicking Edit on the same message could
produce different draft text for messages with whitespace-only
interleaved text parts.
- Cache invalidation: `createChatMessage.onSuccess`,
`editChatMessage.onSettled`, and `useChatStore.upsertCacheMessages`
(when at least one upserted message has `role === "user"`) all
invalidate `chatPromptsKey(chatId)`.

### Tests

- `TestGetChatUserPrompts` (`coderd/exp_chats_test.go`) covers:
- `NewestFirstFiltering` — multi-part concatenation, non-text parts
skipped, whitespace-only filtered, soft-deleted excluded, `model`-only
visibility excluded, assistant-role excluded by `cm.role = 'user'`,
legacy V0 scalar row silently excluded by the `jsonb_typeof` guard,
ordering newest first.
- `LimitClampsResults` — explicit `limit=2` returns the two newest
prompts.
  - `InvalidLimitRejected` — `limit=5000` is `400 Bad Request`.
- `NotFoundForOtherUsers` — a separate user in the same org gets `404`,
not the prompts.
- `EmptyResultIsJSONArray` — zero-message chat and assistant-only chat
both return `Prompts: []` (non-nil, empty).
- `messageParsing.test.ts` adds a unit test asserting that
`getEditableUserMessagePayload(["hello", " ", "world"])` returns `"hello
world"`, locking in the agreement with the SQL `string_agg`.
- `dbauthz_test.go` adds the
`MethodTestSuite.TestChats/GetChatUserPromptsByChatID` entry, asserting
the parent-chat `policy.ActionRead`.

## Out of scope

- Cross-chat history.
- Per-user opt-out for the cycle.
- File-reference / attachment cycling — the cycle still reproduces plain
text only, by design.

</details>

<details>
<summary>coder-agents-review history</summary>

Four review rounds, eight unique findings, all addressed in this PR
(approved twice). Rebased onto `main` twice after R4: first to pick up
new migrations `000491` / `000492`, then again for
`000493_idx_chat_diff_statuses_url_lower`. The prompts-index migration
was renumbered `000491 → 000493 → 000494` via
`coderd/database/migrations/fix_migration_numbers.sh`; no other diff
changes.

| Round | Head | Outcome |
|---|---|---|
| R1 | `725422ab` | `COMMENTED` — 7 findings (DEREM-1..7) |
| R2 | `ab2a8936` | `COMMENTED` — 1 new (DEREM-10) + 1 reraised
(DEREM-5) |
| R3 | `648c5d1f` | **`APPROVED`** — 7 fixed, DEREM-5 deferred via
#25125 |
| R4 | `93b6f450` | **`APPROVED`** — DEREM-5 also fixed in-PR, #25125
closed |

| ID | Where | Resolution |
|---|---|---|
| DEREM-1 | `chats.sql` | Added `jsonb_typeof(content) = 'array'` guard
against V0 scalar rows |
| DEREM-2 | `exp_chats.go` | Removed dead `limit < 0` branch (SDK
rejects upstream) |
| DEREM-3 | `useChatStore.ts` | Rewrote misleading invalidation comment
|
| DEREM-4 | `exp_chats_test.go` | `NewestFirstFiltering` now inserts an
assistant-role message so the `role='user'` filter is exercised
end-to-end |
| DEREM-5 | `messageParsing.ts` | Rewrote
`getEditableUserMessagePayload` to concatenate text parts verbatim,
mirroring the SQL `string_agg` |
| DEREM-6 | `exp_chats.go` | Tightened swagger doc + error message to
spell out the 0–2000 range |
| DEREM-7 | `exp_chats_test.go` | Added `EmptyResultIsJSONArray` subtest
|
| DEREM-10 | `exp_chats_test.go` | `NewestFirstFiltering` now inserts a
raw V0 scalar-content row; verified locally that removing the guard
makes the test fail |

</details>

---

This PR was created on behalf of @ibetitsmike by Coder Agents.
2026-05-14 12:43:12 +02:00
Jaayden Halko 024132e8a4 feat: add theme_mode, theme_light, theme_dark to UserAppearanceSettings (#25076)
Part 1: Backend portion of a change broken into 2 PRs.
Part 2: #25077 

Adds three new UserAppearanceSettings fields (theme_mode, theme_light,
theme_dark) on top of the existing theme_preference and terminal_font.
Replaces GetUserThemePreference and GetUserTerminalFont with a single
GetUserAppearanceSettings aggregate query. The PUT handler is wrapped in
db.InTx so sync-mode's mode + slot writes can never half-apply.
2026-05-14 05:44:05 +01:00
Ethan 8955599bd0 fix: bump sqlc fork to v1.31.1 merge, strip pg_dump meta-commands (#25105)
Closes https://github.com/coder/internal/issues/965

Recent `pg_dump` patch releases (13.22+ / 14.19+ / 15.14+ / 16.10+ /
17.6+) emit `\restrict` / `\unrestrict` psql meta-commands at the head
and tail of schema dumps. These broke both `sqlc` and our
`scripts/migrate-test` schema-equality check. PR #19696 worked around it
by pinning `pg_dump` to a Docker image.

This change unpins the workaround now that `sqlc` handles the
meta-commands:

* Bumps the coder/sqlc fork pin to [`337309b` on
coder/sqlc:main](https://github.com/coder/sqlc/commit/337309bfb9524f38466a5090e310040fc7af0203),
the merge of upstream v1.31.1 (coder/sqlc#6). v1.31.1 includes
[sqlc-dev/sqlc#4390](https://github.com/sqlc-dev/sqlc/pull/4390), the
upstream `\restrict` / `\unrestrict` parser fix. Updated in three places
that pin the fork SHA: `flake.nix` (`sqlc-custom`),
`.github/actions/setup-sqlc/action.yaml`, and the
`dogfood/coder/ubuntu-{22,26}.04` Dockerfiles. The flake's `sha256` /
`vendorHash` are reset to `pkgs.lib.fakeSha256`; Nix will surface the
real hashes on first build, per the existing comment block.
* Reverts #19696's Docker pin in `coderd/database/dbtestutil/db.go`.
Local `pg_dump` (13+) and the `postgres:13` Docker fallback both work
again.
* Strips `\restrict` / `\unrestrict` lines in `normalizeDump` so
`scripts/migrate-test`'s schema comparison is stable across `pg_dump`
versions (the token in those lines is randomized per run).
`TestNormalizeDumpStripsRestrict` locks the behavior in.
* Regenerates with v1.31.1, picking up the version stamp and one
upstream correctness fix in `DeleteLicense`
([sqlc-dev/sqlc#4383](https://github.com/sqlc-dev/sqlc/pull/4383): don't
shadow the input parameter when scanning a single-column return).
2026-05-13 18:55:24 +10:00
Michael Suchacz 96333acda3 fix(coderd): filter build instance agents in SQL (#25031)
Replaces the per-agent Go-side template-version filter in
`handleAuthInstanceID` with a purpose-built SQL query.

`GetWorkspaceBuildAgentsByInstanceID` joins `workspace_agents ->
workspace_resources -> workspace_builds -> provisioner_jobs ->
workspaces` and excludes:

- non-`workspace_build` provisioner jobs (template-version-import,
dry-run)
- deleted agents and sub-agents
- deleted workspaces

The handler:

- drops the per-candidate `GetWorkspaceResourceByID` /
`GetProvisionerJobByID` lookups
- drops the `provisioner_jobs.input` JSON parsing and the follow-up
`GetWorkspaceBuildByID` call
- compares `latestHistory.ID` against `selected.WorkspaceBuildID`
returned directly from the query
- preserves the existing recycled-instance safety check and matching
response codes

One intentional behavior tightening: agents whose workspace is deleted
now return 404 (previously they could reach the recycled-instance check
and return 400, or 200 if the stale build was still latest). This
matches the existing token-auth path, which already refuses to
authenticate against deleted workspaces.

The original `GetWorkspaceAgentsByInstanceID` query is intentionally
untouched. It remains the generic raw lookup used elsewhere in tests and
helpers.

The dbauthz wrapper for the new query uses the system-read fast path
with `fetchWithPostFilter` for non-system reads, with `RBACObject()`
delegating to the embedded `WorkspaceTable`.

Tests:

- new `TestGetWorkspaceBuildAgentsByInstanceID` covering newest-first
ordering, exclusion of deleted/sub agents, exclusion of template-import
and dry-run jobs, and exclusion of deleted workspaces
- new dbauthz mock test for `GetWorkspaceBuildAgentsByInstanceID`
- new `TestPostWorkspaceAuthAWSInstanceIdentity/RecycledInstanceID`
exercising the recycled-instance rejection branch (HTTP 400 when the
agent's build is no longer latest)
- existing `TestPostWorkspaceAuth{AWS,Azure,Google}InstanceIdentity`
continue to cover the handler end to end (including the template-version
+ workspace-build same-instance-ID scenario via
`setupInstanceIDWorkspace`)

> Mux is acting on Mike's behalf.
2026-05-12 14:55:56 +02:00
Thomas Kosiewski 5c3b59151e feat: add Cmd/Ctrl+Enter send setting (#25062)
Adds an Agents General setting to require Cmd/Ctrl+Enter before sending
chat messages. When enabled, plain Enter inserts a newline in agent chat
inputs while the send button remains available.

The preference is now persisted server-side through
`/api/v2/users/{user}/preferences`, alongside the existing user
preference settings, and is applied to both the create-agent input and
existing chat composer. Storybook and API coverage verify the setting,
keyboard behavior, validation, and persistence.

<details>
<summary>Coder Agents notes</summary>

Generated by Coder Agents from a Slack request. Dogfooded with
agent-browser against the Storybook settings and chat input stories.

</details>
2026-05-12 10:09:34 +02:00
Zach b221632615 fix: wipe user secrets when user is soft-deleted (#24985)
Extend the delete_deleted_user_resources() trigger so that secrets
belonging to a soft-deleted user are removed in the same transaction as
the existing api_keys and user_links cleanup.

user_secrets.user_id has ON DELETE CASCADE, but Coder soft-deletes users
by flipping users.deleted rather than removing the row, so the foreign key
cascade never fires and secrets would otherwise survive deletion.

Assisted by Coder Agents.
2026-05-11 09:07:30 -06:00
Yevhenii Shcherbina 4124d1137d feat: add ai_model_prices table (#24932)
# Summary

Implements
https://linear.app/codercom/issue/AIGOV-282/add-ai-model-price-table-and-seed-generator

This PR lays the groundwork for AI Bridge cost controls (per the AI
Governance RFC). It adds the foundation needed for future cost tracking:
a place to store per-model token prices, a way to keep those prices in
sync with upstream pricing data, and a startup mechanism that ensures
every deployment has prices loaded before AI Bridge starts processing
requests.

The price data comes from [models.dev](https://models.dev/), a
community-maintained catalogue of AI provider pricing. A generator
script fetches the latest prices, filters to Anthropic and OpenAI for
now, and produces a seed file checked into the repository.

On every server startup the seed is applied to the database, so new
releases automatically pick up any price corrections that landed since
the previous one. Existing rows are overwritten with the latest prices;
rows for models no longer in the seed are left untouched.

# Batching the AI model price seed: three approaches

Context: at server startup we seed the `ai_model_prices` table from an
embedded JSON price book (~70 rows today, will grow as we add providers,
potentially 4000+).

Each row is:

```text
(provider, model, input_price, output_price, cache_read_price, cache_write_price)
```

Any of the four price columns can be:

- `NULL` → “price unknown for this dimension”
- explicit `0` → “free”

The batch must be an UPSERT so re-running is idempotent and existing
rows pick up new prices.

We considered three implementations.

---

## Approach 1 — Per-row UPSERT in a Go loop

```go
for _, row := range rows {
    if err := db.UpsertAIModelPrice(ctx, database.UpsertAIModelPriceParams{
        Provider:   row.Provider,
        Model:      row.Model,
        InputPrice: nullInt64(row.InputPrice),
        // ...
    }); err != nil {
        return err
    }
}
```

### Pros

- Trivial.
- NULL handling falls out naturally from `sql.NullInt64`.

### Cons

- `N` round-trips per seed.
- With ~70 rows that means ~70 statement executions on every startup,
even inside a transaction.
- Doesn't scale gracefully as the price book grows, potentially 4000+.

---

## Approach 2 — `UNNEST` with parallel arrays

Pass each column as a separate Go slice. Postgres unnests them in
parallel into a virtual table, then `INSERT ... SELECT`.

```sql
INSERT INTO ai_model_prices (
    provider,
    model,
    input_price,
    output_price,
    cache_read_price,
    cache_write_price
)
SELECT
    UNNEST(@providers::text[]),
    UNNEST(@models::text[]),
    NULLIF(UNNEST(@input_prices::bigint[]), -1),
    NULLIF(UNNEST(@output_prices::bigint[]), -1),
    NULLIF(UNNEST(@cache_read_prices::bigint[]), -1),
    NULLIF(UNNEST(@cache_write_prices::bigint[]), -1)
ON CONFLICT (provider, model) DO UPDATE SET
    input_price       = EXCLUDED.input_price,
    output_price      = EXCLUDED.output_price,
    cache_read_price  = EXCLUDED.cache_read_price,
    cache_write_price = EXCLUDED.cache_write_price,
    updated_at        = NOW();
```

Go side: flatten rows into six parallel slices.

Use a sentinel (`-1`) for “missing”, since `lib/pq` can't encode `NULL`
into a `bigint[]` element.

```go
providers := make([]string, len(rows))
models    := make([]string, len(rows))
inputs    := make([]int64,  len(rows))
outputs   := make([]int64,  len(rows))
cacheR    := make([]int64,  len(rows))
cacheW    := make([]int64,  len(rows))

for i, r := range rows {
    providers[i] = r.Provider
    models[i]    = r.Model

    inputs[i] = -1
    if r.InputPrice != nil {
        inputs[i] = *r.InputPrice
    }

    outputs[i] = -1
    if r.OutputPrice != nil {
        outputs[i] = *r.OutputPrice
    }

    cacheR[i] = -1
    if r.CacheReadPrice != nil {
        cacheR[i] = *r.CacheReadPrice
    }

    cacheW[i] = -1
    if r.CacheWritePrice != nil {
        cacheW[i] = *r.CacheWritePrice
    }
}

return db.UpsertAIModelPrices(ctx, database.UpsertAIModelPricesParams{
    Providers:        providers,
    Models:           models,
    InputPrices:      inputs,
    OutputPrices:     outputs,
    CacheReadPrices:  cacheR,
    CacheWritePrices: cacheW,
})
```

### Pros

- Single round-trip.

### Cons

- The generated `sqlc` params become plain `[]int64`, which can't
represent `NULL`.

---

## Approach 3 — `jsonb_array_elements` over a single `@seed::jsonb`
(chosen)

Pass the raw seed JSON as one parameter; let Postgres expand and parse
it.

```sql
INSERT INTO ai_model_prices (
    provider,
    model,
    input_price,
    output_price,
    cache_read_price,
    cache_write_price
)
SELECT
    elem->>'provider',
    elem->>'model',
    (elem->>'input_price')::bigint,
    (elem->>'output_price')::bigint,
    (elem->>'cache_read_price')::bigint,
    (elem->>'cache_write_price')::bigint
FROM jsonb_array_elements(@seed::jsonb) AS elem
ON CONFLICT (provider, model) DO UPDATE SET
    input_price       = EXCLUDED.input_price,
    output_price      = EXCLUDED.output_price,
    cache_read_price  = EXCLUDED.cache_read_price,
    cache_write_price = EXCLUDED.cache_write_price,
    updated_at        = NOW();
```

Go side reduces to:

```go
return db.UpsertAIModelPrices(ctx, seedJSON)
```

### Pros

- Single round-trip.
- NULLs fall out naturally:
  - `(elem->>'cache_write_price')::bigint` becomes `NULL`
  - no sentinels
- The seed is already JSON:
- Existing precedent:
  - `jsonb_array_elements` is already used elsewhere in the codebase

### Cons

- Less type-safe at the SQL boundary than `UNNEST`
- Slightly less standard than `UNNEST`
- Readers need familiarity with:
  - `jsonb_array_elements`
  - `->>` extraction syntax
- Postgres pays JSON parse cost
  - negligible at our scale

---

---

# Decision

We picked Approach 3.

It collapses the round-trips like `UNNEST` does, but without:

- nullable-array workarounds
- sentinel values
2026-05-08 16:45:14 -04:00
Danielle Maywood e7958713a9 feat: add code diff display mode preference (#25027) 2026-05-07 20:15:28 +01:00
Mathias Fredriksson 6b0518d051 fix: state-aware queued message promotion (#24819)
PromoteQueued now branches on chat status: synth tool results before
the user message on requires_action, deferred reorder + Waiting on
running so the worker's persist+auto-promote keeps partial output.
Stale heartbeat falls through to the synchronous path; GetStaleChats
picks up Waiting+queue to recover post-cleanup-crash. Endpoint
returns 202.

Closes CODAGT-119
2026-05-06 19:11:56 +03:00
Michael Suchacz 0bfb9f6f13 feat: show agent turn summary in agents sidebar (#24942)
Persists the agent-generated turn-end summary on `chats` and shows it as
the Agents sidebar subtitle when present, falling back to the model
name. Errors still take precedence.

> Mux is acting on Mike's behalf.

## What changes

**Storage.** New nullable `last_turn_summary` column on `chats`
(migration `000486`). New `UpdateChatLastTurnSummary` query normalizes
blank/whitespace input to `NULL`, preserves `updated_at` (so the chat
does not jump to the top of the sidebar on summary writes), and uses an
`expected_updated_at` stale-write guard so an older async summary cannot
overwrite a newer turn.

**Backend.** `coderd/x/chatd/chatd.go` decouples summary generation from
webpush. Generated summaries persist for completed parent turns even
when webpush is unconfigured or has no subscriptions. The same generated
text is reused as the webpush body when webpush is configured, so the
summary model is not called twice. Generic fallback push text is no
longer persisted; it clears any stale summary instead.
Error/interrupt/pending-action terminal paths clear `last_turn_summary`
for the latest turn.

**Frontend.** `AgentsSidebar.tsx` subtitle priority is now `errorReason
|| lastTurnSummary || modelName`, normalized via the existing
`asNonEmptyString` helper from `blockUtils.ts`.

## Tests

- `TestUpdateChatLastTurnSummary` (database): success,
whitespace-to-NULL, stale guard rejects, `updated_at` preserved.
- `TestUpdateLastTurnSummaryRejectsStaleWrites` (chatd internal): direct
stale-`expected_updated_at` test.
- `TestSuccessfulChatPersistsTurnSummaryWithoutWebPush`: persistence
works without webpush subscriptions.
- `TestSuccessfulChatSendsWebPushWithSummary`: same generated text
drives both DB and push body.
-
`TestSuccessfulChatSendsWebPushFallbackWithoutSummaryForEmptyAssistantText`:
fallback text is not persisted.
- `TestErroredChatClearsLastTurnSummaryAndSendsWebPush`: error path
clears the field.
- `TestInterruptChatDoesNotSendWebPushNotification`: interrupt path
clears the field, no push fires.
- `AgentsSidebar.test.tsx`: subtitle priority for summary-present,
error-wins, no-summary fallback, whitespace fallback.
- `AgentsSidebar.stories.tsx`: `ChatWithTurnSummary` and
`ChatWithTurnSummaryAndError`.

## Notes

- No backfill. Existing chats keep showing the model name until their
next turn completes.
- Parent chats only in this iteration; the field is rendered on any
`Chat` if a future change extends generation to children.
- Decoupling generation from webpush adds quickgen model calls for
completed parent turns that previously skipped generation when no
subscriptions existed. Existing parent-only, assistant-text-present,
`PushSummaryModel` configured, and bounded-timeout gates keep this
behavior bounded.
2026-05-06 16:43:35 +02:00
Michael Suchacz 2874d4b4cd feat: add chat debug retention purge (#24943)
> Mux is acting on Mike's behalf.

Adds configurable retention for chat debug data, including the purge
query, updated_at index, site config, experimental API, SDK types,
frontend lifecycle setting, and docs.

The purge deletes debug runs older than the configured retention window
and relies on existing cascades to delete steps. The default retention
is 30 days, and setting the value to 0 disables the purge.
2026-05-05 22:37:13 +02:00
Zach 1b2a1af097 feat: report user secrets adoption summary in telemetry (#24854)
Add a deployment-wide user secrets summary to the telemetry snapshot so
we can track adoption of user secrets
The summary reports:

- A breakdown of secrets by which injection fields are populated:
EnvNameOnly, FilePathOnly, Both, Neither
- The distribution of secrets per user (max, p25, p50, p75, p90)

All metrics are scoped to active non-system users. Soft-deleted users
are excluded. The percentile distribution is computed across the entire
active non-system user base, including users with zero secrets, so the
percentiles reflect deployment-wide adoption.

Assisted by Coder Agents.
2026-05-05 10:56:39 -06:00
Michael Suchacz 632dcdb63a feat: add personal chat model overrides (#24715) 2026-05-05 00:57:51 +02:00
Michael Suchacz 0bb09935bc feat: add computer-use provider selection for AI agents (#24772)
Adds a deployment-wide setting to select the computer-use provider
(Anthropic or OpenAI) for AI agents, plus the OpenAI computer-use runner
needed to honor that selection.

The setting is stored in `site_configs` under
`agents_computer_use_provider`, defaults to Anthropic when unset, and is
exposed via experimental GET/PUT endpoints under
`/api/experimental/chats/config/computer-use-provider`. The chatd
computer-use tool now dispatches to either `runAnthropicComputerUse` or
`runOpenAIComputerUse` based on the resolved provider, with
provider-specific result metadata for OpenAI screenshots.

Frontend adds a provider dropdown to the Agents Experiments settings
page nested under the virtual desktop toggle, with disabled state
handling while virtual desktop is off and skeleton loaders while config
queries are in flight.

Hugo and Codex review follow-up:
- Uses shared provider validation and clearer computer-use constant
names.
- Removes stale OpenAI pending-safety-checks commentary.
- Documents why provider result metadata is needed for OpenAI
screenshots.
- Keeps the computer-use subagent visible when provider credentials are
missing, then returns a clear spawn-time configuration error.
- Uses OpenAI's recommended 1600x900 screenshot geometry to preserve the
native 16:9 aspect ratio.
- Moves OpenAI-specific computer-use helpers into
`coderd/x/chatd/chatopenai/computeruse` after rebasing onto the provider
package refactor in `main`.
- Converts OpenAI pixel scroll deltas to Coder desktop wheel-click
amounts.
- Preserves OpenAI pointer modifiers with key down/up desktop actions
and rejects unsupported non-left double-click buttons explicitly.
- Maps OpenAI back/forward side-button clicks to browser navigation key
actions.
- Defaults omitted OpenAI click buttons to left-click.
- Retries mouse release cleanup if the final OpenAI drag release fails.
- Keeps computer-use subagent availability messages stable when provider
config cannot be loaded, while logging the backend error.
- Releases remaining OpenAI modifier keys if a synthetic key-up cleanup
action fails.
- Updates Storybook interaction stories so provider snapshots show the
selected final provider.

> Mux updated this PR description on behalf of Mike.
2026-05-04 20:30:50 +02:00
Michael Suchacz 033ed0bb82 feat: add admin-configurable chat title generation model (#24838)
Adds an admin-configurable deployment-wide setting that controls which
model is used for chat title generation. Admins can pick any enabled
chat model config from the Agents settings page, or leave the setting
unset to keep the existing fast-models-then-chat-model fallback
algorithm.

When a model is selected, both automatic and manual title generation use
only that model, with no silent fallback. When the configured model is
disabled, missing credentials, or otherwise unusable, automatic title
generation skips entirely (best-effort) and manual title regeneration
returns a clear error, so admins notice the misconfiguration instead of
silently routing title traffic through another provider.

## Surface

- New deployment-wide setting stored as a `site_configs` row
(`agents_chat_title_generation_model_override`).
- New experimental endpoint `GET/PUT
/api/experimental/chats/config/model-override/{context}`.
- Frontend: title generation now appears as a third dropdown on the
Agents admin settings page alongside the existing general and explore
context overrides.

## DRY refactors folded in

Title generation is integrated as a third value of the existing
`ChatModelOverrideContext` type alongside `general` and `explore`,
sharing the parameterized HTTP route, SDK methods, generated types, and
frontend API plumbing rather than introducing a parallel surface. The
`Agent` prefix was dropped from the type and route since title
generation is not a delegated agent.

The chatd model-override resolver is also shared.
`resolveConfiguredModelOverride` now takes a `failureMode` parameter:

- Subagent overrides use soft failure: misconfigured overrides are
logged and the parent model is used.
- Title generation uses hard failure: misconfigured overrides return an
explicit error so manual title regeneration surfaces the
misconfiguration and automatic title generation skips instead of
silently falling back.

> Mux is acting on Mike's behalf.
2026-05-04 13:13:00 +02:00
Jon Ayers 6b9637d85a feat: replace pgcoordinator pg_notify triggers with app-level Publish() (#24717) 2026-05-01 15:00:08 -05:00
Thomas Kosiewski 06bad73df4 feat: add admin-configurable advisor API, SDK, and queries (#24621)
## Summary

Add the **admin-configurable advisor configuration**: database-backed storage, SDK types, and the experimental HTTP handlers that back the admin settings UI (later PRs). Follows the same "site-configs" pattern as Virtual Desktop.

## Motivation

The advisor needs runtime-tunable knobs (enable/disable, per-run cap, max output tokens, reasoning effort, optional model override) without a service restart or redeploy. Using the existing `site_configs` K/V table keeps this pattern consistent with other admin features and avoids a bespoke schema.

## Changes

### Database (`coderd/database/queries/siteconfig.sql`)
- `GetChatAdvisorConfig` returns the stored JSON blob (default `'{}'`) under key `agents_advisor_config`.
- `UpsertChatAdvisorConfig` uses the standard `INSERT ... ON CONFLICT` pattern.
- Regenerated via `make gen` (queries.sql.go + mocks).

### SDK (`codersdk/chats.go`)
- `AdvisorConfig` type with `Enabled`, `MaxUsesPerRun`, `MaxOutputTokens`, `ReasoningEffort` (`""` / `low` / `medium` / `high`), `ModelConfigID uuid.UUID`.
- Client methods: `ChatAdvisorConfig(ctx)` / `UpdateChatAdvisorConfig(ctx, cfg)`.

### API (`coderd/exp_chats.go`)
- `GET /api/experimental/chats/config/advisor`: reads current config; relies on `ActorFromContext` validation.
- `PUT /api/experimental/chats/config/advisor`: requires `policy.ActionUpdate` on `rbac.ResourceDeploymentConfig`.
- Handlers unmarshal `{}` to a typed zero value and re-marshal on upsert for schema stability.
- Tests in `exp_chats_test.go` cover empty defaults, round-trip update, unauthorized update, and invalid body.

## Stack context

This is **PR 3 of 6** in the advisor feature stack. Consumed by:
- PR 4 (`feat/advisor-04-chatd-runtime`), which reads this config on every `runChat`.
- PR 6 (`feat/advisor-06-admin-settings-ui`), which renders the admin form.

## Scope / non-goals

- No `chatd` read path (lands in PR 4).
- No UI (lands in PR 6).
- `agents_advisor_config` remains a single-row JSON blob; we intentionally do not shard per-org/per-template yet.

## Validation

- `make gen`
- `go test ./coderd/database/... -run TestChatAdvisor`
- `go test ./coderd/... -run TestChatAdvisorConfig`
- `make lint`

---

<details>
<summary>📋 Implementation Plan (shared across the advisor stack)</summary>

# Plan: Add a Mux-style advisor tool to coder agents/chatd

## Outcome

Add a first-class `advisor` tool to agent chats in `coderd/x/chatd` that feels native to Coder:

- it is a built-in server-side tool, not an MCP/dynamic-tool workaround;
- it performs a nested **tool-less** model call for strategic advice;
- it is exposed only when eligible, and the prompt mentions it only when it is actually available;
- it is treated as a **planning-only** tool so it does not run alongside action tools in the same batch;
- it tracks usage/cost separately enough for operators to reason about it;
- it has a minimally polished UI in the Agents page;
- and it ships with explicit dogfooding evidence, including screenshots and repro videos.

## Design decisions to lock before coding

1. **Primary architecture:** native built-in tool in `chattool/`, backed by a small `chatadvisor` package.
2. **Nested model execution:** reuse chatd's existing model/provider stack for a one-step, tool-less advisor call rather than inventing a new provider pathway.
3. **Execution policy:** treat `advisor` as an exclusive/planning-only tool; mixed batches must return structured policy errors and force the model to retry cleanly.
4. **Availability:** initial rollout is for root agent chats only; disable for child/sub-agent chats until recursion/cost policy is proven.
5. **Prompt sync:** use one eligibility boolean to drive both tool registration and advisor guidance injection.
6. **Persistence/cost split:** MVP should keep advisor usage visible in result metadata and server metrics; only add DB schema if product/billing explicitly needs queryable advisor-specific cost.
7. **UI scope:** generic tool rendering is an acceptable temporary milestone during backend bring-up, but the release candidate should include a dedicated lightweight advisor renderer.

## Delivery model

The work should be executed as coordinated workstreams with one integration owner and parallel contributors for low-conflict areas. The integration owner should own `coderd/x/chatd/chatd.go` because prompt assembly, tool registration, and model resolution all converge there.

## Detailed workstreams

### Repo evidence used for this plan

<details>
<summary>Mux reference and current chatd seams</summary>

**Mux reference implementation**

- `src/node/services/tools/advisor.ts` — native advisor tool implementation.
- `src/common/constants/advisor.ts` — advisor prompt/constants and truncation policy.
- `src/common/utils/tools/tools.ts` — conditional tool registration.
- `src/node/services/streamContextBuilder.ts` — injects advisor guidance only when the tool is available.

**Current chatd seams**

- `coderd/x/chatd/chatd.go`
  - `processChat()` — tool assembly, prompt assembly, and chatloop invocation.
  - `resolveChatModel()` — current model/provider/key resolution seam.
  - `type Config struct` — server-level chatd configuration surface.
- `coderd/x/chatd/chatloop/chatloop.go`
  - `Run()` — main streaming/model loop.
  - `executeTools()` — built-in tool execution/batching seam.
- `coderd/x/chatd/chattool/` — built-in tool implementations.
- `site/src/pages/AgentsPage/components/ChatElements/tools/Tool.tsx` — tool renderer dispatch.
- `site/src/pages/AgentsPage/components/ChatConversation/messageParsing.ts` and `ConversationTimeline.tsx` — tool/result merge and rendering flow.

</details>

### Workstream map and ownership

| Workstream | Primary owner | Main files | Can run in parallel? | Done when |
|---|---|---|---|---|
| 0. Integration + gating | Integration lead | `coderd/x/chatd/chatd.go` | No; central merge lane | Tool registration, prompt sync, and model selection are wired together |
| 1. Advisor runtime + tool | Backend agent | new `coderd/x/chatd/chatadvisor/`, new `coderd/x/chatd/chattool/advisor.go` | Yes | Tool can perform a tool-less advisor call in memory and return structured results |
| 2. Planning-only execution policy | Chatloop agent | `coderd/x/chatd/chatloop/chatloop.go`, related tests | Yes | Mixed `advisor` + action-tool batches are rejected cleanly and deterministically |
| 3. Metrics/usage/config | Backend/telemetry agent | `chatd.go`, `chatloop/metrics.go`, optional config plumbing | Partially; coordinate with integration lead | Advisor usage is separately visible in metadata/metrics and limits are enforced |
| 4. Frontend rendering | Frontend agent | `site/.../tools/Tool.tsx`, new `AdvisorTool.tsx`, stories | Yes after result schema stabilizes | Advisor renders as a readable card and story tests pass |
| 5. Dogfood + QA evidence | QA agent | dev server, Storybook, dogfood output | After backend + UI are usable | Repro videos, screenshots, and a concise QA report exist |

### Parallelization rules

- **Do not split `coderd/x/chatd/chatd.go` across multiple execution agents without an integration lead.** That file owns prompt building, tool registration, model resolution, and cost persistence.
- Workstreams 1 and 2 can be developed in parallel and then stacked onto the integration branch.
- Workstream 4 should begin once the backend result schema is agreed on, even if the backend is still behind a feature flag.
- Any agent that needs to re-check Mux behavior should clone `coder/mux` into a temporary directory (for example, `$(mktemp -d)/mux`) and inspect it read-only; do not vendor or copy code from Mux directly.

## Phase 0 — Preflight and guardrails

### Goals

- Align the team on the smallest shippable architecture.
- Prevent scope creep into MCP/dynamic-tool/sub-agent variants.
- Decide upfront what is MVP vs. follow-up.

### Tasks

1. **Confirm the MVP boundary.**
   - Ship a built-in advisor tool first.
   - Do **not** make MCP, dynamic tools, or sub-agents the primary implementation.
   - Do **not** add transient streaming phases in the first backend PR unless they fall out almost for free.

2. **Confirm local workflow hygiene before coding.**
   - Ensure the repo is using the project git hooks from `scripts/githooks`.
   - Do not bypass hooks with `--no-verify`.
   - Use `./scripts/develop.sh` for the full dev server rather than manual build/run commands.

3. **Lock the model-selection policy.**
   - **Recommended MVP:** advisor uses the same resolved provider/model/cost config as the current chat, with advisor-specific max-output and usage caps.
   - **Follow-up only if required:** add a separate `AdvisorModelConfigID`-style override that resolves through the existing `configCache`/model-config path. Do not invent a new free-form `provider:model` parser if chatd already stores provider/model separately.

4. **Lock the persistence policy.**
   - **Recommended MVP:** no DB migration. Persist advisor-visible metadata in the tool result and record separate metrics in memory/Prometheus.
   - **Only if product/billing explicitly asks for queryable advisor cost:** add a later DB migration or usage table, following the normal `queries/*.sql` + `make gen` workflow.

5. **Create an execution ADR note in the work item or tracking doc.**
   - Capture: built-in tool, tool-less nested call, root-chat-only rollout, exclusive execution policy, MVP no-DB-migration default.

### Quality gate

- Everyone on the team can state the same answers to these questions:
  - Is advisor a built-in tool? **Yes.**
  - Can advisor run with action tools in the same batch? **No.**
  - Does advisor get tools of its own? **No.**
  - Is a DB migration required for MVP? **No, unless billing insists.**

## Phase 1 — Build the advisor runtime and tool wrapper

### Goals

Create the core advisor implementation in a way that is easy to test and keeps `chattool/` thin.

### Files to add

- `coderd/x/chatd/chatadvisor/types.go`
- `coderd/x/chatd/chatadvisor/guidance.go`
- `coderd/x/chatd/chatadvisor/handoff.go`
- `coderd/x/chatd/chatadvisor/runtime.go`
- `coderd/x/chatd/chatadvisor/runner.go`
- `coderd/x/chatd/chattool/advisor.go`

### Responsibilities by file

1. **`types.go`**
   - Define the input/result schema used by the tool and UI.
   - Keep the result shape close to Mux so the UI and model both have predictable cases.
   - Recommended result variants:
     - `advice`
     - `limit_reached`
     - `error`

   Recommended shape:

   ```go
   type AdvisorArgs struct {
       Question string `json:"question"`
   }

   type AdvisorResult struct {
       Type          string              `json:"type"`
       Advice        string              `json:"advice,omitempty"`
       Error         string              `json:"error,omitempty"`
       AdvisorModel  string              `json:"advisor_model,omitempty"`
       RemainingUses int                 `json:"remaining_uses,omitempty"`
       Usage         *AdvisorUsageResult `json:"usage,omitempty"`
   }
   ```

2. **`guidance.go`**
   - Hold two strings:
     - the nested advisor system prompt;
     - the parent-agent guidance block to inject into the outer system prompt.
   - The nested advisor prompt must say, in plain language:
     - you are advising the parent agent;
     - you do not address the end user directly;
     - you do not claim actions happened;
     - you return concise strategic guidance and tradeoffs.

3. **`runtime.go`**
   - Define the per-run runtime state.
   - Recommended fields:
     - resolved model + model config;
     - provider keys/options reused from the outer chat;
     - `MaxUsesPerRun`;
     - `MaxOutputTokens`;
     - atomic/current call counter;
     - callback(s) to obtain the current prompt snapshot and current-step snapshot;
     - optional metrics/usage hook.
   - Add fail-fast validation for impossible config: nil model, non-positive limits, empty prompt builders, etc.

4. **`handoff.go`**
   - Build the advisor handoff message from:
     - the explicit question;
     - the exact prompt/messages the parent model just used;
     - the current step's text/reasoning snapshot, if available;
     - the most recent relevant tool outputs, if they are already in the prompt snapshot.
   - **Important:** use the already-prepared outer prompt tail, not a fresh DB reload. That keeps the advisor aligned with compaction and the exact context the outer model saw.
   - Apply hard truncation budgets with recent-context bias.

5. **`runner.go`**
   - Execute the nested advisor call.
   - **Recommended implementation:** call `chatloop.Run()` in an in-memory, one-step mode:
     - `Tools: nil`
     - `ProviderTools: nil`
     - `MaxSteps: 1`
     - `PersistStep`: capture the assistant output in memory instead of writing DB rows
   - Reuse the existing provider/model/cost path instead of building a second provider runner.
   - Assert that no tool definitions are passed to the nested call.

6. **`chattool/advisor.go`**
   - Keep this file thin and consistent with other built-ins.
   - Responsibilities:
     - decode `AdvisorArgs`;
     - validate `Question` is non-empty and bounded;
     - call the `chatadvisor` runner;
     - return a structured tool response.

### Defensive programming requirements

- Assert `Question` is non-empty after trimming.
- Assert runtime limits are positive.
- Assert the nested advisor call runs with zero tools/provider tools.
- Assert `AdvisorResult.Type` is one of the known variants before returning.
- Assert remaining uses never goes negative.

### Acceptance criteria

- A unit test can call the advisor tool with a fake model and receive a stable `advice` result.
- The nested advisor call is impossible to run with tools accidentally attached.
- The core logic lives in `chatadvisor/`, not embedded inside `chatd.go`.

## Phase 2 — Wire advisor into chatd and keep prompt/tool availability in sync

### Goals

Register the tool in the right place, expose it only when eligible, and inject system guidance only when the tool is present.

### Files to modify

- `coderd/x/chatd/chatd.go`
- optionally a small helper file if `chatd.go` becomes too crowded

### Tasks

1. **Compute one eligibility boolean in `processChat()`.**
   Recommended inputs:
   - server-level advisor enabled flag;
   - root chat only (`chat.ParentChatID == uuid.Nil` or equivalent existing root/child check);
   - a usable resolved model/provider exists;
   - optional experiment/workspace/org gate if product wants staged rollout.

2. **Create the runtime once per outer chat run.**
   - Use the model/config/keys resolved by `resolveChatModel()`.
   - Reuse provider options from the current chat's `ChatModelCallConfig`.
   - Set `MaxUsesPerRun` and `MaxOutputTokens` from advisor config defaults.

3. **Register the tool in the built-in tool block.**
   - Insert after the skill tools and before MCP tools in `processChat()`.
   - Record `builtinToolNames["advisor"] = true` so metrics stay bounded.

4. **Inject advisor guidance into the outer system prompt using the same boolean.**
   - Use `chatprompt.InsertSystem()` in the same prompt assembly path that already injects user/system instructions.
   - Place the block near the existing instruction insertion, before plan-path/skill context blocks.
   - Wrap the guidance in an explicit tag like `<advisor-guidance>` so it is easy to spot in tests and future refactors.

5. **Keep advisor out of child chats for the first release.**
   - That avoids recursion/cost blowups with `spawn_agent` / `wait_agent` flows.
   - Document this explicitly in the rollout notes and tests.

### Acceptance criteria

- If advisor is disabled, neither the tool nor the prompt guidance appears.
- If advisor is enabled, both the tool and the prompt guidance appear.
- Root chats can use advisor; child chats cannot.
- Built-in tool names include `advisor` so metrics do not collapse it into the generic `mcp` label.

## Phase 3 — Enforce planning-only execution policy in `chatloop`

### Goals

Prevent the model from calling `advisor` and action tools in the same execution batch.

### Files to modify

- `coderd/x/chatd/chatloop/chatloop.go`
- related chatloop tests

### Recommended implementation

Keep the MVP small; do **not** build a general policy engine yet.

1. Add a minimal field to `chatloop.RunOptions`, for example:

   ```go
   ExclusiveToolName *string
   ```

2. In `Run()` / `executeTools()`, detect the case where the exclusive tool appears in the same local-tool batch as any other locally executed tool.

3. When that happens, synthesize structured tool-result errors for the affected calls instead of executing anything in the batch.
   - `advisor` should receive a clear error like: _advisor must be called by itself before action tools_.
   - The sibling action tools should receive a paired policy error like: _this tool was skipped because advisor must run alone_.

4. Let the outer model see those tool errors and retry cleanly.
   - This is simpler and safer than partial execution or hidden deferral.
   - It preserves deterministic transcript history for debugging.

5. Pass the just-finished step snapshot into the tool execution context.
   - The advisor runtime should be able to see the current step's text/reasoning content, because that is often the best hint about what the outer model is trying to decide.

### Why this is the right fit

- It matches the intended semantics: advisor is consulted **before** taking action.
- It avoids subtle race conditions caused by concurrent built-in tool execution.
- It keeps the behavior easy to test with fake models.

### Acceptance criteria

- A model-emitted batch containing only `advisor` succeeds.
- A model-emitted batch containing `advisor` plus any other locally executed tool returns deterministic policy errors and executes nothing.
- Non-advisor tool execution stays unchanged for normal chats.

## Phase 4 — Usage limits, metrics, and configuration

### Goals

Make advisor safe to operate without over-designing billing/storage in the first release.

### Files to modify

- `coderd/x/chatd/chatd.go`
- `coderd/x/chatd/chatloop/metrics.go` as needed
- `coderd/x/chatd/chatd.go` `Config` struct and constructor path
- optional follow-up config/db files only if a separate advisor model or persistent billing is required

### Tasks

1. **Add explicit server config knobs for MVP.**
   Recommended fields on `chatd.Config` or a nested advisor config struct:
   - `AdvisorEnabled bool`
   - `AdvisorMaxUsesPerRun int`
   - `AdvisorMaxOutputTokens int64`

2. **Track usage per outer run.**
   - Reset the counter for each `processChat()` invocation.
   - Return `remaining_uses` in the tool result.
   - Return `limit_reached` when the cap is exhausted.

3. **Expose advisor usage metadata in the tool result.**
   - Include model name and token/cost summary if available.
   - Use the same `callConfig.Cost` calculation path as the outer chat for MVP if advisor reuses the same model.

4. **Record server-side metrics.**
   - Count advisor invocations, failures, and latency.
   - Ensure they show up under the built-in tool label `advisor`.

5. **Optional decision gate: separate advisor model.**
   - If product insists on a stronger/different advisor model, add a follow-up config hook that resolves another existing chat model config through the same `configCache` path.
   - Keep that out of the first landing PR unless it is required for acceptance.

6. **Optional decision gate: queryable advisor cost.**
   - If this becomes required, spin a follow-up DB task:
     - update `coderd/database/queries/*.sql`;
     - add migration files;
     - run `make gen`;
     - update audit mappings if a new auditable type/field is introduced.

### Acceptance criteria

- Advisor calls are capped per outer run.
- Limit exhaustion is user-visible in the tool result.
- Metrics distinguish advisor calls from other built-in tools.
- MVP does not require a schema migration unless explicitly approved.

## Phase 5 — Frontend rendering and Storybook coverage

### Goals

Make advisor feel intentional in the Agents UI without blocking the backend on fancy streaming UI.

### Files to modify

- `site/src/pages/AgentsPage/components/ChatElements/tools/Tool.tsx`
- new `site/src/pages/AgentsPage/components/ChatElements/tools/AdvisorTool.tsx`
- Storybook story file(s) in the same tools directory

### Delivery strategy

1. **Intermediate milestone during backend bring-up:** rely on the existing generic tool renderer if needed.
   - This is acceptable only as a short-lived integration checkpoint.

2. **Release milestone:** add a dedicated lightweight `AdvisorTool` renderer.
   - Reuse existing primitives:
     - `ToolCollapsible`
     - `ToolIcon`
     - `Response` for markdown/prose rendering
     - `ScrollArea` if the advice can be long
   - Keep styling light and consistent with the Agents page.
   - Do not add unnecessary React memoization in `site/src/pages/AgentsPage/`; that area is already React-Compiler aware.

3. **Render the structured result states cleanly.**
   - `advice` — readable prose/markdown with optional metadata footer.
   - `limit_reached` — warning-style message.
   - `error` — error state with visible fallback text.
   - `running` — existing tool loading state/spinner is enough for MVP.

4. **Add Storybook coverage instead of ad-hoc component tests.**
   Recommended stories:
   - successful advice;
   - running/loading;
   - limit reached;
   - error.

5. **Keep the UI contract narrow.**
   - Prefer one text field like `advice` plus small metadata rather than a deeply nested schema.
   - That keeps the UI resilient to prompt iteration.

### Acceptance criteria

- The advisor tool card renders readable content rather than raw quoted JSON in the final release branch.
- Running, limit, and error states are visibly distinct.
- Storybook stories and play assertions cover the new states.
- Existing tool rendering flows remain unchanged.

## Phase 6 — Automated tests and validation gates

### Backend tests to add

1. **Advisor runtime/tool tests**
   - question validation;
   - tool-less nested execution assertion;
   - success result shaping;
   - limit-reached result shaping;
   - error result shaping.

2. **Prompt/gating tests in chatd**
   - advisor disabled ⇒ no tool, no guidance;
   - advisor enabled/root chat ⇒ tool + guidance;
   - child chat ⇒ advisor absent.

3. **Chatloop policy tests**
   - advisor alone runs;
   - advisor + action tool mixed batch returns deterministic policy errors;
   - non-advisor tools still execute normally.

4. **Usage/metrics tests**
   - per-run cap resets correctly;
   - builtin tool labeling includes `advisor`;
   - returned metadata includes model/usage summary when available.

### Frontend tests to add

- Storybook `play()` assertions for the advisor renderer states.
- Verify expand/collapse behavior and visible fallback text.
- Verify the message timeline still renders adjacent tools correctly.

### Recommended command sequence

Run these as the implementation matures, not only at the end:

1. Backend-focused gate after phases 1–4:
   - `make test RUN=TestAdvisor`
   - `make test RUN=TestChatloopAdvisor`
   - `make lint`

2. Frontend-focused gate after phase 5:
   - `pnpm test:storybook src/pages/AgentsPage/components/ChatElements/tools/AdvisorTool.stories.tsx`
   - `pnpm lint`
   - `pnpm format`

3. Final repo gate before handoff:
   - `make pre-commit`
   - run any additional targeted `make test RUN=...` selections covering touched chatd paths

> Use the exact new test names the implementing agents create; the names above are recommended anchors, not existing tests.

## Dogfooding plan

### Principle

Dogfood the change as a real agent feature, not just a unit-tested backend. Per the dogfood and `agent-browser` skills, the reviewer should get **watchable repro videos** plus screenshots that make the behavior obvious without reading logs.

### Required setup

1. Start the full dev environment with:
   - `./scripts/develop.sh`
2. If the frontend renderer changes, also start Storybook from `site/` with:
   - `pnpm storybook --no-open`
3. Use `agent-browser` directly — **never `npx agent-browser`**.
4. Use named browser sessions and an output folder such as:
   - `./dogfood-output/advisor/`
   - with subfolders `screenshots/` and `videos/`

### Evidence protocol

For every interactive scenario below:

1. Start video recording **before** the action.
2. Capture step-by-step screenshots at human pace.
3. Capture one annotated screenshot of the final state.
4. Stop the recording.
5. Note the exact pass/fail observation in the QA report.

For static UI states (for example Storybook error/limit cards), an annotated screenshot is sufficient; video is optional but still encouraged by this project’s review preference.

### Dogfood scenarios

#### Scenario A — Happy path in the real Agents UI

**Goal:** prove that a root agent chat can invoke advisor and produce a readable recommendation before taking further action.

Steps:

1. Open the Agents page with an advisor-enabled root chat.
2. Start a repro video.
3. Send a prompt that should reasonably trigger strategic planning, such as an architecture or multi-tradeoff question.
4. Capture screenshots of:
   - the prompt before send;
   - the running advisor state;
   - the completed advisor card and the assistant’s follow-up response.
5. Stop recording.

Pass criteria:

- advisor appears in the timeline;
- the rendered result is readable;
- the assistant can continue after consuming the advisor output.

#### Scenario B — Advisor unavailable path

**Goal:** prove the feature is truly gated.

Suggested variants (at least one is required, both are better):

- feature flag/config off;
- child/sub-agent chat.

Evidence:

- annotated screenshot of the chat/tool state showing advisor is absent;
- short video if toggling the gate live is part of the repro.

Pass criteria:

- no advisor tool is available;
- no advisor-specific prompt behavior leaks through.

#### Scenario C — UI states in Storybook

**Goal:** prove the renderer handles non-happy states cleanly.

Required story states:

- success/advice;
- running;
- limit reached;
- error.

Evidence:

- one screenshot per state;
- at least one short video showing collapse/expand behavior.

Pass criteria:

- success renders readable advice;
- limit/error have visible fallback text;
- the component behaves like the other tool cards.

#### Scenario D — Regression sweep of nearby tools

**Goal:** ensure advisor does not break the surrounding chat timeline.

Check at minimum:

- another existing built-in tool still renders correctly near advisor;
- sub-agent/tool cards still expand/collapse normally;
- no obvious console errors appear in the Agents page during the advisor flow.

Evidence:

- screenshots of adjacent tool cards;
- console/error capture if anything suspicious appears.

### `agent-browser` usage notes for the QA agent

- Prefer `agent-browser batch` for 2+ sequential commands when no intermediate parsing is needed.
- Use `snapshot -i` to discover interactive refs.
- Re-snapshot after navigation or major DOM changes.
- Avoid `wait --load networkidle` unless the page is known to go idle; prefer explicit element/text waits or short fixed waits.
- Record videos at human pace and include pauses that a reviewer can follow.

## Rollout plan

### Initial rollout

- Gate behind a server-side advisor-enabled flag.
- Enable only for selected internal/root agent chats first.
- Watch metrics for:
  - invocation count;
  - failure rate;
  - latency;
  - obvious retry loops.

### Expansion conditions

Expand beyond the initial rollout only after the following are true:

- mixed-batch policy behavior is stable;
- cost impact is understood;
- frontend UX is readable in production-like dogfood;
- no recursion surprises have appeared with sub-agent flows.

### Explicit non-goals for the first release

- advisor inside child/sub-agent chats;
- provider-agnostic streaming phase UI;
- MCP-based external advisor implementation;
- mandatory DB-backed advisor cost reporting.

## Final acceptance checklist

- [ ] `advisor` is a built-in chatd tool, not an MCP/dynamic-tool substitute.
- [ ] The nested advisor call is tool-less and bounded to one in-memory step.
- [ ] One eligibility boolean controls both tool registration and prompt guidance injection.
- [ ] Root chats can use advisor; child chats cannot in the initial rollout.
- [ ] Mixed advisor/action batches produce deterministic policy errors instead of partial execution.
- [ ] Per-run usage caps and limit-reached behavior work.
- [ ] Advisor usage is visible in metadata/metrics without forcing a DB migration for MVP.
- [ ] The Agents UI has a readable advisor card and Storybook coverage.
- [ ] Dogfooding produced screenshots and repro videos for the required scenarios.
- [ ] Validation commands (`make lint`, targeted `make test`, Storybook tests, `make pre-commit`) passed before handoff.

## Suggested PR split

1. **PR 1 — Backend foundation**
   - `chatadvisor/` package
   - `chattool/advisor.go`
   - `chatloop` exclusive policy
   - chatd gating/prompt sync
   - backend tests

2. **PR 2 — Frontend + QA**
   - advisor renderer
   - stories/play assertions
   - dogfood artifacts and QA notes

3. **PR 3 — Optional follow-ups only if demanded by stakeholders**
   - separate advisor model override
   - persistent advisor billing/queryability
   - transient phase-stream UX


</details>

---
_Generated with [`mux`](https://github.com/coder/mux) • Model: `anthropic:claude-opus-4-7` • Thinking: `max`_
2026-04-30 14:53:08 +02:00
Asher be57af5ff0 feat: add exit code and status to workspace agent scripts (#24505)
For scripts that have not finished or in dry run cases these will be
omitted.
2026-04-29 12:24:26 -08:00
Zach 1c30d52b2b feat: audit user secret create, update, and delete (#24756)
Emit user secret audit log entries for create/update/delete operations.
Reads stay un-audited, matching every other resource.

Audit log entries record changes in user secret name, environment
variable name, file path, and value. The secret value column is marked
`ActionSecret` so the diff records the change without showing the
ciphertext or plaintext.

Closes a TOCTOU window on delete to ensure no phantom audit logs for a
delete of a non-existent secret. Secret update accepts a small TOCTOU
window matching the other audited resources (templates, workspaces,
chats). The two-query pattern is wrapped in a transaction so audit state
can't leak from a failed mutation.
2026-04-29 12:57:47 -06:00
Cian Johnston 70d6efa311 feat: chat auto-archive owner digest notifications (#24643)
Depends on #24642

Adds per-owner digest notifications onto the chat auto-archive
subsystem.

Each tick's archived rows are grouped by owner, the top 25 titles per
owner are rendered into a new `Chats Auto-Archived` notification
template, and any remainder surfaces as `and N more`. Each digest is
per-tick, so users with large amounts of purgeable data may get multiple
notifications in sequence (one per user per tick).

The template body branches on `retention_days`: when retention is
disabled (`retention_days=0`), users are told archived chats are kept
indefinitely rather than falsely claiming imminent deletion.

### Changes
- migration `000XXX_chat_auto_archive_notification_template` adds new
notification template
- `dbpurge`: threads `notifications.Enqueuer` through `New`; and
enqueues notification message.
- `cli/server.go`: passes `options.NotificationsEnqueuer` into
`dbpurge.New`.
- `coderd/notifications/events.go`: new `TemplateChatAutoArchiveDigest`
UUID.
- `coderd/inboxnotifications.go`: inbox registration.
- Docs: adds a `Notifications` section to `chat-auto-archive.md`.

> 🤖
2026-04-28 08:56:36 +01:00
Kyle Carberry 069223ae26 fix: recover web push subscriptions after PWA reinstall (#24720) 2026-04-26 14:49:10 -07:00
Cian Johnston a876287d36 feat: auto-archive inactive chats with audit trail (#24642)
Adds a background job in `dbpurge` that periodically archives chats
inactive beyond a configurable threshold. Each archived root chat gets a
background audit entry tagged `chat_auto_archive`. Disabled by default.

* New `AutoArchiveInactiveChats` SQL query with LATERAL last-activity
subquery and partial index on archive candidates
* `site_configs`-backed `auto_archive_days` setting with admin-only PUT,
any-authenticated-user GET
* Cascade archive via `root_chat_id`; pinned chats and active threads
exempt
* Root-only audit dispatch on detached context, matching manual archive
(`patchChat`) behavior
* 11 subtests covering disabled no-op, boundary, deleted messages, child
activity, pinned exemption, multi-owner, idempotency, and batch
pagination

PR #24643 adds per-owner digest notifications.
PR #24704 adds the requisite UI controls.

> 🤖
2026-04-24 14:18:28 +01:00
Danielle Maywood 3a9a60dff8 feat: add collapsible thinking blocks with configurable display mode (#24635) 2026-04-24 11:29:08 +00:00
Michael Suchacz 3d90546aae feat: add general subagent model override (#24610)
Adds a deployment-wide admin override for general delegated subagents.

## What changed
- store the general override in `site_configs` and expose it through the
shared `agent-model-override/{context}` API
- apply the general override when spawning delegated general subagents,
while preserving the existing Explore override behavior
- reuse a shared Agents settings form for the general and Explore
override sections

## Validation
- `make gen`
- `go test ./coderd -run 'TestChatModelOverrides'`
- `go test ./coderd/x/chatd -run
'TestSpawnAgent_(GeneralUsesConfiguredModelOverride|GeneralOverrideLogsAndFallsBackWhenCredentialsUnavailable|GeneralOverrideLogsAndFallsBackWhenProviderDisabled)'`
- `pnpm -C site lint:types`
- `pnpm -C site test:storybook --
AgentSettingsAgentsPageView.stories.tsx`
- `make lint`
- `make pre-commit`

> Mux is acting on Mike's behalf.
2026-04-24 12:37:20 +02:00
Ethan ad1906589d fix(coderd): allow deleting chat providers used in historical chats (#24568)
Drop the `chat_model_configs.provider -> chat_providers.provider`
foreign key and soft-delete model configs when their provider is
removed. The provider row is now hard-deleted inside a transaction that
also tombstones its model configs and promotes a replacement default
when needed.

Historical chats and messages keep pointing at the soft-deleted model
config rows, which are hidden from live/admin queries but still resolve
for read. The runtime chat path already falls back to the default model
config when a soft-deleted config is looked up.

Replaces the lost FK validation in the create/update model-config
handlers with an explicit provider lookup that returns the existing
`Chat provider is not configured.` 400.

## UX

**Admin deleting a chat provider that has historical usage**

- Before: blocked with 400 `Provider models are still referenced by
existing chats.` Admins had no in-product way to remove a provider that
had ever been used.
- After: delete succeeds (204). Any model configs under that provider
are soft-deleted. If the removed provider owned the default model
config, one of the remaining live configs is auto-promoted to the new
default. The promotion is deterministic (`ensureDefaultChatModelConfig`
picks the first live config by `provider ASC, model ASC, updated_at
DESC, id DESC`); there is no picker, and no toast or response detail
names which config became the new default.

**End users with chats that used a deleted provider's model**

- Old chats still open and their history still renders unchanged.
- Sending a new turn in such a chat silently falls back to the current
default model. No banner or warning tells the user the original model is
gone.
- The model picker no longer lists the deleted model.
- If no default model config exists at all after the delete, sending a
new turn fails with `no default chat model config is available`.

**Admin creating or updating a model config against a provider that is
not configured**

- Same as before: 400 `Chat provider is not configured.` Only the
detection mechanism changed (explicit `FOR UPDATE` lookup inside the
transaction, which also serializes against a concurrent provider
delete).

**Admin updating a model config whose row disappears mid-transaction**

- Now returns the standard 404 `Resource not found or you do not have
access to this resource` instead of the previous 500 that leaked `sql:
no rows in result set` in the detail. Unrelated internal races (for
example a race on the promoted default candidate) are still reported as
500 so they are not misclassified as "your target is gone".

Closes CODAGT-23
2026-04-22 19:34:34 +10:00
Jaayden Halko 410f9a5e19 feat: allow renaming of agent chat title (#24489)
Co-authored-by: Coder Agents <noreply@coder.com>
2026-04-20 14:00:46 +01:00
Thomas Kosiewski df7e838c21 feat(coderd): wire debug logging into chat lifecycle (#23917) 2026-04-20 12:27:16 +02:00
Mathias Fredriksson fc2493780f fix: exclude subagent chats from sidebar pagination (#24404)
GetChats now returns only root chats (parent_chat_id IS NULL).
A new GetChildChatsByParentIDs query fetches children for visible
roots and embeds them in each parent's Children field. The
singular getChat endpoint does the same.

Archive invariant is one-way: parent archived implies child
archived. Parent archive/unarchive cascades via root_chat_id.
Individual child archive is permitted; child unarchive while the
parent is archived is rejected atomically (row lock on child,
re-read parent inside the transaction). Embedded children are
filtered by the caller's archive state so individually-archived
children stay hidden from active-parent views.

Gitsync MarkStale uses GetChatsByWorkspaceIDs directly;
MarkStaleParams.OwnerID removed (dead after the switch).

Frontend: buildChatTree reads from the embedded children field,
WebSocket handlers route child events into the parent's children
array, and archiving a child strips it from the parent cache.
2026-04-20 13:19:59 +03:00
Spike Curtis e19b21b7d5 chore: add GetLatestWorkspaceBuildWithStatusByWorkspaceID query (#24441)
<!--

If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting.

-->

relates to GRU-18  
  
Adds new database query supporting the Agent Connection Watch we will add.
2026-04-17 22:47:08 -04:00
Thomas Kosiewski 91f9de27a1 feat(coderd): add chat debug service and summary aggregation (#23916) 2026-04-17 16:27:53 +02:00
Dean Sheather 4ba74dcdc8 feat(coderd): add PR status summary to telemetry snapshots (#24379)
Adds aggregate PR counts (total, open, merged, closed) from
`chat_diff_statuses` to telemetry snapshots, giving visibility into AI
agent PR outcomes across deployments.

The existing telemetry system reports `Chats`, `ChatMessageSummaries`,
and `ChatModelConfigs`, but had no PR-level data. This adds a
`ChatDiffStatusSummary` field to the `Snapshot` struct with four
all-time counts derived from a single aggregate query.

<details>
<summary>Implementation details</summary>

- New SQL query `GetChatDiffStatusSummary` counts `chat_diff_statuses`
rows with non-NULL `pull_request_state`, grouped by state
(open/merged/closed).
- `ChatDiffStatusSummary` struct added to telemetry `Snapshot`,
collected via a parallel `eg.Go()` block in `createSnapshot()`.
- `dbauthz` wrapper uses `rbac.ResourceSystem` (telemetry-only pattern).
- Test covers both empty state (zero counts) and populated state (mixed
states + NULL-state exclusion).

</details>

> 🤖 Generated by Coder Agents
2026-04-17 21:56:11 +10:00
Michael Suchacz 73b5058923 feat: add Explore mode as subagent-only modality (#24448)
> This PR was authored by Mux on behalf of Mike.

Introduce Explore mode, a read-only subagent modality for delegated
discovery and code investigation.

## What

Adds a `spawn_explore_agent` tool that creates child chats restricted to
read-only operations. An admin can optionally configure a
deployment-wide
model override so Explore subagents use a model optimized for large
context
or reasoning without changing the root chat's model.

### Backend

- New `ChatModeExplore` enum value (migration 000471).
- `spawn_explore_agent` tool definition with read-only allowlist:
`read_file`, `execute`, `process_output`, `read_skill`,
`read_skill_file`.
  Write tools, file editors, and nested subagent spawning are blocked.
- Deployment config storage for the Explore model override
  (`agents_chat_explore_model_override` in `site_configs`).
- Model resolution hierarchy: configured override, then current turn
model,
then global default. Silent fallback with warning log when the override
  becomes unavailable.
- RBAC: `AsChatd` for daemon reads, `ActionRead` and `ActionUpdate` on
  `ResourceDeploymentConfig` for admin API calls.
- Plan mode root chats can use `spawn_explore_agent` for read-only
research,
  matching the planning prompt guidance.
- The Explore override config API now reports malformed saved overrides
as
  "treated as unset" so admins can clear them explicitly.

### Frontend

- `ExploreModelOverrideSettings` component in admin agent behavior
settings.
  Uses `ModelSelector`, handles unavailable model warnings, and supports
  explicit Save and Clear actions.
- Malformed saved overrides show a warning and require an explicit Save
to
  clear, instead of Clear auto-submitting behind the scenes.

### Tests

- Integration: `TestExploreSubagentIsReadOnly` (full spawn flow, tool
  verification, prompt overlay, DB state).
- Unit: tool allowlist tests for explore, plan, and default modes.
- Internal: model override resolution with valid, invalid UUID,
disabled, and
  unconfigured override scenarios.
- RBAC: `dbauthz_test.go` for `GetChatExploreModelOverride` and
  `UpsertChatExploreModelOverride`.
- API: admin set and clear, malformed stored override reporting,
disabled
  model rejection, non-admin denial.
2026-04-17 13:40:17 +02:00
Michael Suchacz 1092093e98 feat: add internal subagent model override wiring (#24399)
> Mux working on behalf of Mike.

## Summary
- add an enabled chat model config lookup by ID for internal callers
- keep `spawn_agent` unchanged while threading an internal model
override through child subagent chat creation
- extend chatd coverage for inherited bindings, plan mode, and internal
override behavior

## Validation
- `go test ./coderd/x/chatd ./coderd/database/dbauthz`
- `make lint`
2026-04-16 17:08:02 +02:00
Michael Suchacz e5707a13d6 feat: support multiple agents with shared instance-identity auth (#24325)
> This PR was authored by Mux on behalf of Mike.

## Summary

Adds support for multiple peer root workspace agents sharing the same
`auth_instance_id`, so AWS, Azure, and GCP instance-identity auth can
issue the correct session token for a selected agent instead of assuming
a
single root agent per instance.

## Problem

When a Terraform template attaches two or more `coder_agent` resources
(with `auth = "aws-instance-identity"`) to a single compute instance,
every agent shares the same cloud instance ID. The existing singular
lookup picks whichever agent was created most recently, silently
ignoring
the others.

## Solution

Introduce an optional pre-auth agent selector (`CODER_AGENT_NAME`) and
make the server-side lookup ambiguity-aware.

**Database layer:**
- `GetWorkspaceAgentsByInstanceID` (`:many`): returns all matching root
  agents for an instance ID.
- `GetWorkspaceAgentByInstanceIDAndName` (`:one`): returns the named
root
  agent for disambiguation.

**SDK and CLI:**
- `agent_name` field added to AWS, Azure, and GCP request structs
  (`omitempty` for backward compatibility).
- `CODER_AGENT_NAME` env var and `--agent-name` flag wired into the
agent
  bootstrap before instance-identity auth runs.

**Server handler (`handleAuthInstanceID`):**
- When `agent_name` is present: direct lookup by (instance ID, name).
- When absent: legacy lookup, then resource-scoped ambiguity check.
  Returns 409 with available agent names if multiple root agents match.
- Whitespace-only names are trimmed and treated as unspecified.
- Sub-agents remain excluded (`parent_id IS NULL` filter).

**Verification template:**
- `examples/templates/aws-multi-agent/` provisions one EC2 instance with
  two agents (`main` and `dev`), both using instance-identity auth with
  `CODER_AGENT_NAME` set in the cloud-init user data.

## Backward compatibility

Existing single-agent deployments work unchanged. The `agent_name` field
is optional with `omitempty`, and the unnamed path preserves today's
behavior when only one root agent matches.
2026-04-16 13:59:09 +02:00
Michael Suchacz 1cf0354f72 feat: add plan mode with restricted tool boundary (#24236)
> This PR was authored by Mux on behalf of Mike.

## Summary
- add persistent plan mode for chats and the chat-specific plan file
flow
- add structured planning tools such as `ask_user_question` and
`propose_plan`
- keep `write_file` and `edit_files` constrained to the chat-specific
plan file during plan turns
- allow shell exploration in plan mode, including subagents, via
`execute` and `process_output`
- block implementation-oriented, provider-native, MCP, dynamic, and
computer-use tools during plan turns
- update the chat UI, tests, and docs for the new planning flow
2026-04-16 11:12:01 +02:00
Cian Johnston c552f9f281 fix: stop group spend limits from leaking across org boundaries (#24294)
Three SQL queries (`GetUserGroupSpendLimit`,
`ResolveUserChatSpendLimit`, `GetUserChatSpendInPeriod`) aggregated chat
spend limits and usage globally across all organizations. A restrictive
group limit in org A would bleed into org B.

## Changes

- Add `organization_id` parameter to all three SQL queries in
`coderd/database/queries/chats.sql`
- When nil UUID is passed, queries fall back to global behavior
(backward compat for HTTP dashboard endpoints)
- When real org ID is passed, limits and spend are scoped to that
organization
- Thread `organizationID` through `ResolveUsageLimitStatus` →
`checkUsageLimit` → all chatd call sites
- Update dbauthz wrappers for new param structs
- HTTP endpoints (`chatCostSummary`, `getMyChatUsageLimitStatus`) pass
`uuid.Nil` with TODO for future org-scoped UI
- Add `TestResolveUsageLimitStatus_OrgScoped` with 5 test cases covering
org isolation, nil-UUID fallback, spend scoping, and user override
priority

Closes coder/internal#1466

> 🤖
2026-04-14 16:56:17 +01:00
Thomas Kosiewski 6ab30123bf feat: add chat debug log tables, queries, and SDK types (#23913) 2026-04-13 15:06:06 +02:00
Matt Vollmer 36141fafad feat: stack insights tables vertically and paginate Pull requests table (#24198)
The "By model" and "Pull requests" tables on the PR Insights page
(`/agents/settings/insights`) were side-by-side at `lg` breakpoints, and
the Pull requests table was hard-capped at 20 rows by the backend.

- Replaced `lg:grid-cols-2` with a single-column stacked layout so both
tables span the full content width.
- Removed the `LIMIT 20` from the `GetPRInsightsRecentPRs` SQL query so
all PRs in the selected time range are returned.
- Can add this back if we need it. If we do, we should add a little
subheader above this table to indicate that we're not showing all PRs
within the selected timeframe.
- Added client-side pagination to the Pull requests table using
`PaginationWidgetBase` (page size 10), matching the existing pattern in
`ChatCostSummaryView`.
- Renamed the section heading from "Recent" to "Pull requests" since it
now shows the full set for the time range.
<img width="1481" height="1817" alt="image"
src="https://github.com/user-attachments/assets/0066c42f-4d7b-4cee-b64b-6680848edc68"
/>


> 🤖 PR generated with Coder Agents
2026-04-10 10:48:54 -04:00
Garrett Delfosse 3462c31f43 fix: update directory for terraform-managed subagents (#24220)
When a devcontainer subagent is terraform-managed, the provisioner sets
its directory to the host-side `workspace_folder` path at build time. At
runtime, the agent injection code determines the correct
container-internal
path from `devcontainer read-configuration` and sends it via
`CreateSubAgent`.

However, the `CreateSubAgent` handler only updated `display_apps` for
pre-existing agents, ignoring the `Directory` field. This caused
SSH/terminal
sessions to land in `~` instead of the workspace folder (e.g.
`/workspaces/foo`).

Add `UpdateWorkspaceAgentDirectoryByID` query and call it in the
terraform-managed subagent update path to also persist the directory.

Fixes PLAT-118

<details><summary>Root cause analysis</summary>

Two code paths set the subagent `Directory` field:

1. **Provisioner (build time):** `insertDevcontainerSubagent` in
`provisionerdserver.go`
   stores `dc.GetWorkspaceFolder()` — the **host-side** path from the
   `coder_devcontainer` Terraform resource (e.g. `/home/coder/project`).

2. **Agent injection (runtime):**
`maybeInjectSubAgentIntoContainerLocked` in
`api.go` reads the devcontainer config and gets the correct
**container-internal**
path (e.g. `/workspaces/project`), then calls `client.Create(ctx,
subAgentConfig)`.

For terraform-managed subagents (those with `req.Id != nil`),
`CreateSubAgent`
in `coderd/agentapi/subagent.go` recognized the pre-existing agent and
entered
the update path — but only called `UpdateWorkspaceAgentDisplayAppsByID`,
discarding the `Directory` field from the request. The agent kept the
stale
host-side path, which doesn't exist inside the container, causing
`expandPathToAbs` to fall back to `~`.

</details>

> [!NOTE]
> Generated by Coder Agents
2026-04-10 10:11:22 -04:00
Zach 95cff8c5fb feat: add REST API handlers and client methods for user secrets (#24107)
Add the five REST endpoints for managing user secrets, SDK client
methods, and handler tests.

Endpoints:
- `POST /api/v2/users/{user}/secrets`
- `GET /api/v2/users/{user}/secrets`
- `GET /api/v2/users/{user}/secrets/{name}`
- `PATCH /api/v2/users/{user}/secrets/{name}`
- `DELETE /api/v2/users/{user}/secrets/{name}`

Routes are registered under the existing `/{user}` group with
`ExtractUserParam`. The delete query was changed from `:exec` to
`:execrows` so the handler can distinguish "not found" from success
(DELETE with `:exec` silently returns nil for zero affected rows).
2026-04-09 12:12:55 -06:00