Commit Graph

12622 Commits

Author SHA1 Message Date
Kyle Carberry 30d534b36b fix(chatd): fix relay race conditions, extract enterprise relay logic, move pubsub to OSS (#22589)
## Summary

Fixes a bug where interrupting a streaming chat and sending a new
message
left the relay connected to the wrong replica. Expanded into a broader
refactor that cleanly separates concerns:

- **OSS** owns pubsub subscription, message catch-up, queue updates,
  status forwarding, and local parts merging.
- **Enterprise** (`enterprise/coderd/chatd`) only manages relay dialing,
  reconnection, and stale-dial discarding for cross-replica streaming.

## Architecture

### OSS `coderd/chatd/chatd.go`

`Subscribe()` builds the initial snapshot then runs a single merge
goroutine that handles:

- Pubsub subscription for durable events (status, messages, queue,
errors)
- Message catch-up via `AfterMessageID`
- Local `message_part` forwarding
- Relay events from enterprise (when `SubscribeFn` is set)
- Sends `StatusNotification` to enterprise so it can manage relay
lifecycle

Key types:

- `SubscribeFn` — enterprise hook, returns relay-only events channel
- `SubscribeFnParams` — `ChatID`, `Chat`, `WorkerID`,
`StatusNotifications`, `RequestHeader`, `DB`, `Logger`
- `StatusNotification` — `Status` + `WorkerID`, sent to enterprise on
pubsub status changes

### Enterprise `enterprise/coderd/chatd/chatd.go`

`NewMultiReplicaSubscribeFn(cfg MultiReplicaSubscribeConfig)` returns a
`SubscribeFn` that:

- Opens an initial synchronous relay if the chat is running on a remote
worker
- Reads `StatusNotifications` from OSS to open/close relay connections
- Handles async dial, reconnect timers, stale-dial discarding
- Returns only relay `message_part` events

## Bug fixes

### Original bug: stale relay dial after interrupt

`openRelayAsync` goroutines used `mergedCtx` (subscription-level), not a
per-dial context. `closeRelay()` could not cancel in-flight dials. When
the user interrupts and a new replica picks up the chat, the old dial
goroutine could complete after the new one and deliver a stale
`relayResult`.

**Fix**: per-dial `dialCtx`/`dialCancel`, `expectedWorkerID` tracking,
`workerID` on `relayResult`. `closeRelay()` cancels the dial context and
drains `relayReadyCh`. Merge loop rejects mismatched worker IDs.

### Additional fixes

- `statusNotifications` send-on-closed-channel race — goroutine now owns
  `close()` via defer
- Enterprise spin-loop on `StatusNotifications` close — two-value
receive
  with nil-out
- `hasPubsub` set from `p.pubsub != nil` instead of subscription success
  — now tracks actual subscription result
- `lastMessageID` not initialized from `afterMessageID` — caused
  duplicate messages on catch-up
- `wrappedParts` goroutine leaked remote connection on `dialCtx` cancel
- `closeRelay()` did not drain `relayReadyCh`
- `setChatWaiting` race with `SendMessage(Interrupt)` — wrapped in
`InTx`
- `processChat` post-TX side effects fired when chat was taken by
another
  worker — added `errChatTakenByOtherWorker` sentinel
- Cancel closure data race on `reconnectTimer`
- Bare blocking send on pubsub error path
- `localParts` hot-spin after channel close
- No-pubsub branch dropped relay events and initial snapshot
- Failed relay dial caused permanent stall (no reconnect retry)
- DB error during reconnect timer caused permanent stall
- `time.NewTimer` replaced with `quartz.Clock` for testable timing

## Tests

9 enterprise tests covering:

- Relay reconnect on drop (mock clock)
- Async dial does not block merge loop
- Relay snapshot delivery
- Stale dial discarded after interrupt
- Cancel during in-flight dial
- Running-to-running worker switch
- Failed dial retries (mock clock)
- Local worker closes relay
- Multiple consecutive reconnects (mock clock)

All pass with `-race`.
2026-03-04 18:42:28 -05:00
Kyle Carberry 0ccfc4da06 feat(site): add specialized renderer for process_output tool in agent chats (#22628)
Adds a `ProcessOutputTool` component that renders `process_output` tool
calls with a clean terminal-style output block instead of falling
through to the generic JSON renderer.

## Changes

**New file:** `ProcessOutputTool.tsx`
- Output shown directly with no header
- Copy button and status indicators float top-right on hover
- Collapsible output with the same expand/collapse chevron bar used by
`ExecuteTool`
- Exit code badge shown only for non-zero exits
- Spinner shown while process is still running

**Modified files:**
- `Tool.tsx` — `ProcessOutputRenderer` + registered in `toolRenderers`
map
- `ToolIcon.tsx` — `process_output` falls through to `TerminalIcon`
- `ToolLabel.tsx` — shows "Reading process output" label
2026-03-04 18:04:34 -05:00
Danielle Maywood 96926cf189 fix(site): add optimistic updates for chat archive/unarchive (#22622) 2026-03-04 21:51:33 +00:00
Danielle Maywood 93e5d04896 fix(site): only play completion chime for top-level chat agents (#22623) 2026-03-04 21:17:11 +00:00
david-fraley 9bd5a8d4e9 docs: tasks vscode extension update (#22582) 2026-03-04 20:38:03 +00:00
Zach be019d9a23 chore: bump boundary version to capture dropped logs (#22618) 2026-03-04 13:08:13 -07:00
Kyle Carberry e4bdfbebd3 feat(site): improve agent chat header design (#22621)
## Changes

- **User dropdown → sidebar bottom**: Moved from the TopBar into the
sidebar footer with avatar + display name, whole row clickable to open
the dropdown menu
- **Diff stats inline badge**: Compact green/red pill badge next to the
chat title showing `+additions −deletions`, clickable to toggle the diff
panel
- **Reordered TopBar actions**: Ellipsis menu first, then drawer toggle
button on the far right
- **Notification bell scoped**: Removed from individual chat pages
(remains on `/agents` listing)
- **Cleanup**: Removed unused `signOut`/`buildInfo` destructuring from
AgentsPage

### Files changed
- `site/src/pages/AgentsPage/AgentDetail/TopBar.tsx`
- `site/src/pages/AgentsPage/AgentsPage.tsx`
- `site/src/pages/AgentsPage/AgentsSidebar.tsx`

<img width="1876" height="1597" alt="image"
src="https://github.com/user-attachments/assets/8ec33955-f8b4-4064-9767-19147951b3ff"
/>
2026-03-04 13:35:55 -05:00
Kayla はな e35717bc19 fix: show a notice when workspace sharing is disabled globally in organization settings (#22580) 2026-03-04 11:14:52 -07:00
Spike Curtis fda181bb26 chore: modify task status scaletest to use Agent API dRPC (#22356)
relates to #21335

Modifies our taskstatus scaletest load generator to use the dRPC connection to mimic what an actual running Task would do via the MCP server (c.f. PRs below this one in the stack).

Disclosure: I used AI to generate large portions of this PR, but hand-reviewed and tweaked.
2026-03-04 22:12:35 +04:00
Spike Curtis 8327e1f65f chore: mark PatchAppStatus as deprecated in agentsdk (#22355)
relates to #21335

Marks the sdk method that directly calls Coderd to patch App status as deprecated in favor of the Agent API.
2026-03-04 21:52:22 +04:00
Mathias Fredriksson c7dd429bbf fix(coderd/database/dbfake): prevent cross-test job stealing in WorkspaceBuildBuilder (#22598)
Previously, WorkspaceBuildBuilder.doInTX() inserted provisioner jobs
with empty tags and used a loop in AcquireProvisionerJob that could
match other tests' pending jobs when parallel tests share a database.

Add a unique tag (jobID -> "true") to each provisioner job at insert
time, then use that tag in AcquireProvisionerJob to target only the
correct job. This follows the same pattern used in dbgen.ProvisionerJob.

Closes coder/internal#1367
2026-03-04 17:47:34 +00:00
Spike Curtis 1a30ca1a2a chore: use agentsocket for task status updates in MCP server (#22354)
relates to #21335

Modifies our local MCP server used in Tasks to push task status updates over the agentsocket, rather than directly dialing Coderd. This will significantly reduce pressure on the database at scale because we can avoid expensive authentication of the agent API key.

Disclosure: I used AI to generate a lot of this PR, but hand-reviewed and tweaked it.
2026-03-04 21:41:21 +04:00
Spike Curtis 7cc2b22568 chore: expose UpdateAppStatus on agentsocket (#22353)
relates to #21335

Adds UpdateAppStatus on the agentsocket, wired up to forward to Coderd over the dRPC connection the agent maintains.

Disclosure: I used AI to generate significant portions of this PR, but hand-reviewed and tweaked the code. I consider it approximately indistinguishable from what I would have done by hand.
2026-03-04 21:18:17 +04:00
Kyle Carberry 9db39fb358 fix: remove queue message button and clear localStorage draft on submit (#22615)
Fixes two bugs in the agents chat input:

1. **Remove queue message button next to stop button** — The send button
(which showed a ListPlusIcon during streaming) is now hidden when
streaming and not editing a queued message. Messages are still queued
via Enter key; only the visual button is removed. The stop button
remains.

2. **Clear localStorage draft on submit** — The `agents.empty-input`
localStorage key is now cleared synchronously in `handleSend` before the
async `onCreateChat` call. Previously, the draft was only cleared inside
the async `handleCreateChat` after `mutateAsync` resolved, allowing
Lexical editor change events to re-persist the draft during the async
gap.
2026-03-04 16:30:37 +00:00
Danielle Maywood 474e80b646 feat(site): add completion chime for agent tasks (#22608) 2026-03-04 16:23:31 +00:00
Kyle Carberry 17d214b4a4 fix(site): resolve WS/HTTP race condition on workspace parameters page (#22556)
## Problem

Flaky e2e test: `update workspace, new required, mutable parameter
added`

```
Error: Timed out 15000ms waiting for expect(locator).toHaveValue(expected)
Locator: getByTestId('parameter-field-Sixth parameter').locator('input')
Expected string: "99"
Received string: ""
```

## Root Cause

When the workspace parameters page loads, the WebSocket sends an initial
response with template defaults. For parameters with no default (like
`sixth_parameter`), the server returns `{valid: false, value: ""}`. On
first render, `useSyncFormParameters` sees this invalid server value and
overwrites the form's correctly-autofilled value ("99" from the previous
build) with "".

## Fix

When the server value is `{valid: false}`, preserve the current form
value instead of overwriting with "". This prevents the sync hook from
clobbering autofilled values before the server has had a chance to
process them.

## Verification

- TypeScript: zero type errors
- Biome lint: clean
- Unit tests: 2/2 passing
- **E2E soak test: 849/854 passed across 854 runs (99.5% pass rate)**
  - 0 occurrences of the original flake (empty value on settings page)
- 5 residual failures are a separate pre-existing race in
`fillParameters` where user input is overwritten during the 500ms
debounce window
2026-03-04 16:20:49 +00:00
Danielle Maywood 619023f5fc fix(site): disable chat input editor on archived agent chats (#22609) 2026-03-04 16:19:18 +00:00
Matt Vollmer 8a1dd518db fix(docs): reorder Coder Agents section in manifest.json (#22604) (#22614)
## Changes

- Removed the Coder Agents entry from the middle of the children array
in `docs/manifest.json`.
- Added the Coder Agents entry back at the end of the children array to
improve the organization of the documentation structure.

<img width="368" height="688" alt="image"
src="https://github.com/user-attachments/assets/3117acfd-8c8a-4522-84e7-a748a7596cc6"
/>


<!--

If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.

-->
2026-03-04 11:12:51 -05:00
Danielle Maywood ac298a2537 perf(site): improve FilesChangedPanel rendering for large diffs (#22610) 2026-03-04 16:07:13 +00:00
Kyle Carberry ec89abd6e5 feat(chatd): use lightweight model candidates for title generation (#22605)
## Problem

Title generation uses the same model the user selected for chat. This
breaks when:

1. **Thinking/extended thinking models** — `ToolChoice: None` conflicts
with extended thinking on Anthropic. The bare call has no thinking
config, so provider-level defaults can conflict.
2. **Expensive models** — User picks `o3` or `claude-opus-4`, and a
trivial 8-word title generation burns through tokens/cost unnecessarily.
3. **Provider quirks** — Different providers have different constraints
around thinking mode + tool choice combinations.

## Solution

Modeled after how `coder/mux` handles this with
`NAME_GEN_PREFERRED_MODELS` + ordered candidate fallback:

### Phase 1: Candidate model list with fallback
- New `TitleModelFunc` type returns an ordered list of candidate models
- Tries `claude-haiku-4-5` → `gpt-4o-mini` → user's model
- Gracefully skips unavailable candidates (missing API key, provider not
configured)
- Falls back to the user's chat model as last resort

### Phase 2: Provider-safe call options
- Removed `ToolChoice: None` which conflicts with extended thinking on
some providers
- Added `MaxOutputTokens: 256` to cap token usage
- Improved title prompt with verb-noun format guidance (`Fix sidebar
layout`, `Add user authentication`) and explicit
no-markdown/no-code-fences instructions

### Files changed
- `coderd/chatd/title.go` — Candidate loop, improved prompt, safe call
options
- `coderd/chatd/chatd.go` — Build `TitleModelFunc` closure with
lightweight candidates
2026-03-04 16:03:03 +00:00
Kyle Carberry f4a7fa5b95 fix(chatd): block subagents from spawning workspaces (#22603)
## Summary

Subagent (child) chats were previously given access to workspace
provisioning tools (`list_templates`, `read_template`,
`create_workspace`), which could lead to uncontrolled resource
consumption. This PR moves those tools behind the same
`!chat.ParentChatID.Valid` gate that already protects the subagent tools
(`spawn_agent`, `wait_agent`, etc.).

## Changes

- **`coderd/chatd/chatd.go`**: Moved `list_templates`, `read_template`,
and `create_workspace` tool registration into the root-chat-only block
alongside subagent tools.
- **`coderd/chatd/chatd_test.go`**: Added
`TestSubagentChatExcludesWorkspaceProvisioningTools` — an E2E test that
spawns a subagent via a root chat and verifies the subagent's LLM call
does not include workspace provisioning or subagent tools.
- **`coderd/chatd/chattest/openai.go`**: Added `Tools` field to
`OpenAIRequest` and supporting `OpenAITool`/`OpenAIToolFunction` types
so tests can inspect which tools are sent to the model.
2026-03-04 15:49:14 +00:00
Kyle Carberry f56563b406 fix(site): replace modal delete confirmation with inline UI in agents admin (#22587)
## Problem

The agents admin panel (`/agents` → Admin button) is rendered inside a
Radix Dialog (`ConfigureAgentsDialog`). Deleting a model or provider
previously opened a MUI `DeleteDialog` on top, creating a modal-on-modal
situation. The two dialog systems (Radix and MUI) don't coordinate focus
trapping, scroll locking, or backdrop behavior, so the delete
confirmation was broken.

## Solution

Replace the modal `DeleteDialog` in both `ModelForm` and `ProviderForm`
with an inline confirmation strip rendered in the footer area. Clicking
"Delete" now swaps the footer to show:

- A warning message ("Are you sure? This action is irreversible.")
- Cancel and a destructive confirm button with loading spinner

This keeps everything within the existing Radix Dialog content pane — no
layering issues, no second modal.

## Changes

| File | Change |
|---|---|
| `ModelForm.tsx` | Added `isDeleting` prop, changed `onDeleteModel`
signature to async, added `confirmingDelete` state, inline confirmation
footer |
| `ProviderForm.tsx` | Removed `DeleteDialog` import/usage, replaced
with inline confirmation footer |
| `ModelsSection.tsx` | Removed `DeleteDialog` import/usage, removed
`modelToDelete` state, passes new props to `ModelForm` |
2026-03-04 10:09:13 -05:00
Matt Vollmer 77c80c30c0 docs: add Coder Agents overview page (#22584)
Adds a new documentation page at `docs/ai-coder/agents.md` describing
Coder Agents — the built-in chat interface, API, and lightweight AI
coding agent that runs in the Coder control plane.

## What's included

- Overview of what Coder Agents is and who it's for (regulated
industries, platform teams, existing Coder deployments)
- How the architecture works (agent loop in coderd, outbound to LLM
providers, connects to workspaces via existing daemon connection)
- Key features: automatic template/workspace selection, sub-agents, chat
persistence, message queuing
- Security benefits of the control plane architecture (no API keys in
workspaces, simpler network boundaries, centralized enforced control,
user identity attached)
- LLM provider support table (verified against
`coderd/chatd/chatprovider/chatprovider.go`)
- Built-in tools reference
- Comparison to Coder Tasks
- Product status (internal preview, early access next)
2026-03-04 10:06:48 -05:00
Ethan e738ff5299 ci: remove dylib build pipeline (#22592)
## Summary

The macOS `.dylib` is only used by Coder Desktop macOS v0.7.2 or older.
v0.7.2 was released in August 2025. v0.8.0 of Coder Desktop macOS, also
released in August 2025, uses a signed Coder slim binary from the
deployment instead.

It's unlikely customers will be using Coder Desktop macOS v0.7.2 and the
next release of Coder simultaneously, so I think we can safely remove
this process, given it slows down CI & release processes.

## Changes

- **Makefile**: Remove `DYLIB_ARCHES`, `CODER_DYLIBS` variables and
`build/coder-dylib` target
- **scripts/build_go.sh**: Remove `--dylib` flag and all dylib-specific
logic (c-shared buildmode, CGO, plist embedding, vpn/dylib entrypoint)
- **scripts/sign_darwin.sh**: Remove dylib-specific comment
- **CI (ci.yaml)**: Remove `build-dylib` job, artifact download/insert
steps, and `build-dylib` dependency from `build` job
- **Release (release.yaml)**: Remove `build-dylib` job, artifact
download/insert steps, and `build-dylib` dependency from `release` job
- **vpn/dylib/**: Delete entire directory (`lib.go` + `info.plist.tmpl`)
- **vpn/router.go, vpn/dns.go**: Clean up comments referencing dylib

The slim and fat binary builds are completely unaffected — the dylib was
an independent build target with its own CI job.

_Generated by mux but reviewed by a human_
2026-03-05 01:50:50 +11:00
Kyle Carberry 1635b18856 fix: persist draft message in localStorage on agent detail page (#22600)
## Problem

On the `/agents/:agentId` detail page, text typed into the chat input
was lost when navigating away and returning. The empty-state page
(`/agents`) already persisted drafts via `localStorage`, but individual
conversation pages did not.

## Solution

Adds per-conversation draft persistence to `useConversationEditingState`
in `AgentDetail.tsx`, following the same patterns used elsewhere in the
agents page:

- Drafts are stored under `agents.draft-input.<chatID>` keys
- The saved draft is read as the editor's initial value on mount
- `localStorage` is updated on every content change
- The key is removed when the input is cleared or a message is sent
successfully
2026-03-04 14:42:13 +00:00
Kacper Sawicki 52a42af1ca chore(deps): bump clistat to v1.2.1 (#22599)
Bumps `github.com/coder/clistat` from v1.2.0 to v1.2.1.
2026-03-04 15:29:00 +01:00
Danielle Maywood 90f686d684 feat(agents): add unarchive agent support (#22579) 2026-03-04 14:08:12 +00:00
Sas Swart 8c09df52f9 fix(coderd): use WaitSuperLong in TestReinitializeAgent (#22593)
Fixes coder/internal#642

We recently fixed Windows specific flakes for this test and reenabled
it. It then failed intermittently due to context deadline expiration.
The temporary path created on Windows contained invalid characters. This
resulted in a silent startup script failure on Windows. The test then
fruitlessly waited until context expiration. The test now uses a valid
path on Windows.
2026-03-04 15:22:43 +02:00
Kyle Carberry 012a0497ce fix(agents): remove optimistic message rendering and fix auto-promote delivery (#22588)
## Problem

Two bugs in the agents chat flow:

1. **Optimistic rendering glitch**: When sending a message while the
agent is busy, a fake message with a negative ID appears in the
timeline, then gets rolled back to the queued state. This causes a
jarring flash.

2. **Auto-promoted messages not appearing**: When the server
auto-promotes a queued message after finishing a task, the promoted user
message doesn't show up in the timeline until the LLM finishes its
response.

## Root Causes

**Bug 1**: The optimistic rendering system injected placeholder messages
with `id: -Date.now()` into the store. When the server responded with
`queued: true`, the optimistic message was rolled back — but the user
had already seen it flash in the timeline.

**Bug 2**: In `processChat`'s deferred cleanup, the auto-promoted
message was published via `publishEvent()`, which only delivers to local
in-process stream subscribers. The SSE subscriber goroutine only
forwards `message_part` events from the local channel — it ignores
`message` events. Durable events reach the SSE client via pubsub → DB
read, but `publishEvent` doesn't trigger a pubsub notification. The
explicit `PromoteQueued` endpoint correctly used `publishMessage()`
(which does both), but the auto-promote path did not.

## Changes

### Frontend (`site/`)
- **AgentDetail.tsx**: Remove optimistic message injection from send and
edit flows. Instead, use the `CreateChatMessageResponse.message` from
the POST response to insert the real server message into the store
immediately.
- **ChatContext.ts**: Remove the negative-ID cleanup logic from
`upsertDurableMessage` that stripped optimistic placeholders when real
messages arrived.
- **chatStore.test.ts**: Remove 2 tests for negative-ID optimistic
message behavior.

### Backend (`coderd/chatd/`)
- **chatd.go**: In `processChat` cleanup, replace `publishEvent()` with
`publishMessage()` for auto-promoted messages. This ensures the pubsub
notification (`AfterMessageID`) is sent, so SSE subscribers read the new
message from the DB immediately.
2026-03-04 07:49:39 -05:00
Danielle Maywood f28f56d02c test(coderd/rbac): parallelize TestRolePermissions subtests (#22259) 2026-03-04 12:47:39 +00:00
Jeremy Ruppel f07fdce20a flake: add page event logging to e2e tests (#22569)
I'm having a hard time reproducing [this
Heisenbug](https://github.com/coder/internal/issues/1154) in PR CI, but
it seems to happen pretty often on main, so I would like to add some
logging for a few more page events to the ones we already have.
2026-03-04 07:39:20 -05:00
Danielle Maywood a0b3a32cd3 fix(site): refactor agents sidebar timestamp/action cell (#22595) 2026-03-04 11:36:54 +00:00
Sas Swart cfcb81fb0f fix: user status change chart accommodates DST (#22191)
closes https://github.com/coder/internal/issues/464

# Summary

This PR resolves a flaky test that was sensitive to DST transitions in
various time zones. The root of the flake was:
* a bug; the query and its tests assume 24 hours per day
* the tests used local system time, which resulted in failures for dates
proximal to DST transitions

# Changes

Query:

The original query assumed 24 hour intervals between each day, which is
not a valid assumption. It now increments `1 day` at a time.

Database tests:

Database level tests for the query all assumed 24 hour days. They now
increment in DST-aware days instead. Instead of using time.Now() as a
base for testing, the test uses a series of dates over the course of an
entire year, to ensure that DST transition dates are present in every
test run.

# API Endpoint

The endpoint that delivers the user status chart now accepts an IANA
timezone name as a parameter and passes it, keeping the existing offset
as a fallback, to the database query.

API level tests were added to ensure the correct response form and error
behaviour. Correctness of content is tested at the database level.
2026-03-04 12:54:39 +02:00
Danielle Maywood 2882e36222 fix(site): move chat input outside flex-col-reverse scroller (#22585) 2026-03-04 01:04:04 +00:00
Mathias Fredriksson 13411c8a8a docs: add task lifecycle and agent compatibility pages (#22222)
Closes coder/internal#1359
Closes coder/internal#1329
2026-03-04 02:39:48 +02:00
Kyle Carberry 47199ab475 refactor(site): replace bespoke chat model provider UI with schema-driven rendering (#22577)
## Summary

Replace hand-coded per-provider field components, form state types,
validation schemas, and builder functions with generic schema-driven
code that reads from the auto-generated
`chatModelOptionsGenerated.json`.

## Changes

### `ModelConfigFields.tsx` (492 → 341 lines)
- Remove 6 per-provider components (`OpenAIFields`, `AnthropicFields`,
`GoogleFields`, `OpenAICompatFields`, `OpenRouterFields`,
`VercelFields`)
- Remove exported option arrays (`modelConfigReasoningEffortOptions`,
etc.)
- Add `renderSchemaField()` that dispatches to
`InputField`/`SelectField`/`JSONField` based on `field.input_type` from
the generated schema
- `ModelConfigFields` now calls `getVisibleProviderFields()` instead of
a switch statement
- `GeneralModelConfigFields` now calls `getVisibleGeneralFields()`
instead of hard-coding 6 InputField instances

### `modelConfigFormLogic.ts` (742 → 525 lines)
- Remove 6 per-provider form state types and empty defaults
- Remove 6 per-provider Yup validation schemas
- Remove 6 per-provider builder functions (`buildOpenAIOptions`, etc.)
- Remove 2 switch-case dispatch blocks (validation + build)
- Add `buildEmptyProviderState()` that walks schema fields to create
empty form state
- Add schema-driven `extractModelConfigFormState()` and
`buildModelConfigFromForm()`
- Add `yupTestForField()` + `buildYupSchema()` generating Yup validation
from field metadata
- Lazy-cache per-provider Yup schemas for performance

### `modelConfigFormLogic.test.ts`
- All 83 tests updated for the new nested state shape
- Uses `toContain` for error message assertions since labels now come
from schema descriptions

## Motivation

The auto-generated schema (`chatModelOptionsGenerated.json`) was merged
in #22568 but not yet consumed by the UI. This PR wires it up so that
when a new provider or field is added in Go (`codersdk/chats.go`),
running `make gen` regenerates the JSON schema and the UI automatically
picks up the new fields — no manual TypeScript changes needed.

**Production code reduced from 1234 to 866 lines (-30%).**
2026-03-03 17:52:35 -05:00
Kyle Carberry 4ee5306eca fix(site): request notification permission before push subscription (#22576)
## Problem

The subscribe flow in `useWebpushNotifications` called
`pushManager.subscribe()` without first requesting the `Notification`
permission. When the browser permission state is `"denied"` (e.g. from a
previous prompt dismissal), the browser throws:

```
DOMException: Registration failed - permission denied
```

This surfaced as a confusing error toast on the agents page. The error
has nothing to do with Coder RBAC roles — it's the browser denying the
push subscription because notification permission was previously
declined. An admin who had granted browser permission wouldn't see this;
a user who previously dismissed or denied the prompt would.

## Fix

Added an explicit `Notification.requestPermission()` call before
`pushManager.subscribe()`. This:

1. **Re-prompts** the user if the permission state is `"default"` (not
yet decided)
2. **Throws a clear, actionable error** if the permission is `"denied"`:
*"Notifications are blocked by your browser. Please allow notifications
for this site in your browser settings."*
3. **Only proceeds** to `pushManager.subscribe()` after permission is
confirmed as `"granted"`

## Tests

New test file `useWebpushNotifications.jest.ts`:
- **requests notification permission before subscribing** — verifies
`requestPermission()` is called before `pushManager.subscribe()`
- **throws a clear error when permission is denied** — verifies the
user-friendly error message
- **does not call pushManager.subscribe when permission is denied** —
verifies we bail out early
2026-03-03 17:13:31 -05:00
Matt Vollmer 39bde165b8 fix(site): open View Workspace link in new window on agents page (#22578)
On the `/agents` page, the "View Workspace" link in the header dropdown
menu was navigating in the same tab via `navigate()`. This changes it to
`window.open(workspaceRoute, "_blank")` so it opens in a new browser
window/tab instead.

It's frustrating when I want to view my workspace and then I have to go
back and find my chat.
2026-03-03 17:10:11 -05:00
Kyle Carberry f758443f44 feat(codersdk): generate chat model provider options schema from Go structs (#22568) 2026-03-03 21:29:58 +00:00
Kyle Carberry 5b1cf4a6a3 fix(chatd): start stream buffering before publishing running status (#22571)
## Problem

There is a race condition in the chat stream reconnect path. When a
client connects (or reconnects) to `/stream`, sometimes they only see a
`status: running` event but never receive any `message_part` events —
the stream appears stuck.

## Root Cause

In `processChat`, the sequence is:

1. `publishStatus(running)` — broadcasts `status: running` to all
subscribers and via pubsub.
2. `runChat()` is called.
3. Inside `runChat`, there's significant setup work (model resolution,
DB queries, title generation, prompt building, instruction resolution).
4. Only **after** all that setup does `runChat` set `buffering = true`
on the stream state.

If a client connects to `/stream` between steps 1 and 4:
- `Subscribe()` reads `chat.Status == running` from the DB, so it
includes `status: running` in the snapshot.
- But `buffering` is still `false`, so `subscribeToStream` returns an
**empty** local snapshot (no message_parts).
- `publishToStream` **drops** all `message_part` events when `buffering`
is false.
- Result: client sees `running` but never gets any streaming content.

## Fix

Move the `buffering = true` setup (and its deferred cleanup) from
`runChat` into `processChat`, right before `publishStatus(running)`.
This guarantees the buffer is active before any subscriber can observe
`status: running`, so:
- The snapshot always includes any in-flight `message_part` events.
- `publishToStream` never drops parts because buffering is already on.
2026-03-03 21:27:59 +00:00
Danielle Maywood f98761ff67 refactor(site): use button instead of div role="button" (#22575) 2026-03-03 21:26:01 +00:00
Steven Masley f6b4b7edab ci: remove sqlc push to cloud (#22574)
I left a vestigial piece, whooops
2026-03-03 14:49:00 -06:00
Danielle Maywood d2d956edb1 fix: add archived query parameter to chat list endpoint (#22562)
Despite the SDK type having an `Archived` field for chats, this data was
never fetched from the database — the `GetChatsByOwnerID` query
hardcoded `AND archived = false`, and the `convertChat` function never
mapped the field.

This PR adds an optional `archived` query parameter to `GET
/api/experimental/chats`:

| Value | Behavior |
|-------|----------|
| *(not provided)* | Returns all chats (active and archived) |
| `archived=false` | Returns only non-archived chats |
| `archived=true` | Returns only archived chats |

This follows the same pattern used by template versions
(`sqlc.narg('archived')` nullable boolean).

Also fixes `convertChat` to populate the `Archived` field in API
responses, which was never being set despite existing on the SDK type.
2026-03-03 20:39:19 +00:00
Kyle Carberry 8a2635285b fix(site): remove stale ArchivedAgentsSearchAutoExpands story (#22573)
The search input was removed from `AgentsSidebar` but the
`ArchivedAgentsSearchAutoExpands` story still referenced the `Search
agents...` placeholder, causing the Storybook interaction test to fail:

```
within(<div#storybook-root>).getByPlaceholderText("Search agents...")
Unable to find an element with the placeholder text of: Search agents...
```

This PR removes the stale story.
2026-03-03 20:28:04 +00:00
Steven Masley 8ea0c2f3bc ci: remove ci action to push schema to sqlc cloud (#22572)
SQLc cloud no longer exists
2026-03-03 14:21:07 -06:00
Kyle Carberry 810b509290 feat: refactor the agents admin UI layout (#22567)
I am working on a subsequent change to make the fields auto-generated
with `make gen` from the Go code itself, rather than us needing to
create a UI compatibility layer.

Once the above is done, I'll be adding in the payload so users can very
easily just click "Opus 4.6" to add the model, and the config values
will be set appropriately.

This is really just UI changes, nothing functionally should change here.
But the code will be cleaned up a lot post the above changes.

<img width="1197" height="978" alt="image"
src="https://github.com/user-attachments/assets/45f9afff-89bb-47a6-b9a1-534f50a9676e"
/>
<img width="1180" height="949" alt="image"
src="https://github.com/user-attachments/assets/b3fd963f-1c1d-4d2c-b501-ac8118b019ec"
/>
<img width="1185" height="957" alt="image"
src="https://github.com/user-attachments/assets/08faca29-2b38-476a-adab-0bd8ab17ddcc"
/>
2026-03-03 15:19:07 -05:00
Jeremy Ruppel b73f21662b flake: verify parameters in parallel in e2e tests (#22557)
This is an attempt to address coder/internal#1154

Tests appear to fail often on `verifyParameters`, which asserts input
visibility and value in series for all expected parameters. This change
makes the same assertions in parallel, hopefully completing before
timeout.
2026-03-03 14:56:41 -05:00
Danielle Maywood 6acdd6ca7d fix: wire agents-tab-visible metadata to experiments flag (#22553) 2026-03-03 13:51:10 -06:00
Zach 5b7377c375 feat: add Prometheus metrics for boundary log drop reporting (#22521)
Add Prometheus metrics to the boundary log proxy for observability:
- batches_dropped_total (reason: buffer_full, forward_failed)
- logs_dropped_total (reason: buffer_full, forward_failed,
  boundary_channel_full, boundary_batch_full)
- batches_forwarded_total

Also add BoundaryStatus to the BoundaryMessage envelope so boundary
can report dropped log counts as a separate wire message. The agent
records these as Prometheus metrics, making boundary-side data loss
visible. Backwards compatibility for older versions of boundary is maintained.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 12:42:34 -07:00
Danny Kopping 9b5573d7fa feat: store tool call IDs to determine interception lineage (#22246)
Adds database columns and server-side logic to track interception lineage via tool call IDs. When an interception ends, the server resolves the correlating tool call ID to find the parent interception and links them via `parent_id`.

New `provider_tool_call_id` column on `aibridge_tool_usages` and `parent_id` column on `aibridge_interceptions`, with indexes for lookup. `findParentInterceptionID` queries by tool call ID and filters out the current interception to find the parent.

Adapted from the [coder/coder `dk/prompt_provenance_poc`](https://github.com/coder/coder/compare/main...dk/prompt_provenance_poc) branch.
Depends on [coder/aibridge#188](https://github.com/coder/aibridge/pull/188).  
  
Closes https://github.com/coder/internal/issues/1334
2026-03-03 21:07:01 +02:00