Commit Graph

20 Commits

Author SHA1 Message Date
Kyle Carberry c9ed1e17fc feat(agents): add desktop notifications via VAPID web push (#22454)
## Summary

Wire VAPID web push notifications into the Agents (chat) system so users
get desktop notifications when an agent finishes running.

### Backend

- Add `webpush.Dispatcher` to `chatd.Server` and pass it through from
`coderd.Options.WebPushDispatcher`
- In `processChat()`'s deferred cleanup, dispatch a web push
notification when the chat reaches a terminal state:
  - **`waiting`** (success): "Agent has finished running."
- **`error`** (failure): the error message, or "Agent encountered an
error."
- Sub-agent chats (`ParentChatID.Valid`) are skipped to avoid
notification spam from internal delegation
- Gracefully no-ops when the dispatcher is nil (web push disabled)

### Frontend

- New `WebPushButton` component — a bell icon that uses the existing
`useWebpushNotifications` hook
  - Returns `null` when the `web-push` experiment is off
- Three states: loading spinner, green bell (subscribed), muted bell-off
(unsubscribed)
  - Tooltip + toast feedback on toggle
- Added to both the Agents page empty state top bar and the AgentDetail
top bar
- The Agents page has its own layout (no standard Navbar), so it needs
its own subscribe button

### End-to-end flow

1. User clicks the bell icon on `/agents` → browser subscribes via VAPID
2. User starts an agent chat → chat enters `running` status
3. Agent finishes → `processChat` defer sets status to `waiting`/`error`
→ dispatches web push
4. Browser service worker shows a desktop notification with the chat
title and status

---------

Co-authored-by: Coder <coder@users.noreply.github.com>
2026-02-28 23:40:17 -05:00
Kyle Carberry 533b90a3a4 fix: resolve chat title update race conditions and improve resilience (#22450)
## Problem

Chat titles sometimes don't update in the UI. The generated AI title
gets stuck as the fallback (first 6 words of the message) even though
the backend successfully generates a proper title.

## Root Causes

### 1. Cancelable context used during cleanup DB read (P0)
In `processChat`, the deferred cleanup re-reads the chat from the DB to
pick up the AI-generated title for the `status_change` pubsub event. But
it used the cancelable `ctx` instead of `cleanupCtx`:

```go
// Before — ctx may already be canceled here
if freshChat, readErr := p.db.GetChatByID(ctx, chat.ID); readErr == nil {
```

When the context is canceled, the DB read fails silently and the
`status_change` event carries the stale fallback title.

### 2. Title goroutine not tracked by inflight WaitGroup (P2)
The `maybeGenerateChatTitle` goroutine was fire-and-forget — not tracked
by `p.inflight`. During graceful shutdown, the server could exit before
the goroutine completes its DB write or pubsub publish.

### 3. No recovery when watchChats() WebSocket misses events
The frontend relies entirely on the `watchChats()` SSE connection for
title updates. If the connection drops or misses events, titles never
recover — the only fix was a full page reload.

## Fixes

1. **Use `cleanupCtx`** for the `GetChatByID` call and logger in the
deferred cleanup block.
2. **Track the title goroutine** with `p.inflight.Add(1)` / `defer
p.inflight.Done()` so shutdown waits for it.
3. **Invalidate chats query** on WebSocket open/close/error events so
missed updates are recovered via refetch. Also enable
`refetchOnWindowFocus` for the chats query.

Co-authored-by: Coder <coder@users.noreply.github.com>
2026-02-28 21:38:16 -05:00
Kyle Carberry 1c71fd69f6 fix: workspace auto-refresh during the chat flow (#22447) 2026-02-28 19:07:17 -05:00
Kyle Carberry 2abe55549c fix: return in-flight chats to pending on server shutdown (#22443)
When a chatd server shuts down (`Close()`), the server context is
canceled. Previously, in-flight chats would be marked as `error` because
the `context.Canceled` error was not distinguished from actual
processing failures.

This adds `isShutdownCancellation()` to detect when the error is caused
by the server context being canceled (as opposed to a chat-specific
cancellation like `ErrInterrupted`). When detected, the chat status is
set to `pending` with no `last_error`, allowing another replica to pick
it up and retry.

Extracted from #22440 — only the context cancellation bug fix, no
chattest changes.
2026-02-28 17:14:11 -05:00
Kyle Carberry 22d4539a7a fix(chatd): clear stream buffer after each step is persisted (#22445)
The in-memory stream buffer accumulated message-part events for the
entire duration of a chat run. Late-joining subscribers received all
buffered parts even though the backing messages had already been
committed to the database, wasting memory and potentially duplicating
content.

Clear the buffer at the end of each `persistStep` call so that only
in-flight (uncommitted) parts remain in the buffer.
2026-02-28 16:51:04 -05:00
Kyle Carberry 34d9392e37 chore(db): remove workspace_agent_id from chats table (#22442)
## Summary

Remove the `workspace_agent_id` column from the `chats` table and
dynamically look up the first workspace agent instead.

## Problem

When a workspace is stopped and restarted, the workspace agent gets a
new ID. The `workspace_agent_id` stored on the chat at creation time
becomes stale, making the agent unreachable. This caused chats to break
after workspace restarts.

## Solution

Instead of persisting the agent ID, dynamically look up the first agent
from the workspace's latest build via
`GetWorkspaceAgentsInLatestBuildByWorkspaceID` whenever an agent
connection is needed. The `workspace_id` on the chat remains stable
across restarts.

This behavior may be refined later (e.g., agent selection heuristics),
but picking the first agent resolves the immediate breakage.

## Changes

- **Migration 000425**: Drop `workspace_agent_id` column from `chats`
- **SQL queries**: Remove `workspace_agent_id` from `InsertChat` and
`UpdateChatWorkspace`
- **chatd.go**: `getWorkspaceConn` and `resolveInstructions` now look up
agents dynamically from workspace ID
- **chatd.go**: Remove `refreshChatWorkspaceSnapshot` (no longer needed)
- **createworkspace.go**: Stop persisting agent ID when associating
workspace with chat
- **subagent.go**: Stop passing agent ID to child chats
- **SDK/frontend**: Remove `WorkspaceAgentID` / `workspace_agent_id`
from Chat type

---------

Co-authored-by: Kyle Carberry <kylecarbs@gmail.com>
2026-02-28 16:46:51 -05:00
Kyle Carberry c316d0a3e7 fix(chatd): improve subagent tool descriptions and strip tools from child agents (#22441)
Two changes:

1. **Gate subagent tools behind `!chat.ParentChatID.Valid`** so child
agents never receive `spawn_agent`, `wait_agent`, `message_agent`, or
`close_agent`. Previously all 4 tools were given to every chat.
`spawn_agent` would fail at runtime ("delegated chats cannot create
child subagents") but the other 3 had no guard at all — meaning a child
could theoretically operate on sibling chats. Removing the tools
entirely is cleaner and saves context window.

2. **Rewrite tool descriptions to explain *when* to use them**, not just
what they do. `spawn_agent` now says to use it for clearly scoped,
independent, self-contained tasks (e.g. fixing a specific bug, writing a
single module, running a migration) and explicitly says *not* to use it
for simple operations you can handle with
`execute`/`read_file`/`write_file`. It also states that child agents
cannot spawn their own subagents. The other 3 tools get similar
guidance-oriented descriptions.

Co-authored-by: Coder <coder@users.noreply.github.com>
2026-02-28 16:30:25 -05:00
Kyle Carberry c5619746d1 fix(chat): fix stream state discrepancies between frontend and backend (#22437)
## Summary

Fixes four frontend↔backend discrepancies in chat stream state
management that could cause duplicate content, UI flicker, and stale
stream state.

### Backend fixes (`coderd/chatd/chatd.go`)

**1. No-pubsub path double-replayed message_part events**

`Subscribe()` built an `initialSnapshot` containing `message_part`
events from `localSnapshot`, then the no-pubsub goroutine replayed the
same `localSnapshot` into the `mergedEvents` channel. Since `streamChat`
sends the snapshot first then reads the channel, the frontend received
every `message_part` twice. `applyMessagePartToStreamState` doesn't
deduplicate — text gets concatenated, so content appeared doubled.

Fix: Only forward live `localParts` in the no-pubsub goroutine; the
snapshot already contains the historical events.

**2. Snapshot missing status event**

The initial snapshot never included a `status` event. The frontend's
`shouldApplyMessagePart()` gates on status (`pending`/`waiting`), but
the initial status came from a separate REST query via `useEffect`.
During the race window between snapshot arrival and REST resolution,
`message_part` events could be incorrectly accepted or rejected.

Fix: Prepend a `status` event to the snapshot after loading the chat
from DB, so the frontend has the authoritative status from the very
first batch.

### Frontend fixes (`ChatContext.ts`)

**3. Scheduled stream reset not canceled by subsequent message_parts**

When a `message` event arrived, `scheduleStreamReset()` queued
`clearStreamState` via `requestAnimationFrame`. If new `message_part`
events arrived in the next WebSocket frame before the rAF fired, they
were pushed to `pendingMessageParts` without canceling the scheduled
reset. The rAF would fire between frames, clearing stream state, then
the next flush would re-populate it — causing a visible flash.

Fix: Call `cancelScheduledStreamReset()` when accumulating
`message_part` events.

**4. startTransition race with synchronous clearStreamState**

`flushMessageParts` wrapped `applyMessageParts` in `startTransition`,
which React can defer. If a `status: "waiting"` event arrived in the
same batch after `message_part` events, the status handler cleared
stream state synchronously, but the deferred `applyMessageParts`
callback could fire afterward and re-populate it.

Fix: Re-check `shouldApplyMessagePart()` inside the `startTransition`
callback at execution time.

### Tests added

- **Go**: `TestSubscribeSnapshotIncludesStatusEvent` — asserts the first
snapshot event is a status event
- **Go**: `TestSubscribeNoPubsubNoDuplicateMessageParts` — asserts the
events channel doesn't replay snapshot events
- **TS**: `cancels scheduled stream reset when message_part arrives
after message` — verifies stream state survives a [message,
message_part] batch
- **TS**: `does not apply message parts after status changes to waiting`
— verifies deferred applyMessageParts respects status transitions
2026-02-28 13:35:23 -05:00
Kyle Carberry a621c3cb13 feat(agent): add process execution API and rewrite execute tool (#22416)
## Summary

Adds a new agent-side process management HTTP API and rewrites the chat
execute tool to use it instead of SSH sessions.

## What changed

### New agent/agentproc/ package

- **headtail.go** — Thread-safe io.Writer with bounded memory (16KB head
+ 16KB tail ring buffer). Provides LLM-ready output with truncation
metadata and long-line truncation at 2048 bytes.
- **headtail_test.go** — 16 tests including race detector coverage for
concurrent writes.
- **process.go** — Manager + Process types for lifecycle management
using agentexec.Execer for proper OOM/nice scores.
- **api.go** — HTTP API following the agentfiles chi router pattern. 4
endpoints: start, list, output, signal.

### Agent wiring (agent/agent.go, agent/api.go)

Mounts the process API at /api/v0/processes, mirroring how agentfiles is
mounted.

### SDK (codersdk/workspacesdk/agentconn.go)

4 new AgentConn interface methods + 7 request/response types:
- StartProcess, ListProcesses, ProcessOutput, SignalProcess

### Execute tool rewrite (coderd/chatd/chattool/execute.go)

- SSH to Agent API: conn.StartProcess() + conn.ProcessOutput() polling
- New parameters: workdir, run_in_background
- Structured response: success, exit_code, wall_duration_ms, error,
truncated, note, background_process_id
- Non-interactive env vars: GIT_EDITOR=true, TERM=dumb, NO_COLOR=1,
PAGER=cat, etc.
- Output truncation: HeadTailBuffer caps at 32KB for LLM consumption
- File-dump detection with advisory notes suggesting read_file
- Default timeout: 60s to 10s
- Foreground polling: 200ms intervals until exit or timeout

## Architecture

State lives on the agent, surviving coderd failover and instance
changes. Any coderd replica can query any agent via HTTP over tailnet.
2026-02-28 12:33:52 -05:00
Kyle Carberry 0ad2f9ecd7 feat(chatd): persist last_error on chats table (#22436)
Adds a nullable `last_error` column to the `chats` table so error
reasons survive page reloads.

**Backend:**
- Migration adds `last_error TEXT` (nullable) to chats
- `UpdateChatStatus` writes the error reason when status transitions to
`error`, clears it (NULL) on recovery
- `convertChat` maps `sql.NullString` to `*string` in the SDK

**Frontend:**
- Sidebar falls back to `chat.last_error` when no stream error reason is
cached
- Chat detail page does the same for `persistedErrorReason`
- Fixtures updated for new required field
2026-02-28 12:27:26 -05:00
Kyle Carberry 2bdacae5f5 feat(chatd): add LLM stream retry with exponential backoff (#22418)
## Summary

Adds automatic retry with exponential backoff for transient LLM errors
during chat streaming and title generation. Inspired by
[coder/mux](https://github.com/coder/mux)'s retry mechanism.

## Key Behaviors

- **Infinite retries** with exponential backoff: 1s → 2s → 4s → ... →
60s cap
- **Deterministic delays** (no jitter)
- **Error classification**: retryable (429, 5xx, overloaded, rate limit,
network errors) vs non-retryable (auth, quota, context exceeded, model
not found, canceled)
- **Retry status published to SSE stream** so frontend can show
"Retrying in Xs..." UI
- **Title generation** retries silently (best-effort, nil onRetry
callback)

## New Package: `coderd/chatd/chatretry/`

| File | Purpose |
|------|---------|
| `classify.go` | `IsRetryable(err)` and `StatusCodeRetryable(code)` |
| `backoff.go` | `Delay(attempt)` — exponential doubling with 60s cap |
| `retry.go` | `Retry(ctx, fn, onRetry)` — infinite loop with
context-aware timer |

## Test Helpers: `coderd/chatd/chattest/errors.go`

Anthropic and OpenAI error response builders for use in chattest
providers:
- `AnthropicErrorResponse()`, `AnthropicOverloadedResponse()`,
`AnthropicRateLimitResponse()`
- `OpenAIErrorResponse()`, `OpenAIRateLimitResponse()`,
`OpenAIServerErrorResponse()`

## SDK Changes: `codersdk/chats.go`

- New `ChatStreamEventType: "retry"`
- New `ChatStreamRetry` struct with `Attempt`, `DelayMs`, `Error`,
`RetryingAt` fields
- TypeScript types auto-generated

## Changed Files

- `coderd/chatd/chatloop/chatloop.go` — wraps `agent.Stream()` in
`chatretry.Retry()`
- `coderd/chatd/chatd.go` — publishes retry events to SSE stream with
logging
- `coderd/chatd/title.go` — wraps `model.Generate()` in silent retry
- `coderd/chatd/chattest/anthropic.go` / `openai.go` — error injection
support

## Tests

42 tests covering classification (33), backoff (9), and retry scenarios
(8).
2026-02-27 18:34:33 -05:00
Kyle Carberry 4b5ec8a9a4 feat: add diff_status_change event to /chats/watch pubsub stream (#22419)
## Summary

Adds a new `diff_status_change` event kind to the `/chats/watch` pubsub
stream so the sidebar can update diff status (PR created, files changed,
branch info) without a full page reload.

### Problem

When a chat's diff status changes (e.g. PR created via GitHub, git
branch pushed), the sidebar didn't update because:
1. The backend `publishChatPubsubEvent` didn't include diff status data
2. The frontend watch handler only merged `status`, `title`, and
`updated_at` from events

### Solution

A **notify-only** approach: a new `ChatEventKindDiffStatusChange` event
kind tells the frontend "diff status changed for chat X" — the frontend
then invalidates the relevant React Query cache entries to re-fetch.

### Backend changes

- **`coderd/pubsub/chatevent.go`**: New `ChatEventKindDiffStatusChange =
"diff_status_change"` constant
- **`coderd/chatd/chatd.go`**: New `PublishDiffStatusChange(ctx,
chatID)` method on `Server`
- **`coderd/chats.go`**: New `publishChatDiffStatusEvent` helper.
Published from:
- `refreshWorkspaceChatDiffStatuses` — after each chat's diff status is
refreshed via GitHub API
- `storeChatGitRef` — after persisting git branch/origin info from
workspace agent

### Frontend changes

- **`AgentsPage.tsx`**: Handle `diff_status_change` event by
invalidating `chatDiffStatusKey` and `chatDiffContentsKey` queries
- **`ChatContext.ts`**: Remove redundant diff status invalidation that
fired on every chat status change (the new event kind handles this
properly)
2026-02-27 18:06:54 -05:00
Kyle Carberry 12083441e0 feat(chats): archive chats instead of hard-deleting them (#22406)
## Summary

The UI has always labeled the action as "Archive agent" but the backend
was performing a hard `DELETE`, permanently destroying chats and all
their messages.

This change replaces the hard delete with a soft archive, consistent
with the pattern used by template versions.

## Changes

### Database
- **Migration 000423**: Add `archived boolean DEFAULT false NOT NULL`
column to `chats` table
- Replace `DeleteChatByID` query with `ArchiveChatByID` (`UPDATE SET
archived = true`)
- Add `UnarchiveChatByID` query (`UPDATE SET archived = false`)
- Filter archived chats from `GetChatsByOwnerID` (`WHERE archived =
false`)

### API
- Remove `DELETE /api/experimental/chats/{chat}`
- Add `POST /api/experimental/chats/{chat}/archive` — archives a chat
and all its descendants
- Add `POST /api/experimental/chats/{chat}/unarchive` — unarchives a
single chat (API only, no UI yet)

### Backend
- `archiveChatTree()` recursively archives child chats (replaces
`deleteChatTree()` which hard-deleted)
- Chat daemon's `ArchiveChat()` archives the full chat tree in a
transaction
- Authorization uses `ActionUpdate` instead of `ActionDelete`

### SDK
- Replace `DeleteChat()` with `ArchiveChat()` and `UnarchiveChat()`
- Add `Archived` field to `Chat` struct

### Frontend
- `archiveChat` API call uses `POST .../archive` instead of `DELETE`
- No UI changes — the "Archive agent" button now actually archives
instead of deleting

## Design Decision

This follows the **template version archive pattern** (Pattern B in the
codebase):
- `archived boolean` column (not `deleted boolean`)
- Dedicated `POST .../archive` and `POST .../unarchive` routes (not
repurposing `DELETE`)
- Reversible — users can unarchive via the API (UI for this will come
later)
2026-02-27 16:46:19 -05:00
Kyle Carberry 360df1d84f fix(chatd): publish streaming message_part events during compaction (#22410)
## Problem

Context compaction in chatd persisted durable messages for the
`chat_summarized` tool call and result via `publishMessage`, but never
published `message_part` streaming events via `publishMessagePart`. This
meant connected clients had no streaming representation of the
compaction.

The client's `streamState` (built entirely from `message_part` events in
`streamState.ts`) never saw the compaction tool call, so:

- No **"Summarizing..."** running state was shown to the user during
summary generation (which can take up to 90s).
- The durable `message` events arrived after or interleaved with the
`status: waiting` event, causing the tool to appear as "Summarized" with
the chat appearing to just stop.

## Fix

### 1. `CompactionOptions.OnStart` callback (chatloop)

Added an `OnStart` callback to `CompactionOptions`, called in
`maybeCompact` right before `generateCompactionSummary` (the slow LLM
call). This gives `chatd` a hook to publish the tool-call `message_part`
immediately when compaction begins.

### 2. Tool-result streaming part (chatd)

`persistChatContextSummary` now publishes a tool-result `message_part`
before the durable `message` events, so clients transition from
"Summarizing..." to "Summarized" before the status change arrives.

### Event ordering is now:
1. `message_part` (tool call via `OnStart`) — client shows
"Summarizing..."
2. LLM generates summary (up to 90s)
3. `message_part` (tool result) — client shows "Summarized" in stream
state
4. `message` (assistant) — durable message persisted, stream state
resets
5. `message` (tool) — durable tool result persisted
6. `status: waiting` — chat transitions to idle

## Tests

- **`OnStartFiresBeforePersist`**: Verifies callback ordering is
`on_start` → `generate` → `persist`.
- **`OnStartNotCalledBelowThreshold`**: Verifies `OnStart` is not called
when context usage is below the compaction threshold.
2026-02-27 16:33:39 -05:00
Kyle Carberry f509c841cf fix(chatd): recover stale chats after coderd redeployment (#22405)
## Problem

When coderd instances are redeployed (e.g. rolling deployment on
dogfood), in-flight chats get stuck in `running` status permanently. The
UI shows them as "thinking" with a spinning indicator, but no worker is
actually processing them. They never error or resume.

## Root Cause

Two bugs combine to cause this:

### Bug 1: Shutdown cleanup uses a canceled context

The `processChat` defer block updates the chat status in the DB when
processing completes. But it uses `ctx`, which `Close()` cancels
*before* the defer runs. The DB transaction silently fails with
`context.Canceled`, leaving the chat in `status=running` with a dead
`worker_id`.

```go
// Close() calls p.cancel() which cancels ctx
// Then the defer tries to use the now-canceled ctx:
defer func() {
    err := p.db.InTx(func(tx database.Store) error {
        tx.GetChatByIDForUpdate(ctx, chat.ID) // FAILS
        tx.UpdateChatStatus(ctx, ...)          // FAILS
    }, nil)
}()
```

### Bug 2: Stale recovery runs only once at startup

`recoverStaleChats()` was called only once in `start()`, not
periodically. During a rolling deployment, the new instance starts while
the old one is still alive (fresh heartbeat). By the time the old
instance crashes, no one checks again.

## Fix

1. **Use `context.WithoutCancel(ctx)` in the processChat defer** — the
cleanup transaction now completes even during graceful shutdown.

2. **Run `recoverStaleChats` periodically** — a second ticker in the
`start()` loop checks for stale chats at `inFlightChatStaleAfter / 5`
intervals (default: every 1 minute). This catches orphaned chats even
when the instance that owns them crashes without clean shutdown.

## Tests

- `TestRecoverStaleChatsPeriodically` — Verifies chats orphaned *after*
startup are recovered by the periodic loop (not just the startup check).
- `TestNewReplicaRecoversStaleChatFromDeadReplica` — Verifies a new
replica recovers stale chats on startup.
- `TestWaitingChatsAreNotRecoveredAsStale` — Negative test: `waiting`
chats are not incorrectly modified by recovery.
2026-02-27 15:25:40 -05:00
Kyle Carberry b65c0766d2 feat: add line-based read_file tool with safety limits (#22400)
## Summary

Adds a new line-based file reading endpoint to the workspace agent,
replacing the unbounded byte-based approach for the `read_file` chat
tool and `coder_workspace_read_file` MCP tool.

**Problem**: The current `read_file` tool returns the entire file
contents with no limits, which can blow up LLM context windows and cause
OOM issues with large files.

**Solution**: Inspired by [`coder/mux`](https://github.com/coder/mux)
and [`openai/codex`](https://github.com/openai/codex), implement a
line-based reader with safety limits.

## Changes

### Agent (`agent/agentfiles/`)
- New `/read-file-lines` endpoint with `HandleReadFileLines` handler
- Line-based `offset` (1-based line number, default: 1) and `limit`
(line count, default: 2000)
- Safety constants:
  | Constant | Value | Purpose |
  |---|---|---|
  | `MaxFileSize` | 1 MB | Reject files larger than this at stat |
| `MaxLineBytes` | 1,024 | Per-line truncation with `... [truncated]`
marker |
  | `MaxResponseLines` | 2,000 | Max lines per response |
  | `MaxResponseBytes` | 32 KB | Max total response size |
  | `DefaultLineLimit` | 2,000 | Default when no limit specified |
- Line numbering format: `1\tcontent` (tab-separated)
- Structured JSON response: `{ success, file_size, total_lines,
lines_read, content, error }`
- Hard errors when limits exceeded — tells the LLM to use
`offset`/`limit`
- Existing byte-based `/read-file` endpoint preserved (used by
`instruction.go`)

### SDK (`codersdk/workspacesdk/`)
- `ReadFileLinesResponse` type added
- `ReadFileLines` method added to `AgentConn` interface
- Mock regenerated

### Chat tool (`coderd/chatd/chattool/`)
- `read_file` tool now uses `conn.ReadFileLines()` instead of
`conn.ReadFile()`
- Updated tool description to document line-based parameters
- Response includes `file_size`, `total_lines`, `lines_read` metadata

### MCP tool (`codersdk/toolsdk/`)
- `coder_workspace_read_file` updated to use line-based reading
- Schema descriptions updated for line-based offset/limit
- Removed `maxFileLimit` constant (agent handles limits now)

### Tests
- 13 new test cases for `TestReadFileLines`:
- Path validation (empty, relative, non-existent, directory, no
permissions)
  - Empty file handling
  - Basic read, offset, limit, offset+limit combinations
  - Offset beyond file length
  - Long line truncation (>1024 bytes)
  - Large file rejection (>1MB)
- All existing tests pass unchanged

## Design decisions

| Decision | Rationale |
|---|---|
| Line-based, not byte-based | Both coder/mux and openai/codex use
line-based — matches how LLMs reason about code |
| Default limit of 2000 | Matches codex; prevents accidental full-file
dumps while being generous |
| 32 KB response cap | Compromise between mux (16 KB) and codex (no cap)
|
| 1024 byte/line truncation with marker | More generous than codex
(500), marker helps LLM know data is missing |
| Hard errors on overflow | Matches mux; forces LLM to paginate rather
than getting partial data |
| Preserve byte-based endpoint | `instruction.go` needs raw byte access
for AGENTS.md |
2026-02-27 15:12:56 -05:00
Kyle Carberry ff687aa780 fix: re-read chat before publishing status event to preserve AI title (#22402)
## Problem

Chat titles revert to the fallback truncated title after briefly showing
the AI-generated title. Even reloading the page doesn't help — the
correct title flashes then gets overwritten.

## Root Cause

Single bug, two symptoms.

In `processChat` (`coderd/chatd/chatd.go`), the `chat` variable is
passed by value. The flow:

1. `processChat(ctx, chat)` receives `chat` with the initial fallback
title (truncated first message).
2. Inside `runChat`, `maybeGenerateChatTitle` generates an AI title,
writes it to the DB via `UpdateChatByID`, and publishes a `title_change`
event. **The DB has the correct title.** The client briefly displays it.
3. `runChat` returns. The **deferred cleanup** in `processChat`
publishes `publishChatPubsubEvent(chat, StatusChange)` — but `chat` here
is the original value copy that still has the **old fallback title**.
4. The frontend receives the `status_change` SSE event and
**unconditionally applies `title` from every event kind** (see
`AgentsPage.tsx` line ~305: `title: updatedChat.title`). This overwrites
the correct AI title with the stale fallback.

**Why reload doesn't help:** If the chat is still processing when the
page reloads, `listChats` loads the correct title from the DB, but then
the deferred `status_change` event arrives moments later and clobbers
it. The title was always in the DB — it was the pubsub event that kept
overwriting it.

## Fix

Re-read the chat from the database in the deferred cleanup before
publishing the final `status_change` event, so it carries the current
(AI-generated) title.
2026-02-27 15:06:36 -05:00
Kyle Carberry 344d11fa22 feat: include OS and working directory in workspace agent prompt injection (#22399)
When injecting system instructions into the chat prompt, include:

1. **Operating system** and **working directory** from the
`workspace_agents` table
2. **Home-level instructions** from `~/.coder/AGENTS.md` (existing
behavior)
3. **Project-level instructions** from `<pwd>/AGENTS.md` (new)

The XML tag is renamed from `<coder-home-instructions>` to
`<system-instructions>` since it now carries more than just the home
instruction file.

### Example output (both files present)

```xml
<system-instructions>
Operating System: linux
Working Directory: /home/coder/coder

Source: /home/coder/.coder/AGENTS.md
... home instructions ...

Source: /home/coder/coder/AGENTS.md
... project instructions ...
</system-instructions>
```

### Example output (no AGENTS.md files)

```xml
<system-instructions>
Operating System: linux
Working Directory: /home/coder/coder
</system-instructions>
```

### Changes

- **`coderd/chatd/instruction.go`**:
- Renamed types: `homeInstructionContext` → `agentContext`, added
`instructionFile` struct
  - Extracted `readInstructionFileAtPath` shared helper
- Added `readWorkingDirectoryInstructionFile` to read `<pwd>/AGENTS.md`
- Replaced `formatHomeInstruction` with `formatInstructions` that
renders both files under `<system-instructions>`
- **`coderd/chatd/chatd.go`**:
- Renamed `resolveHomeInstruction` → `resolveInstructions`; now reads
both home and pwd instruction files
- `resolveAgentContext` returns `agentContext` (renamed from
`homeInstructionContext`)
- pwd file read is skipped gracefully if directory is empty or file
doesn't exist
- **`coderd/chatd/instruction_test.go`**:
- Added `TestReadWorkingDirectoryInstructionFile` (success, not-found,
empty-directory)
- Replaced `TestFormatHomeInstruction` with `TestFormatInstructions`
covering all combinations
- Added ordering test (`AgentContextBeforeFiles`) to verify OS/pwd
appear before file sources
2026-02-27 14:21:23 -05:00
Kyle Carberry 59cec5be65 feat: add pagination and popularity sorting to chattool list_templates (#22398)
## Summary

The `chattool` `list_templates` tool previously returned all templates
in a single response with no popularity signal. On deployments with many
templates (e.g. 71 on dogfood), this wastes tokens and makes it hard for
the AI to pick the right template for broad user questions.

## Changes

Single file: `coderd/chatd/chattool/listtemplates.go`

- **`page` parameter** — optional, 1-indexed, 10 results per page
- **Popularity sort** — queries
`GetWorkspaceUniqueOwnerCountByTemplateIDs` to get active developer
counts, then sorts descending (most popular first). The DB query returns
templates alphabetically, so this explicit sort is needed.
- **`active_developers`** — included on each template item so the agent
can see the signal
- **Pagination metadata** — `page`, `total_pages`, `total_count` in the
response so the agent knows there are more results
- **Updated tool description** — tells the agent that results are
ordered by popularity and paginated

## Frontend

No frontend changes needed. The renderer already reads `rec.templates`
and `rec.count` from the response — the new fields (`page`,
`total_pages`, `total_count`) are additive and safely ignored.
2026-02-27 14:06:22 -05:00
Kyle Carberry edee917d88 feat: add experimental agents support (#22290)
feat: add AI chat system with agent tools and chat UI

Introduce the chatd subsystem and Agents UI for AI-powered chat
within Coder workspaces.

- Add chatd package with chat loop, message compaction, prompt
  management, and LLM provider integration (OpenAI, Anthropic)
- Add agent tools: create workspace, list/read templates, read/write/
  edit files, execute commands
- Add chat API endpoints with streaming, message editing, and
  durable reconnection
- Add database schema and migrations for chats, chat messages, chat
  providers, and chat model configs
- Add RBAC policies and dbauthz enforcement for chat resources
- Add Agents UI pages with conversation timeline, queued messages
  list, diff viewer, and model configuration panel
- Add comprehensive test coverage including coderd integration tests,
  chatd unit tests, and Storybook stories
- Gate feature behind experiments flag

---------

Co-authored-by: Cian Johnston <cian@coder.com>
Co-authored-by: Danielle Maywood <danielle@themaywoods.com>
Co-authored-by: Jeremy Ruppel <jeremy@coder.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 16:50:56 +00:00