Commit Graph

9 Commits

Author SHA1 Message Date
Kyle Carberry 0a026fde39 refactor: remove reasoning title extraction from chat pipeline (#22926)
Removes the backend and frontend logic that extracted compact titles
from reasoning/thinking blocks. The `Title` field on `ChatMessagePart`
remains for other part types (e.g. source), but reasoning blocks no
longer have titles derived from first-line markdown bold text or
provider metadata summaries.

**Backend:**
- Remove `ReasoningTitleFromFirstLine`, `reasoningTitleFromContent`,
`reasoningSummaryTitle`, `compactReasoningSummaryTitle`, and
`reasoningSummaryHeadline` from chatprompt
- Simplify `marshalContentBlock` to plain `json.Marshal` (no title
injection)
- Remove title tracking maps and `setReasoningTitleFromText` from
chatloop stream processing
- Remove `reasoningStoredTitle` from db2sdk
- Remove related tests from db2sdk_test

**Frontend:**
- Remove `mergeThinkingTitles` from blockUtils
- Simplify `appendTextBlock` to always merge consecutive thinking blocks
- Remove `applyStreamThinkingTitle` from streamState
- Simplify reasoning/thinking stream handler to ignore title-only parts
- Update tests accordingly

Net: **-487 lines / +42 lines**
2026-03-11 11:01:26 +00:00
Kyle Carberry f35b99a4fa fix(chatd): preserve context.Canceled in persistStep during shutdown (#22890)
## Problem

When a chat worker shuts down gracefully (e.g. Kubernetes pod SIGTERM)
while a tool is executing (like `wait_agent` polling for a subagent),
the chat gets stuck in `waiting` status forever — no other worker will
pick it up.

### Root Cause

`persistStep` in `chatd.go` unconditionally returned
`chatloop.ErrInterrupted` for **any** canceled context:

```go
if persistCtx.Err() != nil {
    return chatloop.ErrInterrupted  // BUG: doesn't check WHY the context was canceled
}
```

During shutdown, the context cause is `context.Canceled` (not
`ErrInterrupted`). But because `persistStep` returned `ErrInterrupted`,
the error handling in `processChat` hit the `ErrInterrupted` check first
(line 2011) and set status to `waiting` — the `isShutdownCancellation`
check (line 2017) was never reached:

```go
// Checked FIRST — matches because persistStep returned ErrInterrupted
if errors.Is(err, chatloop.ErrInterrupted) {
    status = database.ChatStatusWaiting  // Stuck forever
    return
}
// NEVER REACHED during shutdown
if isShutdownCancellation(ctx, chatCtx, err) {
    status = database.ChatStatusPending  // Would have been correct
    return
}
```

### Trigger scenario (from production logs)

1. Chat spawns a subagent via `spawn_agent`, then calls `wait_agent`
2. `wait_agent` blocks in `awaitSubagentCompletion` polling loop
3. Worker pod receives SIGTERM → `Close()` cancels server context
4. Context cancellation propagates to `awaitSubagentCompletion` →
returns `context.Canceled`
5. Tool execution completes, `persistStep` is called with canceled
context
6. `persistStep` returns `ErrInterrupted` (wrong!) → status set to
`waiting` (stuck!)

## Fix

Check `context.Cause()` before deciding which error to return:

```go
if persistCtx.Err() != nil {
    if errors.Is(context.Cause(persistCtx), chatloop.ErrInterrupted) {
        return chatloop.ErrInterrupted  // Intentional interruption
    }
    return persistCtx.Err()  // Shutdown → context.Canceled
}
```

This preserves `context.Canceled` for shutdown, allowing
`isShutdownCancellation` to match and set status to `pending` so another
worker retries the chat.

## Test

Added `TestRun_ShutdownDuringToolExecutionReturnsContextCanceled` which:
1. Streams a tool call to a blocking tool (simulating `wait_agent`)
2. Cancels the server context (simulating shutdown) while the tool
blocks
3. Verifies `Run` returns `context.Canceled`, NOT `ErrInterrupted`
2026-03-10 13:01:45 +00:00
Kyle Carberry aba3832b15 fix: update the compaction message to be the "user" role (#22819)
## Bug

After compaction in the chat loop, the loop re-enters and calls the LLM
with a prompt that has **no non-system messages**. Anthropic (and most
providers) require at least one user/assistant/tool message, so the API
errors with empty messages.

## Root Cause

The compaction summary was stored as `role=system`. After compaction,
`GetChatMessagesForPromptByChatID` returns only:
- The compressed system summary (matched by the CTE)
- Original non-compressed system messages (system prompts)

All original user/assistant/tool messages are excluded (they predate the
summary). The compaction assistant/tool messages are `compressed=TRUE`
and don't match the main query's `compressed=FALSE` clauses.

So `ReloadMessages` returned only system messages. The Anthropic
provider moves system messages into a separate `system` field, leaving
the `messages` API field as `[]`.

## Fix

1. **Changed compaction summary from `role=system` to `role=user`** —
the summary now appears as a user message in the reloaded prompt, giving
the model valid conversational context to respond to.

2. **Simplified the CTE** — removed the `role = 'system'` check and
narrowed `visibility IN ('model', 'both')` to just `visibility =
'model'`. The summary is the only compressed message with
`visibility=model` (the assistant has `visibility=user`, the tool has
`visibility=both`), so the role check was redundant.

## Test

`PostRunCompactionReEntryIncludesUserSummary`: verifies the re-entry
prompt contains a user message (the compaction summary) after compaction
+ reload.
2026-03-08 22:25:27 -04:00
Kyle Carberry eecb7d0b66 fix: resolve bugs in chatd streaming system (#22720)
Split from #22693 per review feedback.

Fixes multiple bugs in coderd/chatd and sub-packages including race
conditions, transaction safety, stream buffer bounds, retry limits, and
enterprise relay improvements.

See commit message for full list.
2026-03-06 21:02:25 +00:00
Kyle Carberry 5630390d94 fix(chatd): enable compaction between steps and re-enter after summarization (#22640)
## Problem

Three bugs with chat summarization (compaction) share a single root
cause: `ReloadMessages` was never wired up in the production
`chatloop.Run()` call.

### Bug 1: Compaction never fires between steps

The inline compaction guard in `chatloop.go` requires both `Compaction`
and `ReloadMessages` to be non-nil:

```go
if opts.Compaction != nil && opts.ReloadMessages != nil {
```

Since `ReloadMessages` was only set in tests, inline compaction was
**dead code in production**. Long multi-step turns could blow through
the context window.

### Bug 2: Compaction only occurs at end of turn

The post-run safety net doesn't check `ReloadMessages`, so it was the
only compaction path that fired:

```go
if !alreadyCompacted && opts.Compaction != nil { // no ReloadMessages check
```

This meant compaction only happened once, after the entire agent turn
finished.

### Bug 3: Agent stops after summarization

After post-run compaction, `Run()` unconditionally returned `nil`.
`processChat` then set the chat status to `waiting` (done). The agent
never had a chance to continue with its fresh summarized context.

## Fix

1. **Wire up `ReloadMessages`** in `chatd.go`: reloads persisted
messages from the database and re-applies system prompts (subagent
instruction, workspace AGENTS.md).

2. **Wrap the step loop in an outer compaction loop**: when compaction
fires on the model's final step (`compactedOnFinalStep`), reload
messages and `continue` the outer loop so the agent re-enters with
summarized context.

3. **Track `compactedOnFinalStep`** to distinguish inline compaction on
the last step (needs re-entry) from inline compaction mid-loop followed
by more tool-call steps (agent already consumed the compacted context,
no re-entry needed).

4. **Add `maxCompactionRetries = 3`** to prevent infinite compaction
loops.

## Testing

- All 7 existing compaction tests pass unchanged.
- Added `PostRunCompactionReEntersStepLoop` test: verifies that when a
text-only response triggers compaction, the outer loop re-enters and the
agent makes a second stream call with fresh context.
2026-03-04 22:28:23 -05:00
Kyle Carberry ddfe630757 refactor(chatd): replace fantasy.Agent with custom agent loop (#22507)
## Summary

Replaces fantasy's `Agent` abstraction with a direct step loop calling
`LanguageModel.Stream()`. Fantasy is retained as the provider
abstraction layer (streaming parsers, types, tool schema) but we no
longer use `fantasy.Agent`, `AgentStreamCall`, `AgentResult`, or
`StepResult`.

## Problems solved

| Problem | Before | After |
|---|---|---|
| **Sentinel prompt hack** | fantasy.Agent requires non-empty Prompt →
UUID sentinel generated and stripped in PrepareStep | Messages passed
directly to `model.Stream()` |
| **Discarded PersistStep errors** | `_ = opts.OnStepFinish(result)`
silently swallows errors | Errors propagate directly from
`PersistStep()` |
| **Shadow draft state** | ~160 LOC tracking content in parallel because
fantasy doesn't expose in-progress content on interruption |
`stepResult` owns content directly; `flushActiveState()` is trivial |
| **Nested retry layers** | fantasy's 2-attempt retry nested inside
chatretry's indefinite retry | Single `chatretry.Retry` layer |
| **Callback-mediated compaction** | Mutex + boolean flag + coordination
between OnStepFinish/PrepareStep callbacks | Inline `if` statement
between steps |
| **Duplicate compaction paths** | `compactStep()` + `maybeCompact()`
sharing ~80% logic | Single `tryCompact()` function |

## Changes

### `coderd/chatd/chatloop/chatloop.go` — Rewritten
- **Removed**: `fantasy.NewAgent()`, `AgentStreamCall`, sentinel prompt,
shadow draft state (~160 LOC of closures), `compactedMu`/`compacted`
flag, `PrepareStepResult`
- **Added**: `stepResult` struct, `processStepStream()` (stream
consumer), `executeTools()` (sequential tool execution),
`flushActiveState()` (interrupt handling), `buildToolDefinitions()`,
`toResponseMessages()`
- **Changed**: `Run()` return type from `(*fantasy.AgentResult, error)`
to `error` (callers already discarded the result)
- **Preserved**: Anthropic prompt caching, reasoning title extraction,
`extractContextLimit()`, `ErrInterrupted` semantics

### `coderd/chatd/chatloop/compaction.go` — Simplified
- Merged `compactStep()` + `maybeCompact()` → single `tryCompact()`
- Removed `[]StepResult` parameter from `generateCompactionSummary()`
(caller provides complete message list)
- Kept helper functions: `normalizedCompactionConfig`,
`contextTokensFromUsage`, `resolveContextLimit`, `shouldCompact`

### `coderd/chatd/chatd.go` — Caller updates
- Removed `AgentStreamCall` construction
- Changed `_, err = chatloop.Run(...)` to `err = chatloop.Run(...)`
- Model parameters moved from `AgentStreamCall` fields to `RunOptions`
fields

### Tests — 4 new tests
- `MidLoopCompactionReloadsMessages` — compaction fires mid-loop,
messages reloaded
- `PostRunCompactionSkippedAfterMidLoop` — no double compaction
- `MultiStepToolExecution` — tools execute between steps, results feed
next step
- `PersistStepErrorPropagates` — persistence errors propagate (was
silently discarded)
2026-03-02 18:51:57 -05:00
Kyle Carberry 2bdacae5f5 feat(chatd): add LLM stream retry with exponential backoff (#22418)
## Summary

Adds automatic retry with exponential backoff for transient LLM errors
during chat streaming and title generation. Inspired by
[coder/mux](https://github.com/coder/mux)'s retry mechanism.

## Key Behaviors

- **Infinite retries** with exponential backoff: 1s → 2s → 4s → ... →
60s cap
- **Deterministic delays** (no jitter)
- **Error classification**: retryable (429, 5xx, overloaded, rate limit,
network errors) vs non-retryable (auth, quota, context exceeded, model
not found, canceled)
- **Retry status published to SSE stream** so frontend can show
"Retrying in Xs..." UI
- **Title generation** retries silently (best-effort, nil onRetry
callback)

## New Package: `coderd/chatd/chatretry/`

| File | Purpose |
|------|---------|
| `classify.go` | `IsRetryable(err)` and `StatusCodeRetryable(code)` |
| `backoff.go` | `Delay(attempt)` — exponential doubling with 60s cap |
| `retry.go` | `Retry(ctx, fn, onRetry)` — infinite loop with
context-aware timer |

## Test Helpers: `coderd/chatd/chattest/errors.go`

Anthropic and OpenAI error response builders for use in chattest
providers:
- `AnthropicErrorResponse()`, `AnthropicOverloadedResponse()`,
`AnthropicRateLimitResponse()`
- `OpenAIErrorResponse()`, `OpenAIRateLimitResponse()`,
`OpenAIServerErrorResponse()`

## SDK Changes: `codersdk/chats.go`

- New `ChatStreamEventType: "retry"`
- New `ChatStreamRetry` struct with `Attempt`, `DelayMs`, `Error`,
`RetryingAt` fields
- TypeScript types auto-generated

## Changed Files

- `coderd/chatd/chatloop/chatloop.go` — wraps `agent.Stream()` in
`chatretry.Retry()`
- `coderd/chatd/chatd.go` — publishes retry events to SSE stream with
logging
- `coderd/chatd/title.go` — wraps `model.Generate()` in silent retry
- `coderd/chatd/chattest/anthropic.go` / `openai.go` — error injection
support

## Tests

42 tests covering classification (33), backoff (9), and retry scenarios
(8).
2026-02-27 18:34:33 -05:00
Kyle Carberry 360df1d84f fix(chatd): publish streaming message_part events during compaction (#22410)
## Problem

Context compaction in chatd persisted durable messages for the
`chat_summarized` tool call and result via `publishMessage`, but never
published `message_part` streaming events via `publishMessagePart`. This
meant connected clients had no streaming representation of the
compaction.

The client's `streamState` (built entirely from `message_part` events in
`streamState.ts`) never saw the compaction tool call, so:

- No **"Summarizing..."** running state was shown to the user during
summary generation (which can take up to 90s).
- The durable `message` events arrived after or interleaved with the
`status: waiting` event, causing the tool to appear as "Summarized" with
the chat appearing to just stop.

## Fix

### 1. `CompactionOptions.OnStart` callback (chatloop)

Added an `OnStart` callback to `CompactionOptions`, called in
`maybeCompact` right before `generateCompactionSummary` (the slow LLM
call). This gives `chatd` a hook to publish the tool-call `message_part`
immediately when compaction begins.

### 2. Tool-result streaming part (chatd)

`persistChatContextSummary` now publishes a tool-result `message_part`
before the durable `message` events, so clients transition from
"Summarizing..." to "Summarized" before the status change arrives.

### Event ordering is now:
1. `message_part` (tool call via `OnStart`) — client shows
"Summarizing..."
2. LLM generates summary (up to 90s)
3. `message_part` (tool result) — client shows "Summarized" in stream
state
4. `message` (assistant) — durable message persisted, stream state
resets
5. `message` (tool) — durable tool result persisted
6. `status: waiting` — chat transitions to idle

## Tests

- **`OnStartFiresBeforePersist`**: Verifies callback ordering is
`on_start` → `generate` → `persist`.
- **`OnStartNotCalledBelowThreshold`**: Verifies `OnStart` is not called
when context usage is below the compaction threshold.
2026-02-27 16:33:39 -05:00
Kyle Carberry edee917d88 feat: add experimental agents support (#22290)
feat: add AI chat system with agent tools and chat UI

Introduce the chatd subsystem and Agents UI for AI-powered chat
within Coder workspaces.

- Add chatd package with chat loop, message compaction, prompt
  management, and LLM provider integration (OpenAI, Anthropic)
- Add agent tools: create workspace, list/read templates, read/write/
  edit files, execute commands
- Add chat API endpoints with streaming, message editing, and
  durable reconnection
- Add database schema and migrations for chats, chat messages, chat
  providers, and chat model configs
- Add RBAC policies and dbauthz enforcement for chat resources
- Add Agents UI pages with conversation timeline, queued messages
  list, diff viewer, and model configuration panel
- Add comprehensive test coverage including coderd integration tests,
  chatd unit tests, and Storybook stories
- Gate feature behind experiments flag

---------

Co-authored-by: Cian Johnston <cian@coder.com>
Co-authored-by: Danielle Maywood <danielle@themaywoods.com>
Co-authored-by: Jeremy Ruppel <jeremy@coder.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 16:50:56 +00:00