mirror of
https://github.com/coder/coder.git
synced 2026-06-04 13:38:21 +00:00
a46336c3ec00906429e48e781f75a2202a3fdb8f
9 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
0a026fde39 |
refactor: remove reasoning title extraction from chat pipeline (#22926)
Removes the backend and frontend logic that extracted compact titles from reasoning/thinking blocks. The `Title` field on `ChatMessagePart` remains for other part types (e.g. source), but reasoning blocks no longer have titles derived from first-line markdown bold text or provider metadata summaries. **Backend:** - Remove `ReasoningTitleFromFirstLine`, `reasoningTitleFromContent`, `reasoningSummaryTitle`, `compactReasoningSummaryTitle`, and `reasoningSummaryHeadline` from chatprompt - Simplify `marshalContentBlock` to plain `json.Marshal` (no title injection) - Remove title tracking maps and `setReasoningTitleFromText` from chatloop stream processing - Remove `reasoningStoredTitle` from db2sdk - Remove related tests from db2sdk_test **Frontend:** - Remove `mergeThinkingTitles` from blockUtils - Simplify `appendTextBlock` to always merge consecutive thinking blocks - Remove `applyStreamThinkingTitle` from streamState - Simplify reasoning/thinking stream handler to ignore title-only parts - Update tests accordingly Net: **-487 lines / +42 lines** |
||
|
|
f35b99a4fa |
fix(chatd): preserve context.Canceled in persistStep during shutdown (#22890)
## Problem
When a chat worker shuts down gracefully (e.g. Kubernetes pod SIGTERM)
while a tool is executing (like `wait_agent` polling for a subagent),
the chat gets stuck in `waiting` status forever — no other worker will
pick it up.
### Root Cause
`persistStep` in `chatd.go` unconditionally returned
`chatloop.ErrInterrupted` for **any** canceled context:
```go
if persistCtx.Err() != nil {
return chatloop.ErrInterrupted // BUG: doesn't check WHY the context was canceled
}
```
During shutdown, the context cause is `context.Canceled` (not
`ErrInterrupted`). But because `persistStep` returned `ErrInterrupted`,
the error handling in `processChat` hit the `ErrInterrupted` check first
(line 2011) and set status to `waiting` — the `isShutdownCancellation`
check (line 2017) was never reached:
```go
// Checked FIRST — matches because persistStep returned ErrInterrupted
if errors.Is(err, chatloop.ErrInterrupted) {
status = database.ChatStatusWaiting // Stuck forever
return
}
// NEVER REACHED during shutdown
if isShutdownCancellation(ctx, chatCtx, err) {
status = database.ChatStatusPending // Would have been correct
return
}
```
### Trigger scenario (from production logs)
1. Chat spawns a subagent via `spawn_agent`, then calls `wait_agent`
2. `wait_agent` blocks in `awaitSubagentCompletion` polling loop
3. Worker pod receives SIGTERM → `Close()` cancels server context
4. Context cancellation propagates to `awaitSubagentCompletion` →
returns `context.Canceled`
5. Tool execution completes, `persistStep` is called with canceled
context
6. `persistStep` returns `ErrInterrupted` (wrong!) → status set to
`waiting` (stuck!)
## Fix
Check `context.Cause()` before deciding which error to return:
```go
if persistCtx.Err() != nil {
if errors.Is(context.Cause(persistCtx), chatloop.ErrInterrupted) {
return chatloop.ErrInterrupted // Intentional interruption
}
return persistCtx.Err() // Shutdown → context.Canceled
}
```
This preserves `context.Canceled` for shutdown, allowing
`isShutdownCancellation` to match and set status to `pending` so another
worker retries the chat.
## Test
Added `TestRun_ShutdownDuringToolExecutionReturnsContextCanceled` which:
1. Streams a tool call to a blocking tool (simulating `wait_agent`)
2. Cancels the server context (simulating shutdown) while the tool
blocks
3. Verifies `Run` returns `context.Canceled`, NOT `ErrInterrupted`
|
||
|
|
aba3832b15 |
fix: update the compaction message to be the "user" role (#22819)
## Bug
After compaction in the chat loop, the loop re-enters and calls the LLM
with a prompt that has **no non-system messages**. Anthropic (and most
providers) require at least one user/assistant/tool message, so the API
errors with empty messages.
## Root Cause
The compaction summary was stored as `role=system`. After compaction,
`GetChatMessagesForPromptByChatID` returns only:
- The compressed system summary (matched by the CTE)
- Original non-compressed system messages (system prompts)
All original user/assistant/tool messages are excluded (they predate the
summary). The compaction assistant/tool messages are `compressed=TRUE`
and don't match the main query's `compressed=FALSE` clauses.
So `ReloadMessages` returned only system messages. The Anthropic
provider moves system messages into a separate `system` field, leaving
the `messages` API field as `[]`.
## Fix
1. **Changed compaction summary from `role=system` to `role=user`** —
the summary now appears as a user message in the reloaded prompt, giving
the model valid conversational context to respond to.
2. **Simplified the CTE** — removed the `role = 'system'` check and
narrowed `visibility IN ('model', 'both')` to just `visibility =
'model'`. The summary is the only compressed message with
`visibility=model` (the assistant has `visibility=user`, the tool has
`visibility=both`), so the role check was redundant.
## Test
`PostRunCompactionReEntryIncludesUserSummary`: verifies the re-entry
prompt contains a user message (the compaction summary) after compaction
+ reload.
|
||
|
|
eecb7d0b66 |
fix: resolve bugs in chatd streaming system (#22720)
Split from #22693 per review feedback. Fixes multiple bugs in coderd/chatd and sub-packages including race conditions, transaction safety, stream buffer bounds, retry limits, and enterprise relay improvements. See commit message for full list. |
||
|
|
5630390d94 |
fix(chatd): enable compaction between steps and re-enter after summarization (#22640)
## Problem
Three bugs with chat summarization (compaction) share a single root
cause: `ReloadMessages` was never wired up in the production
`chatloop.Run()` call.
### Bug 1: Compaction never fires between steps
The inline compaction guard in `chatloop.go` requires both `Compaction`
and `ReloadMessages` to be non-nil:
```go
if opts.Compaction != nil && opts.ReloadMessages != nil {
```
Since `ReloadMessages` was only set in tests, inline compaction was
**dead code in production**. Long multi-step turns could blow through
the context window.
### Bug 2: Compaction only occurs at end of turn
The post-run safety net doesn't check `ReloadMessages`, so it was the
only compaction path that fired:
```go
if !alreadyCompacted && opts.Compaction != nil { // no ReloadMessages check
```
This meant compaction only happened once, after the entire agent turn
finished.
### Bug 3: Agent stops after summarization
After post-run compaction, `Run()` unconditionally returned `nil`.
`processChat` then set the chat status to `waiting` (done). The agent
never had a chance to continue with its fresh summarized context.
## Fix
1. **Wire up `ReloadMessages`** in `chatd.go`: reloads persisted
messages from the database and re-applies system prompts (subagent
instruction, workspace AGENTS.md).
2. **Wrap the step loop in an outer compaction loop**: when compaction
fires on the model's final step (`compactedOnFinalStep`), reload
messages and `continue` the outer loop so the agent re-enters with
summarized context.
3. **Track `compactedOnFinalStep`** to distinguish inline compaction on
the last step (needs re-entry) from inline compaction mid-loop followed
by more tool-call steps (agent already consumed the compacted context,
no re-entry needed).
4. **Add `maxCompactionRetries = 3`** to prevent infinite compaction
loops.
## Testing
- All 7 existing compaction tests pass unchanged.
- Added `PostRunCompactionReEntersStepLoop` test: verifies that when a
text-only response triggers compaction, the outer loop re-enters and the
agent makes a second stream call with fresh context.
|
||
|
|
ddfe630757 |
refactor(chatd): replace fantasy.Agent with custom agent loop (#22507)
## Summary Replaces fantasy's `Agent` abstraction with a direct step loop calling `LanguageModel.Stream()`. Fantasy is retained as the provider abstraction layer (streaming parsers, types, tool schema) but we no longer use `fantasy.Agent`, `AgentStreamCall`, `AgentResult`, or `StepResult`. ## Problems solved | Problem | Before | After | |---|---|---| | **Sentinel prompt hack** | fantasy.Agent requires non-empty Prompt → UUID sentinel generated and stripped in PrepareStep | Messages passed directly to `model.Stream()` | | **Discarded PersistStep errors** | `_ = opts.OnStepFinish(result)` silently swallows errors | Errors propagate directly from `PersistStep()` | | **Shadow draft state** | ~160 LOC tracking content in parallel because fantasy doesn't expose in-progress content on interruption | `stepResult` owns content directly; `flushActiveState()` is trivial | | **Nested retry layers** | fantasy's 2-attempt retry nested inside chatretry's indefinite retry | Single `chatretry.Retry` layer | | **Callback-mediated compaction** | Mutex + boolean flag + coordination between OnStepFinish/PrepareStep callbacks | Inline `if` statement between steps | | **Duplicate compaction paths** | `compactStep()` + `maybeCompact()` sharing ~80% logic | Single `tryCompact()` function | ## Changes ### `coderd/chatd/chatloop/chatloop.go` — Rewritten - **Removed**: `fantasy.NewAgent()`, `AgentStreamCall`, sentinel prompt, shadow draft state (~160 LOC of closures), `compactedMu`/`compacted` flag, `PrepareStepResult` - **Added**: `stepResult` struct, `processStepStream()` (stream consumer), `executeTools()` (sequential tool execution), `flushActiveState()` (interrupt handling), `buildToolDefinitions()`, `toResponseMessages()` - **Changed**: `Run()` return type from `(*fantasy.AgentResult, error)` to `error` (callers already discarded the result) - **Preserved**: Anthropic prompt caching, reasoning title extraction, `extractContextLimit()`, `ErrInterrupted` semantics ### `coderd/chatd/chatloop/compaction.go` — Simplified - Merged `compactStep()` + `maybeCompact()` → single `tryCompact()` - Removed `[]StepResult` parameter from `generateCompactionSummary()` (caller provides complete message list) - Kept helper functions: `normalizedCompactionConfig`, `contextTokensFromUsage`, `resolveContextLimit`, `shouldCompact` ### `coderd/chatd/chatd.go` — Caller updates - Removed `AgentStreamCall` construction - Changed `_, err = chatloop.Run(...)` to `err = chatloop.Run(...)` - Model parameters moved from `AgentStreamCall` fields to `RunOptions` fields ### Tests — 4 new tests - `MidLoopCompactionReloadsMessages` — compaction fires mid-loop, messages reloaded - `PostRunCompactionSkippedAfterMidLoop` — no double compaction - `MultiStepToolExecution` — tools execute between steps, results feed next step - `PersistStepErrorPropagates` — persistence errors propagate (was silently discarded) |
||
|
|
2bdacae5f5 |
feat(chatd): add LLM stream retry with exponential backoff (#22418)
## Summary Adds automatic retry with exponential backoff for transient LLM errors during chat streaming and title generation. Inspired by [coder/mux](https://github.com/coder/mux)'s retry mechanism. ## Key Behaviors - **Infinite retries** with exponential backoff: 1s → 2s → 4s → ... → 60s cap - **Deterministic delays** (no jitter) - **Error classification**: retryable (429, 5xx, overloaded, rate limit, network errors) vs non-retryable (auth, quota, context exceeded, model not found, canceled) - **Retry status published to SSE stream** so frontend can show "Retrying in Xs..." UI - **Title generation** retries silently (best-effort, nil onRetry callback) ## New Package: `coderd/chatd/chatretry/` | File | Purpose | |------|---------| | `classify.go` | `IsRetryable(err)` and `StatusCodeRetryable(code)` | | `backoff.go` | `Delay(attempt)` — exponential doubling with 60s cap | | `retry.go` | `Retry(ctx, fn, onRetry)` — infinite loop with context-aware timer | ## Test Helpers: `coderd/chatd/chattest/errors.go` Anthropic and OpenAI error response builders for use in chattest providers: - `AnthropicErrorResponse()`, `AnthropicOverloadedResponse()`, `AnthropicRateLimitResponse()` - `OpenAIErrorResponse()`, `OpenAIRateLimitResponse()`, `OpenAIServerErrorResponse()` ## SDK Changes: `codersdk/chats.go` - New `ChatStreamEventType: "retry"` - New `ChatStreamRetry` struct with `Attempt`, `DelayMs`, `Error`, `RetryingAt` fields - TypeScript types auto-generated ## Changed Files - `coderd/chatd/chatloop/chatloop.go` — wraps `agent.Stream()` in `chatretry.Retry()` - `coderd/chatd/chatd.go` — publishes retry events to SSE stream with logging - `coderd/chatd/title.go` — wraps `model.Generate()` in silent retry - `coderd/chatd/chattest/anthropic.go` / `openai.go` — error injection support ## Tests 42 tests covering classification (33), backoff (9), and retry scenarios (8). |
||
|
|
360df1d84f |
fix(chatd): publish streaming message_part events during compaction (#22410)
## Problem Context compaction in chatd persisted durable messages for the `chat_summarized` tool call and result via `publishMessage`, but never published `message_part` streaming events via `publishMessagePart`. This meant connected clients had no streaming representation of the compaction. The client's `streamState` (built entirely from `message_part` events in `streamState.ts`) never saw the compaction tool call, so: - No **"Summarizing..."** running state was shown to the user during summary generation (which can take up to 90s). - The durable `message` events arrived after or interleaved with the `status: waiting` event, causing the tool to appear as "Summarized" with the chat appearing to just stop. ## Fix ### 1. `CompactionOptions.OnStart` callback (chatloop) Added an `OnStart` callback to `CompactionOptions`, called in `maybeCompact` right before `generateCompactionSummary` (the slow LLM call). This gives `chatd` a hook to publish the tool-call `message_part` immediately when compaction begins. ### 2. Tool-result streaming part (chatd) `persistChatContextSummary` now publishes a tool-result `message_part` before the durable `message` events, so clients transition from "Summarizing..." to "Summarized" before the status change arrives. ### Event ordering is now: 1. `message_part` (tool call via `OnStart`) — client shows "Summarizing..." 2. LLM generates summary (up to 90s) 3. `message_part` (tool result) — client shows "Summarized" in stream state 4. `message` (assistant) — durable message persisted, stream state resets 5. `message` (tool) — durable tool result persisted 6. `status: waiting` — chat transitions to idle ## Tests - **`OnStartFiresBeforePersist`**: Verifies callback ordering is `on_start` → `generate` → `persist`. - **`OnStartNotCalledBelowThreshold`**: Verifies `OnStart` is not called when context usage is below the compaction threshold. |
||
|
|
edee917d88 |
feat: add experimental agents support (#22290)
feat: add AI chat system with agent tools and chat UI Introduce the chatd subsystem and Agents UI for AI-powered chat within Coder workspaces. - Add chatd package with chat loop, message compaction, prompt management, and LLM provider integration (OpenAI, Anthropic) - Add agent tools: create workspace, list/read templates, read/write/ edit files, execute commands - Add chat API endpoints with streaming, message editing, and durable reconnection - Add database schema and migrations for chats, chat messages, chat providers, and chat model configs - Add RBAC policies and dbauthz enforcement for chat resources - Add Agents UI pages with conversation timeline, queued messages list, diff viewer, and model configuration panel - Add comprehensive test coverage including coderd integration tests, chatd unit tests, and Storybook stories - Gate feature behind experiments flag --------- Co-authored-by: Cian Johnston <cian@coder.com> Co-authored-by: Danielle Maywood <danielle@themaywoods.com> Co-authored-by: Jeremy Ruppel <jeremy@coder.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |