coder

mirror of https://github.com/coder/coder.git synced 2026-06-07 15:08:20 +00:00

Author	SHA1	Message	Date
Kyle Carberry	1f37df4db3	perf(chatd): fix six scale bottlenecks identified by benchmarking (#22957 ) ## Summary Scale-tested the `chatd` package with mock-based benchmarks to identify performance bottlenecks. This PR fixes 6 of the 8 identified issues, ranked by severity. ## Changes ### 1. Parallel tool execution (HIGH) — `chatloop.go` `executeTools` ran tool calls sequentially. Now dispatches all calls concurrently via goroutines with `sync.WaitGroup`. Results are pre-allocated by index (no mutex needed). `onResult` callbacks fire as each tool completes. ### 2. Pubsub-backed subagent await (HIGH) — `subagent.go` `awaitSubagentCompletion` polled the DB every 200ms. Now subscribes to the child chat's `ChatStreamNotifyChannel` via pubsub for near-instant notifications. Fallback poll reduced to 5s. Falls back to 200ms only when `pubsub == nil` (single-instance / in-memory). ### 3. Per-chat stream locking (MEDIUM) — `chatd.go` Replaced single global `streamMu` + `map[uuid.UUID]*chatStreamState` with `sync.Map` where each `chatStreamState` has its own `sync.Mutex`. Zero cross-chat contention. ### 4. Batch chat acquisition (MEDIUM) — `chatd.go` `processOnce` acquired 1 chat per tick. Now loops up to `maxChatsPerAcquire = 10` per tick, avoiding idle time when many chats are pending. ### 5. Reduced heartbeat frequency (LOW-MEDIUM) — `chatd.go` `chatHeartbeatInterval` changed from 30s to 60s. Safe given the 5-minute `DefaultInFlightChatStaleAfter`. ### 6. O(depth) descendant check (LOW) — `subagent.go` Replaced top-down BFS (`O(total_descendants)` queries) with bottom-up parent-chain walk (`O(depth)` queries). Includes cycle protection. ## Not addressed (intentionally) - Message serialization overhead - Buffer eviction (`buffer[1:]` pattern)	2026-03-11 14:00:08 -04:00
Kyle Carberry	0a026fde39	refactor: remove reasoning title extraction from chat pipeline (#22926 ) Removes the backend and frontend logic that extracted compact titles from reasoning/thinking blocks. The `Title` field on `ChatMessagePart` remains for other part types (e.g. source), but reasoning blocks no longer have titles derived from first-line markdown bold text or provider metadata summaries. Backend: - Remove `ReasoningTitleFromFirstLine`, `reasoningTitleFromContent`, `reasoningSummaryTitle`, `compactReasoningSummaryTitle`, and `reasoningSummaryHeadline` from chatprompt - Simplify `marshalContentBlock` to plain `json.Marshal` (no title injection) - Remove title tracking maps and `setReasoningTitleFromText` from chatloop stream processing - Remove `reasoningStoredTitle` from db2sdk - Remove related tests from db2sdk_test Frontend: - Remove `mergeThinkingTitles` from blockUtils - Simplify `appendTextBlock` to always merge consecutive thinking blocks - Remove `applyStreamThinkingTitle` from streamState - Simplify reasoning/thinking stream handler to ignore title-only parts - Update tests accordingly Net: -487 lines / +42 lines	2026-03-11 11:01:26 +00:00
Kyle Carberry	eecb7d0b66	fix: resolve bugs in chatd streaming system (#22720 ) Split from #22693 per review feedback. Fixes multiple bugs in coderd/chatd and sub-packages including race conditions, transaction safety, stream buffer bounds, retry limits, and enterprise relay improvements. See commit message for full list.	2026-03-06 21:02:25 +00:00
Kyle Carberry	5630390d94	fix(chatd): enable compaction between steps and re-enter after summarization (#22640 ) ## Problem Three bugs with chat summarization (compaction) share a single root cause: `ReloadMessages` was never wired up in the production `chatloop.Run()` call. ### Bug 1: Compaction never fires between steps The inline compaction guard in `chatloop.go` requires both `Compaction` and `ReloadMessages` to be non-nil: ```go if opts.Compaction != nil && opts.ReloadMessages != nil { ``` Since `ReloadMessages` was only set in tests, inline compaction was dead code in production. Long multi-step turns could blow through the context window. ### Bug 2: Compaction only occurs at end of turn The post-run safety net doesn't check `ReloadMessages`, so it was the only compaction path that fired: ```go if !alreadyCompacted && opts.Compaction != nil { // no ReloadMessages check ``` This meant compaction only happened once, after the entire agent turn finished. ### Bug 3: Agent stops after summarization After post-run compaction, `Run()` unconditionally returned `nil`. `processChat` then set the chat status to `waiting` (done). The agent never had a chance to continue with its fresh summarized context. ## Fix 1. Wire up `ReloadMessages` in `chatd.go`: reloads persisted messages from the database and re-applies system prompts (subagent instruction, workspace AGENTS.md). 2. Wrap the step loop in an outer compaction loop: when compaction fires on the model's final step (`compactedOnFinalStep`), reload messages and `continue` the outer loop so the agent re-enters with summarized context. 3. Track `compactedOnFinalStep` to distinguish inline compaction on the last step (needs re-entry) from inline compaction mid-loop followed by more tool-call steps (agent already consumed the compacted context, no re-entry needed). 4. Add `maxCompactionRetries = 3` to prevent infinite compaction loops. ## Testing - All 7 existing compaction tests pass unchanged. - Added `PostRunCompactionReEntersStepLoop` test: verifies that when a text-only response triggers compaction, the outer loop re-enters and the agent makes a second stream call with fresh context.	2026-03-04 22:28:23 -05:00
Kyle Carberry	ddfe630757	refactor(chatd): replace fantasy.Agent with custom agent loop (#22507 ) ## Summary Replaces fantasy's `Agent` abstraction with a direct step loop calling `LanguageModel.Stream()`. Fantasy is retained as the provider abstraction layer (streaming parsers, types, tool schema) but we no longer use `fantasy.Agent`, `AgentStreamCall`, `AgentResult`, or `StepResult`. ## Problems solved \| Problem \| Before \| After \| \|---\|---\|---\| \| Sentinel prompt hack \| fantasy.Agent requires non-empty Prompt → UUID sentinel generated and stripped in PrepareStep \| Messages passed directly to `model.Stream()` \| \| Discarded PersistStep errors \| `_ = opts.OnStepFinish(result)` silently swallows errors \| Errors propagate directly from `PersistStep()` \| \| Shadow draft state \| ~160 LOC tracking content in parallel because fantasy doesn't expose in-progress content on interruption \| `stepResult` owns content directly; `flushActiveState()` is trivial \| \| Nested retry layers \| fantasy's 2-attempt retry nested inside chatretry's indefinite retry \| Single `chatretry.Retry` layer \| \| Callback-mediated compaction \| Mutex + boolean flag + coordination between OnStepFinish/PrepareStep callbacks \| Inline `if` statement between steps \| \| Duplicate compaction paths \| `compactStep()` + `maybeCompact()` sharing ~80% logic \| Single `tryCompact()` function \| ## Changes ### `coderd/chatd/chatloop/chatloop.go` — Rewritten - Removed: `fantasy.NewAgent()`, `AgentStreamCall`, sentinel prompt, shadow draft state (~160 LOC of closures), `compactedMu`/`compacted` flag, `PrepareStepResult` - Added: `stepResult` struct, `processStepStream()` (stream consumer), `executeTools()` (sequential tool execution), `flushActiveState()` (interrupt handling), `buildToolDefinitions()`, `toResponseMessages()` - Changed: `Run()` return type from `(fantasy.AgentResult, error)` to `error` (callers already discarded the result) - Preserved*: Anthropic prompt caching, reasoning title extraction, `extractContextLimit()`, `ErrInterrupted` semantics ### `coderd/chatd/chatloop/compaction.go` — Simplified - Merged `compactStep()` + `maybeCompact()` → single `tryCompact()` - Removed `[]StepResult` parameter from `generateCompactionSummary()` (caller provides complete message list) - Kept helper functions: `normalizedCompactionConfig`, `contextTokensFromUsage`, `resolveContextLimit`, `shouldCompact` ### `coderd/chatd/chatd.go` — Caller updates - Removed `AgentStreamCall` construction - Changed `_, err = chatloop.Run(...)` to `err = chatloop.Run(...)` - Model parameters moved from `AgentStreamCall` fields to `RunOptions` fields ### Tests — 4 new tests - `MidLoopCompactionReloadsMessages` — compaction fires mid-loop, messages reloaded - `PostRunCompactionSkippedAfterMidLoop` — no double compaction - `MultiStepToolExecution` — tools execute between steps, results feed next step - `PersistStepErrorPropagates` — persistence errors propagate (was silently discarded)	2026-03-02 18:51:57 -05:00
Kyle Carberry	2bdacae5f5	feat(chatd): add LLM stream retry with exponential backoff (#22418 ) ## Summary Adds automatic retry with exponential backoff for transient LLM errors during chat streaming and title generation. Inspired by [coder/mux](https://github.com/coder/mux)'s retry mechanism. ## Key Behaviors - Infinite retries with exponential backoff: 1s → 2s → 4s → ... → 60s cap - Deterministic delays (no jitter) - Error classification: retryable (429, 5xx, overloaded, rate limit, network errors) vs non-retryable (auth, quota, context exceeded, model not found, canceled) - Retry status published to SSE stream so frontend can show "Retrying in Xs..." UI - Title generation retries silently (best-effort, nil onRetry callback) ## New Package: `coderd/chatd/chatretry/` \| File \| Purpose \| \|------\|---------\| \| `classify.go` \| `IsRetryable(err)` and `StatusCodeRetryable(code)` \| \| `backoff.go` \| `Delay(attempt)` — exponential doubling with 60s cap \| \| `retry.go` \| `Retry(ctx, fn, onRetry)` — infinite loop with context-aware timer \| ## Test Helpers: `coderd/chatd/chattest/errors.go` Anthropic and OpenAI error response builders for use in chattest providers: - `AnthropicErrorResponse()`, `AnthropicOverloadedResponse()`, `AnthropicRateLimitResponse()` - `OpenAIErrorResponse()`, `OpenAIRateLimitResponse()`, `OpenAIServerErrorResponse()` ## SDK Changes: `codersdk/chats.go` - New `ChatStreamEventType: "retry"` - New `ChatStreamRetry` struct with `Attempt`, `DelayMs`, `Error`, `RetryingAt` fields - TypeScript types auto-generated ## Changed Files - `coderd/chatd/chatloop/chatloop.go` — wraps `agent.Stream()` in `chatretry.Retry()` - `coderd/chatd/chatd.go` — publishes retry events to SSE stream with logging - `coderd/chatd/title.go` — wraps `model.Generate()` in silent retry - `coderd/chatd/chattest/anthropic.go` / `openai.go` — error injection support ## Tests 42 tests covering classification (33), backoff (9), and retry scenarios (8).	2026-02-27 18:34:33 -05:00
Kyle Carberry	edee917d88	feat: add experimental agents support (#22290 ) feat: add AI chat system with agent tools and chat UI Introduce the chatd subsystem and Agents UI for AI-powered chat within Coder workspaces. - Add chatd package with chat loop, message compaction, prompt management, and LLM provider integration (OpenAI, Anthropic) - Add agent tools: create workspace, list/read templates, read/write/ edit files, execute commands - Add chat API endpoints with streaming, message editing, and durable reconnection - Add database schema and migrations for chats, chat messages, chat providers, and chat model configs - Add RBAC policies and dbauthz enforcement for chat resources - Add Agents UI pages with conversation timeline, queued messages list, diff viewer, and model configuration panel - Add comprehensive test coverage including coderd integration tests, chatd unit tests, and Storybook stories - Gate feature behind experiments flag --------- Co-authored-by: Cian Johnston <cian@coder.com> Co-authored-by: Danielle Maywood <danielle@themaywoods.com> Co-authored-by: Jeremy Ruppel <jeremy@coder.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-27 16:50:56 +00:00

7 Commits