Commit Graph

7 Commits

Author SHA1 Message Date
Mathias Fredriksson 72689c2552 fix(coderd): improve error handling in chattest, chattool, and chats (#23047)
- Use t.Errorf in chattest non-streaming helpers so encoding
  failures fail the test
- Thread testing.TB into writeResponsesAPIStreaming and log
  SSE write errors instead of silently dropping them
- Bump createworkspace DB error log from Warn to Error
- Use errors.Join for timeout + output error in execute.go
2026-03-13 21:41:24 +02:00
Mathias Fredriksson 9d33c340ec fix(coderd): handle ignored errors across coderd packages (#22851)
Handle previously ignored error return values in coderd:

- coderd/chats.go: check sendEvent errors, log on failure
- coderd/chatd/chattest: thread testing.TB through server structs,
  replace log.Printf with t.Logf, check writeSSEEvent errors
- coderd/chatd/chattool/createworkspace.go: log UpdateChatWorkspace
  failure instead of discarding both return values
- coderd/chatd/chattool/execute.go: surface ProcessOutput error in
  the timeout message returned to the caller
- coderd/provisionerdserver: log stream.Send failure in the
  DownloadFile error helper
2026-03-13 19:53:20 +02:00
Danielle Maywood f9891416c0 fix: emit Responses API lifecycle events in mock OpenAI server (#22702) 2026-03-06 12:35:44 +00:00
Kyle Carberry f4a7fa5b95 fix(chatd): block subagents from spawning workspaces (#22603)
## Summary

Subagent (child) chats were previously given access to workspace
provisioning tools (`list_templates`, `read_template`,
`create_workspace`), which could lead to uncontrolled resource
consumption. This PR moves those tools behind the same
`!chat.ParentChatID.Valid` gate that already protects the subagent tools
(`spawn_agent`, `wait_agent`, etc.).

## Changes

- **`coderd/chatd/chatd.go`**: Moved `list_templates`, `read_template`,
and `create_workspace` tool registration into the root-chat-only block
alongside subagent tools.
- **`coderd/chatd/chatd_test.go`**: Added
`TestSubagentChatExcludesWorkspaceProvisioningTools` — an E2E test that
spawns a subagent via a root chat and verifies the subagent's LLM call
does not include workspace provisioning or subagent tools.
- **`coderd/chatd/chattest/openai.go`**: Added `Tools` field to
`OpenAIRequest` and supporting `OpenAITool`/`OpenAIToolFunction` types
so tests can inspect which tools are sent to the model.
2026-03-04 15:49:14 +00:00
Kyle Carberry a33ca95df2 fix(chatd): prevent chat re-acquisition during server shutdown (#22497)
Fixes https://github.com/coder/internal/issues/1371

## Problem

`TestCloseDuringShutdownContextCanceledShouldRetryOnNewReplica` flakes
intermittently in CI. The observed failure is that the chat never
reaches `pending` status after `serverA.Close()`.

## Root cause

Race between context cancellation and the mock OpenAI server's stream
completion marker.

When `Close()` cancels the server context, the in-flight HTTP streaming
request is canceled. The mock server's handler detects this via
`req.Context().Done()` and closes its chunks channel. The mock's
`writeChatCompletionsStreaming` then writes `data: [DONE]` — the SSE
completion marker. On a loopback connection, this marker can reach the
client **before** the client's HTTP transport honors the context
cancellation.

When this happens:
1. The client sees a successful stream completion (not an error)
2. `chatloop.Run` returns `nil`
3. `processChat` falls through without error → status stays `waiting`
(the default)
4. The test expects `pending` → **flake**

## Fix

Skip writing the `[DONE]` marker when the request context is already
canceled, in both `writeChatCompletionsStreaming` and
`writeResponsesAPIStreaming`.
2026-03-02 18:00:21 +00:00
Kyle Carberry 2bdacae5f5 feat(chatd): add LLM stream retry with exponential backoff (#22418)
## Summary

Adds automatic retry with exponential backoff for transient LLM errors
during chat streaming and title generation. Inspired by
[coder/mux](https://github.com/coder/mux)'s retry mechanism.

## Key Behaviors

- **Infinite retries** with exponential backoff: 1s → 2s → 4s → ... →
60s cap
- **Deterministic delays** (no jitter)
- **Error classification**: retryable (429, 5xx, overloaded, rate limit,
network errors) vs non-retryable (auth, quota, context exceeded, model
not found, canceled)
- **Retry status published to SSE stream** so frontend can show
"Retrying in Xs..." UI
- **Title generation** retries silently (best-effort, nil onRetry
callback)

## New Package: `coderd/chatd/chatretry/`

| File | Purpose |
|------|---------|
| `classify.go` | `IsRetryable(err)` and `StatusCodeRetryable(code)` |
| `backoff.go` | `Delay(attempt)` — exponential doubling with 60s cap |
| `retry.go` | `Retry(ctx, fn, onRetry)` — infinite loop with
context-aware timer |

## Test Helpers: `coderd/chatd/chattest/errors.go`

Anthropic and OpenAI error response builders for use in chattest
providers:
- `AnthropicErrorResponse()`, `AnthropicOverloadedResponse()`,
`AnthropicRateLimitResponse()`
- `OpenAIErrorResponse()`, `OpenAIRateLimitResponse()`,
`OpenAIServerErrorResponse()`

## SDK Changes: `codersdk/chats.go`

- New `ChatStreamEventType: "retry"`
- New `ChatStreamRetry` struct with `Attempt`, `DelayMs`, `Error`,
`RetryingAt` fields
- TypeScript types auto-generated

## Changed Files

- `coderd/chatd/chatloop/chatloop.go` — wraps `agent.Stream()` in
`chatretry.Retry()`
- `coderd/chatd/chatd.go` — publishes retry events to SSE stream with
logging
- `coderd/chatd/title.go` — wraps `model.Generate()` in silent retry
- `coderd/chatd/chattest/anthropic.go` / `openai.go` — error injection
support

## Tests

42 tests covering classification (33), backoff (9), and retry scenarios
(8).
2026-02-27 18:34:33 -05:00
Kyle Carberry edee917d88 feat: add experimental agents support (#22290)
feat: add AI chat system with agent tools and chat UI

Introduce the chatd subsystem and Agents UI for AI-powered chat
within Coder workspaces.

- Add chatd package with chat loop, message compaction, prompt
  management, and LLM provider integration (OpenAI, Anthropic)
- Add agent tools: create workspace, list/read templates, read/write/
  edit files, execute commands
- Add chat API endpoints with streaming, message editing, and
  durable reconnection
- Add database schema and migrations for chats, chat messages, chat
  providers, and chat model configs
- Add RBAC policies and dbauthz enforcement for chat resources
- Add Agents UI pages with conversation timeline, queued messages
  list, diff viewer, and model configuration panel
- Add comprehensive test coverage including coderd integration tests,
  chatd unit tests, and Storybook stories
- Gate feature behind experiments flag

---------

Co-authored-by: Cian Johnston <cian@coder.com>
Co-authored-by: Danielle Maywood <danielle@themaywoods.com>
Co-authored-by: Jeremy Ruppel <jeremy@coder.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 16:50:56 +00:00