Commit Graph

4 Commits

Author SHA1 Message Date
Ethan 70f031d793 feat(coderd/chatd): structured chat error classification and retry hardening (#23275)
> **PR Stack**
> 1. #23351 ← `#23282`
> 2. #23282 ← `#23275`
> 3. **#23275** ← `#23349` *(you are here)*
> 4. #23349 ← `main`

---

## Summary

Extracts a structured error classification subsystem for agent chat
(`chatd`) so that retry and error payloads carry machine-readable
metadata — error kind, provider name, HTTP status code, and retryability
— instead of raw error strings.

This is the **backend half** of the error-handling work. The frontend
counterpart is in #23282.

## Changes

### New package: `coderd/chatd/chaterror/`

Canonical error classification — extracts error kind, provider, status
code, and user-facing message from raw provider errors. One source of
truth that drives both retry policy and stream payloads.

- **`kind.go`**: Error kind enum (`rate_limit`, `timeout`, `auth`,
`config`, `overloaded`, `unknown`).
- **`signals.go`**: Signal extraction — parses provider name, HTTP
status code, and retryability from error strings and wrapped types.
- **`classify.go`**: Classification logic — maps extracted signals to an
error kind.
- **`message.go`**: User-facing message templates keyed by kind +
signals.
- **`payload.go`**: Projectors that build `ChatStreamError` and
`ChatStreamRetry` payloads from a classified error.

### Modified

- **`codersdk/chats.go`**: Added `Kind`, `Provider`, `Retryable`,
`StatusCode` fields to `ChatStreamError` and `ChatStreamRetry`.
- **`coderd/chatd/chatretry/`**: Thinned to retry-policy only;
classification logic moved to `chaterror`.
- **`coderd/chatd/chatloop/`**: Added per-attempt first-chunk timeout
(60 s) via `guardedStream` wrapper — produces retryable
`startup_timeout` errors instead of hanging forever.
- **`coderd/chatd/chatd.go`**: Publishes normalized retry/error payloads
via `chaterror` projectors.
2026-03-25 13:47:54 +11:00
Kyle Carberry 3812b504fc fix(coderd/x/chatd): prevent nil required field in MCP tool schemas for OpenAI (#23538) 2026-03-24 18:29:41 -04:00
Michael Suchacz 02356c61f6 fix: use previous_response_id chaining for OpenAI store=true follow-ups (#23450)
OpenAI Responses follow-up turns were replaying full assistant/tool
history even when `store=true`, which breaks after reasoning +
provider-executed `web_search` output.

This change persists the OpenAI response ID on assistant messages, then
in `coderd/x/chatd` switches `store=true` follow-ups to
`previous_response_id` chaining with a system + new-user-only prompt.
`store=false` and missing-ID cases still fall back to manual replay.

It also updates the fake OpenAI server and integration coverage for the
chaining contract, and carries the rebased path move to `coderd/x/chatd`
plus the migration renumber needed after rebasing onto `main`.
2026-03-24 14:57:40 +01:00
Cian Johnston 80a172f932 chore: move chatd and related packages to /x/ subpackage (#23445)
- Moves `coderd/chatd/`, `coderd/gitsync/`, `enterprise/coderd/chatd/`
under `x/` parent directories to signal instability
- Adds `Experimental:` glue code comments in `coderd/coderd.go`

> 🤖 This PR was created with the help of Coder Agents, and was
reviewed by my human. 🧑‍💻
2026-03-23 17:34:43 +00:00