coder

mirror of https://github.com/coder/coder.git synced 2026-06-04 13:38:21 +00:00

Author	SHA1	Message	Date
Ethan	65bf7c3b18	fix(coderd/x/chatd/chatloop): stabilize startup-timeout tests with quartz (#24193 ) The startup-timeout integration tests in `chatloop` used a 5ms real-time budget and relied on wall-clock scheduling to fire the startup guard timer before the first stream part arrived. On loaded CI runners the timer sometimes lost the race, producing `attempts == 2` instead of `attempts == 1` and flaking `TestRun_FirstPartDisarmsStartupTimeout`. Replace the real `time.Timer` in `startupGuard` with a `quartz.Timer` so tests can control time deterministically. Production behavior is unchanged: `RunOptions.Clock` defaults to `quartz.NewReal()` when nil, and the startup timeout still covers both opening the provider stream and waiting for the first stream part. - Add `RunOptions.Clock quartz.Clock` with nil-safe default. - Tag the startup guard timer as `"startupGuard"` for quartz trap targeting. - Rewrite the four startup-timeout integration tests to use `quartz.NewMock(t)` with trap/advance/release sequences instead of wall-clock sleeps. - Add `awaitRunResult` helper so tests fail with a clear message instead of hanging when `Run` does not complete. Closes https://github.com/coder/internal/issues/1460	2026-04-10 00:40:09 +10:00
Michael Suchacz	590235138f	fix: pin fixed anthropic/fantasy forks for streaming token accounting (#24077 )	2026-04-08 17:07:39 +00:00
Kyle Carberry	35c26ce22a	feat: add CreatedAt to tool-call and tool-result ChatMessageParts (#24101 ) Adds an optional `CreatedAt` timestamp to `tool-call` and `tool-result` `ChatMessagePart` variants so the frontend can compute tool execution duration (`result.created_at - call.created_at`). Timestamps are recorded at the correct moments in the chatloop: - Tool-call: when the model stream emits the tool call - Tool-result: when tool execution completes (or is interrupted) These are passed through `PersistedStep.PartCreatedAt` so the persistence layer can apply accurate timestamps to stored parts. SSE-published parts also carry `CreatedAt` for real-time display. Old persisted messages without `created_at` deserialize to `nil` — fully backward compatible. <details><summary>Implementation notes (Coder Agents generated)</summary> ### Why not stamp in `PartFromContent`? `PartFromContent` is called both for SSE publishing (correct timing) and during persistence (wrong timing — both tool-call and tool-result would get the same "persistence time" timestamp, yielding ~0 duration). Instead, timestamps are captured in the chatloop at the right moments and carried through `PersistedStep.PartCreatedAt` as a `map[string]time.Time` keyed by `"call:<id>"` / `"result:<id>"`. ### Interrupted tool calls `persistInterruptedStep` also stamps `CreatedAt` on synthetic error results for cancelled/interrupted tool calls, so partial duration is available. ### Files changed \| File \| Change \| \|------\|--------\| \| `codersdk/chats.go` \| Add `CreatedAt *time.Time` field \| \| `codersdk/chats_test.go` \| JSON round-trip test \| \| `coderd/database/dbtime/dbtime.go` \| Add `TimePtr` helper \| \| `coderd/x/chatd/chatloop/chatloop.go` \| Track timestamps, pass through `PersistedStep` \| \| `coderd/x/chatd/chatd.go` \| Apply timestamps during persistence \| \| `coderd/x/chatd/chatprompt/chatprompt_test.go` \| Verify `PartFromContent` does NOT stamp \| \| `site/src/api/typesGenerated.ts` \| Auto-generated \| </details> --------- Co-authored-by: Ethan <39577870+ethanndickson@users.noreply.github.com>	2026-04-08 12:42:03 -04:00
Kyle Carberry	b969d66978	feat: add dynamic tools support for chat API (#24036 ) Adds client-executed dynamic tools to the chat API. Dynamic tools are declared by the client at chat creation time, presented to the LLM alongside built-in tools, but executed by the client rather than chatd. This enables external systems (Slack bots, IDE extensions, Discord bots, CI/CD integrations) to plug custom tools into the LLM chat loop without modifying chatd's built-in tool set. Modeled after OpenAI's Assistants API: the chat pauses with `requires_action` status when the LLM calls a dynamic tool, the client POSTs results back via `POST /chats/{id}/tool-results`, and the chat resumes. See [this example](https://github.com/coder/coder-slackbot-poc) as a reference for how this is used. It's highly-configurable, which would enable creating chats from webhooks, periodically polling, or running as a Slackbot. <details> <summary>Design context</summary> ### Architecture The chatloop exits when it encounters dynamic tools and re-enters when results arrive. No blocking channels, no pubsub for tool results, no in-memory registry. The DB is the only coordination mechanism. ``` Phase 1 (chatloop): LLM response → execute built-in tools only → Persist(assistant + built-in results) → status = requires_action → chatloop exits Phase 2 (POST /tool-results): Persist(dynamic tool results) → status = pending → wakeCh → chatloop re-enters ``` ### Validation (POST /tool-results) 1. Chat status must be `requires_action` (409 if not) 2. Read chat's `dynamic_tools` → set of dynamic tool names 3. Read last assistant message → extract tool-call parts matching dynamic tool names 4. Submitted tool_call_ids must match exactly (400 for missing/extra) 5. Persist tool-result message parts, set status to `pending`, signal wake ### Idempotency Tool call IDs scoped per LLM step. State machine (`requires_action` → `pending`) is the guard. First POST wins, subsequent get 409. ### Mixed tool calls When the LLM calls both built-in and dynamic tools in one step, built-in tools execute immediately. Their results are persisted in phase 1. Dynamic tool results arrive via POST in phase 2. The LLM sees all results when the chatloop resumes. </details> > 🤖 Generated by Coder Agents	2026-04-08 11:54:44 -04:00
Kyle Carberry	acd5f01b4b	fix: use GreaterOrEqual for step runtime assertion in chatloop test (#24067 ) Fixes https://github.com/coder/internal/issues/1418 The `TestRun_ActiveToolsPrepareBehavior` test asserts `persistedStep.Runtime > 0`, but on Windows the timer resolution (~15ms) means the in-memory mock model can complete within the same clock tick, producing a measured duration of `0s`. Change the assertion from `require.Greater` to `require.GreaterOrEqual` so that a legitimately measured zero duration on low-resolution clocks does not cause a flake. > Generated by Coder Agents	2026-04-07 02:08:49 +00:00
dylanhuff-at-coder	f796f3645f	fix(coderd): fix isContextLimitKey false positive on max_context_version (#23950 ) `isContextLimitKey` had a fallback heuristic that matched any key starting with `"max"` containing `"context"`, causing false positives on keys like `"max_context_version"`. A provider returning such metadata would have the value parsed as a context limit. Replace substring matching on the separator-stripped key with word-level matching. A new `metadataKeyWords` function tokenizes keys by splitting on separators and camelCase boundaries, then the fallback requires `"context"` paired with a limit-related word (`"limit"`, `"window"` + qualifier, `"length"` + qualifier, or `"tokens"` + qualifier). Known exact forms like `"context_window"` remain in the fast-path switch. Closes https://github.com/coder/coder/issues/23332	2026-04-02 10:07:01 -07:00
Ethan	21c2acbad5	fix: refine chat retry status UX (#23651 ) Follow-up to #23282. The retry and terminal error callouts had a few UX oddities: - Auto-retrying states reused backend error text that said "Please try again" even while the UI was already retrying on behalf of the user. - Terminal error states also said "Please try again" with no action the user could take. - `startup_timeout` had no specific title or retry copy — it fell through to the generic "Retrying request" heading. - The kind pill showed raw enum values like `startup_timeout` and `rate_limit`. - Terminal error metadata showed a "Retryable" / "Not retryable" label that does not help users. - A separate "Provider anthropic" metadata row duplicated information already present in the message body. - The `usage-limit` error kind used a hyphen while every backend kind uses underscores. Changes: Backend (`chaterror/message.go`) - Split message generation into `terminalMessage()` and `retryMessage()`, replacing the old `userFacingMessage()`. - Terminal messages include HTTP status codes and actionable guidance (e.g. "Check the API key, permissions, and billing settings."). - Retry messages are clean factual statements without status codes or remediation, suitable for the retry countdown UI (e.g. "Anthropic is temporarily overloaded."). - Removed "Please try again" / "Please try again later" from all paths. - `StreamRetryPayload` calls `retryMessage()` instead of forwarding `classified.Message`. Frontend - Removed the parallel frontend message-generation system: `getRetryMessage()`, `getProviderDisplayName()`, `getRetryProviderSubject()`, and the `PROVIDER_DISPLAY_NAMES` map are all deleted from `chatStatusHelpers.ts`. - `liveStatusModel.ts` passes `retryState.error` through directly — the backend owns the copy. - Added specific title and retry copy for `startup_timeout`, and extended the title mapping to cover `auth` and `config`. - Kind pills now show humanized labels ("Startup timeout", "Rate limit", etc.) instead of raw enum strings. - Removed the redundant "Provider anthropic" metadata row. - Removed the terminal "Retryable" / "Not retryable" badge. - Normalized `"usage-limit"` → `"usage_limit"` and added it to `ChatProviderFailureKind` so all error kinds follow the same underscore convention and live in one enum. Refs #23282.	2026-03-26 17:37:27 +11:00
Hugo Dutka	84740f4619	fix: save media message type to db (#23427 ) We had a bug where computer use base64-encoded screenshots would not be interpreted as screenshots anymore once saved to the db, loaded back into memory, and sent to Anthropic. Instead, they would be interpreted as regular text. Once a computer use agent made enough screenshots and stopped, and you tried sending it another message, you'd get an out of context error: <img width="808" height="367" alt="Screenshot 2026-03-23 at 12 02 54" src="https://github.com/user-attachments/assets/f0bf6be2-4863-47ca-a7a9-9e6d9dfceeed" /> This PR fixes that.	2026-03-25 17:11:21 +00:00
Ethan	70f031d793	feat(coderd/chatd): structured chat error classification and retry hardening (#23275 ) > PR Stack > 1. #23351 ← `#23282` > 2. #23282 ← `#23275` > 3. #23275 ← `#23349` (you are here) > 4. #23349 ← `main` --- ## Summary Extracts a structured error classification subsystem for agent chat (`chatd`) so that retry and error payloads carry machine-readable metadata — error kind, provider name, HTTP status code, and retryability — instead of raw error strings. This is the backend half of the error-handling work. The frontend counterpart is in #23282. ## Changes ### New package: `coderd/chatd/chaterror/` Canonical error classification — extracts error kind, provider, status code, and user-facing message from raw provider errors. One source of truth that drives both retry policy and stream payloads. - `kind.go`: Error kind enum (`rate_limit`, `timeout`, `auth`, `config`, `overloaded`, `unknown`). - `signals.go`: Signal extraction — parses provider name, HTTP status code, and retryability from error strings and wrapped types. - `classify.go`: Classification logic — maps extracted signals to an error kind. - `message.go`: User-facing message templates keyed by kind + signals. - `payload.go`: Projectors that build `ChatStreamError` and `ChatStreamRetry` payloads from a classified error. ### Modified - `codersdk/chats.go`: Added `Kind`, `Provider`, `Retryable`, `StatusCode` fields to `ChatStreamError` and `ChatStreamRetry`. - `coderd/chatd/chatretry/`: Thinned to retry-policy only; classification logic moved to `chaterror`. - `coderd/chatd/chatloop/`: Added per-attempt first-chunk timeout (60 s) via `guardedStream` wrapper — produces retryable `startup_timeout` errors instead of hanging forever. - `coderd/chatd/chatd.go`: Publishes normalized retry/error payloads via `chaterror` projectors.	2026-03-25 13:47:54 +11:00
Kyle Carberry	3812b504fc	fix(coderd/x/chatd): prevent nil required field in MCP tool schemas for OpenAI (#23538 )	2026-03-24 18:29:41 -04:00
Michael Suchacz	02356c61f6	fix: use previous_response_id chaining for OpenAI store=true follow-ups (#23450 ) OpenAI Responses follow-up turns were replaying full assistant/tool history even when `store=true`, which breaks after reasoning + provider-executed `web_search` output. This change persists the OpenAI response ID on assistant messages, then in `coderd/x/chatd` switches `store=true` follow-ups to `previous_response_id` chaining with a system + new-user-only prompt. `store=false` and missing-ID cases still fall back to manual replay. It also updates the fake OpenAI server and integration coverage for the chaining contract, and carries the rebased path move to `coderd/x/chatd` plus the migration renumber needed after rebasing onto `main`.	2026-03-24 14:57:40 +01:00
Cian Johnston	80a172f932	chore: move chatd and related packages to /x/ subpackage (#23445 ) - Moves `coderd/chatd/`, `coderd/gitsync/`, `enterprise/coderd/chatd/` under `x/` parent directories to signal instability - Adds `Experimental:` glue code comments in `coderd/coderd.go` > 🤖 This PR was created with the help of Coder Agents, and was reviewed by my human. 🧑‍💻	2026-03-23 17:34:43 +00:00

12 Commits