coder

mirror of https://github.com/coder/coder.git synced 2026-06-04 13:38:21 +00:00

Author	SHA1	Message	Date
Kyle Carberry	d11849d94a	fix: re-fetch context files and skills from workspace on each turn (#24360 ) Context files (AGENTS.md) and skills were only fetched from the workspace on the first turn or when the agent changed. On subsequent turns, stale content from persisted messages was used. This meant that if AGENTS.md or skills were modified on the workspace between turns, the agent wouldn't see the changes until the user created a new chat. ## Changes - Extract `fetchWorkspaceContext` from `persistInstructionFiles` to allow fetching workspace context without persisting - On subsequent turns, re-fetch fresh context from the workspace instead of reading stale persisted content; falls back to persisted messages if the workspace dial fails - Update `ReloadMessages` callback to re-derive instruction and skills from reloaded database messages after compaction, instead of using captured closure variables - Add `formatSystemInstructionsFromParts` helper to build system instructions directly from agent parts without requiring separate OS/directory params - Add tests for the new helper <details><summary>Implementation Notes</summary> ### Root cause In `runChat`, the `else if hasContextFiles` branch (subsequent turns) called `instructionFromContextFiles(messages)` which read stale content from persisted DB messages. The `ReloadMessages` callback (post-compaction) also used captured `instruction`/`skills` closure variables from the start of the turn, never re-deriving them. ### Approach 1. Extract `fetchWorkspaceContext` — Pure refactor of the fetch-only part of `persistInstructionFiles` (agent connection, context config retrieval, content sanitization, metadata stamping). Returns parts + skills without persisting. 2. Subsequent turns: Instead of reading from persisted messages, launch a `g2` goroutine that calls `fetchWorkspaceContext` to get fresh context from the workspace. Falls back gracefully to persisted messages if the workspace is unreachable. 3. ReloadMessages: Re-derive `instruction` from `instructionFromContextFiles(reloadedMsgs)` and `skills` from `skillsFromParts(reloadedMsgs)` using the freshly loaded messages, with fallback to captured values if the reloaded messages don't contain context (e.g. compacted away). </details> > 🤖 Generated by Coder Agents	2026-04-15 16:41:15 -04:00
Thomas Kosiewski	8382e96a81	feat: add types, context, and model normalization (#23914 )	2026-04-13 19:59:47 +02:00
Ethan	65bf7c3b18	fix(coderd/x/chatd/chatloop): stabilize startup-timeout tests with quartz (#24193 ) The startup-timeout integration tests in `chatloop` used a 5ms real-time budget and relied on wall-clock scheduling to fire the startup guard timer before the first stream part arrived. On loaded CI runners the timer sometimes lost the race, producing `attempts == 2` instead of `attempts == 1` and flaking `TestRun_FirstPartDisarmsStartupTimeout`. Replace the real `time.Timer` in `startupGuard` with a `quartz.Timer` so tests can control time deterministically. Production behavior is unchanged: `RunOptions.Clock` defaults to `quartz.NewReal()` when nil, and the startup timeout still covers both opening the provider stream and waiting for the first stream part. - Add `RunOptions.Clock quartz.Clock` with nil-safe default. - Tag the startup guard timer as `"startupGuard"` for quartz trap targeting. - Rewrite the four startup-timeout integration tests to use `quartz.NewMock(t)` with trap/advance/release sequences instead of wall-clock sleeps. - Add `awaitRunResult` helper so tests fail with a clear message instead of hanging when `Run` does not complete. Closes https://github.com/coder/internal/issues/1460	2026-04-10 00:40:09 +10:00
Michael Suchacz	590235138f	fix: pin fixed anthropic/fantasy forks for streaming token accounting (#24077 )	2026-04-08 17:07:39 +00:00
Kyle Carberry	35c26ce22a	feat: add CreatedAt to tool-call and tool-result ChatMessageParts (#24101 ) Adds an optional `CreatedAt` timestamp to `tool-call` and `tool-result` `ChatMessagePart` variants so the frontend can compute tool execution duration (`result.created_at - call.created_at`). Timestamps are recorded at the correct moments in the chatloop: - Tool-call: when the model stream emits the tool call - Tool-result: when tool execution completes (or is interrupted) These are passed through `PersistedStep.PartCreatedAt` so the persistence layer can apply accurate timestamps to stored parts. SSE-published parts also carry `CreatedAt` for real-time display. Old persisted messages without `created_at` deserialize to `nil` — fully backward compatible. <details><summary>Implementation notes (Coder Agents generated)</summary> ### Why not stamp in `PartFromContent`? `PartFromContent` is called both for SSE publishing (correct timing) and during persistence (wrong timing — both tool-call and tool-result would get the same "persistence time" timestamp, yielding ~0 duration). Instead, timestamps are captured in the chatloop at the right moments and carried through `PersistedStep.PartCreatedAt` as a `map[string]time.Time` keyed by `"call:<id>"` / `"result:<id>"`. ### Interrupted tool calls `persistInterruptedStep` also stamps `CreatedAt` on synthetic error results for cancelled/interrupted tool calls, so partial duration is available. ### Files changed \| File \| Change \| \|------\|--------\| \| `codersdk/chats.go` \| Add `CreatedAt *time.Time` field \| \| `codersdk/chats_test.go` \| JSON round-trip test \| \| `coderd/database/dbtime/dbtime.go` \| Add `TimePtr` helper \| \| `coderd/x/chatd/chatloop/chatloop.go` \| Track timestamps, pass through `PersistedStep` \| \| `coderd/x/chatd/chatd.go` \| Apply timestamps during persistence \| \| `coderd/x/chatd/chatprompt/chatprompt_test.go` \| Verify `PartFromContent` does NOT stamp \| \| `site/src/api/typesGenerated.ts` \| Auto-generated \| </details> --------- Co-authored-by: Ethan <39577870+ethanndickson@users.noreply.github.com>	2026-04-08 12:42:03 -04:00
Kyle Carberry	acd5f01b4b	fix: use GreaterOrEqual for step runtime assertion in chatloop test (#24067 ) Fixes https://github.com/coder/internal/issues/1418 The `TestRun_ActiveToolsPrepareBehavior` test asserts `persistedStep.Runtime > 0`, but on Windows the timer resolution (~15ms) means the in-memory mock model can complete within the same clock tick, producing a measured duration of `0s`. Change the assertion from `require.Greater` to `require.GreaterOrEqual` so that a legitimately measured zero duration on low-resolution clocks does not cause a flake. > Generated by Coder Agents	2026-04-07 02:08:49 +00:00
Ethan	21c2acbad5	fix: refine chat retry status UX (#23651 ) Follow-up to #23282. The retry and terminal error callouts had a few UX oddities: - Auto-retrying states reused backend error text that said "Please try again" even while the UI was already retrying on behalf of the user. - Terminal error states also said "Please try again" with no action the user could take. - `startup_timeout` had no specific title or retry copy — it fell through to the generic "Retrying request" heading. - The kind pill showed raw enum values like `startup_timeout` and `rate_limit`. - Terminal error metadata showed a "Retryable" / "Not retryable" label that does not help users. - A separate "Provider anthropic" metadata row duplicated information already present in the message body. - The `usage-limit` error kind used a hyphen while every backend kind uses underscores. Changes: Backend (`chaterror/message.go`) - Split message generation into `terminalMessage()` and `retryMessage()`, replacing the old `userFacingMessage()`. - Terminal messages include HTTP status codes and actionable guidance (e.g. "Check the API key, permissions, and billing settings."). - Retry messages are clean factual statements without status codes or remediation, suitable for the retry countdown UI (e.g. "Anthropic is temporarily overloaded."). - Removed "Please try again" / "Please try again later" from all paths. - `StreamRetryPayload` calls `retryMessage()` instead of forwarding `classified.Message`. Frontend - Removed the parallel frontend message-generation system: `getRetryMessage()`, `getProviderDisplayName()`, `getRetryProviderSubject()`, and the `PROVIDER_DISPLAY_NAMES` map are all deleted from `chatStatusHelpers.ts`. - `liveStatusModel.ts` passes `retryState.error` through directly — the backend owns the copy. - Added specific title and retry copy for `startup_timeout`, and extended the title mapping to cover `auth` and `config`. - Kind pills now show humanized labels ("Startup timeout", "Rate limit", etc.) instead of raw enum strings. - Removed the redundant "Provider anthropic" metadata row. - Removed the terminal "Retryable" / "Not retryable" badge. - Normalized `"usage-limit"` → `"usage_limit"` and added it to `ChatProviderFailureKind` so all error kinds follow the same underscore convention and live in one enum. Refs #23282.	2026-03-26 17:37:27 +11:00
Ethan	70f031d793	feat(coderd/chatd): structured chat error classification and retry hardening (#23275 ) > PR Stack > 1. #23351 ← `#23282` > 2. #23282 ← `#23275` > 3. #23275 ← `#23349` (you are here) > 4. #23349 ← `main` --- ## Summary Extracts a structured error classification subsystem for agent chat (`chatd`) so that retry and error payloads carry machine-readable metadata — error kind, provider name, HTTP status code, and retryability — instead of raw error strings. This is the backend half of the error-handling work. The frontend counterpart is in #23282. ## Changes ### New package: `coderd/chatd/chaterror/` Canonical error classification — extracts error kind, provider, status code, and user-facing message from raw provider errors. One source of truth that drives both retry policy and stream payloads. - `kind.go`: Error kind enum (`rate_limit`, `timeout`, `auth`, `config`, `overloaded`, `unknown`). - `signals.go`: Signal extraction — parses provider name, HTTP status code, and retryability from error strings and wrapped types. - `classify.go`: Classification logic — maps extracted signals to an error kind. - `message.go`: User-facing message templates keyed by kind + signals. - `payload.go`: Projectors that build `ChatStreamError` and `ChatStreamRetry` payloads from a classified error. ### Modified - `codersdk/chats.go`: Added `Kind`, `Provider`, `Retryable`, `StatusCode` fields to `ChatStreamError` and `ChatStreamRetry`. - `coderd/chatd/chatretry/`: Thinned to retry-policy only; classification logic moved to `chaterror`. - `coderd/chatd/chatloop/`: Added per-attempt first-chunk timeout (60 s) via `guardedStream` wrapper — produces retryable `startup_timeout` errors instead of hanging forever. - `coderd/chatd/chatd.go`: Publishes normalized retry/error payloads via `chaterror` projectors.	2026-03-25 13:47:54 +11:00
Cian Johnston	80a172f932	chore: move chatd and related packages to /x/ subpackage (#23445 ) - Moves `coderd/chatd/`, `coderd/gitsync/`, `enterprise/coderd/chatd/` under `x/` parent directories to signal instability - Adds `Experimental:` glue code comments in `coderd/coderd.go` > 🤖 This PR was created with the help of Coder Agents, and was reviewed by my human. 🧑‍💻	2026-03-23 17:34:43 +00:00

9 Commits